Skip to contents

Computes the mean, standard deviation, coefficient of variation (CV), and number of non-missing observations for each feature (for example, gene or protein) within each condition defined in colData(se).

Usage

calc_gene_CV_by_condition(
  se,
  assay_name = "conc",
  condition_col = "condition",
  na_rm = TRUE
)

Arguments

se

A SummarizedExperiment object.

assay_name

A character scalar specifying which assay in se should be used for CV calculation. Default is "conc".

condition_col

A character scalar specifying the column name in colData(se) that defines sample conditions.

na_rm

Logical, whether missing values should be removed when calculating means and standard deviations. Default is TRUE.

Value

A data.frame with one row per feature-condition pair and the following columns:

feature

Feature name, taken from rownames(se).

condition

Condition label from colData(se). The actual column name in the returned data frame will match condition_col.

mean_val

Mean value of the feature within the condition.

sd_val

Standard deviation of the feature within the condition.

CV

Coefficient of variation, calculated as sd_val / mean_val.

n

Number of non-missing observations used in the calculation.

If every sample belongs to a unique condition, the function returns NULL, because CV is not meaningful without replication.

Details

This implementation works directly on the assay matrix and is typically much faster and more memory-efficient than reshaping the assay into a long-format data frame before summarisation.

Examples

# cv_df <- calc_gene_CV_by_condition_V2(se_obj$se)