Calculate coefficient of variation (CV) for each feature within each condition

Computes the mean, standard deviation, coefficient of variation (CV), and number of non-missing observations for each feature (for example, gene or protein) within each condition defined in colData(se).

Usage

calc_gene_CV_by_condition(
  se,
  assay_name = "conc",
  condition_col = "condition",
  na_rm = TRUE
)

Arguments

se: A SummarizedExperiment object.
assay_name: A character scalar specifying which assay in se should be used for CV calculation. Default is "conc".
condition_col: A character scalar specifying the column name in colData(se) that defines sample conditions.
na_rm: Logical, whether missing values should be removed when calculating means and standard deviations. Default is TRUE.

Value

A data.frame with one row per feature-condition pair and the following columns:

feature: Feature name, taken from rownames(se).
condition: Condition label from colData(se). The actual column name in the returned data frame will match condition_col.
mean_val: Mean value of the feature within the condition.
sd_val: Standard deviation of the feature within the condition.
CV: Coefficient of variation, calculated as sd_val / mean_val.
n: Number of non-missing observations used in the calculation.

If every sample belongs to a unique condition, the function returns NULL, because CV is not meaningful without replication.

Details

This implementation works directly on the assay matrix and is typically much faster and more memory-efficient than reshaping the assay into a long-format data frame before summarisation.

Examples

# cv_df <- calc_gene_CV_by_condition_V2(se_obj$se)