Factoring Matrices#

xynergy.factor.matrix_factorize(df: DataFrame, dose_cols: list[str] = ['dose_a', 'dose_b'], response_col: str = 'resp_imputed', experiment_cols: str | list[str] | None = 'experiment_id', method: list[str] | str = ['NMF', 'SVD', 'PMF', 'RPCA'], og_response_col: str | None = 'response', log: str = 'all')#

Estimate dose-response data via matrix factorization.

Parameters#

df: polars.DataFrame

Usually the output from tidy or one of its downstream functions

dose_cols: list, default [“dose_a”, “dose_b”]

A list of exactly two columns names that contain untransformed numeric values of agent dose

response_col: string, default “resp_imputed”

The name of the column containing responses. Should not contain missing values. In a typical workflow, this will be the pre-imputed responses

experiment_cols: list[str], string, or None, default “experiment_id”

The names of columns that should be used to distinguish one dose pair’s response from another. If none are supplied, two rows with the same doses will be considered replicates.

method: list[str] or str, default [“NMF”, “SVD”, “PMF”, “RPCA”]

The method(s) used for matrix factorization

log: string, default “all”

Verbosity of function. Options include “all”, “warn”, and “none”.

If “all”, will emit notes and warnings.
If “warn”, will emit only warnings.
If “none”, will not emit anything (except errors)

Returns#

polars.DataFrame: Input with [response_col]_[method] columns appended. These columns contain the supplied response values approximated by the respective method(s)

Notes#

If there are multiple responses per dose-pair per experiment, (that is, replicates), this function will silently take the mean.