Massive univariate linear model.
Residualization of a Y data on possibly adjusted for other variables.
-
class
mulm.residualizer.residualizer.
Residualizer
(data=None, formula_res=None, formula_full=None, contrast_res=None)[source]¶ Residualization of a Y data on possibly adjusted for other variables.
Example: Y is a (n, p) array of p-dependant variables, we want to residualize for “site” adjusted for “age + sex”.
1) Use of DataFrame and formula: 1.1) Residualizer(data=df, formula_res=”site”, formula_full=site + age + sex”)
1.2) Z = get_design_mat(data) will return the numpy (n, k) array design matrix. Row selection can be done on both Y and design_mat (Cross-val., etc.)
2) Use of raw arrays: if you choose to manually write your design matrix. In this case provide res_mask ie, the residualization mask within your full. model. For example: Residualizer(mask=[False, True, False, False]) will fit the whole model and residualize on the second regressor, ie, site.
3) fit(Y, X) fits the model: Y = b0 + b1 site + b2 age + b3 sex + eps => learn and store b1, b2, b3
transform(Y, X) residualize Y on X, ie, returns Y - b1 site
-
fit
(Y, X)[source]¶ Fit parameters of p linear models where each Y is regressed on X.
- Parameters
Y: array (n, p)
Dependant variables
X: array(n, k)
Design matrix of independant variables
-
class
mulm.residualizer.residualizer.
ResidualizerEstimator
(residualizer)[source]¶ Wrap Residualizer into an estimator compatible with sklearn API.
Note that to be consistant with sklearn API, here X contains the input variable and Z is the design matrix for residualization.
Follow us
Inspired by AZMIND template.