Analysis (Python)¶

raven-toolbox objects in raven_toolbox.analysis, collected from the source of the tracked branch.

Functions¶

Function	Summary
`find_good_reactions`	Reactions usable as random objectives: carry real (non-loop) flux.
`FluxSamplingResult`	Output of :func:`~raven_toolbox.analysis.sampling.random_sampling`.
`fseof`	Run FSEOF for over-production of `target_rxn`'s product.
`FSEOFResult`	FSEOF output.
`max_volume_ellipsoid`	Maximum-volume ellipsoid inscribed in the polytope `{z : A z <= b}`.
`random_sampling`	Sample `model`'s flux space — entry point for all sampling methods.
`reporter_metabolites`	Compute Reporter Metabolites from per-gene differential-expression p-values.
`ReporterResult`	Reporter-metabolite scores for one gene set.

Reference¶

find_good_reactions¶

Reactions usable as random objectives: carry real (non-loop) flux.

A reaction is kept if its FVA range spans more than flux_tol. With loopless (default) the FVA is loopless (cycleFreeFlux), so reactions that can carry flux only through a thermodynamically-infeasible cycle have a ~0 loopless range and are dropped — the right test for "loopy", unlike a fixed bound threshold which wrongly drops legitimate reactions that simply reach the model's default (e.g. 1000) bound. Pass loopless=False for a faster, looser pass that keeps any flux-carrying reaction (loops included).

FluxSamplingResult¶

Output of :func:~raven_toolbox.analysis.sampling.random_sampling.

Attributes:

Name	Type	Description
`samples`	`DataFrame`	Flux vectors shaped n_samples × n_reactions (one sample per row, reaction ids as columns — the `cobra.sampling` layout).
`method`	`str`	`"achr"`, `"chrr"`, or `"random_objective"`.
`good_reactions`	`list[str] \| None`	`random_objective` only — reaction ids eligible as random objectives; `None` for the MCMC methods.
`n_dimensions`	`int \| None`	`chrr` only — dimension of the full-dimensional flux polytope sampled (degrees of freedom after fixing implicitly-determined reactions).
`mve_converged`	`bool \| None`	`chrr` only — whether the MVE rounding solver reached its tolerance. A `False` here is not fatal (the last ellipsoid iterate is still a valid rounding) but very elongated results may warrant more thinning.
`n_warmup`	`int \| None`	`achr` only — number of FVA warmup directions.
`fixed_reactions`	`list[str]`	`chrr` only — reactions folded into the equality system as fixed.

fseof¶

Run FSEOF for over-production of target_rxn's product.

Enforces target flux from max_fraction/n_steps up to max_fraction of the theoretical maximum in n_steps steps, maximising growth (biomass_rxn or the model's current objective) with pFBA at each step. Returns an :class:FSEOFResult.

FSEOFResult¶

FSEOF output.

scan is reactions × enforced-flux-levels (the full flux scan); enforced are the enforced target fluxes; targets is the classified per-reaction table (sorted by score). :attr:gene_targets aggregates targets to genes.

gene_targets `property` ¶

gene_targets: DataFrame

Per-gene aggregation: the target reactions each gene is associated with.

max_volume_ellipsoid¶

Maximum-volume ellipsoid inscribed in the polytope {z : A z <= b}.

Solves max log det E over centre x and SPD E such that the ellipsoid {x + E s : ||s||_2 <= 1} is contained in {z : A z <= b}, using the Zhang & Gao (2003) primal-dual interior-point method (the regularised variant shipped in COBRA's chrrSampler).

Parameters:

Name	Type	Description	Default
`A`	`ndarray`	Polytope `A z <= b` with `A` of shape (m, n), `m >= n + 1`, and a non-empty interior.	required
`b`	`ndarray`	Polytope `A z <= b` with `A` of shape (m, n), `m >= n + 1`, and a non-empty interior.	required
`x0`	`ndarray \| None`	A strictly interior point (`A x0 < b`). If `None` a Chebyshev centre is computed by LP.	`None`
`maxiter`	`int`	Iteration cap, convergence tolerance, and the diagonal/Levenberg regularisation that keeps the two Newton solves well-conditioned.	`150`
`tol`	`int`	Iteration cap, convergence tolerance, and the diagonal/Levenberg regularisation that keeps the two Newton solves well-conditioned.	`150`
`reg`	`int`	Iteration cap, convergence tolerance, and the diagonal/Levenberg regularisation that keeps the two Newton solves well-conditioned.	`150`

Returns:

Name	Type	Description
`x`	`ndarray`	Ellipsoid centre, shape (n,).
`E`	`ndarray`	Lower-triangular rounding transform with `E @ E.T == E2` (the SPD ellipsoid matrix). The rounding substitution is `z = x + E y`.
`converged`	`bool`	Whether `tol` was reached within `maxiter`.

Notes

Validated against analytic cases: a box maps to the unit ball (E = I); a scaled/sheared box {-1 <= M z <= 1} gives E E^T = M^{-1} M^{-T}; the standard simplex gives its inscribed ball.

random_sampling¶

Sample model's flux space — entry point for all sampling methods.

Dispatches on method:

"achr" (default) — Artificially Centered Hit-and-Run; near-uniform MCMC sampling of the polytope interior (wraps :class:cobra.sampling.ACHRSampler).
"chrr" — Coordinate Hit-and-Run with Rounding; near-uniform MCMC with maximum-volume-ellipsoid rounding, for better mixing on thin/ill-conditioned polytopes such as enzyme-constrained models.
"random_objective" — the random-objective vertex method of Bordel et al. (2010): each sample maximises a small random objective, returning a polytope vertex. This was random_sampling's historical behaviour; it is no longer the default.

The "achr"/"chrr" methods draw the (near-)uniform interior distribution; "random_objective" draws diverse vertices. Set any constraints you want to condition on (e.g. a biomass lower bound, measured fluxes, enzyme-usage bounds) on the model before calling.

Parameters:

Name	Type	Description	Default
`n_samples`	`int`	Number of flux vectors to return.	`1000`
`method`	`str`	`"achr"` (default), `"chrr"`, or `"random_objective"`.	`'achr'`
`seed`	`int \| None`	Seed for reproducible chains/draws.	`None`
`thinning`	`int`	`achr`/`chrr` — Markov-chain steps between recorded samples (default 100).	`100`
`warmup`	`int`	`chrr` — burn-in steps discarded before the first recorded sample.	`1000`
`fixed_width_tol`	`float`	`chrr` — a reaction whose FVA range is narrower than this is folded into the equality system as fixed (keeps the reduced polytope full-dimensional).	`1e-07`
`n_objectives`	`int`	`random_objective` only — see the method's parameters; `good_reactions` can be passed back from a previous result to skip the one-off FVA.	`2`
`good_reactions`	`int`	`random_objective` only — see the method's parameters; `good_reactions` can be passed back from a previous result to skip the one-off FVA.	`2`
`replace_max_bound`	`int`	`random_objective` only — see the method's parameters; `good_reactions` can be passed back from a previous result to skip the one-off FVA.	`2`
`min_flux`	`int`	`random_objective` only — see the method's parameters; `good_reactions` can be passed back from a previous result to skip the one-off FVA.	`2`
`loopless_good_reactions`	`int`	`random_objective` only — see the method's parameters; `good_reactions` can be passed back from a previous result to skip the one-off FVA.	`2`
`exclude_reactions`	`int`	`random_objective` only — see the method's parameters; `good_reactions` can be passed back from a previous result to skip the one-off FVA.	`2`
`max_attempts`	`int`	`random_objective` only — see the method's parameters; `good_reactions` can be passed back from a previous result to skip the one-off FVA.	`2`
`suppress_errors`	`int`	`random_objective` only — see the method's parameters; `good_reactions` can be passed back from a previous result to skip the one-off FVA.	`2`

Returns:

Type	Description
`FluxSamplingResult`

reporter_metabolites¶

Compute Reporter Metabolites from per-gene differential-expression p-values.

gene_pvalues maps gene id → p-value (genes not in the model, or with a NaN or out-of-[0, 1] p-value, are dropped — a stray invalid p-value would otherwise turn the whole result NaN). If gene_fold_changes (gene id → log fold change) is given, two extra results are returned for the up- (fc ≥ 0) and down- (fc < 0) regulated gene subsets, in addition to "all".

Parity with RAVEN's reporterMetabolites: the z_score and underlying background correction match exactly (exact closed-form instead of RAVEN's Monte-Carlo, see IMPROVEMENTS RM1). The reported p_value is the one-sided ("up") enrichment 1 - Φ(z) and the result is sorted by z_score descending. RAVEN sorts by p-value and reports both tails (allPValues, allUpPValues, allDownPValues); the up/down splits here come from the gene_fold_changes subset partition instead, so the same information is available via the three returned ReporterResult rows.

ReporterResult¶

Reporter-metabolite scores for one gene set.

test is "all", "up" or "down"; table is a DataFrame with columns metabolite, name, z_score, p_value, n_genes, mean_z, std_z sorted by descending z_score.