Skip to content

Analysis (Python)

raven-toolbox objects in raven_toolbox.analysis, collected from the source of the tracked branch.

Functions

Function Summary
find_good_reactions Reactions usable as random objectives: carry real (non-loop) flux.
FluxSamplingResult Output of :func:~raven_toolbox.analysis.sampling.random_sampling.
fseof Run FSEOF for over-production of target_rxn's product.
FSEOFResult FSEOF output.
max_volume_ellipsoid Maximum-volume ellipsoid inscribed in the polytope {z : A z <= b}.
random_sampling Sample model's flux space — entry point for all sampling methods.
reporter_metabolites Compute Reporter Metabolites from per-gene differential-expression p-values.
ReporterResult Reporter-metabolite scores for one gene set.

Reference

find_good_reactions

Reactions usable as random objectives: carry real (non-loop) flux.

A reaction is kept if its FVA range spans more than flux_tol. With loopless (default) the FVA is loopless (cycleFreeFlux), so reactions that can carry flux only through a thermodynamically-infeasible cycle have a ~0 loopless range and are dropped — the right test for "loopy", unlike a fixed bound threshold which wrongly drops legitimate reactions that simply reach the model's default (e.g. 1000) bound. Pass loopless=False for a faster, looser pass that keeps any flux-carrying reaction (loops included).

FluxSamplingResult

Output of :func:~raven_toolbox.analysis.sampling.random_sampling.

Attributes:

Name Type Description
samples DataFrame

Flux vectors shaped n_samples × n_reactions (one sample per row, reaction ids as columns — the cobra.sampling layout).

method str

"achr", "chrr", or "random_objective".

good_reactions list[str] | None

random_objective only — reaction ids eligible as random objectives; None for the MCMC methods.

n_dimensions int | None

chrr only — dimension of the full-dimensional flux polytope sampled (degrees of freedom after fixing implicitly-determined reactions).

mve_converged bool | None

chrr only — whether the MVE rounding solver reached its tolerance. A False here is not fatal (the last ellipsoid iterate is still a valid rounding) but very elongated results may warrant more thinning.

n_warmup int | None

achr only — number of FVA warmup directions.

fixed_reactions list[str]

chrr only — reactions folded into the equality system as fixed.

fseof

Run FSEOF for over-production of target_rxn's product.

Enforces target flux from max_fraction/n_steps up to max_fraction of the theoretical maximum in n_steps steps, maximising growth (biomass_rxn or the model's current objective) with pFBA at each step. Returns an :class:FSEOFResult.

FSEOFResult

FSEOF output.

scan is reactions × enforced-flux-levels (the full flux scan); enforced are the enforced target fluxes; targets is the classified per-reaction table (sorted by score). :attr:gene_targets aggregates targets to genes.

gene_targets property

gene_targets: DataFrame

Per-gene aggregation: the target reactions each gene is associated with.

max_volume_ellipsoid

Maximum-volume ellipsoid inscribed in the polytope {z : A z <= b}.

Solves max log det E over centre x and SPD E such that the ellipsoid {x + E s : ||s||_2 <= 1} is contained in {z : A z <= b}, using the Zhang & Gao (2003) primal-dual interior-point method (the regularised variant shipped in COBRA's chrrSampler).

Parameters:

Name Type Description Default
A ndarray

Polytope A z <= b with A of shape (m, n), m >= n + 1, and a non-empty interior.

required
b ndarray

Polytope A z <= b with A of shape (m, n), m >= n + 1, and a non-empty interior.

required
x0 ndarray | None

A strictly interior point (A x0 < b). If None a Chebyshev centre is computed by LP.

None
maxiter int

Iteration cap, convergence tolerance, and the diagonal/Levenberg regularisation that keeps the two Newton solves well-conditioned.

150
tol int

Iteration cap, convergence tolerance, and the diagonal/Levenberg regularisation that keeps the two Newton solves well-conditioned.

150
reg int

Iteration cap, convergence tolerance, and the diagonal/Levenberg regularisation that keeps the two Newton solves well-conditioned.

150

Returns:

Name Type Description
x ndarray

Ellipsoid centre, shape (n,).

E ndarray

Lower-triangular rounding transform with E @ E.T == E2 (the SPD ellipsoid matrix). The rounding substitution is z = x + E y.

converged bool

Whether tol was reached within maxiter.

Notes

Validated against analytic cases: a box maps to the unit ball (E = I); a scaled/sheared box {-1 <= M z <= 1} gives E E^T = M^{-1} M^{-T}; the standard simplex gives its inscribed ball.

random_sampling

Sample model's flux space — entry point for all sampling methods.

Dispatches on method:

  • "achr" (default) — Artificially Centered Hit-and-Run; near-uniform MCMC sampling of the polytope interior (wraps :class:cobra.sampling.ACHRSampler).
  • "chrr" — Coordinate Hit-and-Run with Rounding; near-uniform MCMC with maximum-volume-ellipsoid rounding, for better mixing on thin/ill-conditioned polytopes such as enzyme-constrained models.
  • "random_objective" — the random-objective vertex method of Bordel et al. (2010): each sample maximises a small random objective, returning a polytope vertex. This was random_sampling's historical behaviour; it is no longer the default.

The "achr"/"chrr" methods draw the (near-)uniform interior distribution; "random_objective" draws diverse vertices. Set any constraints you want to condition on (e.g. a biomass lower bound, measured fluxes, enzyme-usage bounds) on the model before calling.

Parameters:

Name Type Description Default
n_samples int

Number of flux vectors to return.

1000
method str

"achr" (default), "chrr", or "random_objective".

'achr'
seed int | None

Seed for reproducible chains/draws.

None
thinning int

achr/chrr — Markov-chain steps between recorded samples (default 100).

100
warmup int

chrr — burn-in steps discarded before the first recorded sample.

1000
fixed_width_tol float

chrr — a reaction whose FVA range is narrower than this is folded into the equality system as fixed (keeps the reduced polytope full-dimensional).

1e-07
n_objectives int

random_objective only — see the method's parameters; good_reactions can be passed back from a previous result to skip the one-off FVA.

2
good_reactions int

random_objective only — see the method's parameters; good_reactions can be passed back from a previous result to skip the one-off FVA.

2
replace_max_bound int

random_objective only — see the method's parameters; good_reactions can be passed back from a previous result to skip the one-off FVA.

2
min_flux int

random_objective only — see the method's parameters; good_reactions can be passed back from a previous result to skip the one-off FVA.

2
loopless_good_reactions int

random_objective only — see the method's parameters; good_reactions can be passed back from a previous result to skip the one-off FVA.

2
exclude_reactions int

random_objective only — see the method's parameters; good_reactions can be passed back from a previous result to skip the one-off FVA.

2
max_attempts int

random_objective only — see the method's parameters; good_reactions can be passed back from a previous result to skip the one-off FVA.

2
suppress_errors int

random_objective only — see the method's parameters; good_reactions can be passed back from a previous result to skip the one-off FVA.

2

Returns:

Type Description
FluxSamplingResult

reporter_metabolites

Compute Reporter Metabolites from per-gene differential-expression p-values.

gene_pvalues maps gene id → p-value (genes not in the model, or with a NaN or out-of-[0, 1] p-value, are dropped — a stray invalid p-value would otherwise turn the whole result NaN). If gene_fold_changes (gene id → log fold change) is given, two extra results are returned for the up- (fc ≥ 0) and down- (fc < 0) regulated gene subsets, in addition to "all".

Parity with RAVEN's reporterMetabolites: the z_score and underlying background correction match exactly (exact closed-form instead of RAVEN's Monte-Carlo, see IMPROVEMENTS RM1). The reported p_value is the one-sided ("up") enrichment 1 - Φ(z) and the result is sorted by z_score descending. RAVEN sorts by p-value and reports both tails (allPValues, allUpPValues, allDownPValues); the up/down splits here come from the gene_fold_changes subset partition instead, so the same information is available via the three returned ReporterResult rows.

ReporterResult

Reporter-metabolite scores for one gene set.

test is "all", "up" or "down"; table is a DataFrame with columns metabolite, name, z_score, p_value, n_genes, mean_z, std_z sorted by descending z_score.