Localization (Python)¶
raven-toolbox objects in raven_toolbox.localization, collected from the source of the tracked branch.
Functions¶
| Function | Summary |
|---|---|
apply_localization |
Apply a :class:LocalizationProposal to model: move reactions, add the |
load_deeploc |
Parse DeepLoc 2 CSV output into a normalised :class:LocalizationScores. |
load_wolfpsort |
Parse WoLF PSORT summary output (runWolfPsortSummary) into a normalised |
LocalizationProposal |
What :func:predict_localization proposes, before applying it. |
LocalizationResult |
Outcome of :func:predict_localization (when apply=True). |
LocalizationScores |
Per-gene compartment scores. df is indexed by gene_id with one column per |
predict_localization |
Place a caller-specified set of reactions in compartments via MILP. |
Reference¶
apply_localization¶
Apply a :class:LocalizationProposal to model: move reactions, add the
inter-compartment transports the proposal listed, and return (model_copy, added).
The returned model is a deep copy of the input (original left untouched). Moved
reactions get their metabolites' compartment suffix swapped (e.g. A_c → A_m);
new compartment-specific metabolite copies are added on demand. Each added
transport is a passive diffusion M[default] ⇌ M[c] (RAVEN convention),
named tr_<met>_<c>.
load_deeploc¶
Parse DeepLoc 2 CSV output into a normalised :class:LocalizationScores.
DeepLoc 2's per-protein CSV has columns Protein_ID, Localizations, Signals,
<Compartment1>, <Compartment2>, ... where columns 4+ are per-class probabilities.
The first three metadata columns are dropped; the rest become compartment columns.
load_wolfpsort¶
Parse WoLF PSORT summary output (runWolfPsortSummary) into a normalised
:class:LocalizationScores. Rows like PROT: treating N X's as ... are skipped.
LocalizationProposal¶
What :func:predict_localization proposes, before applying it.
All DataFrames have one row per item. Use this with apply=False to preview
changes; pass it back to :func:apply_localization to commit, or diff against a
curator's expectations.
LocalizationResult¶
Outcome of :func:predict_localization (when apply=True).
LocalizationScores¶
Per-gene compartment scores. df is indexed by gene_id with one column per
compartment id; values are floats (higher = stronger evidence for that compartment).
Genes absent from df and NaN entries are treated as "no signal" by
:func:raven_toolbox.localization.predict_localization (uniform prior contribution).
with_compartments ¶
Rename compartment columns via {old: new} (e.g. predictor labels →
model compartments). Unmapped columns are kept; multiple sources can be merged
with df.combine_first afterwards.
predict_localization¶
Place a caller-specified set of reactions in compartments via MILP.
Returns a :class:LocalizationProposal (when apply=False) or a
:class:LocalizationResult (when apply=True).
reactions_to_relocate: the reaction ids to (re-)place. Everything else stays
where it is. Boundary reactions and existing multi-compartment transports passed
in this set are silently filtered out (always pinned). Pass an empty set or a list
of zero non-boundary reactions to no-op.
transport_cost: either a scalar (same cost per added transport) or a mapping
{metabolite_id_base: cost} (where the base id strips the compartment suffix,
e.g. "glc__D" matches "glc__D_c"/"glc__D_e"). Negative costs favour
adding the transport.
Multi-compartment gene scoring (default behaviour): a gene contributes its
predictor score in each compartment it lands in; the highest-scoring compartment
is "free", each additional compartment costs multi_compartment_penalty. A
secondary compartment is only worth picking when its score (typically lower than
the primary) still exceeds the penalty — no hard cutoff, just an explicit
score-vs-penalty trade-off. Set multi_compartment_penalty very large for
effectively mono-localised genes.