Skip to content

Input / output (Python)

raven-toolbox objects in raven_toolbox.io, collected from the source of the tracked branch.

Functions

Function Summary
ec_data_from_yaml_sections Build an EcData from the ec-* top-level YAML sections.
ec_data_to_yaml_sections Serialise an EcData to a dict suitable for YAML emission.
EcData Typed enzyme-constrained ecModel substructure attached as model.ec.
export_for_git Write model into a Standard-GEM repository layout.
export_model_to_sif Write model to a Cytoscape SIF file.
export_to_excel Write model to a RAVEN-format .xlsx file.
model_from_yaml_data Build a cobra.Model from an already-parsed RAVEN/cobrapy YAML dict.
read_yaml_model Read a RAVEN/cobrapy YAML model into a cobra.Model.
write_yaml_model Write a cobra.Model to RAVEN/cobrapy (!!omap) YAML.

Reference

ec_data_from_yaml_sections

Build an EcData from the ec-* top-level YAML sections.

Returns None when ec-rxns and ec-enzymes are both absent — the caller treats that as "this YAML is not an ec-model" and leaves model.ec = None. If exactly one of the two is present, the YAML is malformed: raise ValueError.

sections is the dict of foreign top-level keys captured by the YAML loader. gecko_light defaults to False when the key is absent.

ec_data_to_yaml_sections

Serialise an EcData to a dict suitable for YAML emission.

Returns a fresh dict with three keys: gecko_light (bool), ec-rxns (list of mappings), ec-enzymes (list of mappings). Values are native Python primitives — no numpy/ruamel scalars — so the YAML writer can dump them directly without further coercion.

Empty optional fields are omitted to keep the file compact; the loader fills them back in.

EcData

Typed enzyme-constrained ecModel substructure attached as model.ec.

Field semantics match MATLAB GECKO's model.ec struct one-to-one. Two parallel index spaces:

  • per-reaction arrays (rxns, kcat, source, notes, eccodes) of length n_rxns;
  • per-enzyme arrays (genes, enzymes, mw, sequence, concs) of length n_enzymes.

Connected by the sparse rxn_enz_mat of shape (n_rxns, n_enzymes) whose [i, j] entry is the subunit count of enzyme j in reaction i (typically 0 or 1; >1 for heteromeric complexes).

Sentinels (mirror MATLAB GECKO):

  • kcat == 0 means "no kcat assigned" (zero is the unset state; real turnover numbers are always positive).
  • mw == nan means "MW unknown" (the writer omits NaN mw entries).
  • concs == nan means "not measured" (the writer omits NaN concs).
  • empty strings in source / notes / eccodes / sequence are omitted on write and restored as "" on load.

gecko_light marks the gecko-light layout: cobra reactions stay singular, ec.rxns carries one entry per isozyme distinguished by a ###_ counter prefix, and per-enzyme prot_<id> / usage reactions are skipped in favour of the shared protein pool. False is the default (full layout, where ec.rxns matches cobra reactions one-to-one after isozyme expansion).

empty staticmethod

empty(
    n_rxns: int,
    n_enzymes: int = 0,
    *,
    gecko_light: bool = False
) -> EcData

Preallocate an EcData with the canonical sentinel values.

Per-rxn fields get empty strings; per-enzyme fields get empty strings and NaN arrays. kcat starts at 0 (0 marks "no kcat assigned"). mw and concs start at NaN, since their physical default is "unknown" rather than zero.

Used by makeEcModel-style builders that allocate the structure up-front, then fill it row by row.

validate

validate() -> None

Raise ValueError if internal field lengths are inconsistent.

Cheap sanity check: catches accidental drift between the per-rxn arrays, the per-enzyme arrays, and the coupling matrix shape. Called by pipeline stages after they mutate the data, and by the YAML loader after construction.

export_for_git

Write model into a Standard-GEM repository layout.

Parameters:

Name Type Description Default
path str | Path

Directory to populate.

'.'
prefix str

Base filename for every format (default "model").

'model'
formats Iterable[str]

Which formats to write; any of "yml", "xml", "mat", "xlsx", "txt" (default yml/xml/mat/xlsx).

('yml', 'xml', 'mat', 'xlsx')
sub_dirs bool

If True (default), write model/<fmt>/<prefix>.<fmt> (standard-GEM layout); otherwise all files go directly in path.

True

Returns:

Type Description
Path

The root directory written to.

export_model_to_sif

Write model to a Cytoscape SIF file.

Parameters:

Name Type Description Default
graph_type str

"rc" (reaction–compound, default), "rr" (reaction–reaction), or "cc" (compound–compound).

'rc'
reaction_labels Mapping[str, str] | None

Optional {id: label} maps overriding the node labels (default: IDs).

None
metabolite_labels Mapping[str, str] | None

Optional {id: label} maps overriding the node labels (default: IDs).

None

export_to_excel

Write model to a RAVEN-format .xlsx file.

For enzyme-constrained models (a populated :class:~raven_toolbox.io.EcData on model.ec), two further export-only sheets are written: ENZYMES (one row per enzyme) and ENZRXNS (one row per ec-reaction).

Parameters:

Name Type Description Default
sort_ids bool

If True, write reactions/metabolites/genes sorted alphabetically by ID (the model itself is not modified). The ec sheets are not reordered.

False

model_from_yaml_data

Build a cobra.Model from an already-parsed RAVEN/cobrapy YAML dict.

Performs three jobs in order:

  1. cobra-shaped portion: strips and restores RAVEN-only per-entry side-fields onto each entry's .notes; lifts id / name out of legacy metaData; preserves version and metaData on model.notes for round-trip.
  2. legacy quirks: lifts per-metabolite top-level smiles into annotation['smiles'] and per-reaction top-level eccodes into annotation['ec-code'] (older RAVEN/MATLAB GECKO files emitted these at the top level); flips the older reverse-direction usage_prot_* / prot_pool_exchange convention to the forward convention.
  3. GECKO ec sections: when ec-rxns / ec-enzymes are present, parses them into a typed :class:EcData and attaches it as model.ec. Other unknown top-level keys land opaquely on model.notes['_yaml_sections'] for round-trip.

raw is mutated in place — copy it first if the caller needs the original.

read_yaml_model

Read a RAVEN/cobrapy YAML model into a cobra.Model.

Convenience wrapper around :func:model_from_yaml_data that opens the file (transparently un-gzipping .gz) and parses the YAML. Callers that need to pre-process the document (e.g. lift further non-standard fields that cobra doesn't recognise) can read+normalise themselves and call :func:model_from_yaml_data with the resulting dict.

Accepts both the cobra !!omap shape and a very old RAVEN shape where the document root is a bare - sequence of single-key mappings; the latter is merged into one mapping before parsing.

write_yaml_model

Write a cobra.Model to RAVEN/cobrapy (!!omap) YAML.

When model.ec is a populated :class:EcData, the gecko_light flag and the ec-rxns / ec-enzymes top-level sections are emitted from it (numpy/ruamel scalars are coerced to plain Python primitives en route, so the dumper never sees them).

With sort_ids=True metabolites/reactions/genes/compartments are written in alphabetical order (diff-friendly), without modifying model.