Skip to content

Localization

MATLAB functions in RAVEN/localization of the RAVEN toolbox. Help text is collected from the source of the tracked branch.

Functions

Function Summary
getWoLFScores Predict protein sub-cellular localization with WoLF PSORT.
mapCompartments Map compartments in the geneScoreStructure.
parseScores Parse the output from a predictor to generate the GSS.
predictLocalization

Reference

getWoLFScores

Predict protein sub-cellular localization with WoLF PSORT.

The output can be used as input to predictLocalization. This function is currently only available for Linux and requires Perl to be installed. If one wants to use another predictor, see parseScores. The function normalizes the scores so that the best score for each gene is 1.0.

Input arguments:

Name Type Description Default
inputFile char

a FASTA file with protein sequences.

required
kingdom char

the kingdom of the organism, "animal", "fungi" or "plant".

required

Output arguments:

Name Type Description
GSS struct

a gene scoring structure to be used in predictLocalization.

Examples:

GSS = getWoLFScores(inputFile, kingdom);
See also

parseScores, predictLocalization

mapCompartments

Map compartments in the geneScoreStructure.

Maps compartments in the geneScoreStructure. This is used if you do not want a model that uses all of the compartments from the predictor. This function will then let you define rules on how the compartments should be merged.

Input arguments:

Name Type Description Default
geneScoreStructure struct

a structure to be used in predictLocalization.

required
varargin char or cell

any number of rules, defined as consecutive strings or in a cell array:

  • "comp1" : comp1 should be kept in the structure.
  • "comp1=comp2" : The scores in comp2 are merged to comp1 and comp2 is removed from the structure. This automatically keeps comp1 in the structure.
  • "comp1=comp2 comp3" : The scores in comp2 and comp3 are merged to comp1 and comp2 & comp3 are removed from the structure. This automatically keeps comp1 in the structure.
  • "comp1 comp2=comp3" : The scores in comp3 are split between comp1 and comp2. This automatically keeps comp1 and comp2 in the structure.
  • "comp1=other" : The scores in any compartment not included are merged to comp1. This is applied after all other rules.
required

Output arguments:

Name Type Description
geneScoreStructure struct

a structure to be used in predictLocalization.

Examples:

The predictor you use gives prediction for Extracellular, Cytosol, Nucleus, Peroxisome, Mitochondria, ER, and Lysosome. You want to have a model with Extracellular, Cytosol, Mitochondria, and Peroxisome where Lysosome is merged with Peroxisome and all other compartments are merged to the Cytosol:

GSS = mapCompartments(GSS, "Extracellular", "Mitochondria", ...
    "Peroxisome=Lysosome", "Cytosol=other");
Notes

When one compartment is merged to another the resulting scores will be the best for each gene in either of the compartments. In the case where one compartment is split among several, the scores for the compartment to be merged is weighted with the number of compartments to split to.

parseScores

Parse the output from a predictor to generate the GSS.

The function normalizes the scores so that the best score for each gene is 1.0.

Input arguments:

Name Type Description Default
inputFile char

a file with the output from the predictor.

required

Name-value arguments:

Name Type Description Default
predictor char

the predictor that was used. "wolf" for WoLF PSORT, "cello" for CELLO, "deeploc" for DeepLoc (default "wolf").

Output arguments:

Name Type Description
GSS struct

a gene scoring structure to be used in predictLocalization.

Examples:

GSS = parseScores(inputFile, predictor);
See also

predictLocalization, getWoLFScores

predictLocalization

Assign reactions to compartments using localization predictors.

Tries to assign reactions to compartments in a manner that is in agreement with localization predictors while at the same time maintaining connectivity.

Input arguments:

Name Type Description Default
model struct

a model structure. If the model contains several compartments they will be merged.

required
GSS struct

a gene scoring structure as from parseScores.

required
defaultCompartment char

transport reactions are expressed as diffusion between the defaultCompartment and the others. This is usually the cytosol. The default compartment must have a match in GSS.

required

Name-value arguments:

Name Type Description Default
transportCost double

the cost for including a transport reaction. If this is a scalar then the same cost is used for all metabolites. It can also be a vector of costs with the same dimension as model.mets. Note that negative costs will result in transport of the metabolite being encouraged (default 0.5).

maxTime double

maximum optimization time in minutes (default 15).

plotResults logical

true if the results should be plotted during the optimization (default false).

Output arguments:

Name Type Description
outModel struct

the resulting model structure.

geneLocalization struct

structure with the genes and their resulting localization.

transportStruct struct

structure with the transport reactions that had to be inferred and between which compartments.

scores struct

structure that contains the total score history together with the score based on gene localization and the score based on included transport reactions.

removedRxns cell

cell array with the reaction ids that had to be removed in order to have a connected input model.

Notes

This function requires that the starting network is connected when it is in one compartment. Reactions that are unconnected are removed and saved in removedRxns. Try running fillGaps to have a more connected input model if there are many such reactions. The input model should also not include any exchange, demand or sink reactions, otherwise this function would not provide any results.

In the final model all metabolites are produced in at least one reaction. This does not guarantee a fully functional model since there can be internal loops. Transport reactions are only included as passive diffusion (A <=> B).

The score of a model is the sum of scores for all genes in their assigned compartment minus the cost of all transport reactions that had to be included. A gene can only be assigned to one compartment. This is a simplification to keep the problem size down. The problem is solved using simulated annealing.

Examples:

[outModel, geneLocalization, transportStruct, scores, removedRxns] = ...
    predictLocalization(model, GSS, defaultCompartment, ...
        transportCost, maxTime, plotResults);