Bases: ete2.coretype.tree.TreeNode
Extends the standard TreeNode instance. It adds specific attributes and methods to work with phylogentic trees.
Parameters: |
|
||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Returns: | a tree node object which represents the base of the tree. |
Add NCBI taxonomy annotation to all descendant nodes. Leaf nodes are expected to contain a feature (name, by default) encoding a valid taxid number.
All descendant nodes (including internal nodes) are annotated with the following new features:
Node.spname: scientific spcies name as encoded in the NCBI taxonomy database
Node.named_lineage: the NCBI lineage track using scientific names
Node.taxid: NCBI taxid number
Node.lineage: same as named_lineage but using taxid codes.
Note that for internal nodes, NCBI information will refer to the first common lineage of the grouped species.
Parameters: |
|
---|
values are their translation into NCBI scientific name. Its use is optional and allows to avoid database queries when annotating many trees containing the same set of taxids.
Parameters: | tax2track (None) – A dictionary where keys are taxid numbers and |
---|
values are their translation into NCBI lineage tracks (taxids). Its use is optional and allows to avoid database queries when annotating many trees containing the same set of taxids.
Parameters: | tax2rank (None) – A dictionary where keys are taxid numbers and |
---|
values are their translation into NCBI rank name. Its use is optional and allows to avoid database queries when annotating many trees containing the same set of taxids.
:param None dbfile : If provided, the provided file will be used as a local copy of the NCBI taxonomy database.
Returns: | tax2name (a dictionary translating taxid numbers into |
---|
scientific name), tax2lineage (a dictionary translating taxid numbers into their corresponding NCBI lineage track) and tax2rank (a dictionary translating taxid numbers into rank names).
Converts lineage specific expansion nodes into a single tip node (randomly chosen from tips within the expansion).
Parameters: | species (None) – If supplied, only expansions matching the species criteria will be pruned. When None, all expansions within the tree will be processed. |
---|
Implements the phylostratigrafic method described in:
Huerta-Cepas, J., & Gabaldon, T. (2011). Assigning duplication events to relative temporal scales in genome-wide studies. Bioinformatics, 27(1), 38-45.
New in version 2.2.
Returns the node better balance current tree structure according to the topological age of the different leaves and internal node sizes.
Parameters: | species2age – A dictionary translating from leaf names into a topological age. |
---|
Returns a list of all duplication and speciation events detected after this node. Nodes are assumed to be duplications when a species overlap is found between its child linages. Method is described more detail in:
“The Human Phylome.” Huerta-Cepas J, Dopazo H, Dopazo J, Gabaldon T. Genome Biol. 2007;8(6):R109.
Returns the farthest oldest leaf to the current one. It requires an species2age dictionary with the age estimation for all species.
Parameters: | is_leaf_fn (None) – A pointer to a function that receives a node instance as unique argument and returns True or False. It can be used to dynamically collapse nodes, so they are seen as leaves. |
---|
New in version 2.1.
Returns the farthest oldest node (leaf or internal). The difference with get_farthest_oldest_leaf() is that in this function internal nodes grouping seqs from the same species are collapsed.
Returns a list of duplication and speciation events in which the current node has been involved. Scanned nodes are also labeled internally as dup=True|False. You can access this labels using the ‘node.dup’ sintaxis.
Method: the algorithm scans all nodes from the given leafName to the root. Nodes are assumed to be duplications when a species overlap is found between its child linages. Method is described more detail in:
“The Human Phylome.” Huerta-Cepas J, Dopazo H, Dopazo J, Gabaldon T. Genome Biol. 2007;8(6):R109.
Calculates all possible species trees contained within a duplicated gene family tree as described in Treeko (see Marcet and Gabaldon, 2011 ).
Parameters: | autodetect_duplications (True) – If True, duplication |
---|
nodes will be automatically detected using the Species Overlap algorithm (PhyloNode.get_descendants_evol_events(). If False, duplication nodes within the original tree are expected to contain the feature “evoltype=D”.
Parameters: | features (None) – A list of features that should be |
---|
mapped from the original gene family tree to each species tree subtree.
Returns: | (number_of_sptrees, number_of_dups, species_tree_iterator) |
---|
Returns the set of species covered by its partition.
Returns an iterator over the species grouped by this node.
Returns the reconcilied topology with the provided species tree, and a list of evolutionary events inferred from such reconciliation.
Sets the parsing function used to extract species name from a node’s name.
Parameters: | fn – Pointer to a parsing python function that receives nodename as first argument and returns the species name. |
---|
# Example of a parsing function to extract species names for
# all nodes in a given tree.
def parse_sp_name(node_name):
return node_name.split("_")[1]
tree.set_species_naming_function(parse_sp_name)
Returns the list of all subtrees resulting from splitting current tree by its duplication nodes.
Parameters: | autodetect_duplications (True) – If True, duplication |
---|
nodes will be automatically detected using the Species Overlap algorithm (PhyloNode.get_descendants_evol_events(). If False, duplication nodes within the original tree are expected to contain the feature “evoltype=D”.
Returns: | species_trees |
---|
Basic evolutionary event. It stores all the information about an event(node) ocurred in a phylogenetic tree.
etype : D (Duplication), S (Speciation), L (gene loss),
in_seqs : the list of sequences in one side of the event.
out_seqs : the list of sequences in the other side of the event
node : link to the event node in the tree