Look up records of species, gene, protein, cell marker#

Entities and ontologies can be complex with many different identifiers or even species.

Here we show Bionty’s Entity model for species, genes, proteins and cell markers. You’ll see how to

  • initialize an Entity model with different identifiers

  • access the reference table via .df

  • lookup an entity record via .lookup.{term}

import bionty as bt

Species#

To examine the Species ontology we create the corresponding object and look at the associated Pandas DataFrame.

species_bionty = bt.Species()
species_bionty

Reference table#

df = species_bionty.df()
df.head()

Lookup records#

Terms can be searched with auto-complete using a lookup object:

Tip

By default, the name field is used to generate the lookup, you may change the field via:

species.lookup_field = <new field>

For duplications, we uniquefy them by appending __0, __1, __2, …

species_bionty_lookup = species_bionty.lookup()
species_bionty_lookup.white_tufted_ear_marmoset
species_bionty_lookup.white_tufted_ear_marmoset.scientific_name

To access the information of, for example the human, pig, and mouse species, we select the corresponding species through Pandas:

df = species_bionty.df()
df.set_index("name", inplace=True)
df.loc[["human", "mouse", "pig"]]

Gene#

Next let’s take a look at genes, which follows the same design choices as Species.

The only difference is the Gene class will initialize with a species parameter, therefore you will only retrieve gene entries of the specified species.

gene_bionty = bt.Gene(species="human")
gene_bionty
df = gene_bionty.df()
df.head()
gene_bionty_lookup = gene_bionty.lookup()
gene_bionty_lookup.TCF7

Convert between identifiers just using Pandas:

df.loc[df["symbol"].isin(["BRCA1", "BRCA2"])]

The mouse reference is also available from ensembl:

gene_bionty_mouse = bt.Gene("mouse")
df = gene_bionty_mouse.df()
df.head()

Protein#

The protein reference uses UniProt id as the standardized identifier.

protein_bionty = bt.Protein(species="human")
protein_bionty
protein_bionty_lookup = protein_bionty.lookup()
protein_bionty_lookup.ABC_transporter_domain_containing_protein
df = protein_bionty.df()
df.head()

Cell marker#

The cell marker ontologies works similarly.

cell_marker_bionty = bt.CellMarker(species="human")
cell_marker_bionty
df = cell_marker_bionty.df()
df.head()
cell_marker_bionty_lookup = cell_marker_bionty.lookup()
cell_marker_bionty_lookup.CD45