Ontologies

Our AI platform leverages ontologies to make sense of healthcare concepts.

The ScienceIO AI platform uses a Knowledge Graph to connect healthcare concepts to 20+ leading ontologies. These ontologies contain millions of concepts and we refresh them regularly.

Note

To help make understanding your results easier, we’ve defined our own concept types (concept_type) to create higher-level views of the data. This allows you to build groupings of similar data types, which you can then examine more closely using the concept_id.

See Concept Types for more information.

What is an Ontology?

In a healthcare data setting, an ontology is a standardized way of identifying different types of healthcare information across the entire industry and around the globe. This is usually done through codes, groupings, and specific naming conventions.

Ontologies are used to help medical data move between different electronic systems, and to semantically represent healthcare concepts. Many ontologies focus on one type of concept (procedure codes, medical conditions, clinical drugs, etc.) but others, like the UMLS, include a variety of concept types and have some overlap with other ontologies.

Supported Ontologies

ScienceIO’s Knowledge Graph categorizes each of our supported ontologies as primary when it uses them to map your data. These relationships are internally created and will evolve as we continue to train our models.

Primary Ontologies

A primary ontology is considered a parent-level ontology within our API. The Knowledge Graph uses it to map healthcare concepts found within that ontology and then looks for any additional relationships in associated secondary ontologies (example: UMLS, which also maps to a number of other secondary ontologies like LOINC, CPT, RxNorm, etc.). Our primary ontologies include:

UMLS
ChEMBL and ChEBI
dbSNP
Cell Line Ontology (CLO and CVCL)
GeneID
ClinVar
NCBI Taxonomy ID

Primary ontologies are shown in concept_type.

UMLS

The Unified Medical Language System (UMLS) is composed of the Metathesaurus, the Semantic Network, and the SPECIALIST Lexicon and Lexical Tools. It is widely used to develop digital tools and applications, and to link terms and codes across different systems or interested parties (doctors, pharmacies, insurance companies, hospital departments, etc.). It is also used in search engines, data mining, research, and statistics.

The UMLS is exceptionally comprehensive, and includes records that span all ScienceIO concept types; it is the basis for many of ScienceIO’s internal concept mappings.
The UMLS also maps to LOINC, SNOMED-CT, ICD-9/10, ICD-10-CM, CPT, HCPCS, OMIM, MeDRA, MeSH, NCIt, and RxNorm.
The UMLS code displays in the concept_id for each piece of UMLS healthcare data identified.

ChEMBL and ChEBI

ChEMBL is a database of bioactive molecules with drug-like properties. Its goal is to help translate genomic information into new drugs. ChEMBL also maps to RxNorm, and includes the following:

2.3 million compounds
1.5 million assays
85,000 documents
43,000 indications
15,000 targets
14,000 drugs
6,300 mechanisms
2,000 cells
1,200 drug warnings
757 tissues

ChEBI is a non-proprietary dictionary focused on “small” molecular entities (chemical compounds). These entities may be products of nature or synthetic products used to intervene in the processes of living organisms. ChEBI includes classes of molecular entities and part-molecular entities, but does not include nucleic acids, proteins, or peptides derived from proteins by cleavage.

ScienceIO Concept Type:

Chemicals and Drugs

CLO

Cell Line Ontology Database (CLO) is a community-based ontology of cell lines that is designed to create a standardized, logically defined format for publicly available cell line entry data. CLO includes more than 36,000 cell lines that are drawn from the following repositories:

Cell Line Knowledgebase (CLKB)
European Bioinformatics Institute (EMBL-EBI)
Coriell Catalog
Bioassay Ontology (BAO)

ScienceIO Concept Type:

Cell Biology

Gene (GeneID)

Gene provides detailed information about genes, identifies gene-specific connections, and assigns genes a unique identifier. It includes over 33 million entries for a wide range of species that are pulled from all major taxonomic groups. Gene’s records include:

Nomenclature
Reference Sequences (RefSeqs)
Maps
Pathways
Variations
Phenotypes
Links to worldwide resources related to genome, phenotype, and locus

ScienceIO Concept Type:

Genetics

dbSNP

The Single Nucleotide Polymorphism database (dbSNP) is an authoritative central repository for simple genetic polymorphisms that spans all classes of simple molecular variation, including neutral polymorphisms and those that cause rare clinical phenotypes. dbSNP includes:

Single-base nucleotide substitutions, also known as single nucleotide polymorphisms (SNPs)
Small-scale multi-base deletions or insertions, also known as deletion insertion polymorphisms (DIPs)
Retroposable element insertions and microsatellite repeat variations, also known as short tandem repeats (STRs)
Genomic and RefSeq mapping for common variations and clinical mutations
Population frequency
Molecular consequence
Publication information

ScienceIO Concept Type:

Genetics

ClinVar

ClinVar is a public archive of the relationships between human variations and phenotypes, with the goal of aggregating information about genomic variation such that we can understand its relationship to human health. ClinVar provides:

Records for a gene
Records by chromosome location
Records for a disease or phenotype

ScienceIO Concept Type:

Genetics

NCBI Taxonomy

The National Center for Biotechnology Information (NCBI) Taxonomy includes organism names as well as classifications, and spans every sequence in the nucleotide and protein sequence databases of the International Nucleotide Sequence Database Collaboration (INSDC). It distinguishes between formal and informal names. NCBI is also the standard nomenclature and classification repository for:

GenBank
The European Molecular Biology Laboratory (EMBL)
DNA Data Bank of Japan (DDBJ)

ScienceIO Concept Type:

Species & Viruses

Feedback

Was this page helpful?

Great! If you ever have questions or want to provide feedback, send us an email.

Bummer. We hate when we miss the mark. If you have suggestions for improvements or other general comments, send us an email.