[BETA] Structure with Ontologies
Now our API can return the ontologies along with the UMLS codes in each JSON response.
The API's enhanced Structure model (structure-ontologies
) automatically maps UMLS codes to a select set of ontologies. This means you no longer have to manually look up every UMLS code received from the API to find the ontologies that are related to it.
The endpoint is currently only available via HTTP or a Python requests library. SDK support is coming soon!
Note that we cannot guarantee backward compatibility for any beta endpoint. API signatures and response schemas are subject to change in future releases.
Why Try It?
Use this new endpoint if you want to:
- See if ScienceIO maps your data to the ontologies you care about
- See how many ontologies ScienceIO can find for an identified piece of healthcare data
Supported Ontologies
In this first phase, the following ontologies are identified and supported:
- CPT
- HCPCS
- ICD-10
- SNOMED-CT
The API uses a Knowledge Graph to connect healthcare concepts to ontologies. The ontologies above are mapped by leveraging UMLS as a primary ontology.
These are not the only ontologies.
Additional ontologies like ChEMBL, NCIT, RxNORM, GeneID, Cell Line, LOINC, and more are still used by the API to identify and structure healthcare information, but are not yet supported by this feature. For more information about ontologies, see our Ontologies page.
How to Access the New Endpoint
In this release, Python users can access the new endpoint via a requests library. HTTP users will need to update the endpoint in their code to structure-ontologies
in order to automatically map to the ontologies.
HTTP
Updating the endpoint to structure-ontologies
(line one in the example below) is optional and reversible; you may continue to call the current structure
endpoint.
curl https://api.aws.science.io/v2/structure-ontologies \
--request POST \
--header "Content-type: application/json" \
--header "x-api-id: $SCIENCEIO_KEY_ID" \
--header "x-api-secret: $SCIENCEIO_KEY_SECRET" \
--data '{ "text": "ALS is often called Lou Gehrigs disease, after the baseball player who was diagnosed with it. Doctors usually do not know why ALS occurs."}'
For additional help, see Make an API Call (HTTP.
Requests Library (Python)
First, create a mini SDK for the API that includes your API keys. You will use an exponential backoff, and ScienceIO recommends maximizing the attempts at 8. Make sure you add your API keys in the appropriate variables (the last two lines of this code sample).
When you have finished and tested this piece, go to the Examples section to learn how to make your API calls.
import time
import requests
MAX_ATTEMPTS = 8
INITIAL_TIMEOUT_SECS = 1
def get_result_with_exponential_backoff(base_url: str, request_id: str, api_key_id: str, api_key_secret: str):
url = f"{base_url}/{request_id}"
headers = {
"Content-Type": "application/json",
"x-api-id": api_key_id,
"x-api-secret": api_key_secret,
}
current_timeout = INITIAL_TIMEOUT_SECS
for _ in range(MAX_ATTEMPTS):
response = requests.get(url, headers=headers)
response.raise_for_status()
response_json = response.json()
#print(response_json)
inference_result = response_json.get("inference_result", None)
if inference_result is not None:
return inference_result
time.sleep(current_timeout)
current_timeout *= 2
raise Exception("Number of attempts exhausted, try again later")
def call_short_async(model: str, text: str, api_key_id: str, api_key_secret: str):
url = f"https://api.aws.science.io/v2/{model}"
json_request = {"text": text}
headers = {
'x-api-id': api_key_id,
'x-api-secret': api_key_secret,
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, json=json_request)
status_code = response.status_code
if status_code != 201:
reason = response.reason
raise Exception(f"Request failed with status code {status_code} and reason: {reason}")
request_id = response.json()["request_id"]
return get_result_with_exponential_backoff(url, request_id, api_key_id, api_key_secret)
# Add your ScienceIO API keys here.
api_key_id = "<YOUR_API_KEY_ID>"
api_key_secret = "<YOUR_API_SECRET_KEY>"
Examples
- Use these examples after you have successfully accessed the new endpoint via a requests library or HTTP. Click here to return to those instructions.
Click to go to the example:
Make a Basic Call
Use the following code to make a call to the structure-ontologies
endpoint with your healthcare text.
# Basic call to the endpoint. Replace the text with yours.
response = call_short_async("structure-ontologies", "The patient presents with a complaint of allergies to ragweed.", api_key_id, api_key_secret)
response
The response looks like this, and includes the new ontologies
dictionary:
{
"request_id": "0cefa1a8-2320-4684-9a3b-2c081dfb9e3b",
"inference_result": {
"text": "The patient presents with a complaint of allergies to ragweed.",
"spans": [
{
"concept_id": "UMLS:C0086418",
"concept_name": "Homo sapiens",
"concept_type": "Species & Viruses",
"pos_end": 11,
"pos_start": 4,
"score_id": 0.999993085861206,
"score_type": 0.9999990463256836,
"text": "patient",
"ontologies": {
"SNOMEDCT_US": [
{
"aui": "A2882492",
"code": "337915000",
"name": "Homo sapiens"
},
{
"aui": "A3497260",
"code": "278412004",
"name": "Human - origin"
}
]
}
},
{
"concept_id": "UMLS:C0020517",
"concept_name": "Hypersensitivity",
"concept_type": "Medical Conditions",
"pos_end": 50,
"pos_start": 41,
"score_id": 0.6348877549171448,
"score_type": 0.9998083710670471,
"text": "allergies",
"ontologies": {
"ICD10": [
{
"aui": "A0244105",
"code": "T78.4",
"name": "Allergy, unspecified"
}
],
"SNOMEDCT_US": [
{
"aui": "A10863724",
"code": "421961002",
"name": "Hypersensitivity reaction"
},
{
"aui": "A9379243",
"code": "418634005",
"name": "Allergic reaction to substance"
}
]
}
},
{
"concept_id": "UMLS:C0946568",
"concept_name": "Ambrosia artemisiifolia",
"concept_type": "Species & Viruses",
"pos_end": 61,
"pos_start": 54,
"score_id": 0.8726935386657715,
"score_type": 0.6667567491531372,
"text": "ragweed",
"ontologies": {
"SNOMEDCT_US": [
{
"aui": "A24089354",
"code": "41020006",
"name": "Ambrosia artemisiifolia"
}
]
}
}
]
},
"model_type": "structure-ontologies",
"inference_status": "COMPLETED",
"message": "Your inference results are ready."
}
To format these results into a table, use pandas (see the next example for a more detailed look at pandas):
# Pandas can help you better analyze the response.
import pandas as pd
df = pd.DataFrame(response['spans'])
df
Extract Concepts that Map to a Specific Ontology Using Pandas
In this example, we'll take some sample text and extract out the ICD-10 codes.
First, let's input some sample healthcare text (make sure you have pandas installed before beginning):
# First, we need to input the text.
text = """Patient is a 35-year-old male with a history of hypertension and
obesity. He presents today with chest pain and shortness of breath.
Physical examination reveals elevated blood pressure and increased heart rate.
EKG shows evidence of acute myocardial infarction. Patient is started on aspirin
and referred for urgent cardiac catheterization. Further management to be
determined following catheterization."""
# Now we'll call the endpoint and request a response using pandas.
response = call_short_async("structure-ontologies", text, api_key_id, api_key_secret)
pd.DataFrame(response["spans"])
The response looks like this:

The far right column contains the ontologies. You can use a wrapper function to extract only the healthcare concepts that are mapped to a specific ontology (in this example, ICD-10, entered as "ICD10"), and return a pandas dataframe with only the ontology codes you were looking for.
Note: You could find SNOMED-CT, instead, by changing the last line in the code below to: extract_specific_codes(response,"SNOMEDCT_US")
# In this example, we are looking for ICD10.
def extract_specific_codes(response,ontology = "ICD10"):
"""
Parse the response from the API and extract concepts that map to a specific ontology
"""
df = pd.DataFrame(response["spans"])
temp_df = pd.json_normalize(df["ontologies"])
df = df.join(temp_df)
if ontology in ["SNOMEDCT_US","ICD10"]:
if ontology in df.columns:
temp_df = df[~df[ontology].isna()].reset_index()
temp_df[ontology+"_codes"] = temp_df[ontology].apply(lambda x :[r["code"] for r in x])
temp_df[ontology+"_count"] = temp_df[ontology].apply(len)
return temp_df
else:
return print("Response had no annotations in", ontology)
else:
return print("Ontology requested can be one of SNOMEDCT_US,ICD10")
# Specify the ontology here. You can also try this exercise with "SNOMEDCT_US".
extract_specific_codes(response,"ICD10")
The response looks like this:

Interpreting the New Ontologies Dictionary
The JSON response maps each concept_id
(UMLS code) to one or more secondary ontologies by using its atom unique identifier (AUI), providing insight into the specific ontologies that are associated with each piece of healthcare information. For existing users, the arrays are the same as before except they now include the ontologies
dictionary, with the following information for each mapped ontology:
aui
= the atom unique identifiercode
= the source identifier in the ontologyname
= the concept name in the ontology
In the example below, the UMLS codes have both been mapped to SNOMED-CT.
- UMLS code C5203670 for "COVID-19" was mapped to SNOMED-CT based on the AUI A31531574. The corresponding concept name in SNOMED-CT for "COVID-19" is "Disease caused by 2019-nCoV."
- UMLS code C0012634 for "Disease" was mapped to SNOMED-CT based on the AUI A2880798. The corresponding concept name in SNOMED-CT for "Disease" is the same as the UMLS
concept_name
.
Remember that only the currently supported ontologies will display; look for additional ontologies in future releases.
{
"text": "COVID-19 is a disease",
"spans": [
{
"concept_id": "UMLS:C5203670",
"concept_name": "COVID-19",
"concept_type": "Medical Conditions",
"pos_end": 8,
"pos_start": 0,
"score_id": 0.9998598098754883,
"score_type": 0.9999113082885742,
"text": "COVID-19",
"ontologies": {
"SNOMEDCT_US": [
{
"aui": "A31531574",
"code": "840539006",
"name": "Disease caused by 2019-nCoV"
}
]
}
},
{
"concept_id": "UMLS:C0012634",
"concept_name": "Disease",
"concept_type": "Medical Conditions",
"pos_end": 21,
"pos_start": 14,
"score_id": 0.9999176263809204,
"score_type": 0.9999895095825195,
"text": "disease",
"ontologies": {
"SNOMEDCT_US": [
{
"aui": "A2880798",
"code": "64572001",
"name": "Disease"
}
]
}
}
],
"model_type": "structure-ontologies",
"inference_status": "COMPLETED",
"message": "Your inference results are ready."
}
Troubleshooting
NameError or AttributeError
Check your code and the endpoint to be sure everything is correct.
- Check your requests library to be sure the mini SDK code was copied and edited correctly, and that you included your API keys
- In cURL, make sure you have "v2" in your url and that you have used a dash and not an underscore in the endpoint (
https://api.aws.science.io/v2/structure-ontologies
)
If these steps fail, you may wish to generate new API keys from the ScienceIO dashboard and try again. Note, however, that your old API keys will no longer work. Be sure they are not being used in production (or that you are prepared to update your keys) before you generate new ones.
SyntaxError
This error is usually caused by contractions.
- Manually remove the apostrophes in your query text
- Write code to automatically clean up your query text and remove apostrophes
If this does not resolve the issue, make sure you have enclosed your query text with quotation marks (single or double).
Ontology Not Found
Make sure you have used the correct nomenclature, and that you are requesting one of the supported ontologies. They should be written exactly as follows in your code:
- CPT
- HCPCS
- ICD10
- SNOMEDCT_US
Feedback
We'd love your feedback! Tell us what you think about this product update. Email us at [email protected].
Updated 8 days ago