Filter Messages by Concept Type
Identify trends in messages with filtering.
In this example, we will filter patient messages by concept type (concept_type
) in a couple of different ways.
This example uses the structured patient messages from Structure a Series of Messages. We recommend executing that code first, choosing the option on this page you're most interested in, and then adding that code to it.
Advanced users may wish to make code edits and apply the steps to their own messages so they can use filtering (some guidance has been provided for each option). You must still call the
structure
endpoint before you can filter by concept type.To see a more basic filtering example, go to Step 4 of the Make an API Call (Python SDK) page.
For this exercise only, remember that:
- When
concept_type
equals "Medical Conditions", theconcept_name
indicates the specific side effect - When
concept_type
equals "Chemicals & Drugs", theconcept_name
indicates the medication
These code samples use version 2.0.0 of the ScienceIO API. Before you begin, please check your version and upgrade your scienceio Python package if necessary.
Option 1: Get a Count of Side Effects
One way to examine trends in messages is to use counts. After running the code from Structure a Series of Messages, add this code to get a count of all side effects that were identified in the messages.
# Subset the df to only medical conditions
side_effects_db = structured_results[structured_results["concept_type"]=="Medical Conditions"]
# Create a count of each
side_effects_db["concept_name"].value_counts()
The resulting table looks like this:

A few notes:
- The name of the referenced DataFrame in this example is
structured_results
, which we created in step 3 of Structure a Series of Messages. - The
concept_type
equals "Medical Conditions" because the API assigned all side effects the Medical Conditions concept type when it structured the messages. - The
concept_name
is what generates the count for each side effect; in this example, the API assigned all side effects to theconcept_name
variable when it structured the messages.
Advanced Users
To use this example on your own messages, change the name of the DataFrame to yours, and decide which
concept_type
you want to count. ScienceIO has nine different concept types.
Option 2: Filter by Medications
Another way to examine messages is to list every concept_name
found for a particular concept_type
. After running the code from Structure a Series of Messages, add this code to create a list of all medications that were identified in the messages.
# Subset the df to only medications
structured_results[structured_results["concept_type"]=="Chemicals & Drugs"]
The resulting table looks like this:

A few notes:
- The name of the referenced DataFrame in this example is
structured_results
, which we created in step 3 of Structure a Series of Messages. - The
concept_type
equals "Chemicals & Drugs" because the API assigned all medications the Chemicals & Drugs concept type when it structured the messages.
Advanced Users
To use this example on your own messages, change the name of the DataFrame to yours, and decide which
concept_type
you want to list. ScienceIO has nine different concept types.
Option 3: Find Side Effects Associated with Medications
Continuing on from Options 1 and 2 above, meta-analyses are possible to help provide a deeper understanding of your data.
This example is based on the simplifying assumption that each patient message mentions only a single medication. The patient messages also included only two different concept types, and both apply to this meta-analysis (so we did not filter out any others). You may need to first filter by concept type when trying the code on your own data.
First, use this code to create a mapping of the medication to each patient message. This will add another column called medication
to the DataFrame, which indicates the medication a patient mentioned in their message.
structured_results_medications = structured_results.copy()
medication_map = {}
for _,row in structured_results_medications.iterrows():
if row["concept_type"]=="Chemicals & Drugs":
medication_map[row["message_id"]] = row["concept_name"]
print(medication_map)
#
structured_results_medications["medication"] = structured_results_medications["message_id"].apply(lambda x: medication_map[x])
structured_results_medications.head()
The results look like this:

Next, use pandas to tie each medication to its reported side effect by filtering the structured results DataFrame to only the "Medical Conditions" concept_type
, grouping the results by the new medication
column, and adding a count in the concept_name
column.
# Filter to only medical conditions
filtered_conditions = structured_results_medications[structured_results_medications["concept_type"]=="Medical Conditions"]
filtered_conditions.groupby("medication")["concept_name"].value_counts().to_frame("counts")
The resulting table looks like this:

Advanced Users
To use this example on your own messages, change the name of the DataFrame to yours, decide which variables you want to map, and decide how you wish to group the results. You may also want to first filter the results to only the concept types you are analyzing.
Questions?
If you need additional help, we're standing by ready to assist! Contact support.
Updated 20 days ago