Filter Messages by Concept Type

Identify trends in messages with filtering.

In this example, we will filter patient messages by concept type (concept_type) in a couple of different ways. These code samples use version 2.0.0 of the ScienceIO API. Before you begin, please check your version and upgrade your scienceio Python package if necessary.

For this exercise only, remember that:

  • When concept_type equals “Medical Conditions”, the concept_name indicates the specific side effect.
  • When concept_type equals “Chemicals & Drugs”, the concept_name indicates the medication.

Option 1: Get a Count of Side Effects

One way to examine trends in messages is to use counts. After running the code from Structure, Parse, and Analyze Messages, add this code to get a count of all side effects that were identified in the messages.

# Subset the df to only medical conditions
side_effects_db = structured_results[structured_results["concept_type"]=="Medical Conditions"]

# Create a count of each
side_effects_db["concept_name"].value_counts()

The resulting table looks like this:

Notes:

  • The name of the referenced DataFrame in this example is structured_results, which we created in step 3 of Structure, Parse, and Analyze Messages.
  • The concept_type equals “Medical Conditions” because the API assigned all side effects the Medical Conditions concept type when it structured the messages.
  • The concept_name is what generates the count for each side effect; in this example, the API assigned all side effects to theconcept_name variable when it structured the messages.

Option 2: Filter by Medications

Another way to examine messages is to list every concept_name found for a particular concept_type. After running the code from Structure, Parse, and Analyze Messages, add this code to create a list of all medications that were identified in the messages.

# Subset the df to only medications
structured_results[structured_results["concept_type"]=="Chemicals & Drugs"]

The resulting table looks like this:

Notes:

  • The name of the referenced DataFrame in this example is structured_results, which we created in step 3 of Structure, Parse, and Analyze Messages.
  • The concept_type equals “Chemicals & Drugs” because the API assigned all medications the Chemicals & Drugs concept type when it structured the messages.

Option 3: Find Side Effects Associated with Medications

Continuing on from Options 1 and 2 above, meta-analyses are possible to help provide a deeper understanding of your data.

First, use this code to create a mapping of the medication to each patient message. This will add another column called medication to the DataFrame, which indicates the medication a patient mentioned in their message.

structured_results_medications = structured_results.copy()

medication_map = {}
for _,row in structured_results_medications.iterrows():
    if row["concept_type"]=="Chemicals & Drugs":
        medication_map[row["message_id"]] = row["concept_name"]
print(medication_map)

# 
structured_results_medications["medication"] = structured_results_medications["message_id"].apply(lambda x: medication_map[x])
structured_results_medications.head()

The results look like this:

Next, use pandas to tie each medication to its reported side effect by filtering the structured results DataFrame to only the “Medical Conditions” concept_type, grouping the results by the new medication column, and adding a count in the concept_name column.

# Filter to only medical conditions 
filtered_conditions = structured_results_medications[structured_results_medications["concept_type"]=="Medical Conditions"]
filtered_conditions.groupby("medication")["concept_name"].value_counts().to_frame("counts")

The resulting table looks like this:

Questions?

If you need additional help, we’re standing by ready to assist! Contact support.