Introduction
The llm_interpret() function integrates Large Language
Model (LLM) capabilities to automatically interpret epidemiological
visualisations and data. This powerful feature can generate insights,
identify patterns, and provide contextual analysis of your charts and
datasets, making it valuable for surveillance reporting and data
exploration.
Environment Setup
Before using llm_interpret(), set provider-agnostic
environment variables. These are what the function reads.
Required Environment Variables
Set these in your .Renviron file or R session:
# Provider: one of "openai", "gemini", or "anthropic"
Sys.setenv(LLM_PROVIDER = "openai")
# API key for the chosen provider
Sys.setenv(LLM_API_KEY = "your-api-key")
# Model for the chosen provider (examples)
Sys.setenv(LLM_MODEL = "gpt-4.1-nano") # OpenAI
# Sys.setenv(LLM_MODEL = "gemini-2.5-flash-lite") # Google
# Sys.setenv(LLM_MODEL = "claude-sonnet-4-20250514") # AnthropicVerify Setup
if (Sys.getenv("LLM_PROVIDER") != "" && Sys.getenv("LLM_API_KEY") != "" && Sys.getenv("LLM_MODEL") != "") {
cat("LLM environment is configured\n")
} else {
cat("Please set LLM_PROVIDER, LLM_API_KEY and LLM_MODEL\n")
}Example 1: Interpreting an epidemic curve
This example demonstrates how to use LLM interpretation to analy#sze an epidemic curve and generate insights about temporal patterns.
Prepare the data and create a visualization
# Create an epidemic curve for interpretation
epi_data <- epiviz::lab_data %>%
filter(
organism_species_name == "STAPHYLOCOCCUS AUREUS",
specimen_date >= as.Date("2023-01-01"),
specimen_date <= as.Date("2023-12-31")
)
# Create the epidemic curve
epi_curve_plot <- epi_curve(
dynamic = FALSE,
params = list(
df = epi_data,
date_var = "specimen_date",
date_start = "2023-01-01",
date_end = "2023-12-31",
time_period = "year_month",
fill_colours = "#007C91",
chart_title = "Monthly Staph aureus detections (2023)",
x_axis_title = "Month",
y_axis_title = "Number of detections"
)
)
# Display the plot
print(epi_curve_plot)Interpret the visualisation
# Use LLM to interpret the epidemic curve
interpretation <- llm_interpret(
input = epi_curve_plot,
word_limit = 120,
prompt_extension = "Analyse this epidemic curve and identify notable patterns, trends, or anomalies. Focus on seasonal patterns and potential outbreak periods."
)
# Display the interpretation
cat("LLM Interpretation:\n")
cat(interpretation)Interpretation: The LLM will analyse the epidemic curve and provide insights about temporal patterns, seasonal trends, and any notable peaks or anomalies in the data.
Example 2: Custom interpretation with specific focus
This example shows how to provide a custom prompt to guide the LLM’s analysis toward specific epidemiological concerns.
Prepare the data and create a visualization
# Create an age-sex pyramid for interpretation
pyramid_data <- epiviz::lab_data %>%
filter(
organism_species_name == "KLEBSIELLA PNEUMONIAE",
specimen_date >= as.Date("2023-01-01"),
specimen_date <= as.Date("2023-06-30")
)
# Create the age-sex pyramid
pyramid_plot <- age_sex_pyramid(
dynamic = FALSE,
params = list(
df = pyramid_data,
var_map = list(dob_var = "date_of_birth", sex_var = "sex"),
grouped = FALSE,
mf_colours = c("#440154", "#2196F3"),
x_axis_title = "Number of cases",
y_axis_title = "Age group (years)",
legend_title = "Klebsiella pneumoniae cases by age and sex (H1 2023)"
)
)
# Display the plot
print(pyramid_plot)Interpret with custom epidemiological focus
# Use LLM with a custom prompt focused on public health implications
custom_interpretation <- llm_interpret(
input = pyramid_plot,
word_limit = 150,
prompt_extension = "As a public health epidemiologist, analyze this age-sex pyramid for Klebsiella pneumoniae cases. Identify which demographic groups are most at risk, discuss potential risk factors, and suggest targeted prevention strategies. Consider healthcare-associated infections and community transmission patterns."
)
# Display the custom interpretation
cat("Custom Epidemiological Analysis:\n")
cat(custom_interpretation)Interpretation: The LLM will provide a detailed epidemiological analysis focusing on risk groups, potential transmission patterns, and public health recommendations based on the demographic distribution shown in the pyramid.
Tips for LLM interpretation
-
API Key Management:
- Store API keys securely in your
.Renvironfile - Never commit API keys to version control
- Use different keys for different environments (development, production)
- Store API keys securely in your
-
Prompt Engineering:
- Be specific about what you want the LLM to focus on
- Include context about the epidemiological scenario
- Ask for actionable insights and recommendations
- Specify the level of detail you need
-
Data Privacy:
- Be cautious with sensitive health data
- Consider using aggregated or anonymised data for LLM interpretation
- Review your organisation’s data sharing policies
-
Cost Management:
- LLM API calls have costs based on usage
- Start with smaller datasets for testing
-
Quality Control:
- Always review LLM interpretations for accuracy
- Cross-reference with domain expertise
- Use LLM insights as a starting point for further analysis
-
Integration with Workflows:
- Use LLM interpretation for automated report generation
- Incorporate into surveillance dashboards
- Generate insights for outbreak investigation
