
⚡ The Problem & Why It Mattered To Me
Rare disease trials are few, scattered and change without warning.
Manually re-checking ClinicalTrials.gov—opening every record, scanning for new studies or status changes ate hours I’d rather spend reading the actual data. I needed updates to come to me, not the other way around.
⚡ TL-DR
A Python script that:
- Queries the ClinicalTrials.gov API on demand.
- Compiles a tidy table of new or updated trials for my keywords.
- Outputs a CSV/Excel file for quick filtering.
One run surfaces changes in seconds instead of slogging through the website.
(and in v2, will email the delta straight to my inbox).
🛠️ Approach (data flow)
ClinicalTrials.gov API
↓ (keyword & date filters)
JSON response
↓
Pandas ETL pipe
↓
Output : Excel Update
Core libs: requests
, pandas
, smtplib
📚 Things I Learned
- Stitching together small API calls rather than scraping HTML.
- Designing with the end output in mind—Excel vs. live dashboard dictates very different code paths.
🤯 Toughest Bug
Determing what output I wanted. It sounds simple but determining exactly what my updated excel file would look like took the longest!
🚀 If I Had Infinite Time…
- Run a cronjob to scheudle automatic checks and email a digest of changes straight to me.