apple-health
Reads, parses, and summarizes data from an Apple Health "Export your Data" archive — the folder containing export.xml, export_cda.xml, electrocardiograms/*.csv, and workout-routes/*.gpx. Use whenever the user mentions Apple Health, HealthKit, an Apple Health export, an export.xml file, an apple_heal
Install
mkdir -p .claude/skills/apple-health && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/14418" && unzip -o skill.zip -d .claude/skills/apple-health && rm skill.zipInstalls to .claude/skills/apple-health
Activation
This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.
Reads, parses, and summarizes data from an Apple Health "Export your Data" archive — the folder containing export.xml, export_cda.xml, electrocardiograms/*.csv, and workout-routes/*.gpx. Use whenever the user mentions Apple Health, HealthKit, an Apple Health export, an export.xml file, an apple_health_export folder, ECG/EKG CSVs from an Apple Watch, sleep data from Apple Watch or WHOOP, workout routes from Apple Health, or any analysis of step counts, heart rate, energy burned, sleep stages, VO2max, HRV, or related HealthKit metrics. Trigger even when the user only says things like "I just exported my health data" or "process the export from my Watch" without naming the format explicitly — those exports are this format. Do not try to read export.xml with a normal DOM parser; it is routinely 1–5 GB and will OOM.About this skill
Apple Health export
An Apple Health export is what you get from Health app → profile photo → Export All Health Data. It unzips to a folder with this shape:
apple_health_export/
├── export.xml # ALL HealthKit data — usually 0.5–5 GB
├── export_cda.xml # same data in HL7 CDA — usually skip unless asked
├── electrocardiograms/ # one CSV per ECG reading from Apple Watch
└── workout-routes/ # one GPX per workout with GPS
The export is large but the structure is well-defined and unchanging. The job of this skill is to turn it into something you can actually query: tidy CSVs per metric, plus convenience parsers for the three other formats.
Workflow
When the user asks anything that requires looking inside the export, the path is almost always:
- Locate the export folder. Confirm it contains
export.xml. If the user only has a.zip, unzip it first. - Run
scripts/parse_export_xml.pyonce to fanexport.xmlout into one tidy CSV per HealthKit record type (plusworkouts.csv,activity_summary.csv,me.json). This is the foundation — every later question is fast because you're reading 50 MB of CSV instead of streaming 1.5 GB of XML. - Answer the user's question against the derived CSVs using pandas (or whatever fits). For ECG / GPX / CDA, use the dedicated parser script.
Re-running parse_export_xml.py is only necessary when the user re-exports their data. Cache its outputs in a parsed/ directory next to export.xml.
Why streaming matters
export.xml is a single XML document with millions of <Record> elements. Loading it with ElementTree.parse() or lxml.etree.parse() will allocate gigabytes and almost always OOM the kernel. Always use iterparse (in the stdlib) and call elem.clear() after handling each element. The provided parse_export_xml.py does this correctly — prefer it over rolling your own.
The provided scripts
All scripts live in scripts/ and are runnable with python3 <script> --help.
| Script | What it does |
|---|---|
parse_export_xml.py | Streams export.xml and writes one CSV per record type, plus workouts.csv, activity_summary.csv, me.json. Memory-bounded. |
parse_ecg.py | Reads an Apple Watch ECG CSV (header + waveform) into a dict + numpy array. |
parse_gpx.py | Reads a GPX route file into a DataFrame of track points; computes distance/elevation/duration. |
parse_cda.py | Walks export_cda.xml (HL7 CDA) for clinical observations (labs, vitals, FHIR clinical records). |
sleep_summary.py | Reads parsed/HKCategoryTypeIdentifierSleepAnalysis.csv and produces a per-night summary (time-in-bed, asleep, REM, deep, core, awake) per source. |
workouts_with_routes.py | Joins workouts.csv to the GPX files in workout-routes/ so each workout has its route file path. |
Run them with no arguments for the --help output, or read the top of each file — they're short and the docstrings document the I/O.
Record types you will see
Apple Health uses opaque identifiers like HKQuantityTypeIdentifierHeartRate and HKCategoryTypeIdentifierSleepAnalysis. The full catalog observed in real exports, with units and meaning, is in references/record_types.md. Read it before answering a metric-specific question so you pick the right type and unit and don't mix up, e.g., ActiveEnergyBurned with BasalEnergyBurned.
Sleep is its own thing
Sleep data lives in HKCategoryTypeIdentifierSleepAnalysis records with these values:
HKCategoryValueSleepAnalysisInBed— bedtime window (older devices only emit this)HKCategoryValueSleepAnalysisAwake— awake during sleep windowHKCategoryValueSleepAnalysisAsleepUnspecified— asleep, stage unknown (pre-watchOS 9)HKCategoryValueSleepAnalysisAsleepCore/AsleepDeep/AsleepREM— modern stages
Multiple sources frequently overlap (Apple Watch, WHOOP, Sleep Cycle, iPhone). Do not naively sum durations; choose one source per night or merge carefully. The full discussion and a working aggregation are in references/sleep.md and scripts/sleep_summary.py.
Workout routes
A <Workout> element can contain a <WorkoutRoute> with a <FileReference path="/workout-routes/route_2024-01-15_10.27pm.gpx"/>. That path is relative to the export folder root, not absolute. workouts_with_routes.py handles the join.
ECG files
Each electrocardiograms/ecg_YYYY-MM-DD[_n].csv starts with an 8-line header block (Name, DOB, Recorded Date, Classification, Symptoms, Software Version, Device, Sample Rate), a blank line, two more lines (Lead,Lead I and Unit,µV), another blank line, then ~15k rows of sample_index,microvolt_value. parse_ecg.py returns (metadata_dict, numpy_array_microvolts) so it is straightforward to plot or run filters on.
CDA / clinical records
export_cda.xml is the same data restructured as HL7 CDA. For most questions it is redundant with export.xml. Reach for it only when the user asks about clinical records imported from a provider (lab results, immunizations, conditions) — those live as <ClinicalRecord> references in export.xml pointing at FHIR JSON, and as <observation>/<organizer> in the CDA. See references/cda.md.
Sources and devices
Real exports mix data from several devices and apps (Apple Watch, iPhones, WHOOP, Sleep Cycle, Headspace, etc.). The same metric can appear from multiple sources at the same timestamp. When summing or averaging, always check sourceName and consider filtering or deduplicating — references/source_devices.md has the patterns.
What to do when the user asks an open question
Examples and the recommended approach for each:
"How many steps did I average in 2024?"
→ Parse if not already parsed → read HKQuantityTypeIdentifierStepCount.csv → group by date → mean. Note that the same StepCount minute can come from both Watch and iPhone; pick one source (usually Watch when present) before summing per day.
"Show my resting heart rate trend."
→ HKQuantityTypeIdentifierRestingHeartRate.csv is already daily; just plot it.
"How well did I sleep last month?"
→ Run sleep_summary.py to get a per-night DataFrame, then filter and chart.
"Map my longest run."
→ Find the longest HKWorkoutActivityTypeRunning (or Walking) row in workouts.csv, look up its GPX path via workouts_with_routes.py, plot with parse_gpx.py.
"What does my ECG from July 9th look like?"
→ Use parse_ecg.py on the matching CSV; plot the waveform.
Default output formats: derived data → CSV in parsed/; summaries the user asked for → xlsx or PNG charts in the workspace folder.