resume
I build ML systems in healthcare—messy data, tight timelines, real consequences. 4+ years shipping production pipelines: constrained decoding (180x speedup), clinical RAG (3 weeks to trial), PHI de-identification (45% → 90% F1).
highlights
Constrained Decoding for Medical JSON — Pathology reports → structured JSON. 180x speedup (3 min → 1 sec), 100% schema validity. Scaling to 1M+ reports.
Caring Contacts RAG — Personalized letters of hope for patients following psychiatric discharge. Concept to clinical trial in 3 weeks. Presenting ISBD 2025, IASR 2025.
NLP to OMOP Pipeline — Radiology reports → NLP classification → FHIR R4 → OMOP tables. End-to-end. Extended fhir.resources with Breast Radiology IG profiles.
FHIR Extraction — Needed 2M+ reports, existing queries took 120 days. Reverse-engineered undocumented structure. 700x speedup, found 500K missing records.
PHI De-identification — Built augmented data pipeline for hospital-specific NER. 45% → 90% F1.
Mammography Classification — Sort DICOMs by artifact type when metadata was too inconsistent to rely on. EfficientNetB0 + active learning loop.
currently
Research Software Engineer · Sunnybrook Research Institute · Toronto
skills
LLMs & NLP · Constrained Decoding · RAG · vLLM · llama.cpp
Quantization · Active Learning · Synthetic Data Generation
PyTorch · FHIR · OMOP · Postgres · Docker · FastAPI
education
BEng, Biomedical Engineering — Toronto Metropolitan University