last updated: jan 2026 download pdf

I build ML systems in healthcare—messy data, tight timelines, real consequences. 4+ years shipping production pipelines: constrained decoding (180x speedup), clinical RAG (3 weeks to trial), PHI de-identification (45% → 90% F1).

highlights

Constrained Decoding for Medical JSON — Pathology reports → structured JSON. 180x speedup (3 min → 1 sec), 100% schema validity. Scaling to 1M+ reports.

Caring Contacts RAG — Personalized letters of hope for patients following psychiatric discharge. Concept to clinical trial in 3 weeks. Presenting ISBD 2025, IASR 2025.

NLP to OMOP Pipeline — Radiology reports → NLP classification → FHIR R4 → OMOP tables. End-to-end. Extended fhir.resources with Breast Radiology IG profiles.

FHIR Extraction — Needed 2M+ reports, existing queries took 120 days. Reverse-engineered undocumented structure. 700x speedup, found 500K missing records.

PHI De-identification — Built augmented data pipeline for hospital-specific NER. 45% → 90% F1.

Mammography Classification — Sort DICOMs by artifact type when metadata was too inconsistent to rely on. EfficientNetB0 + active learning loop.

currently

Research Software Engineer · Sunnybrook Research Institute · Toronto

skills

LLMs & NLP · Constrained Decoding · RAG · vLLM · llama.cpp
Quantization · Active Learning · Synthetic Data Generation
PyTorch · FHIR · OMOP · Postgres · Docker · FastAPI

education

BEng, Biomedical Engineering — Toronto Metropolitan University