Do LLMs Think in English?
Models use multilingual concepts as rhetorical devices, not reasoning primitives.
Building a ML Pipeline for Microcalcification Classification in OMOP
OMOP's non-auto-incrementing PKs taught me more about database design than any tutorial.
On Building Augmented Datasets: A Practical Case Study
Why hospital-specific training data matters more than MIMIC.
When 75% Isn't Enough: Trying to Distill GPT-4 into GLiNER
Prompted probabilities aren't real probabilities. That's why this failed.
Road to a SOTA PII Model
From PhysioNet heuristics tagging 'pain' as a name to 75% F1.
Using Decoder-Only LLMs for PHI De-Identification: A Minimal Setup
Skip asking for character positions—just get spans and regex the rest.
RAG Experiments: Chunking, Retrieval, Reformulation
Proposition-level chunking beat paragraphs for pulling personal details from clinical notes.
Framing Survival Prediction as Next-Token: A Failed Experiment
Cell2Sentence orders genes by expression. Could clinical features be ordered the same way?
Building a Bookkeeping Agent: What Worked, What Didn't
I built tools for a financial agent from scratch while interviewing for an agents role. Calculator failed, DuckDB saved the day, and I found my $70/month Starbucks habit.
MedColBERT: Late Interaction Retrieval for Clinical Text
JaColBERT's training recipe fixed ColBERT's in-domain underperformance. Could it work for medical retrieval?