Applied AI · NLP · Research
Applied AI· Capital One
I lead a Data Science and AI team within the Card Intelligence group at Capital One. Over my career, I worked on different projects in the clinical NLP, generative AI, information extraction, and retrieval spaces. I hold a Ph.D. in Computer Science, with publications in venues such as SIGIR, ACL, NAACL, EMNLP, ECIR, and AMIA. Beyond the day-to-day team work, I co-author papers with my colleagues at Capital One and stay close to where the field is heading. I'm particularly drawn to technology that earns its place quietly — reducing friction, fitting into daily life without demanding attention.
LLMs often produce output that is conceptually correct but violates formal constraints like word limits or formatting rules. This paper proposes a multi-agentic workflow that separates optimization of the core task from its specific output constraints, using quantitative compliance scores as iterative feedback signals. The method yields significantly higher instruction-following scores on Llama 3.1 8B and Mixtral-8x 7B without any model fine-tuning.
arXiv · 2026This work proposes an automated multi-stage pipeline for checking marketing content against compliance requirements, without relying on external knowledge bases. It benchmarks fine-tuning strategies — SFT vs. GRPO — and evaluates how reasoning tokens improve smaller models' ability to detect violations. The study also systematically tests how different reward function combinations shape model behavior under GRPO training.
arXiv · Dec 2025GRAID tackles data scarcity in harmful text classification by generating training examples that are geometrically spread across the embedding space, then diversifying them stylistically through a multi-agent reflection loop. The pipeline is model-agnostic and domain-agnostic, designed to improve guardrail coverage without manual annotation. On two benchmark datasets it achieves an average F1 gain of 12% over baselines.
EMNLP 2025Using Apple's FoundationModels to generate stories calibrated to the learner's vocabulary — entirely on-device, no API key needed.
Read →Building a food tracking app with fast on-device search using BM25, without any cloud dependencies.
Read →The surprisingly tricky problem of sub-millisecond timing on mobile and how AVAudioEngine solves it.
Read →A deep dive into the differences between qualitative and quantitative content analysis and when to use each approach.
Read →