Alberto Purpura

Applied AI · NLP · Research

Alberto
Purpura.

Applied AI· Capital One

I lead a Data Science and AI team within the Card Intelligence group at Capital One. Over my career, I worked on different projects in the clinical NLP, generative AI, information extraction, and retrieval spaces. I hold a Ph.D. in Computer Science, with publications in venues such as SIGIR, ACL, NAACL, EMNLP, ECIR, and AMIA. Beyond the day-to-day team work, I co-author papers with my colleagues at Capital One and stay close to where the field is heading. I'm particularly drawn to technology that earns its place quietly — reducing friction, fitting into daily life without demanding attention.

Recent publications

All papers →
Jan 2026
Enhancing LLM Instruction Following: An Evaluation-Driven Multi-Agentic Workflow for Prompt Instructions Optimization

LLMs often produce output that is conceptually correct but violates formal constraints like word limits or formatting rules. This paper proposes a multi-agentic workflow that separates optimization of the core task from its specific output constraints, using quantitative compliance scores as iterative feedback signals. The method yields significantly higher instruction-following scores on Llama 3.1 8B and Mixtral-8x 7B without any model fine-tuning.

arXiv · 2026
Dec 2025
A Multi-Stage Workflow for the Review of Marketing Content with Reasoning Large Language Models

This work proposes an automated multi-stage pipeline for checking marketing content against compliance requirements, without relying on external knowledge bases. It benchmarks fine-tuning strategies — SFT vs. GRPO — and evaluates how reasoning tokens improve smaller models' ability to detect violations. The study also systematically tests how different reward function combinations shape model behavior under GRPO training.

arXiv · Dec 2025
EMNLP 2025
GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection

GRAID tackles data scarcity in harmful text classification by generating training examples that are geometrically spread across the embedding space, then diversifying them stylistically through a multi-agent reflection loop. The pipeline is model-agnostic and domain-agnostic, designed to improve guardrail coverage without manual annotation. On two benchmark datasets it achieves an average F1 gain of 12% over baselines.

EMNLP 2025