Available for senior AI engineering roles

Building production AI
that actually ships.

Senior AI Engineer with 8+ years architecting RAG systems, ML pipelines, and vector-search products in Python — from notebook to scaled production.

See my work Get in touch

Years experience

40+

Models shipped

12M+

Vectors indexed

99.9%

Uptime SLO

About

Engineer first. AI native.

I help teams take AI from prototype to production — focusing on retrieval quality, evaluation harnesses, and the unglamorous infrastructure that keeps LLM products reliable.

Applied ML

Fine-tuning, evaluation, and deploying transformer models for real workloads.

Vector Systems

Designing retrieval pipelines on pgvector, Pinecone, Weaviate and Qdrant.

Data Engineering

Streaming, batch, and feature pipelines with Airflow, Spark, and dbt.

Production AI

Latency, cost, observability and guardrails for LLM systems at scale.

Skills

The stack I use to ship.

A focused toolkit refined across eight years of building data, ML, and AI systems in production.

Languages & Core

PythonSQLTypeScriptBashFastAPIPydantic

AI / LLM

OpenAIAnthropicLangChainLlamaIndexHugging FaceLoRA / PEFTPrompt EngineeringRAGAgents

Machine Learning

PyTorchTensorFlowscikit-learnXGBoostMLflowWeights & BiasesOptuna

Vector DBs & Search

pgvectorPineconeWeaviateQdrantFAISSElasticsearchHybrid SearchReranking

Data Engineering

Apache AirflowSparkKafkadbtSnowflakeBigQueryPostgresRedis

Cloud & MLOps

AWS SagemakerGCP Vertex AIDockerKubernetesTerraformGitHub ActionsGrafana

Selected work

Projects that moved real metrics.

A snapshot of recent AI systems I designed, built and shipped end-to-end.

RAG · Production

Enterprise RAG Platform

Multi-tenant retrieval system over 12M+ documents with hybrid search, reranking, and citation-grounded answers. Reduced hallucinations by 64% vs baseline.

12M+

Documents

180ms

p95 latency

64%

Hallucination ↓

PythonFastAPIpgvectorOpenAICohere RerankRedis

LLM Agents

Agentic Research Assistant

Tool-using agent with planning, web search, and code execution. Built deterministic eval harness with 200+ scenarios and CI gating on regression.

92%

Task success

200+

Eval scenarios

Faster research

LangGraphAnthropicPlaywrightDuckDBPytest

MLOps · Data

Real-time ML Feature Store

Streaming feature pipeline serving 30+ models with point-in-time correctness. Migrated from batch to event-driven, cutting model staleness from hours to seconds.

30+

Models served

<2s

Feature freshness

$140k

Annual savings

KafkaSparkFeastSnowflakeAirflowTerraform

Computer Vision

Vision Defect Detection

Fine-tuned vision transformer for manufacturing QA on edge devices. Active-learning loop reduced labeled data needed by 70%.

99.2%

Recall

70%

Less labels

45ms

Edge inference

PyTorchViTONNXTritonLabel Studio

Experience

Eight years, one through-line.

From data pipelines to LLM platforms — building systems that earn their keep in production.

2023 — Present
Senior AI Engineer
— Confidential (Series C SaaS)
- Led design of company-wide RAG platform serving 5 product surfaces.
- Owned LLM evaluation, observability, and prompt-versioning stack.
- Mentor of 4 ML engineers; set hiring bar and onboarding playbooks.
2020 — 2023
Machine Learning Engineer
— FinTech Scale-up
- Built real-time fraud models on streaming features (Kafka + Spark).
- Cut model serving cost 38% via batching, quantization, and autoscaling.
2018 — 2020
Data Scientist
— Healthcare Analytics
- Shipped NLP pipelines for clinical document classification.
- Productionized first transformer model — replaced legacy regex stack.
2016 — 2018
Data Engineer
— E-commerce
- Built batch ETL on Airflow + BigQuery powering BI for 200+ users.