Portfolio projects

Build production AI systems you can actually put on your resume.

Each project includes architecture walkthrough, stack choices, implementation guidance, deployment notes, evaluation rubric, and ready-to-use resume bullet points.

Every project includes:

Architecture overview and design rationale
Recommended tech stack with alternatives
Step-by-step implementation guidance
Evaluation rubric and quality criteria
Deployment guide (Vercel, Railway, Fly.io)
3 ready-to-use resume bullet points

Free projects

Open access
Intermediate
10h

Production RAG Chatbot

Build an end-to-end RAG chatbot that ingests documents, chunks and embeds them, retrieves relevant context with hybrid search, reranks results, and returns grounded, cited answers.

Skills

RAGVector SearchEmbeddingsPython

Stack

PythonFastAPILangChain or customOpenAI

Built a production RAG chatbot with hybrid retrieval (BM25 + dense), reranking, and citation grounding, reducing hallucination rate by ~40%

Intermediate
12h

Agentic Customer Support Assistant

Design and build a tool-calling AI agent that handles customer support tasks: looking up orders, checking policies, escalating to humans, and logging outcomes — with guardrails and audit trails.

Skills

AgentsTool CallingGuardrailsMemory

Stack

PythonOpenAI function callingFastAPIRedis (session memory)

Built a production agentic customer support system with 12 tools, structured outputs, and automatic escalation to human agents

Intermediate
8h

Document Q&A Platform

Build a multi-document Q&A platform where users can upload PDFs, ask questions, and get answers grounded in specific document sections — with permission-aware retrieval for multi-tenant scenarios.

Skills

RAGMulti-tenantPermissionsPDF

Stack

Next.jsPythonLangChainPinecone namespaces

Built multi-tenant document Q&A platform with permission-aware retrieval using Pinecone namespaces and row-level auth

Intermediate
8h

Synthetic Evaluation Dataset Generator

Build a pipeline that generates synthetic question-answer pairs from source documents using LLMs, validates quality with LLM-as-judge, and produces ready-to-use evaluation datasets for RAG systems.

Skills

EvaluationRAGSynthetic DataLLM-as-judge

Stack

PythonOpenAIRAGASPandas

Built synthetic eval dataset pipeline generating 500+ QA pairs from internal docs, filtered by LLM judge quality threshold

Premium projects

Subscription
Advanced
14h

LLM Observability Dashboard

Build a tracing and observability dashboard for LLM applications that captures token usage, latency distributions, tool-call outcomes, evaluation scores, and cost-per-request trends over time.

Skills

LLMOpsObservabilityTracingCost Monitoring

Stack

PythonOpenTelemetryClickHouse or TimescaleDBGrafana or custom Next.js

Built LLM observability platform tracking token cost, latency p95, and eval scores across 50k+ daily LLM calls

Advanced
10h

AI Interview Coach

Build an AI-powered interview coach that gives engineers practice with behavioral and system design questions, evaluates their answers using LLM-as-judge, and tracks improvement over time.

Skills

Interview PrepLLM-as-judgeSystem DesignEvaluation

Stack

Next.jsOpenAISupabaseFirebase Auth

Built AI interview coach using LLM-as-judge evaluation with per-rubric scoring across clarity, depth, and tradeoff awareness

Advanced
12h

LLM Evaluation Harness

Build a reusable evaluation framework for LLM applications with dataset management, multiple evaluator types (deterministic, LLM-as-judge, human), regression test runner, and result dashboards.

Skills

EvaluationLLMOpsTestingCI/CD

Stack

PythonFastAPISupabaseOpenAI

Built LLM evaluation harness with deterministic, LLM-as-judge, and human evaluator types, used to gate prompt version releases

Advanced
16h

MCP Server Marketplace

Build a minimal MCP server registry where developers can publish tool definitions, browse available servers, and connect them to agent applications with auth and versioning.

Skills

MCPAgentsAPI DesignSecurity

Stack

Node.jsNext.jsSupabaseRedis

Built MCP server marketplace with schema validation, versioning, and auth-gated tool discovery for agent integrations

Advanced
14h

Multi-Agent Code Review Assistant

Build a multi-agent system where specialized sub-agents review code for security, performance, test coverage, and documentation quality — coordinated by a planner agent with human approval gates.

Skills

Multi-agentCode ReviewLangGraphGitHub

Stack

PythonLangGraphOpenAIGitHub API

Built multi-agent code review system with 4 specialized sub-agents orchestrated by a planner using LangGraph state machines

Intermediate
12h

AI SaaS Starter Template

Build a production-ready AI SaaS template with Google auth, Stripe subscriptions, content gating, SEO pages, admin dashboard, and an AI feature — ready to clone and monetize.

Skills

AI SaaSStripeFirebaseNext.js

Stack

Next.jsTypeScriptTailwind CSSFirebase

Built production AI SaaS template with Google OAuth, Stripe subscriptions, webhook sync, and middleware-based content gating

Unlock all premium projects

Get full access to all 7 premium projects with complete implementation guides, evaluation rubrics, and resume bullet templates — plus all premium lessons and quizzes.

See pricing

Weekly newsletter

Get practical AI engineering insights in your inbox.

Weekly guides, interview prep, architecture breakdowns, and production lessons for engineers building with AI — free forever.

Subscribe free