Gen AI Systems
Generative AI Systems: Architecture, LLMs, RAG, and Production Considerations
Learn the architecture of generative AI systems through a beginner-friendly story covering LLMs, prompts, embeddings, RAG, guardrails, latency, cost, and production trade-offs.
Embeddings and Vector Databases: Semantic Search at Scale
Learn embeddings and vector databases through a beginner-friendly story covering semantic search, vector similarity, indexing, metadata filtering, HNSW, and production trade-offs.
RAG Architecture: Chunking, Retrieval, Reranking, and Generation
Learn Retrieval-Augmented Generation through a beginner-friendly story covering ingestion, chunking, embeddings, retrieval, reranking, context assembly, citations, evaluation, and advanced RAG patterns.
LLM Gateway and Routing: Model Selection, Fallbacks, and Cost Control
Design the gateway layer between applications and LLM providers, including model routing, provider fallback, rate limiting, semantic routing, observability, and cost tracking.
Prompt Caching and Semantic Caching: Lower Latency and Cost
Learn exact prompt caching, prefix caching, semantic caching, TTLs, invalidation, cache safety, and when caching LLM responses is a bad idea.
Agentic Patterns and Tool Use: ReAct, Function Calling, and Orchestration
Design LLM systems that use tools safely, including ReAct loops, function calling, planning, supervisor-worker orchestration, multi-agent patterns, and safety controls.
Streaming and Latency Optimization: TTFT, SSE, KV Cache, and Batching
Design low-latency LLM experiences using streaming, Server-Sent Events, time-to-first-token optimization, KV cache management, speculative decoding, batching, and context reduction.
Guardrails and Output Validation: Safer LLM Responses
Protect LLM systems with structured outputs, schema validation, moderation, jailbreak resistance, hallucination checks, retries, and human-in-the-loop workflows.
LLM Observability and Evaluation: Traces, Quality Metrics, and Experiments
Build observability and evaluation for LLM systems, including prompt traces, cost tracking, model versions, RAG metrics, LLM-as-judge, A/B tests, and regression datasets.