Career Roadmap

Generative AI Engineer: Zero to Hero

This roadmap reflects the 2026 generative AI certification landscape, including critical exam transitions happening June-July 2026. AI-900 retires June 30, 2026 and is replaced by AI-901. AI-102 retires June 30, 2026 and is replaced by AI-103. AZ-204 retires July 31, 2026 and is replaced by AI-200. The roadmap covers both Microsoft Azure and AWS tracks, with specialization guidance for candidates who want to go deep on one platform versus maintaining multi-cloud breadth. Use ExamOS practice quizzes at every step to make progress measurable before each exam attempt.

10 steps5 certifications~6-9 months01-Jun-202643 views

Embark on your career roadmap by setting a target and staying accountable

Set target

Step 0 - Engineering and AI foundations

Build the programming, data, and conceptual foundations that every generative AI engineering task depends on. GenAI engineering is software engineering with AI components — the engineering fundamentals matter as much as the AI concepts.

3-4 weeks

Python proficiency — functions, classes, async/await, virtual environments, pip, data structures, file I/O
REST APIs and HTTP — making API calls, handling responses, authentication headers, rate limiting, error handling
JSON and data handling — parsing, serializing, working with nested structures, schema validation
Git fundamentals — version control for AI projects, branching, pull requests, managing large files
Cloud basics — what managed AI services are, API-first services, serverless compute concepts
Linear algebra intuition — vectors, dimensions, similarity, dot product (conceptual only, no deep math required)
Data fundamentals — structured versus unstructured data, what embeddings represent, why vector search works

💡 Python proficiency is explicitly required for AI-103 and AI-200. "Comfortable writing simple scripts" from the original roadmap is not sufficient — you need to be able to write real application code, handle exceptions, manage dependencies, and read SDK documentation.

💡 If Python is new to you, invest 4-6 weeks here rather than 2. Every subsequent step assumes you can write Python code fluently enough to build working applications. FastAPI, Pydantic, and async programming patterns appear throughout production GenAI development.

💡 JavaScript/TypeScript is a valid alternative for frontend and serverless GenAI work, but Python is the dominant language across both Azure and AWS AI SDKs.

Step 1 - AI and LLM fundamentals (AI-901)

Build the conceptual foundation for how modern AI systems work, focused on generative AI, agents, and Microsoft Foundry. This step also covers the first certification in the path.

2-3 weeks

How large language models work — tokens, attention mechanisms, context windows, temperature, top-p (conceptual)
Generative AI concepts — text generation, image generation, multimodal models, foundation models versus fine-tuned models
Prompt engineering fundamentals — system prompts, user prompts, few-shot examples, chain-of-thought, prompt templates
LLM limitations — hallucination, knowledge cutoffs, context window limits, bias, inconsistency
Responsible AI principles — fairness, reliability, privacy, inclusiveness, transparency, accountability
Microsoft Foundry overview — the unified platform for developing AI apps and agents, model catalog, deployments
Agents at a conceptual level — what makes a system agentic, tools, memory, planning, orchestration
Azure Content Understanding — extracting structured information from documents, images, and audio
Multimodal AI — vision models, speech-to-text, text-to-speech, combining modalities

Certifications

Microsoft Azure AI Fundamentals (AI-901)

💡 AI-901 is more implementation-oriented than AI-900 was. It tests judgment about real implementation decisions — not just awareness of what AI services exist.

💡 Use ExamOS quizzes to test conceptual GenAI understanding before sitting AI-901 or AIF-C01.

Step 2 - Microsoft Foundry and Azure AI platform

Build hands-on fluency with Microsoft Foundry as the primary development platform for Azure AI engineering. AI-103 is Foundry-centric — candidates who haven't used Foundry directly will find the exam significantly harder.

3-4 weeks

Microsoft Foundry architecture — hubs, projects, connections, deployments, model catalog
Azure OpenAI Service — model deployment, endpoints, API versions, quotas, regional availability
Model selection in Foundry — choosing between GPT-4o, GPT-4o-mini, Phi-4, Mistral, Llama and the criteria that determine the right model for a workload
Azure AI Foundry SDK — azure-ai-projects, azure-ai-inference, azure-ai-evaluation packages
Prompt Flow — prompt engineering workflows, flow creation, evaluation flows, deployment from Foundry
Azure AI Search — vector indexes, hybrid search (keyword plus vector), semantic ranking, index schema design
Azure Content Safety — content filters, harm categories, severity thresholds, custom blocklists, groundedness detection
Azure AI Evaluation SDK — evaluating response quality, groundedness, relevance, coherence at scale
Managed identities for Foundry — authentication without keys, role assignments for AI resources

Certifications

Microsoft Certified: Azure AI Apps and Agents Developer Associate (AI-103)

💡 AI-103 is Microsoft's replacement for AI-102, launching beta April 21, 2026 with GA targeted June 2026. AI-102 retires June 30, 2026. If you are starting fresh, prepare for AI-103 directly.

💡 The hardest part of AI-103 is choosing the correct Foundry component for a described scenario — model deployment versus agent versus Prompt Flow versus AI Search versus Document Intelligence versus Content Understanding. Each solves a different problem and the exam tests whether you can distinguish between them.

💡 Hands-on time in Foundry is essential. Create a Foundry hub, deploy a model, build a Prompt Flow, create a vector index in AI Search, and evaluate outputs using the evaluation SDK before your exam.

💡 The Microsoft Spring Skills Challenge 2026 offers discounted beta exam fees for AI-103. Check Microsoft Learn events for current availability.

Step 3 - RAG architecture and vector databases

Build the most important practical skill in production generative AI engineering — Retrieval-Augmented Generation systems that ground LLM responses in real organizational data.

4-5 weeks

RAG architecture fundamentals — why RAG exists, the retrieve-then-generate pattern, when RAG versus fine-tuning
Embedding models — what embeddings are, embedding dimensions, choosing embedding models, Azure OpenAI text-embedding-3-large versus small
Vector databases — Azure AI Search as a vector store, pgvector with Azure Database for PostgreSQL, Azure Cosmos DB for NoSQL vector search, Redis vector search
Chunking strategies — fixed-size chunking, semantic chunking, recursive character splitting, parent-document retrieval
Document ingestion pipelines — parsing PDFs and Office documents, handling tables and images, metadata extraction
Hybrid search — combining dense vector search with BM25 keyword search, reciprocal rank fusion, semantic reranking
RAG evaluation — groundedness (is the answer supported by the retrieved context?), relevance, coherence, RAGAS framework
Advanced RAG patterns — HyDE (hypothetical document embeddings), query expansion, contextual compression, multi-vector retrieval
Azure AI Search index design — fields, analyzers, scoring profiles, semantic configurations

Certifications

Microsoft Certified: Azure AI Apps and Agents Developer Associate (AI-103)

Azure AI Cloud Engineer Associate (AI-200)

💡 RAG is the most tested production pattern across both AI-103 and AI-200. Build at least two complete RAG applications — one using Azure AI Search directly and one using the Foundry SDK — before sitting either exam.

💡 Azure AI Search is the primary vector store tested on AI-103. Cosmos DB for NoSQL with vector search and pgvector are tested on AI-200 given its focus on back-end data services.

💡 Evaluation is not optional in production RAG systems. The AI-103 exam specifically tests evaluation methodology — candidates who only know how to build RAG without evaluating it will miss a significant portion of the exam.

💡 LangChain and LlamaIndex are widely used open-source frameworks for RAG development. Neither is explicitly tested on AI-103 or AI-200, but fluency with one significantly accelerates hands-on development throughout this step.

Step 4 - Agentic AI systems and multi-agent orchestration

Build autonomous AI agents that can plan, use tools, and complete multi-step tasks. Agentic solutions represent 35-40% of AI-103 and are the fastest-evolving area of GenAI engineering.

4-5 weeks

What agents are — the reasoning loop (think, act, observe), tool calling, memory, planning
Azure AI Foundry Agent Service — creating agents, defining tools, managing threads, streaming responses
Tool design — function tools, code interpreter, Azure AI Search grounding tool, Bing grounding tool, custom tools
Agent memory — thread-based memory in Foundry, external memory patterns, memory compression for long conversations
Semantic Kernel — the Microsoft SDK for agent orchestration, plugins, planners, kernel function patterns
AutoGen — multi-agent conversation patterns, user proxy agents, assistant agents, group chat orchestration
Multi-agent architectures — orchestrator agents, specialist sub-agents, agent handoff patterns, parallel execution
Agent evaluation — measuring task completion, tool call accuracy, hallucination in agentic contexts
Agent safety — prompt injection risks in agentic systems, tool call validation, human-in-the-loop controls
Durable agent patterns — resumable agents with Azure Durable Functions, checkpoint and resume strategies

Certifications

Microsoft Certified: Azure AI Apps and Agents Developer Associate (AI-103)

💡 Agentic solutions at 35-40% is the largest and most heavily weighted AI-103 domain. Candidates who treat agents as "prompts with tools" and don't understand orchestration, memory management, and multi-agent coordination will miss a significant portion of the exam.

💡 Azure AI Foundry Agent Service is the primary tested platform. Semantic Kernel is the primary tested SDK for orchestration. Know both.

💡 Prompt injection in agentic systems is a specific security risk that AI-103 tests under responsible AI. An agent that processes untrusted content (emails, web pages, user-uploaded documents) can be manipulated through indirect prompt injection. Know what this is and what architectural controls mitigate it.

💡 Build at least one multi-agent system before your exam — a simple orchestrator with two specialist agents completing a multi-step task is sufficient to develop the intuition the exam tests.

Step 5 - AI cloud infrastructure and back-end development (AI-200)

Build the back-end cloud infrastructure that AI applications run on — containers, event-driven pipelines, vector data services, security, and observability. AI-200 is the back-end complement to AI-103's application and agent focus.

4-5 weeks

Containerized AI applications — Docker for AI apps, Azure Container Registry, Azure Container Apps, AKS for AI workloads, KEDA for event-driven scaling
Azure Cosmos DB for NoSQL with vector search — vector index configuration, similarity search, partitioning for AI workloads
Azure Database for PostgreSQL with pgvector — vector extension setup, ivfflat and hnsw indexes, similarity operators
Azure Cache for Redis — semantic caching for LLM responses, vector search in Redis, cache-aside patterns for AI
Event-driven AI pipelines — Azure Event Grid for AI triggers, Azure Service Bus for reliable message delivery, Azure Functions for serverless AI processing
Serverless AI processing — Azure Functions triggers for AI workloads, Durable Functions for long-running AI workflows
AI application security — Azure Key Vault for API key management, Managed Identities for keyless access, RBAC for AI endpoints, network isolation for AI services
Azure API Management for AI — rate limiting AI endpoints, token quota management, semantic caching at the gateway layer
Distributed observability — Azure Monitor for AI applications, Application Insights distributed tracing, OpenTelemetry for AI workloads, KQL for token usage analysis

Certifications

Azure AI Cloud Engineer Associate (AI-200)

💡 AI-200 is explicitly about the back-end of AI solutions. Where AI-103 tests whether you can build the AI application and agent layer, AI-200 tests whether you can wire those applications into a production cloud infrastructure. Both credentials together form the most complete Azure GenAI engineering profile available in 2026.

💡 Candidates who are primarily AI researchers or data scientists will find AI-200 more demanding than AI-103 because of its infrastructure and cloud development requirements. Candidates coming from AZ-204 development experience will find AI-200 more accessible.

💡 Azure API Management's AI Gateway capabilities (token quota management, semantic caching, load balancing across multiple Azure OpenAI deployments) are new content specific to AI-200 that has no equivalent in AZ-204.

Step 6 - AWS generative AI track (parallel or alternative path)

Build generative AI engineering skills on AWS using Amazon Bedrock, Amazon Q, and the AWS AI services ecosystem. Follow this track if your organization runs primarily on AWS or if you want multi-cloud GenAI credibility.

4-6 weeks

Amazon Bedrock — foundation model access, model providers (Anthropic Claude, Meta Llama, Mistral, Amazon Titan), API patterns
Bedrock Agents — creating agents, action groups, knowledge bases, guardrails, agent collaboration
Amazon Bedrock Knowledge Bases — RAG implementation on AWS, OpenSearch Serverless as vector store, S3 data sources
Amazon Bedrock Guardrails — content filtering, topic denial, PII redaction, grounding checks
AWS Bedrock model evaluation — automatic evaluation, human evaluation, custom metrics
Amazon Q Developer and Amazon Q Business — AI assistant deployment for enterprise, customization with company data
Amazon SageMaker AI — fine-tuning foundation models, model deployment, inference endpoints
AWS Lambda for AI workloads — triggering Bedrock from Lambda, streaming responses, error handling
AWS AI services — Amazon Transcribe, Amazon Comprehend, Amazon Textract for document processing in AI pipelines

Certifications

AWS Certified Generative AI Developer - Professional (AIP-C01)

💡 AWS Certified AI Practitioner (AIP-C01) is the associate-level AWS AI credential validated specifically for generative AI developers. It covers Amazon Bedrock, responsible AI, AI security, and the full AWS AI services stack. 85 questions, 120 minutes, 700/1000 passing score, $150 USD.

💡 AIF-C01 (AWS Certified AI Practitioner Foundational) is the entry-level equivalent — appropriate at Step 1 for the AWS track. AIP-C01 is the associate-level credential for this step.

💡 Candidates who have completed the Azure track (AI-103 and AI-200) will find the AWS track more approachable because the underlying patterns — RAG, agents, evaluation, safety, observability — are the same across platforms. The service names and APIs differ, not the concepts.

Step 7 - Production AI systems, security, and responsible AI

Build production-ready AI applications that are secure, observable, cost-controlled, and compliant with responsible AI requirements. This is where the gap between demo applications and enterprise deployments lives.

3-4 weeks

AI security in production — prompt injection defense strategies, input validation for LLM inputs, output validation before display
Content safety at scale — Azure Content Safety integration patterns, configuring harm thresholds by use case
Cost management for LLM applications — token budgeting, prompt compression, model routing (expensive model for complex tasks, cheap model for simple ones)
Latency optimization — streaming responses, caching strategies, async processing, choosing the right model size
Rate limiting and retry patterns — exponential backoff for API throttling, circuit breakers for AI services
Model version management — handling model deprecations, pinning API versions, testing before migration
AI monitoring in production — tracking token usage, latency percentiles, error rates, cost per conversation
Responsible AI in practice — bias testing, red-teaming AI applications, documenting model cards, responsible AI impact assessment
AI compliance — GDPR implications for LLM data processing, EU AI Act obligations for high-risk AI systems, organizational AI governance
Disaster recovery for AI applications — multi-region Azure OpenAI deployment, Bedrock cross-region failover

💡 Prompt injection is the OWASP

💡 Cost is the most consistently underestimated production concern for GenAI applications. A poorly designed RAG system that retrieves too much context per query can have 10-50x higher costs than a well-designed equivalent. Token budgeting and context compression are engineering skills, not afterthoughts.

💡 The EU AI Act's risk tiers (minimal, limited, high, unacceptable) are increasingly relevant for AI engineers in regulated industries. High-risk AI systems require conformity assessments, documentation, and human oversight. Know which categories your applications fall into.

💡 Use ExamOS to practice responsible AI and security scenario questions that test the reasoning the AI-103 planning domain requires.

Step 8 - Advanced agentic patterns and emerging capabilities

Stay current with the fastest-evolving area of GenAI engineering. Agentic architectures are evolving faster than any certification syllabus can track — this step is about building the continuous learning habit alongside the formal credentials.

Ongoing

Model Context Protocol (MCP) — the emerging standard for agent-tool communication, server implementations, client integration
Computer use agents — agents that control browser and desktop environments, Azure AI Foundry integration
Long-horizon task completion — agents that work across hours or days, persistent state management, human checkpoint design
Multimodal agents — agents that process images, audio, and video alongside text, Azure Content Understanding integration
Agent evaluation at scale — measuring task completion rates across diverse test sets, regression testing for agentic systems
Fine-tuning versus RAG versus prompting — choosing the right knowledge injection strategy for different accuracy and cost requirements
Small language models (SLMs) — Phi-4, Phi-4-mini on Azure, edge deployment, when SLMs outperform large models
Reasoning models — o1, o3-mini deployment patterns, when reasoning models justify their higher cost and latency

💡 MCP (Model Context Protocol) is becoming the dominant standard for agent-tool communication. While not yet explicitly tested on AI-103, it is increasingly present in production implementations and job descriptions. Understanding the protocol at a conceptual level future-proofs your agent architecture knowledge.

💡 The generative AI engineering field is moving faster than any certification syllabus. The credentials in this roadmap validate a point-in-time snapshot of best practices. Building the daily practice habit — following Azure AI updates, AWS Bedrock release notes, and the academic papers that inform production practice — is what keeps your engineering judgment current between certification cycles.

💡 ExamOS tracks the AI certification landscape actively. Use daily scenario practice to maintain the reasoning sharpness that production GenAI work requires, not just during pre-exam sprints.

Final step - Certification, validation, and the 2026 exam transition

The 2026 Microsoft AI certification transition is the most significant restructuring in years. AI-900 retires June 30, 2026 (replaced by AI-901). AI-102 retires June 30, 2026 (replaced by AI-103). AZ-204 retires July 31, 2026 (replaced by AI-200). If you are mid-preparation for any of the retiring exams, the decision is clear: sit before the retirement date if you are close to ready, or prepare for the replacement if you are just starting. The credentials that result from the new exams are more aligned with what generative AI engineering actually requires in 2026 — both for the exam and for the job. Before booking AI-103, have Foundry hands-on experience and understand the distinction between every major Foundry component. Before booking AI-200, have real experience with containerized AI applications, vector data services, and event-driven pipelines. Use ExamOS scenario practice to measure readiness objectively. Consistent performance above 80% on Legend mode across multiple sessions is the threshold that correlates with genuine exam readiness rather than pre-exam anxiety.

Certifications

Microsoft Azure AI Fundamentals (AI-901)

Microsoft Certified: Azure AI Apps and Agents Developer Associate (AI-103)

Azure AI Cloud Engineer Associate (AI-200)

AWS Certified AI Practitioner (AIF-C01)

AWS Certified Generative AI Developer - Professional (AIP-C01)

Realistic timeline

2 hours per day: approximately 6-9 months for the full path including both Azure and AWS tracks
Azure track only (AI-901, AI-103, AI-200): approximately 4-6 months at 2 hours per day
AWS track only (AIF-C01, AIP-C01): approximately 2-3 months at 2 hours per day
Candidates with existing Python and cloud development experience: compress Steps 0-1 to 2 weeks and focus time on Foundry hands-on work
Hands-on lab time building real AI applications counts as study time and produces significantly better exam outcomes than reading documentation
Steps 3-4 (RAG and agents) together represent the core of AI-103 and should receive the most dedicated preparation time
Step 5 (AI-200) requires genuine back-end development experience — candidates without containerization or event-driven architecture experience should plan additional time here
The generative AI field evolves faster than most certification syllabi — treat Step 8 as a permanent ongoing commitment rather than a fixed curriculum with an end date

Embark on your career roadmap by setting a target and staying accountable

Set target

Share your feedback

Generative AI Engineer: Zero to Hero

Step 0 - Engineering and AI foundations

Step 1 - AI and LLM fundamentals (AI-901)

Step 2 - Microsoft Foundry and Azure AI platform

Step 3 - RAG architecture and vector databases

Step 4 - Agentic AI systems and multi-agent orchestration

Step 5 - AI cloud infrastructure and back-end development (AI-200)

Step 6 - AWS generative AI track (parallel or alternative path)

Step 7 - Production AI systems, security, and responsible AI

Step 8 - Advanced agentic patterns and emerging capabilities

Final step - Certification, validation, and the 2026 exam transition

Realistic timeline