Abstract illustration of AI with silhouette head full of eyes, symbolizing observation — Photo by Tara Winstead on Pexels

Last updated 2026-05-03 · By Shawn Nunley · Vendor-neutral · View source on GitHub

The honest version: Most security professionals don't need to train models from scratch. You need to understand how LLMs actually work, how to use them well, and how they fail — because soon you will be asked to defend an agent that has access to your production cloud. This roadmap is the shortest path I'd recommend to a working CNAPP engineer, detection engineer, or cloud architect who has used ChatGPT but hasn't yet built or attacked anything with an LLM.

📖 The Path

Why bother (and why now)
Prerequisites
Stage 1: Become a daily user (Weeks 0–2)
Stage 2: Mental model (Weeks 2–6)
Stage 3: Build something (Months 1.5–3)
Stage 4: AI/LLM security (Months 3–6)
Stage 5: Agents & MCP (Months 5–8)
Stage 6: Specialize
Hands-on labs & CTFs
Books & courses
People & newsletters to follow
Certifications worth your time
Project ideas to actually build
Frameworks & standards to know cold
Stay current
Common mistakes

Why bother (and why now)

Three reasons this is on a cloud-security site:

Agents are landing in your cloud. An agent with an IAM role and a browser tool has the blast radius of a compromised admin. Defending that is a cloud-security problem, not a separate AI team's problem.
Your job is changing. Detection engineers are writing prompts to triage alerts. IR teams are using LLMs to summarize CloudTrail. Pen testers are using AI to draft attacks faster than humans can review. If you can't read a system prompt, you can't review what your team ships.
The attack surface is genuinely new. Prompt injection isn't XSS. RAG isn't SQL. Most of the OWASP LLM Top 10 has no clean analog from the appsec world. You have to learn it on its own terms.

You don't need to become a machine-learning researcher. You need working AI literacy: enough to use these tools well, build small things, and reason about how they break.

Prerequisites

You'll have a much easier time if you already have:

Comfort in a terminal and one scripting language. Python is the lingua franca of the AI world. You don't need to be fluent — you need to read other people's notebooks and modify them.
HTTP and JSON intuition. Every model API is a POST request returning JSON. If you can hit an API with curl, you have what you need.
Cloud security fundamentals. If you're new to cloud security, do the cloud security learning path first. AI security is a layer on top of cloud security, not a substitute for it.
A credit card with a $20–50/month budget. You'll burn this on API calls. It's the cheapest tuition in tech.

That's it. You do not need linear algebra, calculus, or a PyTorch tattoo to be useful here.

Stage 1: Become a daily user (Weeks 0–2)

Goal: Build instinct for what frontier models can and can't do. You can't defend a tool you've never used in anger.

1. Pick one frontier model and pay for it

Claude (Anthropic), ChatGPT (OpenAI), or Gemini (Google). Pay the $20/month — the gap between the free and paid tier is enormous, and the gap between "occasional user" and "daily user" is bigger still. Don't try to evaluate three at once; you'll learn more by going deep on one.

2. Use it for everything for two weeks

Drafting Slack messages and emails.
Reading a CloudTrail log dump and explaining what an attacker just did.
Writing the boilerplate of a Terraform module before you fix it by hand.
Summarizing a security paper or vendor whitepaper.
Explaining a concept you almost-but-not-quite understand.

Notice when it's confidently wrong. Notice when it saves you an hour. Notice how rephrasing a prompt changes the answer. That instinct is the foundation of everything below.

3. Try a coding agent

Use Claude Code, Cursor, GitHub Copilot, or Codex on a real task in a real repo. Write a small tool, debug a Lambda, refactor a script. The leap from "chat assistant" to "agent that edits files and runs commands" is exactly the shift you'll later need to defend.

Stage 1 milestone: You instinctively reach for an LLM for the right kinds of tasks and avoid it for the wrong ones. You can describe — to a colleague — three things this model does well and three it fails at.

Elegant 3D visualization of neural networks showcasing abstract connections in digital space — Photo by Google DeepMind on Pexels

Stage 2: Mental model (Weeks 2–6)

Goal: Understand what's actually happening when you press send. No more magic.

1. The 30-minute mental model

You don't need to derive attention from scratch. You do need crisp answers to:

What is a token, and why does it cost money?
What is the context window and what happens when you exceed it?
What does temperature actually change?
What is an embedding and how is it different from a token?
What's the difference between fine-tuning, RAG, and prompting — and when do you reach for each?
What is a system prompt and why is it not a security boundary?
Why do models hallucinate, and what kinds of hallucinations are dangerous?

2. Recommended primers

Andrej Karpathy — "Intro to Large Language Models" (1 hour, free). The single best onboarding video. Watch it twice.
DeepLearning.AI Short Courses — free, 1–2 hours each. Start with "ChatGPT Prompt Engineering for Developers" and "Building Systems with the ChatGPT API."
Anthropic's prompt-engineering docs and OpenAI's prompt-engineering guide — vendor docs, but unusually high signal.
Anthropic Courses (GitHub) — interactive Jupyter notebooks for prompt engineering, tool use, RAG, evals.

3. Prompt engineering, briefly

Most "prompt engineering" content online is hype. The 80/20:

Be specific. Vague input, vague output.
Show, don't tell. Few-shot examples beat instructions almost every time.
Ask the model to think step by step (chain of thought) for hard reasoning, then ask it to verify its own answer.
Constrain the output format. JSON schemas and clear delimiters reduce variance.
Iterate. The first prompt is rarely the right prompt.

Stage 2 milestone: You can read a prompt-engineering blog post or a model evaluation paper and not get lost in the jargon. You stop calling LLMs "AI" in casual conversation because the precision matters.

A robot arm assists a professional with a book and coffee in a modern office — Photo by Pavel Danilyuk on Pexels

Most AI security work in 2026 is plumbing — guardrails, secrets, observability — not exotic adversarial ML. — what the job actually looks like

Stage 3: Build something with the API (Months 1.5–3)

Goal: Get past consumer-app polish and into raw model behavior.

1. The canonical first project: RAG over your own notes

Build a retrieval-augmented Q&A bot that answers questions about a corpus you control — your blog posts, last year's incident write-ups, every CloudTrail finding from a CTF, your runbooks, whatever. You will:

Embed documents into a vector store (start with FAISS or Chroma locally — don't reach for Pinecone yet).
Implement retrieval: top-k similarity, then a re-ranker.
Stuff the results into a prompt with a system message.
Stream the response back.

It will work badly the first time. Fixing it is where the real learning lives.

2. Your second project: an evaluation harness

Pick a security task — "extract IOCs from this incident report," "classify this CloudTrail event as benign or suspicious," "summarize this CVE in three bullets." Build a small dataset of ~30 examples with ground truth. Run the same prompt against it on every change. Watch your changes regress things. This is what real AI engineering feels like.

3. Recommended starting kits

OpenAI Cookbook — runnable recipes for almost everything.
Anthropic Cookbook — same idea, Claude-flavored.
LangChain if you like batteries included; the raw SDK if you want to actually understand what's happening (recommended).
Hugging Face Learn — free, deeper into the ML side if you want to go there.

Stage 3 milestone: You've shipped two things you wrote yourself, in code, that call a model API. You have an opinion about model providers based on real use, not Twitter takes.

Close-up of a vintage typewriter with 'AI ETHICS' typed on paper — Photo by Markus Winkler on Pexels

Stage 4: AI/LLM security (Months 3–6)

Goal: Move from "I can build" to "I can break, and I can defend."

1. Read the canon

OWASP Top 10 for LLM Applications. Read every entry. Re-read it next quarter — it's a living document.
NIST AI Risk Management Framework (AI RMF 1.0). Govern / Map / Measure / Manage. Boring, important.
MITRE ATLAS — the ATT&CK equivalent for ML systems. Map techniques to detections.
ISO/IEC 42001 — AI management systems standard. Useful when GRC inevitably knocks.
Google's Secure AI Framework (SAIF) and CISA "Guidelines for Secure AI System Development".

2. The attack classes you must understand

Prompt injection (direct and indirect). The defining LLM vulnerability. An attacker hides instructions in retrieved content, a webpage, an email, a calendar invite — and the model executes them.
Jailbreaks. Inputs that defeat safety training. Different from injection; sometimes overlapping.
Training-data extraction. Coercing a model into reproducing memorized training data — secrets, PII, proprietary code.
Model denial of service. Inputs that blow up token budgets or latency.
Insecure output handling. The model returns code, SQL, or shell that gets executed downstream. The model is now an attack proxy.
Supply-chain attacks. Tampered model weights, poisoned datasets, malicious Hugging Face repos.
Excessive agency. An agent given more tools, scopes, or authority than the threat model warrants.

3. Defenses you'll build (or buy)

Input/output filtering — Llama Guard, Lakera, Prompt Guard, Rebuff, NeMo Guardrails. Helpful, never sufficient.
Strong system prompts paired with structured output and schema validation.
Tight tool scoping — never give an agent write access to anything you wouldn't give a junior contractor on day one.
Logging every prompt, completion, tool call, and model identity. You can't investigate what you didn't record.
Human-in-the-loop for irreversible actions. "Send email," "delete resource," "pay invoice" should not be one prompt away.

Stage 4 milestone: You can review an LLM-powered feature in your company's product and produce a credible threat model. You stop treating "the LLM is powerful" as a magic explanation.

Business professionals discussing graphs on a flipchart during a daylight meeting — Photo by Kaboompics on Pexels

Stage 5: Agents & MCP (Months 5–8)

Goal: Internalize what changes when an LLM stops talking and starts doing.

Agents are LLMs in a loop with tools. Model Context Protocol (MCP) is the emerging open standard for plugging tools and data into them. Together they are the most consequential security shift since cloud itself, and most companies have no idea yet.

1. Build an agent end-to-end

Pick a small, well-scoped goal: "summarize my unread security newsletters" or "open a GitHub issue when a new high-severity CVE matches our stack."
Implement tool calling against the raw API. Watch the agent loop in your terminal.
Add a "max steps" cap, a budget cap, and a require-confirmation hook for destructive actions. Notice how often you would have nuked something without them.

2. Set up MCP locally

Install Claude Desktop or another MCP-capable client.
Connect at least two MCP servers — file system, GitHub, a database, etc.
Read the spec. Note that the security boundary is the client, not the protocol.
Now design an injection attack: a malicious file/issue/row that flips the agent's intent. Try it on your own setup.

3. Read the agent-security canon

Simon Willison's writing on prompt injection — the most important running commentary on this attack class.
Embrace The Red (Johann Rehberger) — practical agent and assistant exploits.
Anthropic's research on agentic misuse and misalignment.
The CSOH Breach Kill Chains — agent-related incidents are starting to show up here. Read them as they land.

Stage 5 milestone: You can sketch the threat model for an agentic system on a whiteboard, identify the three highest-blast-radius tools, and explain to an exec why "but the system prompt forbids that" is not a control.

Stage 6: Specialize

Pick the lane your career and curiosity pull you toward:

AI red teaming. Build offensive skill against LLM apps and agents. Adversarial prompting, jailbreak research, agent-tool exploitation, multi-modal attacks. Communities: HackerOne AI Safety Fund, AI Village at DEF CON.
Defending agentic systems. Detection content for prompt-injection patterns, tool-misuse anomalies, model-identity tracking, and the analogue of CloudTrail for AI control planes.
AI governance & risk. Map NIST AI RMF and ISO 42001 onto your existing GRC stack. Run AI inventories. Approve and review use cases. Talk to legal regularly.
MLSecOps. Pipeline security — model registry, dataset lineage, signed weights, deployment guardrails. The ML world's answer to DevSecOps.
Building AI-native security tooling. Triage agents for the SOC, alert summarization, IaC review, autonomous IR copilots. The market is wide open.
AI-APP / runtime protection. The emerging category — see Wiz's AI-APP framing — for monitoring deployed model endpoints, prompts, completions, and agent behavior in production.

Hands-on labs & CTFs

You can read for a year and learn less than you do in one weekend of CTFs. The full list lives in the CSOH CTF directory; here's what to start with:

Gandalf (Lakera) — eight levels of prompt-injection puzzles. The free, browser-based gateway drug. Do all eight.
Prompt Airlines — trick a customer-service chatbot into giving you a free flight. Realistic.
Doublespeak — extract a secret from an AI character with deception-resistant defenses.
AI Goat (Orca Research) — Terraform-based vulnerable AI/ML pipeline in AWS. Model poisoning, insecure endpoints, exfil paths.
Dreadnode Crucible — adversarial-ML challenges, free tier, leaderboards.
Wiz / Lakera / Hugging Face CTF events — watch the CTF directory for live challenges; AI-themed events are happening multiple times a year now.
Hugging Face Spaces — find a small open-source LLM, run it locally, try to break your own deployment.

For each lab, write up what you tried, what worked, and what didn't. Public write-ups are the highest-leverage portfolio item in this field right now.

Books & courses

The published-book layer is thinner than for cloud, but a few are worth it. The CSOH reading list tracks updates; here's a focused starter set:

Hands-On Large Language Models — Jay Alammar & Maarten Grootendorst (O'Reilly, 2024). The clearest end-to-end "what's actually happening" book.
Adversarial AI Attacks, Mitigations, and Defense Strategies — John Sotiropoulos (Packt, 2024). Explicitly security-focused.
The Developer's Playbook for Large Language Model Security — Steve Wilson (O'Reilly, 2024). Aligned with the OWASP LLM Top 10 (the author leads it).
Generative AI Security — Ken Huang et al. (Springer, 2024). Heavier, more reference-style.
AI Engineering — Chip Huyen (O'Reilly, 2024). Best book on building production systems on top of foundation models. Not security-specific, but you can't secure what you can't build.

Free courses worth your evenings:

DeepLearning.AI Short Courses — start with prompt engineering, then the agentic and red-teaming tracks.
Hugging Face NLP Course & Agents Course.
DAIR.AI Prompt Engineering Guide — open-source, frequently updated.
Microsoft "Generative AI for Beginners" — free, 21 lessons.

People & newsletters to follow

The half-life of an AI blog post is about six weeks. Follow people, not blog posts:

Simon Willison — the most reliable signal source for prompt injection, agent capabilities, and what frontier models can actually do.
Johann Rehberger / Embrace The Red — original agent/assistant exploit research.
Latent Space (swyx) — newsletter and podcast, the "what changed this week" feed for AI engineers.
The Pragmatic Engineer — Gergely Orosz writes the most grounded coverage of AI in real engineering teams.
Jack Clark — Import AI — weekly, policy + capabilities.
Nathan Lambert — Interconnects — model evaluation and post-training, deeper end of the pool.
Ethan Mollick — One Useful Thing — "what does this mean for normal humans" framing.
Lilian Weng — long-form technical surveys; the prompt-engineering and agents posts are canonical.
Anthropic, OpenAI, and Google AI blogs — read the model cards and system cards, not the marketing posts.

Certifications worth your time

The AI security cert market is young. Most of the value is still in the work, not the badges. That said:

ISC2 / CSA Certificate of Competence in AI/ML Security (CCAIML) and the CSA AI Security Certificate — cheap, vendor-neutral, sane curricula.
OWASP LLM Top 10 working-group resources — not a cert, but reading the whole canon and engaging with the community has more signal than most paid certs.
SANS SEC545 (Cloud Security Architecture and Operations) and the newer SANS AI & LLM training tracks — pricey, but the SANS quality bar is high.
Provider-specific: AWS now has AI Practitioner and ML Engineer associate certs; Azure has AI-102; Google has the Generative AI Leader path. Useful if you work in that cloud — skippable otherwise.
Don't pay for an "AI security expert" cert from a vendor you've never heard of. The market is full of them. Wait six months and ask people whose work you respect.

Project ideas to actually build

The portfolio that matters in 2026 is a small handful of running, public, AI-flavored projects. Pick one or two:

CloudTrail summarizer. Stream events, summarize the interesting ones to Slack with explanations a tier-1 analyst can act on.
IAM policy reviewer. Take an arbitrary IAM policy and explain in English what it allows, what it implicitly denies, and the three closest privilege-escalation paths.
IaC linter. An LLM pass on Terraform / Bicep / Helm that flags risky patterns the static linters miss (because they need context to judge).
Phishing-email triage agent. Classify, extract IOCs, draft a response. Local model, no data leaves the box.
Prompt-injection canary. A small site or document with hidden instructions; track which assistants execute them and report back.
Local-only RAG over your runbooks. Ollama + Llama 3.x + a vector store. Useful, private, and a great way to feel the limits of small models.
An "AI security review checklist" generator for a given application architecture, mapped to OWASP LLM Top 10 and MITRE ATLAS.

Publish the code. Write up what worked and what didn't. That artifact will outrun any cert in interviews.

Frameworks & standards to know cold

You'll be asked about all of these in interviews and audits. Skim them now, deep-read the ones your job touches:

OWASP Top 10 for LLM Applications — the working list of the things that go wrong.
NIST AI Risk Management Framework (AI RMF 1.0) — Govern / Map / Measure / Manage. The closest thing to NIST CSF for AI.
NIST AI 600-1: Generative AI Profile — applies the AI RMF to GenAI specifically.
MITRE ATLAS — adversarial techniques, mapped like ATT&CK.
ISO/IEC 42001 — AI management system standard, increasingly asked for in vendor questionnaires.
EU AI Act — risk-tiered regulation; you'll touch this if your company sells in the EU.
Google SAIF and Microsoft's Responsible AI Standard — vendor frameworks, useful as templates.
CSA AI Controls Matrix — practical control mappings.

Stay current

This field changes faster than cloud, which is saying something. A sustainable rhythm:

Daily: use a frontier model on real work. Skim Simon Willison.
Weekly: read Latent Space and Import AI. Skim the CSOH news feed for AI-tagged items. Show up to Friday Zoom.
Monthly: read one model card or system card start to finish. Try the newest frontier model.
Quarterly: ship something. A new lab, a new project, a new write-up.
Annually: re-read the OWASP LLM Top 10. Refresh your AI RMF mapping. Audit your company's AI inventory.

Common mistakes

Treating the system prompt as a security boundary. It isn't. Anything you tell the model in the system prompt, an attacker can convince it to ignore.
Confusing fluency with capability. Models that sound confident are sometimes wrong; models that hedge are sometimes right. Calibrate on outcomes, not vibes.
Going wide before going deep. One serious project beats six toy demos.
Skipping the API in favor of just chatting. The chat product hides the levers. Spend time at the raw API at least once.
Ignoring evals. If you can't measure it, you can't improve it. Build a small dataset before you build a feature.
Granting agents broad cloud roles "just to demo." Demos become production. Scope tightly from day one.
Trusting any single benchmark. Benchmarks leak into training data. Trust your own evals on your own task.
Believing AI security is unrelated to cloud security. Almost every interesting AI system runs in someone's cloud, with cloud identities, talking to cloud APIs. The two are converging fast.

Ready to start?

If you haven't done it yet, do the cloud security learning path first — AI security stacks on top of it.
Pick one frontier model, pay for a month, and use it for everything.
Watch Karpathy's "Intro to Large Language Models" tonight.
Beat all eight Gandalf levels this weekend.
Browse the CSOH CTF directory — filter for AI/ML.
Bookmark the reading list — it tracks AI security additions as they land.
Skim the AI & LLM section of the CSOH glossary — every term is defined for security people.
Join us on Friday Zoom — AI-related topics show up most weeks now.

Learn AI: A Roadmap for Cloud Security Pros