Skip to content

From LLM to Agent

An LLM answers questions. An agent gets things done.

What an agent is

An agent is an LLM working in a loop to solve a task using tools.

A plain LLM reads your message and writes back — one response, done. An agent keeps going: it plans steps, calls tools (read a file, run a build, check test results), reads the output, and decides what to do next. The loop runs until the task is complete or it gets stuck.

One important thing to understand: an agent has no inherent memory. Every time you send a message, the model reads the current conversation, responds, and ceases to exist. The next message, it starts fresh. The only reason it appears to remember is that the framework re-sends the full context — history, files, tool instructions — on every call.

Remove the context and you lose the memory.

The three pillars

Every agentic setup has three components, and the quality of your results depends on all three:

Context — everything the model can see: your instructions, the codebase, the spec files, MCP outputs. This is the biggest lever and the main focus of this workshop.

Model — the weights doing the predicting. Opus 4.7 for deep research, Sonnet 4.6 as a daily driver. Different tasks warrant different models.

Harness — the runtime around the agent: Cursor, Claude Code, GitHub Copilot, Codex. It decides what tools are available, how the approval gates work, and what the agent can and cannot do.

Why the same prompt gives different results in different tools

If you've ever tried a prompt in ChatGPT, then tried the same prompt in Cursor and got a completely different result — this is why. The harness changes what tools are available. The harness changes what context is loaded. A different harness may even use a different model. You changed all three pillars at once.

Your job in this workshop

You are not the typist. You are the architect.

You decide what context the agent gets. You choose which model fits the moment. You configure the harness with the right tools. You review the output and push back when it's wrong.

The agent writes the code. You own the result.