Guide Play Print Original

Reference

Glossary & Concept Index

Short definitions for the terms that repeat across the guide, with chapter links for deeper reading.

Agent

An AI system that can plan, call tools, observe results, and continue through a task instead of returning one immediate answer.

Chapter 6

Alignment

Training and product work that steers a model toward useful, honest, safe, and controllable behavior.

Chapter 2

Attention

The Transformer mechanism that lets each token weigh which other tokens matter most for interpreting it.

Chapter 1

Context Window

The immediate working memory available to the model: the prompt, retrieved documents, tool outputs, and prior messages it can inspect now.

Chapter 3

Diffusion

A generative media approach that starts from noise and repeatedly removes predicted noise until an image, video, or audio sample emerges.

Chapter 5

Embedding

A vector representation that places text, images, or other data into a mathematical space where similar meanings sit near each other.

Chapter 3

Eval

A repeatable test that checks whether an AI system behaves correctly on cases that matter for the product.

Chapter 8

Groundedness

The degree to which an answer is supported by the evidence the system retrieved or was given.

Chapter 8

Guardrail

A product-level check or constraint around the model, such as input filtering, output validation, citation checks, or tool permission rules.

Chapter 8

LLM-as-Judge

Using a model to grade another model's output with a rubric. It is useful for scale, but it needs calibration because judges can be biased.

Chapter 8

Mixture of Experts

A sparse architecture where a router activates only a few specialized expert networks for each token.

Chapter 4

Prompt Injection

An attack where untrusted text tries to override the system's instructions, often through retrieved documents, web pages, or user input.

Chapter 8

Query, Key, Value

The three vectors used by attention: what a token seeks, what other tokens offer, and the content that gets blended into the result.

Chapter 1

Retrieval-Augmented Generation

A pattern where the system retrieves relevant external evidence, places it in context, and asks the model to answer from that evidence.

Chapter 3

Test-Time Compute

Extra inference-time work spent on planning, searching, checking, or tool use before the final answer is returned.

Chapter 7