Editing Claude AI (section)

== Architecture and training ==

=== Model architecture ===

Claude models are built on the [[transformer (machine learning)|transformer]] architecture, using a decoder-only (causal) design. All current Claude models support:

* Text and image input (vision capability across all Claude 3+ models)
* Text output with multilingual capability
* A 200,000-token context window (standard across Claude 4 and 4.5 generations)
* Up to 64,000 tokens of output per response (Claude 4.5 generation)

Claude models are trained on a dataset assembled from publicly available internet sources, licensed third-party content, and data provided by users and crowdsourced annotators.<ref name="britannica-claude">Britannica, ''Claude AI'', 2025.</ref> Anthropic's data acquisition has included an internal operation called '''Project Panama''' — disclosed in January 2026 via unsealed court filings from a 2024 copyright lawsuit — in which Anthropic purchased, physically sliced, and scanned millions of books from online retailers to augment training data.<ref name="anthropic-wiki"/> Judge William Alsup ruled in that case that the destruction of legally purchased books constituted fair use.

=== Training pipeline ===

Claude's training pipeline follows three major phases:

# '''Pretraining''' — Unsupervised training on large text corpora using next-token prediction loss.
# '''Supervised fine-tuning (SFT)''' — Training on curated (instruction, response) pairs to improve instruction-following.
# '''Constitutional AI + RLAIF''' — Self-critique and revision loop against the published constitution, followed by reinforcement learning from AI-generated preference judgments. This replaces or supplements traditional human-preference RLHF.

=== Hybrid reasoning ===

Beginning with Claude 3.7 and extending through the Claude 4 and 4.5 families, Claude models support '''hybrid reasoning''': the ability to internally generate extended chain-of-thought reasoning before producing a final response. This mode trades higher latency for substantially improved accuracy on complex logical, mathematical, and coding tasks. Users and API callers can request extended thinking explicitly or allow the model to select it adaptively.

=== Knowledge cutoffs ===

Each Claude model has a distinct training knowledge cutoff. As representative examples: Claude 3 Haiku's cutoff was August 2023; Claude 3.7 Sonnet's was November 2024; Claude Opus 4.1's was March 2025.<ref name="britannica-claude"/> From March 2025 onward, all Claude models gained access to a live web search capability that supplements (but does not replace) training-derived knowledge.