Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Large Language Model Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Claude AI
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Architecture and training == === Model architecture === Claude models are built on the [[transformer (machine learning)|transformer]] architecture, using a decoder-only (causal) design. All current Claude models support: * Text and image input (vision capability across all Claude 3+ models) * Text output with multilingual capability * A 200,000-token context window (standard across Claude 4 and 4.5 generations) * Up to 64,000 tokens of output per response (Claude 4.5 generation) Claude models are trained on a dataset assembled from publicly available internet sources, licensed third-party content, and data provided by users and crowdsourced annotators.<ref name="britannica-claude">Britannica, ''Claude AI'', 2025.</ref> Anthropic's data acquisition has included an internal operation called '''Project Panama''' β disclosed in January 2026 via unsealed court filings from a 2024 copyright lawsuit β in which Anthropic purchased, physically sliced, and scanned millions of books from online retailers to augment training data.<ref name="anthropic-wiki"/> Judge William Alsup ruled in that case that the destruction of legally purchased books constituted fair use. === Training pipeline === Claude's training pipeline follows three major phases: # '''Pretraining''' β Unsupervised training on large text corpora using next-token prediction loss. # '''Supervised fine-tuning (SFT)''' β Training on curated (instruction, response) pairs to improve instruction-following. # '''Constitutional AI + RLAIF''' β Self-critique and revision loop against the published constitution, followed by reinforcement learning from AI-generated preference judgments. This replaces or supplements traditional human-preference RLHF. === Hybrid reasoning === Beginning with Claude 3.7 and extending through the Claude 4 and 4.5 families, Claude models support '''hybrid reasoning''': the ability to internally generate extended chain-of-thought reasoning before producing a final response. This mode trades higher latency for substantially improved accuracy on complex logical, mathematical, and coding tasks. Users and API callers can request extended thinking explicitly or allow the model to select it adaptively. === Knowledge cutoffs === Each Claude model has a distinct training knowledge cutoff. As representative examples: Claude 3 Haiku's cutoff was August 2023; Claude 3.7 Sonnet's was November 2024; Claude Opus 4.1's was March 2025.<ref name="britannica-claude"/> From March 2025 onward, all Claude models gained access to a live web search capability that supplements (but does not replace) training-derived knowledge.
Summary:
Please note that all contributions to Large Language Model Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
My wiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Claude AI
(section)
Add topic