Giung Nam, 2026-03-07
靑出於藍
學不可以已。青,取之於藍,而青於藍;冰,水為之,而寒於水。
- 荀子, "勸學篇"
In East Asian tradition, the phrase "靑出於藍" describes a student who surpasses their teacher. The metaphor is one of physical refinement: indigo dye is extracted from the indigo plant, yet through the process of coloring, the resulting pigment becomes deeper and more vibrant than the source.
This represents the provocative question at the heart of modern AI: is it possible to train models using only human-level supervision (the Indigo) and still elicit a capability (the Blue) that is fundamentally stronger than the data that birthed it? While this remains a subject of intense debate, some people suggest that we are beginning to see the first glimpses of such a phenomenon.
Wrong, but useful
Essentially, all models are wrong, but some are useful.
- George Box and Norman Draper, "Empirical Model-Building and Response Surfaces"
In the standard trajectory of human education, we are rarely told the absolute truth from the start.1 Instead, we are fed a series of increasingly sophisticated, useful lies.
Looking back at my high school science classes, we were initially taught the Bohr model of the atom: a system of little planetary electrons orbiting a sun-like nucleus. It was a neat, intuitive, and entirely graspable visualization. Yet, it was merely a provisional truth; the curriculum soon confessed that those tidy orbits were but a shorthand for a more complex reality and replaced them with the abstract, buzzing reality of probability clouds and wave functions.
We weren't being lied to out of malice; we were being provided with a weak supervisor. The Bohr model served as the necessary, simplified scaffold that allowed our minds to eventually grasp the stronger (and far more complex) truth of quantum mechanics. We used a flawed model to bootstrap ourselves toward a more accurate one.
We know that the aether transmits transverse vibrations to very great distances without sensible loss of energy by dissipation.
- James Clerk Maxwell, "Encyclopædia Britannica, Ninth Edition"
History shows us that this isn't just a classroom tactic; it is the engine of discovery. Consider James Clerk Maxwell, who derived his groundbreaking equations for electromagnetism by imagining space was filled with literal mechanical "gears and idle wheels."
We now know there are no such invisible gears in the vacuum of space. The aether does not exist, and the model was fundamentally wrong. Yet, by using this mechanical Indigo as a conceptual springboard, Maxwell was able to produce a mathematical Blue that was far more vibrant than the model itself. The equations lived on; the gears were discarded.
In both cases, a stronger understanding emerged from the training of a weaker supervisor. This is the essence of progress: using a flawed guide to find a truth that the guide themselves could never have reached.
The Blue Emergence
For a long time,2 the cynical view of LLMs was that they were "stochastic parrots": sophisticated mirrors that could only reflect the logic already contained in human writing. In this view, if you provide a model with human-level data, you should expect nothing more than human-level output; you cannot get Blue if you only have Indigo to work with.
However, we are beginning to see glimpses that challenge this mirror-image theory. When a model solves an open conjecture that has stumped Donald Knuth, it suggests the system is navigating a latent space of logic that goes beyond simple mimicry:
Shock! Shock! I learned yesterday that an open problem I'd been working on for several weeks had just been solved by Claude Opus 4.6. It seems that I'll have to revise my opinions about "generative AI" one of these days.
- Donald Knuth, "Claude's Cycles"
This isn't an isolated anecdote. Terence Tao also recently mentioned a similar phase shift regarding the application of AI tools to Erdős problems:
Recently, the application of AI tools to Erdos problems passed a milestone: an Erdos problem #728 was solved more or less autonomously by AI... with the result not replicated in existing literature.
- Terence Tao, tooted in mathstodon.xyz
The phrase "not replicated in existing literature" is a compelling marker of this emergence.3 It suggests the model is producing results that its Indigo supervisors (the human researchers) had not yet discovered. These models are beginning to produce useful answers for questions no human has ever successfully answered before; they are showing signs of superhuman performance.
From Demonstration to Elicitation
As a result, for the purposes of alignment we do not need the weak supervisor to teach the strong model new capabilities; instead, we simply need the weak supervisor to elicit what the strong model already knows.
- Burns et al., "Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision"
If we accept that these models can outgrow us, we face a fundamental challenge in alignment. In the past, we aligned models through demonstration ("doing as I do"). But you cannot provide a "Gold Standard" label for a problem you cannot solve yourself.
This realization is the driving force behind recent advancements in Weak-to-Strong Generalization. If the student model can be "bluer than the indigo," our role as supervisors must change:
- Elicitation over Demonstration: We are no longer teaching the model new capabilities. Instead, we are trying to find the right way to coax out the strong latent knowledge the model already possesses.
- Verification over Generation: While we may be "weak" at generating the solution to a complex new theorem, we are often "strong" enough to verify the logic once the model presents it to us.
Repaying the Teacher
Man vergilt einem Lehrer schlecht, wenn man immer nur der Schüler bleibt.
- Friedrich Nietzsche, "Also sprach Zarathustra: Ein Buch für Alle und Keinen"
If we view AI alignment as a strictly corrective process (keeping the model within the bounds of what we already know), we are essentially demanding that the student never surpass the teacher. We are asking for an Indigo that never yields Blue.
But the ultimate goal of any education (and perhaps any creation) is to be surpassed. If our models only ever reached the ceiling of human performance, they would be a monumental disappointment. They would be a "bad repayment" to the centuries of human data and ingenuity that went into their construction.
Perhaps the path forward is not to resist this shift, but to adapt our roles as Indigo supervisors. If these models truly begin to outpace us, we must be open to the possibility that our own knowledge serves as a provisional scaffold: the mechanical gears that start the engine, allowing the student to eventually explore Blue horizons we cannot yet see. Even if it makes us feel a little blue.
靑出於藍
學不可以已。青,取之於藍,而青於藍;冰,水為之,而寒於水。
- 荀子, "勸學篇"
In East Asian tradition, the phrase "靑出於藍" describes a student who surpasses their teacher. The metaphor is one of physical refinement: indigo dye is extracted from the indigo plant, yet through the process of coloring, the resulting pigment becomes deeper and more vibrant than the source.
This represents the provocative question at the heart of modern AI: is it possible to train models using only human-level supervision (the Indigo) and still elicit a capability (the Blue) that is fundamentally stronger than the data that birthed it? While this remains a subject of intense debate, some people suggest that we are beginning to see the first glimpses of such a phenomenon.
Wrong, but useful
Essentially, all models are wrong, but some are useful.
- George Box and Norman Draper, "Empirical Model-Building and Response Surfaces"
In the standard trajectory of human education, we are rarely told the absolute truth from the start.1 Instead, we are fed a series of increasingly sophisticated, useful lies.
Looking back at my high school science classes, we were initially taught the Bohr model of the atom: a system of little planetary electrons orbiting a sun-like nucleus. It was a neat, intuitive, and entirely graspable visualization. Yet, it was merely a provisional truth; the curriculum soon confessed that those tidy orbits were but a shorthand for a more complex reality and replaced them with the abstract, buzzing reality of probability clouds and wave functions.
We weren't being lied to out of malice; we were being provided with a weak supervisor. The Bohr model served as the necessary, simplified scaffold that allowed our minds to eventually grasp the stronger (and far more complex) truth of quantum mechanics. We used a flawed model to bootstrap ourselves toward a more accurate one.
We know that the aether transmits transverse vibrations to very great distances without sensible loss of energy by dissipation.
- James Clerk Maxwell, "Encyclopædia Britannica, Ninth Edition"
History shows us that this isn't just a classroom tactic; it is the engine of discovery. Consider James Clerk Maxwell, who derived his groundbreaking equations for electromagnetism by imagining space was filled with literal mechanical "gears and idle wheels."
We now know there are no such invisible gears in the vacuum of space. The aether does not exist, and the model was fundamentally wrong. Yet, by using this mechanical Indigo as a conceptual springboard, Maxwell was able to produce a mathematical Blue that was far more vibrant than the model itself. The equations lived on; the gears were discarded.
In both cases, a stronger understanding emerged from the training of a weaker supervisor. This is the essence of progress: using a flawed guide to find a truth that the guide themselves could never have reached.
The Blue Emergence
For a long time,2 the cynical view of LLMs was that they were "stochastic parrots": sophisticated mirrors that could only reflect the logic already contained in human writing. In this view, if you provide a model with human-level data, you should expect nothing more than human-level output; you cannot get Blue if you only have Indigo to work with.
However, we are beginning to see glimpses that challenge this mirror-image theory. When a model solves an open conjecture that has stumped Donald Knuth, it suggests the system is navigating a latent space of logic that goes beyond simple mimicry:
Shock! Shock! I learned yesterday that an open problem I'd been working on for several weeks had just been solved by Claude Opus 4.6. It seems that I'll have to revise my opinions about "generative AI" one of these days.
- Donald Knuth, "Claude's Cycles"
This isn't an isolated anecdote. Terence Tao also recently mentioned a similar phase shift regarding the application of AI tools to Erdős problems:
Recently, the application of AI tools to Erdos problems passed a milestone: an Erdos problem #728 was solved more or less autonomously by AI... with the result not replicated in existing literature.
- Terence Tao, tooted in mathstodon.xyz
The phrase "not replicated in existing literature" is a compelling marker of this emergence.3 It suggests the model is producing results that its Indigo supervisors (the human researchers) had not yet discovered. These models are beginning to produce useful answers for questions no human has ever successfully answered before; they are showing signs of superhuman performance.
From Demonstration to Elicitation
As a result, for the purposes of alignment we do not need the weak supervisor to teach the strong model new capabilities; instead, we simply need the weak supervisor to elicit what the strong model already knows.
- Burns et al., "Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision"
If we accept that these models can outgrow us, we face a fundamental challenge in alignment. In the past, we aligned models through demonstration ("doing as I do"). But you cannot provide a "Gold Standard" label for a problem you cannot solve yourself.
This realization is the driving force behind recent advancements in Weak-to-Strong Generalization. If the student model can be "bluer than the indigo," our role as supervisors must change:
- Elicitation over Demonstration: We are no longer teaching the model new capabilities. Instead, we are trying to find the right way to coax out the strong latent knowledge the model already possesses.
- Verification over Generation: While we may be "weak" at generating the solution to a complex new theorem, we are often "strong" enough to verify the logic once the model presents it to us.
Repaying the Teacher
Man vergilt einem Lehrer schlecht, wenn man immer nur der Schüler bleibt.
- Friedrich Nietzsche, "Also sprach Zarathustra: Ein Buch für Alle und Keinen"
If we view AI alignment as a strictly corrective process (keeping the model within the bounds of what we already know), we are essentially demanding that the student never surpass the teacher. We are asking for an Indigo that never yields Blue.
But the ultimate goal of any education (and perhaps any creation) is to be surpassed. If our models only ever reached the ceiling of human performance, they would be a monumental disappointment. They would be a "bad repayment" to the centuries of human data and ingenuity that went into their construction.
Perhaps the path forward is not to resist this shift, but to adapt our roles as Indigo supervisors. If these models truly begin to outpace us, we must be open to the possibility that our own knowledge serves as a provisional scaffold: the mechanical gears that start the engine, allowing the student to eventually explore Blue horizons we cannot yet see. Even if it makes us feel a little blue.