A Stanford study tracked what happened when students practiced math with a chatbot. They performed better in the moment — tasks went faster, outputs looked polished. Then they took a closed-book exam and scored up to 17% worse than peers who had studied alone.

The OECD’s Digital Education Outlook 2026 calls this a “mirage of false mastery.” The performance improves. The learning doesn’t.

This is the thing that’s hard to see from the outside and obvious from the inside. A student can produce a sophisticated essay, a strong analysis, a well-structured argument — and have no idea how any of it was assembled. The output is real. The understanding isn’t.

What We Mean When We Talk About AI Use

Researchers are starting to pin down a distinction worth paying attention to. Cognitive offloading is what humans have always done — writing things down so you don’t have to hold them in your head, using a calculator so you can focus on the problem structure, looking something up instead of memorizing it. Tools extend thinking. That’s what tools are for.

Cognitive outsourcing is different. That’s when the thinking itself gets handed off — when the student isn’t using a tool to extend their reasoning but replacing it entirely. No wrestling with competing evidence. No personally building the argument. The product exists; the cognitive work never happened.

This distinction matters because schools have a theory — or should have one — about what they’re building. We’re not just grading essays. We’re building the kind of mind that can produce an essay when there’s no AI around, can hold a position under pressure, can change course when the evidence demands it. A student who can prompt effectively but can’t think independently isn’t prepared for the moments when prompting isn’t an option: a conversation, a performance, a job that requires real-time judgment.

The Electric Bike Problem

The U.S. Department of Education, in its AI guidance, offered a telling metaphor. AI should work like an electric bike, they said — it “multiplies effort” and reduces the burden of the task.

The analogy was meant to be reassuring. It is actually an admission of the problem.

Because if you stop pedaling an electric bike, you are being carried. And the legs don’t grow. Cognitive muscles — the ability to sustain focused attention, hold competing ideas in tension, push through confusion toward a conclusion — develop through resistance, not assistance. The struggle isn’t the obstacle to learning. The struggle is the mechanism.

Anthropic’s own education research found that 76% of students primarily use AI for higher-order thinking tasks — analysis, synthesis, evaluation. Those are the exact tasks that build the cognitive capacity we’re trying to develop. Students aren’t using AI to format citations. They’re using it to do the thinking that teachers assume they’re doing themselves.

Vermont Is the Rare Exception

Vermont released a developmental framework in January 2026 that is worth knowing about, partly because most states haven’t attempted one. Their guidance recommends no AI chatbot use at all for PreK through second grade, curriculum-embedded AI only for grades 3–5, structured education-specific tools for middle school, and broader AI fluency development in high school.

You don’t write that policy if you believe AI is just another productivity tool. You write it if you believe cognitive development is staged — that there are foundational years when students need to build capacity before you introduce anything that can do the work for them.

Most districts have landed somewhere between “ban everything” and “AI literacy is the future, embrace it” — without asking what the developmental cost of unstructured early outsourcing actually is. Vermont is at least asking the question. The rest of us are just hoping it turns out fine.

When You Use AI Matters More Than Whether You Use It

A 2026 study looked at when, in a problem-solving task, students consulted AI — early in the process, or after they had already partially worked through it themselves. Students who waited performed significantly better on follow-up assessments than students who opened the chatbot first.

The thinking that happened before the AI was the point. The answer the AI produced was beside it.

This matters for how we design instruction, not just how we design policy. A student who is required to commit to a position, work through the evidence, and articulate their reasoning before accessing any AI will learn something the student who opens ChatGPT first will not. The sequencing is doing the cognitive work that the tool cannot do for them.

The Larger Stakes

The bigger question sits behind all of this. We’re at a point where AI tools are useful, increasingly embedded, and actively reshaping how people engage with information and argument. The students who are 12 and 15 and 17 right now will be the ones who inherit the decisions about how AI develops. They will vote on it, build with it, regulate it, and live with the consequences of how those systems get designed.

If we raise a generation that knows how to prompt but not how to think — that can generate output but can’t interrogate it — we don’t produce the kind of citizens who can navigate what comes next. We need people who can look at an AI-generated conclusion and ask whether it’s correct. People who can push back on a model’s reasoning. People who can make judgments about when to trust the tool and when to override it.

That requires having built the capacity for judgment in the first place. You can’t develop it by skipping the years when it’s supposed to form.

Takeaway for Teachers

The product tells you almost nothing about what a student actually knows. If you want to find out, ask them to explain their reasoning out loud — not in writing, not after preparation, just: “Walk me through why you made that choice.” Thirty seconds of verbal explanation will reveal more about student understanding than a polished paragraph will.

A student who can describe the confusion they worked through, why they chose one piece of evidence over another, where they changed their mind — that student did the work. A student who can’t field a follow-up question about their own argument didn’t, regardless of how good the essay looks.

Redesigning every assignment is a serious lift. Adding a thirty-second oral check to one assignment per week isn’t. Start there, and see what you learn about what’s actually happening.


If you’re thinking through questions like this one — what AI does to developing minds, and what we owe students in response — my book goes deeper. The AI Doesn’t Know Your Students is available on Amazon and at shouldiuse.ai/book/.

David Jacobson is a high school history teacher. He writes about AI, education, and the messy intersection of the two at shouldiuse.ai.

Leave a Reply