One of my students — sharp and a little mischievous — walked up to the whiteboard during a compare-and-contrast activity on the Silk Roads and the Indian Ocean Trade Networks. She wrote: The Silk Roads were on land, and the Indian Ocean Trade Routes were on water.
She giggled when I read it out loud. A few others smiled. It looked like the most obvious thing a person could say.
She was right — not just factually correct, but right in the deeper sense that matters for historical thinking. What she had done was — not just factually correct, but right in the deeper sense. Everything else about those two networks follows from that foundational distinction. Good historical thinking works this way: you name the obvious first, and you build from it. The broad frame is what makes the specific insight visible.
Here’s what connects that moment to AI: a generative tool would never do this. Ask AI to compare the Silk Roads and the Indian Ocean Trade Networks and you’ll get something immediately impressive — a well-organized paragraph on goods, cultural exchange, religious diffusion, political dynamics. What you won’t get is a student standing at the board, a little nervous, writing the thing that seems too simple to say. AI produces the shape of sophisticated analysis. It skips the step where you figure out what the obvious actually is. Students need to practice that step.
That student was doing something AI can’t do for her. She was learning to think like a historian. The AI would have performed one.
The wrong question
Most conversations about AI and student work are organized around a compliance question: how do we stop students from using AI? That question produces detection tools, honor codes, academic integrity policies — all of which have their place, but none of which change what actually happens when students sit down to do the work.
The more useful question is: how do we design so the thinking happens regardless?
This is the central argument of Chapter Twelve of The AI Doesn’t Know Your Students, and it’s the one I return to most often in practice. Not because it’s idealistic, but because it’s operational. You can actually do something with it on Monday.
The design mistake most teachers make — and it’s understandable — is treating the output as if it were the learning. If the output is all that matters, AI is a threat to the assignment. But if the learning is what matters, the output is just evidence of it, and evidence can be gathered in other ways: through conversation, through revision, through documented process, through questions that require the student’s specific knowledge — knowledge that no AI can supply, because no AI was in your classroom last Tuesday.
What four decades of research has been pointing toward
Manu Kapur’s work on productive failure found, across more than 12,000 participants, that students who attempt problems before receiving instruction outperform students who receive instruction first — particularly on measures of conceptual understanding and transfer. The struggle activates learning processes that smooth delivery sequences don’t. The implication for AI is there: if a student uses AI to produce a first draft and is then asked to do something genuinely hard with it — identify what it got wrong, defend its choices against your questions, revise toward greater specificity — the productive struggle has moved to the revision stage. If those stages require real thinking, the sequence still works.
John Sweller’s cognitive load theory gives you the design language. The distinction between extrinsic load — the cognitive cost of managing mechanics: formatting, locating information, constructing serviceable sentences — and intrinsic load — the actual thinking: argumentation, evaluation, synthesis — clarifies where AI belongs. What AI does well is handle extrinsic load. What it cannot do is handle intrinsic load. It cannot bring the student’s specific classroom experience, their evolving understanding, their genuine uncertainty, their emerging argument. If we design so the extrinsic work can be delegated while the intrinsic work stays with the student, we’ve created a more focused cognitive environment, not a degraded one.
Gert Biesta names the stakes most directly. Education has three functions: qualification (developing skills and knowledge), socialization (learning to participate in existing norms), and subjectification — developing students as genuine agents capable of thinking, choosing, and acting on their own judgment. An AI-completed assignment submitted unchanged is not subjectification. A student who uses AI as a starting point and then exercises real judgment — about what to keep, what to cut, what the AI got wrong, what their own position actually is — is doing something much closer to it. The difference is almost entirely in the design.
Three moves that actually work
None of these require banning AI. None require pretending students aren’t using it.
Require visible AI use, not hidden use. When AI use is disclosed and documented — when students submit their original prompt, the AI output, and a brief annotation explaining what they changed and why — it becomes an object of instruction. The annotation alone requires the student to read critically, articulate a judgment, and account for their choices. That’s the thinking. Designing for it costs almost nothing.
Design the defense, not just the draft. If students know they’ll have to explain their work — in a short conversation, a follow-up question, five minutes at the start of the next class — the AI draft changes character. It becomes material they need to understand, not just something they submitted. A single appended question works: “What would a thoughtful skeptic say about your argument, and how would you respond?” No AI can answer that the way a student who actually worked through the material can.
Build a classroom-anchored question into every major assignment. Include one question that explicitly requires reference to something from your specific classroom — a primary source examined together, a distinction drawn in discussion, a piece of evidence the unit foregrounded. “Based on the Ottoman sources we looked at last week — not what you could find online, but specifically what came up in class — what’s missing from this analysis?” No AI can answer that. The student either was in the room or wasn’t. This isn’t a gotcha mechanism. It’s an acknowledgment that classroom learning happens in a specific place with specific people and produces knowledge that can’t be replicated from a search.
The problem at the other end
Most of the conversation about AI and academic integrity focuses on students who use it to skip work they couldn’t otherwise do. But there’s a failure mode on the other end that gets less attention: students who could do the work well, and still find AI flattening their thinking.
A strong writer who relies on an AI scaffold doesn’t get to practice the part where they fight through an argument and come out the other side with something that’s genuinely theirs. For these students, AI functions as a ceiling rather than a floor. The productive struggle they need isn’t about getting to a serviceable draft — it’s about going somewhere genuinely unexpected, and that destination doesn’t show up in AI output.
The design challenge isn’t only “how do we make sure struggling students do real thinking?” It’s also “how do we make sure strong students don’t settle?”
Robert Bjork’s research on desirable difficulties offers the diagnostic: conditions that make performance feel easy in the moment often reduce long-term retention. If AI handles the initial retrieval effort, that ease may come at the cost of durable learning unless the difficult work is restored at a later stage. The design moves above attempt exactly that — restore productive difficulty at the revision and critique stage, rather than eliminating it at the drafting stage. Not to make tasks harder for their own sake. To make sure the difficulty that actually builds capacity stays with the student.
Takeaway for Teachers
Tell students they may use AI to produce a first draft on an upcoming assignment — but they must submit three things alongside the final work: the original prompt they gave the AI, the AI’s initial output, and a brief annotation (150–200 words) explaining what they kept, what they changed, and why. Students who engaged critically will have substantive things to say. Students who pasted without reading will reveal it. You’re not grading the AI’s draft. You’re grading whether the student understood it well enough to improve it.
Then add one question that requires the student’s specific experience in your classroom. Something no AI could answer. Because the student either was in the room, or wasn’t.
Chapter Twelve of The AI Doesn’t Know Your Students goes deeper on all of this — the research foundations, the specific classroom sequences, and the failure modes on both ends. If this framing is useful to you, the full book is available now.
David Jacobson is a high school history teacher. He writes about AI, education, and the messy intersection of the two at shouldiuse.ai.
