<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Jon Cardoso-Silva</title>
<link>https://jonjoncardoso.github.io/blog/</link>
<atom:link href="https://jonjoncardoso.github.io/blog/index.xml" rel="self" type="application/rss+xml"/>
<description>Data science education, generative AI in teaching, and research at LSE.</description>
<image>
<url>https://jonjoncardoso.github.io/../images/Cardoso-Silva_Jon.jpeg</url>
<title>Jon Cardoso-Silva</title>
<link>https://jonjoncardoso.github.io/blog/</link>
</image>
<generator>quarto-1.9.37</generator>
<lastBuildDate>Sat, 11 Apr 2026 00:00:00 GMT</lastBuildDate>
<item>
  <title>The Wrong Test: How AI Exposed a Flaw in How We Measure Learning</title>
  <dc:creator>Jon Cardoso-Silva</dc:creator>
  <link>https://jonjoncardoso.github.io/blog/posts/2026-04-11-performance-vs-learning.html</link>
  <description><![CDATA[ 





<details class="key-definitions">
<summary>
<strong>Key Definitions</strong> used in this article
</summary>
<table class="glossary_table table">
<thead><tr><th>Term</th><th>Definition</th></tr></thead>
<tbody>
<tr><td>desirable difficulties</td><td>Manipulations that slow acquisition but enhance retention and transfer, such as spacing practice, interleaving task types, and testing rather than re-reading. Term from Nick Soderstrom and Robert Bjork. Conditions that make practice feel harder often produce better long-term outcomes (Soderstrom and Bjork, 2015).</td></tr>
<tr><td>learning</td><td>Used in two senses across this blog. When discussing outcomes, learning consists of relatively permanent changes in knowledge or skills that persist beyond practice and transfer to new contexts (Soderstrom and Bjork, 2015). But when discussing process, learning refers to knowledge created through the transformation of experience (Kolb, 1984), where the doing, reflecting, and experimenting are the learning.</td></tr>
<tr><td>meta-analysis</td><td>A statistical method that combines results from multiple independent studies to estimate an overall effect. A single study might be too small to be conclusive; a meta-analysis pools the evidence across many studies.</td></tr>
<tr><td>performance</td><td>Temporary changes in behaviour or knowledge observable during or immediately after practice. A student can perform well today and fail the same task next month. Performance during acquisition does not reliably indicate learning (Soderstrom and Bjork, 2015).</td></tr>
<tr><td>transfer</td><td>Using what you learned in one context to handle a different one. Includes solving a new problem unaided (direct application) and learning something new faster because of prior experience (preparation for future learning). See Bransford and Schwartz (1999).</td></tr>
</tbody>
</table>
<!-- **Performance**
: Temporary changes in behaviour or knowledge observable during or immediately after practice. A student can perform well on a task today and fail the same task next month. Performance during acquisition does not reliably indicate learning.
: [Based on @soderstrom2015learning, p. 176; read in full.]{.definition-source}

**Learning**
: Relatively permanent changes in knowledge or skills that last beyond the practice session and can be applied in new contexts. Where performance is what you can do right now, learning is what you can still do weeks later and in situations you have not encountered before.
: [Synthesises Soderstrom & Bjork's [-@soderstrom2015learning, p. 176] definition ("relatively permanent changes in behavior or knowledge that support long-term retention and transfer") with Kolb's [-@kolb1984experiential] process view ("knowledge created through the transformation of experience"). S&B read in full; Kolb cited via secondary source.]{.definition-source}

**Transfer**
: The ability to use what you learned in one context to handle a different one. A student who learns to debug Python errors on homework and can then debug unfamiliar JavaScript errors at work has transferred that skill. Bransford & Schwartz [-@bransford1999rethinking] distinguish two ways to measure it: whether someone can solve a new problem unaided ("direct application"), and whether their prior learning helps them learn new things faster ("preparation for future learning").
: [Synthesises @soderstrom2015learning discussion with @bransford1999rethinking PFL framework. Both read in full.]{.definition-source} -->
</details>
<p>Since ChatGPT launched in late 2022, anyone who teaches has had to reconsider how their students learn. I teach data science and programming at LSE, and the early conversation about the “death of the essay” applied to my courses too, since students write up their analyses alongside their code, but I worried about the coding at least as much because ChatGPT could already write working code. The tools have only got better since: Claude Code, GitHub Copilot, Cursor, Lovable, and something new every few weeks.</p>
<p>The recent experimental evidence suggests this worry is grounded. Bastani et al. <span class="citation" data-cites="bastani2025generative">(2025)</span> gave roughly a thousand high school maths students access to GPT-4 for practice and found that the group with unrestricted access scored 17% worse on the subsequent exam than students who never had the tool, even though their practice scores had gone up. Fan et al. <span class="citation" data-cites="fan2025metacognitive">(2025)</span> ran a similar comparison with essay writing: the ChatGPT group produced higher-scoring work than those who used no AI, and even those paired with a human writing expert, but showed no advantage on a knowledge test or a <button class="glossary"><span class="def">Using what you learned in one context to handle a different one. Includes solving a new problem unaided (direct application) and learning something new faster because of prior experience (preparation for future learning). See Bransford and Schwartz (1999).</span>transfer</button> task given the same day.</p>
<p>I use these tools myself every day, and I am excited about what they let me do that I could not do before. But it does worry me when I cannot tell whether my students have actually understood the material or whether they just handed in whatever the chatbot produced. That worry gets a bit existential when you sit down to revise your syllabus or plan the graded assignments for a new term. What am I actually measuring in an assignment? How do I design coursework that rewards more the learning process rather than a coherent output? This post is about grappling with those questions.</p>
<section id="the-gap-between-output-and-understanding" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="the-gap-between-output-and-understanding">The gap between output and understanding</h2>
<p>A <button class="glossary"><span class="def">A statistical method that combines results from multiple independent studies to estimate an overall effect. A single study might be too small to be conclusive; a meta-analysis pools the evidence across many studies.</span>meta-analysis</button> of 69 ChatGPT experiments from 2022 to 2024 <span class="citation" data-cites="deng2025chatgpt">(Deng et al., 2025)</span> finds that students using ChatGPT achieve higher academic performance, feel more motivated, and score better on higher-order thinking tasks, all while exerting <em>less</em> mental effort. More learning for less effort? Sounds great, right?!</p>
<div class="page-columns page-full"><p>But Deng et al.&nbsp;flag a measurement problem in their own data: of the 51 studies that contributed to the performance estimate, nine allowed students to use ChatGPT during the assessment itself, 33 did not report whether it was allowed, and only nine clearly prohibited it. The positive findings may reflect the quality of ChatGPT’s output rather than anything the students learned <span class="citation" data-cites="yan2025distinguishing">(Yan et al., 2025)</span>. A separate review <span class="citation" data-cites="walker2025learning">(Walker &amp; Vorvoreanu, 2025)</span> reaches a similar conclusion: <span class="ink-highlight ink-p-coral">unstructured generative AI use in formal learning is associated with weaker memory, less critical engagement, and growing dependence on the tool</span><span class="ink-ref" data-ink-ref="ink-auto-1">1</span>.</p><div class="no-row-height column-margin column-container"><span class="ink-note margin-aside"><span class="ink-ref" data-ink-ref="ink-auto-1">1</span> It’s not all just bad news, though! I have a new blog post coming soon about the conditions under which Generative AI helps with learning.</span></div></div>
<p>When researchers remove the AI from students, the gains disappear. Akgun &amp; Toker <span class="citation" data-cites="akgun2025shortterm">(2025)</span> found that advantages measured on immediate tasks had vanished entirely three weeks later. Darvishi et al. <span class="citation" data-cites="darvishi2024impact">(2024)</span> found that students who had relied on AI produced lower-quality work after its removal than students who never had it. Even Bastani et al.’s pedagogically constrained GPT Tutor, which withheld direct answers and pushed students to reason through problems, only managed to avoid harm and did not outperform the no-AI control on the subsequent exam <span class="citation" data-cites="bastani2025generative">(Bastani et al., 2025)</span>.</p>
<p>Interestingly, Bastani et al.&nbsp;also ran an NLP classification of student messages and found that in 95% of GPT Base conversations, students asked for the answer at least once. The error pattern in the data tells us something about how they treated those answers. GPT-4 was correct only 51% of the time, making logical errors on 42% of problems and arithmetic errors on 8%. A student who was reading and evaluating the solutions would catch arithmetic mistakes more easily than logical ones, since checking a calculation is simpler than evaluating a line of reasoning. But both error types reduced practice scores by similar amounts, suggesting students were accepting the output wholesale rather than, say, internalising the wrong concepts because they ‘learned’ from the AI. It is also telling that the students themselves never reported feeling that they had learned less, even though their exam scores had dropped by 17%.</p>
<div class="page-columns page-full"><p>Kosmyna et al. <span class="citation" data-cites="kosmyna2025brain">(2025)</span> approached the question from a different direction<span class="ink-ref" data-ink-ref="ink-auto-2">2</span>. Using EEG to measure brain activity during essay writing across three conditions (ChatGPT, a search engine, and no tools), they found that cognitive engagement scaled down with the level of external support, with ChatGPT users showing the weakest neural connectivity. Participants who later switched from ChatGPT to working alone showed reduced neural engagement compared to those who had never used it, a pattern the researchers call <span class="rough-underline">cognitive debt</span> <span class="ink-ref" data-ink-ref="ink-auto-3">3</span>. Over four months, the ChatGPT group underperformed at neural, linguistic, and behavioural levels, and struggled to recall or quote their own essays.</p><div class="no-row-height column-margin column-container"><span class="ink-note margin-aside"><span class="ink-ref" data-ink-ref="ink-auto-2">2</span> You might have come across this study before as it made quite the splash <a href="https://www.media.mit.edu/posts/your-brain-on-chatgpt-in-the-news/">in the media</a> and online in general.</span><span class="ink-note margin-aside"><span class="ink-ref" data-ink-ref="ink-auto-3">3</span> There has been a proliferation of new terms to describe this in the literature! I plan to write separately about the difference between them. There’s <span class="rough-underline">cognitive offloading</span>, <span class="rough-underline">cognitive debt</span>, our own <span class="rough-underline">cognitive bypass</span> <span class="citation" data-cites="sallai2024bypass">(Sallai et al., 2024)</span>, and the newly coined <span class="rough-underline">cognitive surrender</span> <span class="citation" data-cites="shaw2026trisystem">(Shaw &amp; Nave, 2026)</span>.</span></div></div>
</section>
<section id="learning-science-predicted-this" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="learning-science-predicted-this">Learning science predicted this</h2>
<div class="page-columns page-full"><p>I would wager that none of the findings above would come as a surprise to a cognitive psychologist. The gap between how well students perform during practice and how much they actually retain has been studied for decades, long before anyone had heard of ChatGPT. Soderstrom &amp; Bjork’s <span class="citation" data-cites="soderstrom2015learning">(2015)</span> <em>Learning versus performance: An integrative review</em><span class="ink-ref" data-ink-ref="ink-auto-4">4</span> synthesises that body of work. What looks like learning during practice often is not. Students can perform well while receiving instruction but fail tests on the same topic weeks later. Conversely, and counterintuitively, students who struggle during practice often outperform on delayed tests. That is, conditions that make acquisition <em>feel</em> easier often produce worse long-term outcomes.</p><div class="no-row-height column-margin column-container"><span class="ink-note margin-aside"><span class="ink-ref" data-ink-ref="ink-auto-4">4</span> Google Scholar counts <a href="https://scholar.google.co.uk/scholar?cluster=17718342143045074281&amp;hl=en&amp;as_sdt=2005&amp;sciodt=0,5">nearly 1000 citations</a> for this paper.</span></div></div>
<div class="page-columns page-full"><p>Bjork &amp; Bjork’s <span class="citation" data-cites="bjork1992newtheory">(1992)</span> “new theory of disuse” offers a possible explanation for this mechanism. They distinguish <span class="rough-underline"><em>storage strength</em></span> (how integrated a memory is with other knowledge) from <span class="rough-underline"><em>retrieval strength</em></span> (how accessible it is right now). Gains in storage strength are greater when current retrieval strength is lower. The harder you work to retrieve something, the more that retrieval strengthens the memory. Bjork coined the term <span class="hand-circled ink-p-secondary" style="color:#a9371b"><strong>“desirable difficulties”</strong></span> <span class="ink-ref" data-ink-ref="ink-auto-5">5</span> for manipulations that slow acquisition but enhance retention and transfer: spacing practice across sessions rather than cramming, interleaving different task types rather than practising one to mastery, and testing rather than re-reading.</p><div class="no-row-height column-margin column-container"><span class="ink-note margin-aside"><span class="ink-ref" data-ink-ref="ink-auto-5">5</span> I love this concept! My students, not so much… I might write about how I engineer pedagogical attrition in my courses in the future.</span></div></div>
<!-- Soderstrom & Bjork also discuss what they call [guidance effects]{.highlight}: instructional guidance during practice improves immediate performance but often degrades long-term retention. Guided groups outperformed unguided groups during training but underperformed on retention tests weeks later. However, most of this guidance research comes from motor learning and simple cognitive tasks, so caution is warranted in generalising to complex tasks like programming or essay writing. Ease during acquisition is an unreliable signal of learning. [Not sure if this paragraph helps with our thesis. So leaving it out for now.]{.note-from-jon} -->
<blockquote class="blockquote">
<p>Generative AI provides exactly the conditions that the cognitive science research would predict to be harmful. GenAI reduces difficulty, provides constant guidance, and makes work <em>feel</em> fluent. The apparent fluency itself is a huge part of the problem because students may believe they have learned because the task felt easy, when in fact they have not built durable memory or transferable skills.</p>
</blockquote>
<p>Several mechanisms contribute to this, from the straightforward (the model does the cognitive work so the student never practises it) to the subtler (the student stops evaluating the model’s output, and then stops noticing when it is wrong). I unpack the different labels researchers have given to these processes, and how they relate to each other, in a new blog post. The short version is that these labels describe the same problem at different levels, from the general mechanism to the behaviour you see in a student’s chat log.</p>
<p>The decades of research that Soderstrom and Bjork review show that ease during practice is a weak guide to what people retain weeks later and to how they fare on a genuinely new task. The mismatch between performance and learning existed long before GenAI, but a chatbot can now produce the polished output that used to require understanding. When we grade what students can generate in a single sitting, with or without a model whispering in the tab, we are mostly auditing that output.</p>
</section>
<section id="assessment-built-on-the-wrong-theory" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="assessment-built-on-the-wrong-theory">Assessment built on the wrong theory</h2>
<p>Most university courses and the studies I described above share the same assumption about how to find out whether students learned: teach them, let them practise, then remove the support and judge how they tackle a similar task on their own. Bransford &amp; Schwartz <span class="citation" data-cites="bransford1999rethinking">(1999)</span> call this <span class="rough-underline">“sequestered problem solving” (SPS)</span>.</p>
<p>Bransford &amp; Schwartz argue that SPS, and the “direct application” theory of transfer that accompanies it, are responsible for much of the pessimism about whether education produces transfer at all. Under SPS, transfer looks rare. Students tested in isolation frequently fail to produce adequate solutions <span class="citation" data-cites="bransford1999rethinking">(1999, pp. 66–68)</span>, and the conclusion is that their education did not prepare them.</p>
<p>But consider what SPS misses. When Bransford &amp; Schwartz <span class="citation" data-cites="bransford1999rethinking">(1999, pp. 66–67)</span> asked fifth graders and college students to create recovery plans for bald eagles, neither group produced adequate plans. Under SPS, both failed. But when asked what they would need to research, the groups diverged: fifth graders asked about individual eagles (“How big are they?”), while college students asked structural questions about ecosystems, historical threats, and the kinds of specialists needed. Their prior learning had not given them the answer, but it had prepared them to ask better questions.</p>
<p>Bransford &amp; Schwartz call this <span class="rough-underline">“preparation for future learning” (PFL)</span>. Where direct application asks “can you apply what you know?”, PFL asks “has what you know prepared you to learn new things?” The evidence for PFL is found in process: the sophistication of questions asked, the quality of hypotheses formed, the ability to seek and use resources effectively, the trajectory of improvement when given the chance to revise.</p>
<p>Nobody would assess a newly qualified teacher by locking her in a room and testing whether she can recall her education courses from memory. You would watch her teach over time: how she adapts to her students, how she seeks feedback, how her practice improves. That is PFL assessment. Yet for most students, in most courses, we give them the locked-room version and call the result “evidence of learning.”</p>
<div class="page-columns page-full"><p>Broudy (1977, as discussed by Bransford &amp; Schwartz <span class="citation" data-cites="bransford1999rethinking">(1999)</span>) offers a useful concept here: <span class="rough-underline">“knowing with.”</span> Beyond replicating facts (knowing that) and applying procedures (knowing how), people perceive and interpret the world <em>through</em> their accumulated knowledge. An educated person <em>“thinks, perceives and judges with everything that he has studied in school, even though he cannot recall these learnings on demand.”</em> You forget the details of a biology course, but the concept of bacterial infection still shapes how you interpret illness. You forget specific statistical formulas, but the idea that data has variability still shapes how you read a graph <span class="ink-ref" data-ink-ref="ink-auto-6">6</span>. This residual framework, largely tacit, is what Broudy calls “knowing with,” and it is what PFL-style assessment tries to detect.</p><div class="no-row-height column-margin column-container"><span class="ink-note margin-aside"><span class="ink-ref" data-ink-ref="ink-auto-6">6</span> This vibes a lot with the style of education Paulo Freire advocated in his critical pedagogy work.</span></div></div>
<p>SPS cannot tell us whether GenAI-mediated learning changed what students “know with,” whether it shifted their questions or their readiness to learn the next thing. Our conventional assessments cannot tell us either. As Bransford &amp; Schwartz write: _“Despite the value of the SPS methodology, it often comes with a set of unexamined assumptions about what it means to know and understand. The most important assumption is that ‘real transfer’ involves only the direct application of previous learning; we believe that this assumption has unduly limited the field’s perspective.”</p>
<blockquote class="blockquote">
<p>We still treat one solo shot at the problem as the verdict on learning. Preparation for future learning looks elsewhere: at how students approach problems they have not seen before, and whether they get better when given the chance to revise.</p>
</blockquote>
</section>
<section id="watching-how-students-learn" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="watching-how-students-learn">Watching how students learn</h2>
<div class="page-columns page-full"><p>The alternative to sequestred problem solving assessment is to assess how students learn when given the opportunity to do so, using all the resources available to them, <span class="ink-highlight ink-p-coral">including Generative AI</span>. This is what Bransford &amp; Schwartz’s PFL perspective implies, and it shifts attention from product to process<span class="ink-ref" data-ink-ref="ink-auto-7">7</span>.</p><div class="no-row-height column-margin column-container"><span class="ink-note margin-aside"><span class="ink-ref" data-ink-ref="ink-auto-7">7</span> Much of the assessment theory here draws on work from the late 1990s. From what I can tell, the arguments still hold up. A recent systematic review of reviews on problem-based learning <span class="citation" data-cites="amoa-danquah_systematic_2025">(Amoa-Danquah &amp; Carbonneau, 2025)</span> confirms that process-focused, learner-centred approaches continue to show benefits for engagement and critical thinking. I plan to write separately about PBL and project-based learning.</span></div></div>
<p>Bransford &amp; Schwartz <span class="citation" data-cites="bransford1999rethinking">(1999)</span> gave a concrete example in 1999: students attempted a geometry challenge, rated how confident they were, and chose how much help they wanted (from a brief definition up to an interactive simulation). They then tried an analogous problem. What mattered was how they responded to difficulty: did the student recognise they were stuck, pick help that addressed the gap, and improve on the second attempt?</p>
<p>Current assessment rarely captures any of this. A student who uses ChatGPT to produce correct code and a student who struggled with the problem, asked the AI specific questions, and modified what it returned would receive the same mark on most rubrics. The first student may not notice a gap in their understanding if the problem changed. The second student already showed that they can.</p>
<hr>
<div class="page-columns page-full"><p>Although I do not yet have answers for how to implement process-based assessment at scale, my colleagues and I have been developing a framework to help make sense of what students actually do when they interact with GenAI. The GENIAL Framework <span class="citation" data-cites="cardoso-silva_mapping_2025">(Cardoso-Silva et al., 2025)</span> provides a tool for investigating the process of learning when mediated by GenAI. It describes several engagement patterns, of which two are most relevant here: “Resourceful” (students who use AI to support their own thinking by adapting suggestions, asking follow-up questions, and testing alternatives) and “Receptive” (students who delegate thinking to AI by copying output without modification and accepting answers without evaluation). In Bransford &amp; Schwartz’s terms, Resourceful engagement looks like PFL evidence and Receptive engagement looks like SPS failure made visible in real time <span class="ink-ref" data-ink-ref="ink-auto-8">8</span>.</p><div class="no-row-height column-margin column-container"><span class="ink-note margin-aside"><span class="ink-ref" data-ink-ref="ink-auto-8">8</span> I plan to write separately about what process-based assessment looks like in practice.</span></div></div>
<p>Observing learning trajectories takes time. It requires reading students’ process alongside their products, examining how they interact with tools, tracking their questions and revisions. This is labour-intensive work that does not scale through automation (yet?), because the judgment required is precisely the kind that cannot be delegated to an algorithm without recreating the problem we are trying to solve. If we outsource the evaluation of learning processes to AI, we risk the same gap at the assessment level that we identified at the student level which is the appearance of rigour without the substance.</p>
<p>Educators who want to assess learning rather than performance need time to observe processes, design dynamic assessments, and evaluate trajectories. This means smaller class sizes, with more time and motivation to investigate the learning process of their students when grading their work, or more teaching staff, or both. It means administrators recognising that AI-resilient assessment is not a technology problem with a technology solution. The answer may be, uncomfortably to some, more humans, not more AI.</p>
<div class="page-columns page-full"><p>That said, the evidence is not uniformly bleak. There are conditions under which students who use AI learn more than those who work without it, and the design choices that separate productive use from dependency are becoming clearer. <span class="ink-ref" data-ink-ref="ink-auto-9">9</span></p><div class="no-row-height column-margin column-container"><span class="ink-note margin-aside"><span class="ink-ref" data-ink-ref="ink-auto-9">9</span> I plan to cover those findings in a new blog post.</span></div></div>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0" data-line-spacing="2">
<div id="ref-akgun2025shortterm" class="csl-entry">
Akgun, M., &amp; Toker, S. (2025). Short-<span>Term</span> <span>Gains</span>, <span>Long</span>-<span>Term</span> <span>Gaps</span>: The <span>Impact</span> of <span>GenAI</span> and <span>Search</span> <span>Technologies</span> on <span>Retention</span>. <em>arXiv.org</em>. <a href="https://doi.org/10.48550/ARXIV.2507.07357">https://doi.org/10.48550/ARXIV.2507.07357</a>
</div>
<div id="ref-amoa-danquah_systematic_2025" class="csl-entry">
Amoa-Danquah, P., &amp; Carbonneau, K., J. (2025). A <span>Systematic</span> <span>Review</span> of <span>Reviews</span> on <span>Problem</span>-<span>Based</span> <span>Learning</span> and <span>Its</span> <span>Effectiveness</span>. <em>Current Issues in Education</em>, <em>26</em>(2). <a href="https://doi.org/10.14507/cie.vol26iss2.2293">https://doi.org/10.14507/cie.vol26iss2.2293</a>
</div>
<div id="ref-bastani2025generative" class="csl-entry">
Bastani, H., Bastani, O., Sungu, A., Ge, H., Kabakcı, Ö., &amp; Mariman, R. (2025). Generative <span>AI</span> without guardrails can harm learning: Evidence from high school mathematics. <em>Proceedings of the National Academy of Sciences</em>, <em>122</em>(26), e2422633122. <a href="https://doi.org/10.1073/pnas.2422633122">https://doi.org/10.1073/pnas.2422633122</a>
</div>
<div id="ref-bjork1992newtheory" class="csl-entry">
Bjork, R. A., &amp; Bjork, E. L. (1992). A new theory of disuse and an old theory of stimulus fluctuation. In A. K. Healy, S. M. Kosslyn, &amp; R. M. Shiffrin (Eds.), <em>From learning processes to cognitive processes: Essays in honor of william k. estes</em> (Vol. 2, pp. 35–67). Erlbaum.
</div>
<div id="ref-bransford1999rethinking" class="csl-entry">
Bransford, J. D., &amp; Schwartz, D. L. (1999). Rethinking transfer: A simple proposal with multiple implications. In <em>Review of research in education</em> (Vol. 24, pp. 61–100). American Educational Research Association. <a href="https://doi.org/10.2307/1167267">https://doi.org/10.2307/1167267</a>
</div>
<div id="ref-cardoso-silva_mapping_2025" class="csl-entry">
Cardoso-Silva, J., Sallai, D., Kearney, C., Panero, F., &amp; Barreto, M. E. (2025). Mapping <span>Student</span>-<span>GenAI</span> <span>Interactions</span> onto <span>Experiential</span> <span>Learning</span>: <span>The</span> <span>GENIAL</span> <span>Framework</span>. <em>SSRN Electronic Journal</em>, 22. <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5674422">https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5674422</a>
</div>
<div id="ref-darvishi2024impact" class="csl-entry">
Darvishi, A., Khosravi, H., Sadiq, S., Gašević, D., &amp; Siemens, G. (2024). Impact of <span>AI</span> assistance on student agency. <em>Computers &amp; Education</em>, <em>210</em>, 104967. <a href="https://doi.org/10.1016/j.compedu.2023.104967">https://doi.org/10.1016/j.compedu.2023.104967</a>
</div>
<div id="ref-deng2025chatgpt" class="csl-entry">
Deng, R., Jiang, M., Yu, X., Lu, Y., &amp; Liu, S. (2025). Does <span>ChatGPT</span> enhance student learning? A systematic review and meta-analysis of experimental studies. <em>Computers &amp; Education</em>, <em>227</em>, 105224. <a href="https://doi.org/10.1016/j.compedu.2024.105224">https://doi.org/10.1016/j.compedu.2024.105224</a>
</div>
<div id="ref-fan2025metacognitive" class="csl-entry">
Fan, Y., Tang, L., Le, H., Shen, K., Tan, S., Zhao, Y., Shen, Y., Li, X., &amp; Gašević, D. (2025). Beware of metacognitive laziness: <span>Effects</span> of generative artificial intelligence on learning motivation, processes, and performance. <em>British Journal of Educational Technology</em>, <em>56</em>(2), 489–530. <a href="https://doi.org/10.1111/bjet.13544">https://doi.org/10.1111/bjet.13544</a>
</div>
<div id="ref-kosmyna2025brain" class="csl-entry">
Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X.-H., Beresnitzky, A. V., Braunstein, I., &amp; Maes, P. (2025). <em>Your <span>Brain</span> on <span>ChatGPT</span>: <span>Accumulation</span> of <span>Cognitive</span> <span>Debt</span> when <span>Using</span> an <span>AI</span> <span>Assistant</span> for <span>Essay</span> <span>Writing</span> <span>Task</span></em>. arXiv. <a href="https://doi.org/10.48550/ARXIV.2506.08872">https://doi.org/10.48550/ARXIV.2506.08872</a>
</div>
<div id="ref-sallai2024bypass" class="csl-entry">
Sallai, D., Cardoso-Silva, J., Barreto, M. E., Panero, F., Berrada, G., &amp; Luxmoore, S. (2024). Approach generative AI tools proactively or risk bypassing the learning process in higher education. <em>LSE Public Policy Review</em>, <em>3</em>(3), 7. <a href="https://doi.org/10.31389/lseppr.108">https://doi.org/10.31389/lseppr.108</a>
</div>
<div id="ref-shaw2026trisystem" class="csl-entry">
Shaw, S. D., &amp; Nave, G. (2026). <em>Thinking—<span>Fast</span>, <span>Slow</span>, and <span>Artificial</span>: <span>How</span> <span>AI</span> is <span>Reshaping</span> <span>Human</span> <span>Reasoning</span> and the <span>Rise</span> of <span>Cognitive</span> <span>Surrender</span></em>. PsyArXiv. <a href="https://doi.org/10.31234/osf.io/yk25n_v1">https://doi.org/10.31234/osf.io/yk25n_v1</a>
</div>
<div id="ref-soderstrom2015learning" class="csl-entry">
Soderstrom, N. C., &amp; Bjork, R. A. (2015). Learning versus performance: An integrative review. <em>Perspectives on Psychological Science</em>, <em>10</em>(2), 176–199. <a href="https://doi.org/10.1177/1745691615569000">https://doi.org/10.1177/1745691615569000</a>
</div>
<div id="ref-walker2025learning" class="csl-entry">
Walker, K., &amp; Vorvoreanu, M. (2025). <em>Learning outcomes with <span>GenAI</span> in the classroom: A review of empirical evidence</em> [Microsoft Aether Psychological Influences of AI (Psi) working group]. Microsoft Research. <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2025/10/GenAILearningOutcomes-Report-published-10-07-2025.pdf">https://www.microsoft.com/en-us/research/wp-content/uploads/2025/10/GenAILearningOutcomes-Report-published-10-07-2025.pdf</a>
</div>
<div id="ref-yan2025distinguishing" class="csl-entry">
Yan, L., Greiff, S., Lodge, J. M., &amp; Gašević, D. (2025). Distinguishing performance gains from learning when using generative <span>AI</span>. <em>Nature Reviews Psychology</em>, <em>4</em>, 435–436. <a href="https://doi.org/10.1038/s44159-025-00467-5">https://doi.org/10.1038/s44159-025-00467-5</a>
</div>
</div>


</section>

 ]]></description>
  <category>performance vs learning</category>
  <category>desirable difficulties</category>
  <guid>https://jonjoncardoso.github.io/blog/posts/2026-04-11-performance-vs-learning.html</guid>
  <pubDate>Sat, 11 Apr 2026 00:00:00 GMT</pubDate>
  <media:content url="https://jonjoncardoso.github.io/blog/images/slab_seed_81081.webp" medium="image" type="image/webp"/>
</item>
<item>
  <title>Coming Soon</title>
  <link>https://jonjoncardoso.github.io/blog/posts/coming-soon.html</link>
  <description><![CDATA[ 





<p>This space will feature insights on data science education, generative AI in teaching, and evidence-based pedagogical research.</p>



 ]]></description>
  <category>announcement</category>
  <guid>https://jonjoncardoso.github.io/blog/posts/coming-soon.html</guid>
  <pubDate>Fri, 22 Aug 2025 00:00:00 GMT</pubDate>
  <media:content url="https://jonjoncardoso.github.io/blog/posts/images/synthesis-cover.webp" medium="image" type="image/webp"/>
</item>
</channel>
</rss>
