Principles of Learning

💡

Description:

This is a living document consolidating the principles of increasing learning rate based on the writing of Math Academy’s chief quant and director of analytics, Justin Skycak

A (long) personal thought

In September I wrote about signing my 6th grader and myself up for courses on mathacademy.com. It’s been a month and we’re addicted and competing with each other to level up in our respective leagues by gaining XP. A unit of XP approximates “one minute of focused effort by a serious but imperfect student”.

[I’ve turned a bunch of readers onto this site just as I was turned on to it by another reader and now I got peeps texting me questions about it or telling me about their kids progress. You love to see it. Random side note — I have a good friend who just moved from my neighborhood to Austin because he’s deep in the education/AI intersection and the weird city is the scenius for education experimentation. I mentioned it to him and let’s just say he knew all about it from different angles. What he told me only got me even more stoked about what mathacademy is doing fwiw.]

In that post, I pasted links to 30 articles that I planned to read by the site’s chief quant Justin Skycak after already doing a fair bit of reading on the blog. I’ve plowed thru the 30 articles and then some, which is still just a fraction of what’s on there.

I’m personally highly interested in the entire topic of using AI to develop talent and learn at rates that were previously unthinkable. I have a large unfinished document with years of insights that I’ve pulled together from various sources that probably won’t see the light of day. For education I’m a big fan of writers like Scott H. Young, Cedric Chin, Matt Bateman, Kathleen Mercury and Freddie deBoer. You can search the substack for all the times I’ve referenced their work and I have plenty more in backlog. I’ve also harped on the degree to which SIG’s education was extremely well-mapped out from a pedagogical point of view. It wasn’t until I heard Todd Simkin explain the educational influences that informed how they taught did I appreciate the extent to which education theory underpinned their methods.

See:

🔗Educational Ideas Inspired By Seymour Papert’s Constructionism (Moontower)

🔗Notes From Todd Simkin On The Knowledge Project (Moontower)

🔗General & Childhood Education Articles (Moontower)

I’m adding Justin to my list of must-reads. After spending most of Sunday with the blog, I’ve synthesized a much more condensed version of Principles of Learning except it’s fully based on Justin’s insights.

I reached out to him when I first discovered the site and made my interest in what he’s doing as plain as possible. I told him:

I think being born on 3rd is to get exposure to someone when you are young who shows just how self imposed our speed limits are.

He hadn’t heard it put that way before.

I harp on this stuff. You’ve seen it on my affirmations page.

The wealth you give a youth is self-efficacy. A chance to match their abilities to the needs of communities they find themselves in as they get older. Autonomy and confidence through competence.

When I say “speed limit” I’m not referring to speed only, or even necessarily. It’s more about limits in general. In athletics, you can’t be Lebron no matter what you do. But whatever your limit is, it’s further than you think. It goes without saying that finding your limit requires brutal effort and commitment…but however far that gets you, personalized instruction will get you even further.

If a great teacher/mentor/coach will get you further than the frontier that caps out at a given level of effort, then that role has insane leverage. The very act of pushing through a previously-conceived frontier will increase your motivation and effort as you see what’s possible.

There was a Washington Post article several years ago referring to “America’s most advanced math program” in Pasadena. The kids were crushing the AP Calc BC exam in 8th grade.

Who were the teachers?

The founders of mathacademy.com

The Math Academy began as a tutoring program run by husband-and-wife duo Jason and Sandy Roberts before being formally adopted into the PUSD curriculum in 2017.

Seen narrowly, MathAcademy is an AI program that helps you learn math faster.

I think this is to miss what’s coming.

The instruction portion of the personalized coach is being automated.

I’m fairly convinced that we aren’t too far from knowledge not just being democratized (I mean Wikipedia already exists) but structured for delivery on incredibly effective, personalized rails.

Before someone’s reactance reflex gets all buzzy, I don’t mean “education” will be solved by a robot. Instruction is simply one component of education. Motivation, support, guidance, as well the type of story-telling and conversation that relates classroom learning to the world and others is as human-based an activity as a warm hug. But if the price for personalized instruction craters, the secondary effects are going to be large and visible.

At scale, we are going to find out just how many kids are capable of finishing Calc BC by grade 8 or publishing a novel in middle school. We hear those stories now and we dismiss them as “genius” or “privileged”.

But what if a low price for personalized instruction tells us we’re wrong about this? There will always be examples of genius or privilege. But if stories of insane achievement start multiplying amongst broken-English immigrants or other groups who are not advantaged in any way EXCEPT in motivation than you’ll know that the things Justin is writing about turned out to be true.

The price of personalized instruction falling is not a panacea. The cost is only a bottleneck after basic needs like stability and safety are met. But the cost is an active bottleneck for all but the rich once those needs are met. Even expensive schools are only incrementally better on truly personalized instruction (their primary advantage might be the compression of the classroom range to a higher functioning average but that’s not the same as personalized instruction so much as a release from tolerating a small number of disproportionally disruptive students).

I’m fascinated by mathacademy because of what it telegraphs — a future of cheap personalized instruction. I’m not picturing slicker edtech apps here. This is a glimpse of something different.

Libraries were free. The internet is free, convenient, and wider reaching. Sal Khan is a prophet who built on its rails. Well, the tracks are being upgraded.

The trains are going to go faster.

This is a synthesis of what I got from reading Justin’s blog.

Maximizing the Learning Rate: A Neuroscience-Informed Approach to Education

The objective function of educational strategy outlined below is to maximize the learning rate—helping students acquire and retain knowledge more effectively. There are certainly great programs for independent learning out there but the objective in this discussion is to leverage technology and cog sci to progress through levels of mastery faster.

What Neuroscience Has Taught Us About the Brain

These are some of the most durable findings in cognitive science.

Neuroplasticity: The brain’s ability to rewire itself through new experiences is one of the most significant findings in neuroscience. Neuroplasticity means that the brain continually adjusts its neural connections in response to new learning. This allows learners to develop new skills and adapt to challenges. Methods like deliberate practice particularly the "effortful repetition" and "successive refinement" aspects repeatedly strengthen neural pathways until tasks become second nature (Talent Development vs Traditional Schooling).

Dopamine and Motivation: Neuroscience has shown that dopamine, a neurotransmitter, plays a critical role in motivation and reward-based learning. When learners experience success, dopamine is released, reinforcing the behavior and encouraging continued effort. This makes motivation a crucial component of the learning process, as it directly influences how willing a learner is to persevere through challenges.

Working Memory and Its Limitations: The brain’s working memory, or the ability to hold and manipulate information temporarily, is limited. Overloading this system can impede learning, as the brain can only focus on a few pieces of information at once. Techniques like chunking—breaking down complex tasks into smaller, more manageable units—can help mitigate this overload (When Should You Do Math in Your Head vs Writing It Out on Paper?).

The Science of Forgetting: One of the most critical insights from cognitive psychology is the concept of forgetting curves. The theory, which dates back to Hermann Ebbinghaus’s pioneering research, shows that learners forget newly acquired information rapidly unless there is some form of reinforcement. The brain’s natural tendency to forget is often visualized in a forgetting curve, which steeply declines in the hours or days after learning. The spacing effect: more long-term retention occurs when you space out your practice, even if it's the same amount of total practice.

Forgetting Curves and Memory Decay: Ebbinghaus’s forgetting curve demonstrates that without review or rehearsal, retention of new knowledge drops quickly over time. However, the rate of forgetting slows down when learners engage in retrieval practice and spaced repetition—both of which can flatten the curve, leading to more durable retention.

Spaced Repetition Leads to Automaticity: Over time, repeated retrieval practice pushes learners toward automaticity—the ability to recall information effortlessly. Once information is retrieved enough times across spaced intervals, it becomes deeply embedded in long-term memory. Efficiency is achieved through repeated activation and myelination – a process where neural pathways are coated with a substance called myelin, increasing the speed and efficiency of signal transmission.

This is the key loop:

Retrieval practice > Automaticity > Reduced demand on working memory >

The learner frees up cognitive resources for more complex tasks, facilitating better problem-solving and higher-order thinking.

"Automaticity frees up cognitive resources that would otherwise be consumed by basic recall tasks, allowing for higher-order cognitive tasks to take place in the working memory.”

What Pedagogy Research Has Taught Us

Pedagogy research provides practical strategies that align with neuroscience insights, helping us understand how to optimize learning environments:

Deliberate Practice: One of the most well-established findings in educational research is the importance of deliberate practice. Unlike passive or rote learning, deliberate practice focuses on honing specific skills through effortful repetition and immediate feedback. This approach helps students achieve automaticity, where foundational skills become second nature and free up cognitive resources for more complex problem-solving. This is why “deliberate practice” is regarded as the most effective training technique across talent domains (The Pedagogically Optimal Way to Learn Math).

Worked Examples to Reduce Cognitive Load: Especially in subjects like mathematics, worked examples are invaluable for novice learners. By showing step-by-step problem-solving processes, worked examples reduce cognitive load, allowing learners to focus on understanding the process rather than inventing solutions. This strategy is effective in reducing overwhelm, a key barrier to learning (Which Cognitive Psychology Findings Are Solid, That Can Be Used to Help Students Learn Better?).

Active Learning for Deeper Understanding: Research consistently shows that active learning—engaging students in activities like problem-solving, discussion, and teaching others—leads to better retention and understanding than passive learning methods like lectures. However, this active engagement must be paired with direct instruction, especially for novices, to prevent cognitive overload (Why is the EdTech Industry So Damn Soft?).

Interleaving Practice: Interleaving, or mixing different topics or skills within a study session, forces the brain to continually retrieve and apply information, strengthening neural connections. While it may feel harder for learners, this desirable difficulty improves long-term retention and the ability to transfer knowledge to new contexts (Which Cognitive Psychology Findings Are Solid, That Can Be Used to Help Students Learn Better?).

Implications for Implementation

Here’s how these insights translate into effective learning strategies:

Retrieval Practice and Minimizing Forgetting: The act of retrieving information from memory, rather than passively reviewing material, significantly boosts retention. Each successful retrieval attempt strengthens neural pathways and makes the knowledge more durable. As learners engage in retrieval, they disrupt the forgetting curve and prolong the retention of knowledge (Which Cognitive Psychology Findings Are Solid, That Can Be Used to Help Students Learn Better?).

Spaced Repetition for Long-Term Retention: A profound consequence of the spacing effect is that the more reviews are completed (with appropriate spacing), the longer the memory will be retained, and the longer one can wait until the next review is needed. This observation gives rise to a systematic method for reviewing previously-learned material called spaced repetition (or distributed practice). A "repetition" is a successful review at the appropriate time. By structuring review sessions at increasingly spaced intervals, learners allow time for memory consolidation. This reduces the steep decline of the forgetting curve, especially in the early stages of learning. Over time, the intervals between repetitions can be extended without significant loss in retention, enabling efficient long-term learning. The use of spaced repetition systems (SRS) has demonstrated significant improvements in student performance (Optimized, Individualized Spaced Repetition in Hierarchical Knowledge Structures).

The testing effect, also known as the retrieval practice effect: the best way to review material is to test yourself on it, that is, practice retrieving it from memory, unassisted. To maximize the amount by which your memory is extended when solving review problems, it's necessary to avoid looking back at reference material unless you are totally stuck and cannot remember how to proceed. The testing effect can be combined with spaced repetition to produce an even more potent learning technique known as spaced retrieval practice.

During review, it's also best to spread minimal effective doses of practice across various skills. This is known as mixed practice or interleaving -- it's the opposite of "blocked" practice, which involves extensive consecutive repetition of a single skill. Blocked practice can give a false sense of mastery and fluency because it allows students to settle into a robotic rhythm of mindlessly applying one type of solution to one type of problem. Mixed practice, on the other hand, creates a "desirable difficulty" that promotes vastly superior retention and generalization, making it a more effective review strategy.

An emphasis on automaticity: To free up mental processing power, it's critical to practice low-level skills enough that they can be carried out without requiring conscious effort. This is known as automaticity. Think of a basketball player who is running, dribbling, and strategizing all at the same time -- if they had to consciously manage every bounce and every stride, they'd be too overwhelmed to look around and strategize. The same is true in learning.

Common Misconceptions in Learning

It’s easy to fall into widely accepted beliefs about how people learn, but research has debunked many of these ideas. Here are a few myths that might surprise you:

Learning Styles: Contrary to popular belief, the idea that individuals have specific "learning styles" (e.g., visual, auditory, kinesthetic) and that teaching should be tailored to these styles is unsupported by research. While students may have preferences, these preferences do not significantly improve learning outcomes. Instead, using varied teaching methods that engage multiple senses enhances learning for all students (Why is the EdTech Industry So Damn Soft?). Veritasium has also called this the “biggest myth in education”.

The Myth of Productive Struggle: While allowing learners to struggle through difficult problems might seem beneficial, research has shown that this is often counterproductive, particularly for novices. Without proper guidance, prolonged struggle leads to frustration and disengagement. Scaffolding and explicit instruction provide the necessary support to avoid cognitive overload and enable meaningful progress (What’s the Best Way to Teach Math: Explicit Instruction or Less Guided Learning?).

Discovery Learning vs. Direct Instruction: The idea that students should learn concepts through self-discovery has been largely debunked, especially for beginners. Direct instruction, which provides clear guidance and support, has proven far more effective in most learning scenarios. Discovery learning works well for experts but can leave novices overwhelmed and unproductive, a paradoxical finding known as the “expertise reversal effect”. (The Pedagogically Optimal Way to Learn Math).

The Illusion of Comprehension: Learners often mistake familiarity with material for true understanding—a phenomenon known as the illusion of comprehension. Just because something feels familiar doesn't mean the learner can apply it effectively. Combatting this requires practices like retrieval practice and interleaving, which force deeper engagement with the material (Which Cognitive Psychology Findings Are Solid, That Can Be Used to Help Students Learn Better?).

This knowledge isn’t new. Why does it feel that way?

Most key findings have been known for many decades.

It’s just that they’re not widely known / circulated outside the niche fields of cognitive science & talent development, not even in seemingly adjacent fields like education.

Why?

Let’s start with a foundational insight to this approach…

The most effective type of active learning is deliberate practice, which consists of individualized training activities specially chosen to improve specific aspects of a student's performance through repetition (effortful repetition, not mindless repetition) and successive refinement.

…feels underemphasized in conventional education circles.

A few possible reasons:

Because deliberate practice requires intense effort focused in areas beyond one's repertoire, which tends to be more effortful and less enjoyable, people will tend to avoid it, instead opting to ineffectively practice within their level of comfort (which is never a form of deliberate practice, no matter what activities are performed).

Instructional techniques that promote the most learning in experts, promote the least learning in beginners, and vice versa. This is known as the expertise reversal effect. An important consequence is that effective methods of practice for students typically should NOT emulate what experts do in the professional workplace (e.g., working in groups to solve open-ended problems). Beginners (i.e. students) learn most effectively through direct instruction.

Leveraging the insights is expensive because requires additional effort from both teachers and students.

Each strategy increases the intensity of effort required from students and/or instructors, and the extra effort is then converted into an outsized gain in learning. This theme is so well-documented in the literature that it even has a catchy name: a practice condition that makes the task harder, slowing down the learning process yet improving recall and transfer, is known as a desirable difficulty. Desirable difficulties make practice more representative of true assessment conditions. Consequently, it is easy for students (and their teachers) to vastly overestimate their knowledge if they do not leverage desirable difficulties during practice, a phenomenon known as the illusion of comprehension.

Incentives The typical teacher is incentivized to maximize the immediate performance and/or happiness of their students, which biases them against introducing desirable difficulties and incentivizes them to promote illusions of comprehension. Using desirable difficulties exposes the reality that students didn’t actually learn as much as they (and their teachers) “felt” they did under less effortful conditions. This reality is inconvenient to students and teachers alike; therefore, it is common to simply believe the illusion of learning and avoid activities that might present evidence to the contrary.

💡

Most edtech systems do not actually leverage the above findings.

If you pick any edtech system off the shelf and check whether it leverages each of the cognitive learning strategies I’ve described above, you’ll probably be surprised at how few it actually uses. For instance:

Tons of systems don't scaffold their content into bite-sized pieces.

Tons of systems allow students to move on to more material despite not demonstrating knowledge of prerequisite material.

Tons of systems don't do spaced review. (Moreover, tons of systems don't do ANY review.)

Sometimes a system will appear to leverage some finding, but if you look more closely it turns out that this is actually an illusion that is made possible by cutting corners somewhere less obvious.

For instance:

Tons of systems offer bite-sized pieces of content, BUT they accomplish this by watering down the content, cherry-picking the simplest cases of each problem type, and skipping lots of content that would reasonably be covered in a standard textbook.

Tons of systems make students do prerequisite lessons before moving on to more advanced lessons, BUT they don't actually measure tangible mastery on prerequisite lessons. Simply watching a video and/or attempting some problems is not mastery. The student has to actually be getting problems right, and those problems have to be representative of the content covered in the lesson.

Tons of systems claim to help students when they're struggling, BUT the way they do this is by lowering the bar for success on the learning task (e.g., by giving away hints). Really, what the system needs to do is take actions that are most likely to strengthen a student's area of weakness and empower them to clear the bar fully and independently on their next attempt.

I’m not saying that these issues apply to all edtech systems. I do think edtech is the way forward here – optimal teaching is an inhuman amount of work, and technology is needed. Heck, I personally developed all the quantitative software behind one system that properly handles the above challenges. All I’m saying is that you can’t just take these things at face value. Many edtech systems don’t really work from a learning standpoint, just as many psychology findings don’t hold up in replication – but at the same time, some edtech systems do work, shockingly well, just as some cognitive psychology findings do hold up and can be leveraged to massively increase student learning.

Accountability Even if you leverage the above findings, you still have to hold students accountable for learning. Suppose you have the Platonic ideal of an edtech system that leverages all the above cognitive learning strategies to their fullest extent. Can you just put a student on it and expect them to learn? Heck no! That would only work for exceptionally motivated students. Most students are not motivated to learn the subject material. They need a responsible adult – such as a parent or a teacher – to incentivize them and hold them accountable for their behavior. I can’t tell you how many times I’ve seen the following situation play out:

Adult puts a student on an edtech system.

Student goofs off doing other things instead (e.g., watching YouTube).

Adult checks in, realizes the student is not accomplishing anything, and asks the student what's going on.

Student says that the system is too hard or otherwise doesn't work.

Adult might take the student's word at face value. Or, if the adult notices that the student hasn't actually attempted any work and calls them out on it, the scenario repeats with the student putting forth as little effort as possible -- enough to convince the adult that they're trying, but not enough to really make progress. In these situations, here’s what needs to happen:

The adult needs to sit down next to the student and force them to actually put forth the effort required to use the system properly.

Once it's established that the student is able to make progress by putting forth sufficient effort, the adult needs to continue holding the student accountable for their daily progress. If the student ever stops making progress, the adult needs to sit down next to the student again and get them back on the rails.

To keep the student on the rails without having to sit down next to them all the time, the adult needs to set up an incentive structure. Even little things go a long way, like "if you complete all your work this week then we'll go get ice cream on the weekend," or "no video games tonight until you complete your work." The incentive has to be centered around something that the student actually cares about, whether that be dessert, gaming, movies, books, etc. Even if an adult puts a student on an edtech system that is truly optimal, if the adult clocks out and stops holding the student accountable for completing their work every day, then of course the overall learning outcome is going to be worse.

None of this is a knock on teachers nor parents. The truth is that the problem is irreducibly difficult: leveraging cognitive learning strategies to their fullest extent requires an inhuman amount of effort from teachers. Consider the ideal classroom:

Every individual student is fully engaged in productive problem-solving, with immediate feedback (including remedial support when necessary), on the specific types of problems, and in the specific types of settings (e.g., with vs without reference material, blocked vs interleaved, timed vs untimed), that will move the needle the most for their personal learning progress at that specific moment in time.
This is happening throughout the entirety of class time, the only exceptions being those brief moments when a student is introduced to a new topic and observes a worked example before jumping into active problem-solving.

Why is this an inhuman amount of work? Mapping and responding to the student’s level

First of all, it's at best extremely difficult, and at worst (and most commonly) impossible, to find a type of problem that is productive for all students in the class. Even if a teacher chooses a type of problem that is appropriate for what they perceive to be the "class average" knowledge profile, it will typically be too hard for many students and too easy for many others (an unproductive use of time for those students either way).
To even know the specific problem types that each student needs to work on, the teacher has to separately track each student's progress on each problem type, manage a spaced repetition schedule of when each student needs to review each topic, and continually update each schedule based on the student's performance (which can be incredibly complicated given that each time a student learns or reviews an advanced topic, they're implicitly reviewing many simpler topics, all of whose repetition schedules need to be adjusted as a result, depending on how the student performed). This is an inhuman amount of bookkeeping and computation.

Each at their own pace

Even on the rare occasion that a teacher manages to find a type of problem that is productive for all students in the class, different students will require different amounts of practice to master the solution technique. Some students will catch on quickly and be ready to move on to more difficult problems after solving just a couple problems of the given type, while other students will require many more attempts before they are able to solve problems of the given type successfully on their own. Additionally, some students will solve problems quickly while others will require more time.

In the absence of the proper technology, it is impossible for a single human teacher to deliver an optimal learning experience to a classroom of many students with heterogeneous knowledge profiles, who all need to work on different types of problems and receive immediate feedback on each attempt.

Connecting It All: The Flywheel of Competence, Confidence, and Motivation

When neuroscience and pedagogy principles are applied in tandem, they create a reinforcing cycle that propels students toward continuous growth and mastery:

Competence: Effective learning techniques, such as deliberate practice and retrieval practice, build competence. As learners master fundamental skills, they achieve automaticity, allowing them to perform basic tasks effortlessly, freeing up mental resources for tackling more advanced problems (Automaticity for Cognitive Efficiency).

Confidence: With growing competence comes confidence. When learners see themselves succeeding—whether it's mastering a math concept or improving in a skill—they are more likely to tackle new challenges with a positive mindset. This confidence feeds into their willingness to engage with difficult tasks (Recreational Mathematics: Why Focus on Projects Over Puzzles).

Motivation: Confidence breeds motivation. As students become more confident in their abilities, they are more driven to continue learning. This motivation reinforces their engagement in deliberate practice, completing the flywheel and leading to greater competence over time. Accountability, whether through structured learning programs or paid educational platforms, also plays a role in keeping learners committed to their goals.

Key points from select posts

Recreational Mathematics: Why Focus on Projects Over Puzzles (2 min read)

There’s only so much fun you can have trying to follow another person’s footsteps to arrive at a known solution. There’s only so much confidence you can build from fighting against a problem that someone else has intentionally set up to be well-posed and elegantly solvable if you think about it the right way.

Why is the EdTech Industry So Damn Soft? (11 min read)

The expectation of free or cheap online learning is detrimental to the industry. Skycak argues that this leads companies to prioritize a massive user base over effective learning strategies. The reasoning:

1) The Problem with Free: When companies offer their products for free or at very low prices, they need a massive user base to survive. This creates a problem because most people are not serious about learning and prefer less effective, easier methods.

2) Effective Learning is Hard Work: Learning is like an "exhausting workout with a personal trainer," requiring consistent effort. Only a small percentage of people are willing to invest this time and effort.

3) The Trap of "Unserious Learners": Catering to a large base of unserious learners forces companies to adopt ineffective teaching methods that don't challenge students. This makes these companies "frauds" because they prioritize engagement over actual learning.

4) The Power of Accountability: Charging a fair price for educational products allows companies to be accountable for student results. When students pay, they demand results, pushing companies to provide valuable learning experiences.

The Situation with AI in STEM Education (11 min read)

LLMs fall short in teaching complex technical skills and concepts. While LLMs can provide surface-level information, they struggle to convey the depth required for practical application in fields like neural networks. For example, an LLM might help a student understand the general idea of neural networks, but it can't teach them how to code one from scratch, troubleshoot errors, or explain the nuances of different model components.

The major limitation of LLMs in education is their reliance on student-initiated questions. Effective teachers don't simply answer questions; they guide students through a structured learning process, scaffolding information and addressing knowledge gaps. LLMs, like ChatGPT, primarily respond to prompts, lacking the pedagogical ability to anticipate a student's needs or direct their learning path.

A successful STEM education requires breaking down complex concepts into a progression manageable steps, providing appropriate practice, and addressing individual student struggles.

The promise of AI in education overemphasizes the role of "explanation". Scaffolding and learning management are equally important. He cautions against prioritizing AI's ability to engage in conversational dialogue over its capacity to deliver well-structured, personalized learning experiences. He emphasizes the need for frequent assessments, personalized remediation, and spaced repetition to reinforce learning.

Optimized, Individualized Spaced Repetition in Hierarchical Knowledge Structures (22 min read)

Fractional Implicit Repetition (FIRe): This model, developed by the author for mathematics education, recognizes the interconnected nature of mathematical knowledge. Repetitions on advanced topics "trickle down" to implicitly reinforce simpler, encompassed topics. The model adjusts repetition schedules accordingly, leading to significant efficiency gains.

Repetition Compression: This key concept highlights how practicing advanced mathematical skills implicitly reinforces simpler, encompassed skills. This interconnectedness allows for fewer explicit reviews as advanced practice inherently covers foundational elements, leading to more efficient learning.

Theoretical Maximum Learning Efficiency In physics, nothing can travel faster than the speed of light. It is the theoretical maximum speed that any physical object can attain. A universal constant. In the context of spaced repetition, there is an analogous concept: theoretical maximum learning efficiency which posits that in a perfectly encompassed body of knowledge, it's theoretically possible to achieve mastery through continuously learning new, progressively advanced topics without ever explicitly reviewing old material. This idea, while theoretical, underscores the power of leveraging knowledge interconnectedness.

Calibrating to Individual Students and Topics: The model tailors the learning pace to each student and topic. It recognizes that learning speed varies based on individual ability and topic difficulty, leading to a personalized learning experience. Weaker students, particularly on difficult topics, might benefit less from implicit repetitions, requiring more explicit reviews.

Importance of Encompassing Graphs (as opposed to prerequisite graphs) : Unlike prerequisite graphs which show learning dependencies, encompassing graphs map how practicing advanced topics reinforces prior knowledge. Constructing these graphs is a laborious, manual process requiring significant domain expertise, highlighting the importance of expert-designed learning pathways.

From the Pasadena experience covered in the media

Evidence of our learning efficiency from our original in-school program in Pasadena. Sixth-graders begin with Prealgebra and progress through the entire high school math curriculum (Algebra 1, Geometry, Algebra 2, Precalculus) by 8th grade. They then tackle AP Calculus BC and take the AP exam.

Initially, AP scores were decent with manual teaching. However, after implementing our automated system—which includes the spaced repetition system (SRS) described here—AP Calculus BC exam scores soared. Most students passed, with the majority achieving the maximum score of 5 out of 5. That same year, four students unaffiliated with our Pasadena program used our system independently. Three scored a perfect 5 on the AP exam, while one received a 4.

We even witnessed seemingly impossible feats. Some highly motivated 6th graders, starting midway through Prealgebra, completed the entire high school math curriculum and began AP Calculus BC within a single school year. This progress was so rapid that when Math Academy founders Jason and Sandy first saw a 6th grader receiving Calculus tasks, Jason's reaction was, "WTF is happening with the model? Why is this kid getting calculus tasks? He placed into Prealgebra last fall—this doesn't make any sense." Upon investigation, I confirmed it was legitimate—this student had indeed mastered all of high school math in just one year.

If you're interested in learning more, check out these links: mathacademy.us/press, a relevant Reddit thread, a story I posted on X/Twitter (and some follow-up to that story).

Talent Development vs Traditional Schooling (12 min read)

Orthogonality of Talent Development and Schooling: The source argues that traditional schooling, with its age-based grouping and standardized curricula, often fails to effectively nurture talent. This stark contrast emphasizes the need for specialized approaches outside the traditional classroom setting. Talent development is not only different from schooling, but in many cases completely orthogonal to schooling: "For one portion of our sample, talent development and schooling were almost two separate spheres of their life. ... Usually the student made the adjustments, resolving the conflict by doing all that was a part of schooling and then finding the additional time, energy, and resources for talent development. ... Mathematicians found and worked through special books and engaged in special projects and programs outside of school. Sometimes the schools or particular teachers made minor adjustments to dissipate the conflict. Mathematicians were sometimes excused from a class they were too advanced for and allowed to work on their own in the library. Sometimes they were accelerated one grade as a concession to their outside learning. ... Whether the individual or the school made these adjustments, it was clear that these adjustments minimized conflict but did little to assist in talent development. The individual was able to work at both schooling and talent development, although with minimum interaction. ... Talent development and schooling were isolated from one another. Schooling did not assist in talent development, but in these instances it did not interfere with talent development."

Individualized Instruction in Talent Development: Unlike the group-focused approach of schools, talent development thrives on personalized instruction, tailoring learning tasks to individual needs and ensuring mastery before moving on. This distinction underscores the importance of personalized learning pathways in maximizing potential.

Longitudinal Accountability vs. Cross-Sectional Focus: In talent development, instructors often work with individuals over extended periods, fostering a deep understanding of their strengths and weaknesses. This contrasts with traditional schooling's focus on short-term performance within a specific grade or course.

Prohibitive Cost of Effective Talent Development: The source highlights the significant financial barrier to personalized talent development, particularly in mathematics. The high cost of private tutoring makes it inaccessible for many, raising concerns about equity and access.

The Pedagogically Optimal Way to Learn Math (14 min read)

Deliberate Practice over Challenge Problems: Focusing solely on challenging problems can be inefficient, especially for learners who haven't mastered foundational skills. It advocates for deliberate practice - focused repetition with continual refinement - as a more effective approach, aligning with the emphasis on deliberate practice in other sources.

Expertise Reversal Effect: Instructional strategies effective for experts may not be ideal for beginners. While experts thrive on open-ended problem solving, novices benefit more from direct instruction and scaffolded learning experiences.

Worked Examples for Cognitive Load Management: The source argues for the use of worked examples to reduce cognitive load and facilitate learning, particularly as mathematical concepts become more complex.

Layering Skills for Deeper Understanding: Mastering advanced skills built upon foundational knowledge naturally fosters a deeper understanding of those foundations. This "structural integrity" of knowledge highlights how progression itself can deepen understanding, challenging the notion that worked examples hinder deep learning.

Importance of Expert Guidance in Deliberate Practice: The source stresses that even with metacognitive skills, learners often need expert guidance to design effective deliberate practice activities. Experts can identify weaknesses, target specific skills, and create appropriately challenging exercises.

What’s the Best Way to Teach Math: Explicit Instruction or Less Guided Learning? (10 min read)

Unequivocal Support for Explicit Instruction: This is a fake debate. Explicit instruction dominates minimally guided approaches backed by decades of research supporting its effectiveness, particularly for novice learners. It challenges the notion that "discovery learning" is inherently superior.

Addressing the "Struggle Builds Problem-Solving" Argument: Refutes the common argument that students need prolonged struggle with minimal guidance to develop problem-solving skills. It argues that such an approach is not supported by evidence, especially for non-expert learners.

Active Learning with Direct Instruction: The source clarifies that active learning is superior to passive learning but emphasizes that this doesn't equate to endorsing minimal guidance. It highlights the effectiveness of combining active learning with direct instruction, embodied in the concept of deliberate practice.

Addressing Misinterpretations of Research: The source cautions against misinterpreting research findings that show active learning with minimal guidance outperforming passive learning with direct instruction. It emphasizes that such results don't necessarily mean minimal guidance is superior, as other factors are often at play.

For anyone who has dug into the scientific literature on direct vs unguided instruction, I should point out a common area of confusion. In some scientific experiments, active learning with minimal guidance has outperformed passive learning with direct instruction. These experiments often misinterpreted by the lay audience as providing support for minimal guidance – which is not even remotely true, even though minimal guidance is a part of the more successful experimental condition. The thing is, it’s widely known that active learning is superior to passive learning. Students who spend a class period actively solving problems, hands-on, learn more than students who sit and passively listen to a lecture. So, any experimental condition involving active learning will likely outperform any experimental condition involving passive learning, no matter what other second-order strategies are tacked on! You could take tired students, engage them in active learning, and probably get a better learning outcome than well-rested students experiencing passive learning. But does that mean being tired is good for learning? Heck no! By the same reasoning, if you find that active learning with minimal guidance outperforms passive learning with direct instruction, you cannot conclude that minimal guidance is superior to direct instruction. In general, if you change two variables at once and get a better outcome, you can’t conclude anything about either of the individual variables. All you can conclude is that one combination is better than the other combination. In the case of minimal guidance vs direct instruction, it turns out that when you do both in an active learning setting, direct instruction outperforms minimal guidance. Here’s the big picture: active/direct > active/unguided > passive/direct. (I didn’t include passive/unguided here because I’m not sure it’s even possible to create such a combination.) Now, how do you do direct instruction and active learning simultaneously? Is that even possible? Well, that’s essentially what “deliberate practice” is, and in the academic field of talent development, there’s a mountain of evidence supporting deliberate practice as the most effective training technique across a wide variety of talent domains. In particular, a key finding is that the volume of accumulated deliberate practice is the single biggest factor responsible for individual differences in performance among elite performers. (The next biggest factor is genetics, and the relative contributions of deliberate practice vs genetics can can vary significantly across talent domains.) This seems to be a general result across all talent domains: as far as I’m aware, no counterexamples have been found.

Which Cognitive Psychology Findings are Solid, That Can Be Used to Help Students Learn Better? (15 min read)

Active Learning Superiority: This source reaffirms the well-established finding that active learning, involving problem-solving and engagement, leads to better learning outcomes than passive approaches like traditional lectures.

Spacing Effect and Spaced Retrieval Practice: The source highlights the power of spaced repetition and retrieval practice, scientifically validated techniques for enhancing long-term retention. These strategies are presented as fundamental principles for effective learning.

Mixed Practice and Interleaving: The source advocates for interleaving different skills during practice rather than focusing on one skill at a time (blocked practice). This mixed practice promotes better retention and generalization by forcing learners to retrieve and apply different concepts, creating "desirable difficulty."

Automaticity for Cognitive Efficiency: Mastering fundamental skills to the point of automaticity (performing them without conscious effort) frees up cognitive resources for higher-order thinking. This concept highlights the importance of fluency in foundational knowledge for tackling more complex problems.

Desirable Difficulties and Illusion of Comprehension: The source introduces the concepts of "desirable difficulties" and the "illusion of comprehension." Desirable difficulties, while making learning more effortful in the short term, lead to long-term benefits. The illusion of comprehension refers to overestimating understanding when learning feels easy, often due to ineffective strategies.

Challenges in Implementing Effective Strategies: The source acknowledges the challenges in implementing research-backed learning strategies, citing factors like increased effort for teachers and students, the difficulty of individualization in traditional classrooms, and the lack of effective technology solutions.

My notes from Justin’s book The Math Academy Way

🎗️

A useful reminder from You Will Never Achieve Your Goals Unless You Transform Yourself Into a Person Who is Capable of Achieving Them:

You want to do something that sets you apart? You’re going to have to work harder than most.

Actually, let’s re-print the entire post:

The #1 confusion that I hear when people ask me about math, ML/AI, startups, etc., is they think there’s a way to achieve outsized success without putting in an outsized amount of work. You want to do something that sets you apart? You’re going to have to work harder than most. There is no way around it. You think you can get good at math by watching YouTube videos? Develop cutting-edge ML/AI by asking ChatGPT to code it up for you? Put a dent in the universe working 40 hours per week? If you think any of those things, then you will never achieve your goals because you will never transform yourself into a person who is capable of achieving them. And guess what? It’s not enough to simply work hard. To achieve outsized success, it’s critical to not only put in enough time/effort, but also to work productively. You have to work hard AND work smart. And furthermore, work in a direction where you have some competitive advantage (or, at least, you’re not at a disadvantage). Part of this work involves engaging in activities that maximize the likelihood of you getting some lucky breaks. You have to work to maximize your luck surface area.

Principles of Learning Fast