How to Teach Element 2 of 4: Transformations

Sometimes a process isn't a process...

Jun 02, 2025

1×

0:00

-5:46

Podcast is AI generated, and has definitely made mistakes (e.g. no mention of verbalising a process.) Interactive transcript available in the podcast post.

In a previous post, How to Teach Anything, we saw that nearly everything, the hundreds and thousands of ideas we’ll come to teach in maths, can be atomised into one of just four elements.

How to Teach Anything

Kristopher Boulton

May 19

“It feels like magic. Feels like there’s this button you pressed, and suddenly all the kids are getting everything right.”

Read full story

Previously we looked at how to teach Element 1: Categoricals.

How to Teach Element 1 of 4: Categoricals

Kristopher Boulton

May 23

Read full story

Next up:

how do you identify whether one of those atoms is a transformation
and once you do, how exactly do you teach it?

Identifying Transformations

Transformations have an input and an output. A prompt and a response. You present something to students and then they present something back to you, according to the transformation.

For example:

Transformations

In each case something is happening according to a set rule that takes the prompt and turns out the correct response.

Importantly, in each case, it is immediately apparent from the surface details of the prompt how the response was arrived at. This means that none of the examples below are valid transformations.

Not Transformations

None of the above are transformations because it is never immediately apparent just from the surface details how the prompt became the response.

Now, a distinction between categoricals and transformations is how students can respond.

For categoricals, response options are restricted, finite.

You can say ‘yes/no’ this ‘is/isn’t’ an example of X.

You can say that, out of examples A-F, only ‘B, D and E’ are all examples of X.

However you pose the questions, responses will always be restricted to a finite set of options.

For transformations, responses are potentially unrestricted, infinite.

When you ask students to ‘expand this bracket,’ there is an infinity of number/letter/symbol combinations they can theoretically choose from, as in the case where you ask them to expand this:

\(10(4x+5)\)

And they respond with this:

\(104x+50\)

Or this:

\(40x+20\)

This is called the response dimension of the task. For categoricals the response dimension is finite. For transformations it is infinite.

Thanks for reading Unstoppable Learning! This post is public so feel free to share it.

Teaching Transformations

There are nuanced choices you can make for how to teach these, but here is a simple rule of thumb that will work in 80% or more of cases.

Say the goal was teach students how to substitute into an expression.

We might start with this:

\(10x-7y+11\)

And say:

“I’m going to show what we mean when we say ‘substitute’ in algebra.

First, I substitute this:

\(x=3\)

Like this:

\(10(3)-7y+11\)

I substitute this:

\(x=38\)

Like this:

\(10(38)-7y+11\)

And I substitute this:

\(y=19a\)

Like this:”

\(10(38)-7(19a)+11\)

Job done.

No analogies about substituting footballers, no long-winded explanation, no need for any of it. 10-20s later and every kid in your class can substitute.

A simple sequence of three examples. Sometimes even fewer examples are possible for teaching transformations, just one or two. Rarely are more needed. Unlike categoricals, just three examples is usually enough.

The examples in the sequence above adhere to the setup principle, with irrelevant features being held constant each time (for example, the expression we’re substituting into is held constant for now.)

They adhere to the difference principle by showing minimal variation between examples, and then treating them differently (we go from x = 3 to x = 31, and this changes what we substitute.)

They adhere to the sameness principle by showing a maximal variation and treating it the same as every other example (we move to y = 19a, which both shows substitution for y for the first time, and shows substitution of a letter for another letter, but we do the same thing as before - replace the y with brackets, and write in what it was to the right of its equals sign.)

So what do you do now? Did this really work? How do you know?

We’ll look at that later in the week.

Thanks for reading Unstoppable Learning! This post is public so feel free to share it.

Jack Styles

Jun 4

Yes your rationale makes perfect sense for why to show those 3 examples - first trial being potentially a bit too hard allows you to get to efficient communication much more quickly than starting with examples that are too easy and potentially are slower to communicate the rule and potentially more likely to induce stipulation (E.g. that you can only substitute for numbers). And certainly it will come down to a judgement of the class too and you can respond in the moment if you have gone too hard. I think I will change some of my sequences based on this conversation.

I still don't yet believe that the sameness or difference principles apply here or to any transformation as every change in the input results in a change to the output. The difference principle is that when a minimal change results in a change in label then we narrow down what causes a change in the label and by extrapolation can rule out as many possible reasons for the change in label as possible. Whereas the sameness principle is that a great change in input resulting in no change to the label implies a large number of possible cases for which the label still applies by interpolation. But perhaps my understanding of the principles is limited to (categoricals)?

Expand full comment

2 replies by Kristopher Boulton

Jun 3

Thanks for this.

To me it seems that the examples you present seem reasonable in showing the transformation rule here (replace the letter with its given value in brackets) and show a good range of variation which would hopefully be sufficient to induce generalisation.

I don't quite understand what you say about the sameness and difference prompts. If changing from x = 3 to x = 31 leads to a difference in the way we treat the example, then changing from x = 31 to y = 6a also leads to a difference in the way we treat the example.

It seems you've deviated from Theory of Instruction in this which suggests it is preferable to model the first 2 to 5 with minimum difference variations, to include at least 2 test examples as minimum difference variations, and then potentially have maximum difference variations that learners attempt (which forms part of the expansion sequence). My guess is that you've done that to a) reinforce important idea of minimal and maximal difference prompts , b) make it more consistent with approach for ‘Categoricals' and hence make it easier for teachers to remember and c) to reduce the time spent on the modelled examples, hoping that the rule has already been made sufficiently clear so showing the maximal variation in your example is more likely to induce the desired generalisation. Is that correct? Under what conditions would you err more on doing it in the manner prescribed in ToI?

Thanks, very much enjoying the blog.

1 reply by Kristopher Boulton

3 more comments...