I don't think I'd found anything yet that pointed to 2!
Do you mean that for finding a linear sequence you take the 1st common difference and divide by 1!
For quads 2nd common difference divide by 2!
For cubics 3rd common difference divide by 3!
For quartics 4th common difference divide by 4!
For Nth common difference, divide by N!
?
If so then yeah that's great in terms of structure and generalisation.
You have to balance it the against the need to teach factorials first, and adds a lot of additional information that they don't 'need' to know for the exam / in school, which raises questions about time cost and whether you can teach it all in the time available.
Yes to the above, it’s a lovely generalisation to divide by the difference level factorial.
In terms of teaching factorials, would you be doing this when it comes to product rule? I’d say you could probably teach it like a fact. Multiply each natural number before together. In which case it shouldn’t take long to add. Is it essential for all? Definitely not and for those they will likely simply revert to divide by 2.
I think some of the maths challenges also do a little bit with factorials, or it simplifies the maths if they do?
Definitely agree with the time balance element and going too far into non-curriculum content.
I have also been teaching in Australia and was planning to post something very similar to that Alex! Though your atomisation is a bit better than what I'd planned. I agree that there is no point teaching a skill very successfully in isolation if the skill is completely unrelated to all others. At its most extreme, I could teach this one this way: "The nth term of this sequence (...) is 40 - 5n. What is the nth term of this sequence?" Obviously this would be both successful - with sufficient retrieval practise - and utterly pointless as it wouldn't generalise to anything else and would be a useless fact connected to nothing else. On the other hand, I could go far the other way and include a check to see that it's linear, and make it more general - have a table of values in terms of x and y where x goes up in increments other than 1. Or where we're told its linear, but the x values goes up in irregular increments, and some process such as change in y over change in x, is required. This would then generalise completely to finding the equation of a straight line given a table of values.
Interesting Kris' point about the split attention effect and how a routine is better as one train of thought, I need to think about that more, seems useful.
There are three general principles I can think of when inventing a cognitive routine that need to be weighed up:
1) The more generalisable the routine is, the better. This relates to both the generalisability/usefulness of the atoms that need to be taught as well as the generalisability of the routine itself. Here, I agree that it should be a routine which clearly leads into understanding of linear relationships and graphing straight lines etc (as yours does Alex), as well as finding the nth term for quadratic sequences.
2) The more intuitive the routine is, the better. If steps make sense so that there are mechanisms for students to self check then that is better.
3) The simpler it is for students to learn, and the less time taken to teach and to do, the better.
Clearly, principle 1 is often in conflict with 2 and 3, and the desired generalisability is a function of many things.
For this specific case, I prefer the second approach you outline for finding the nth term of a quadratic sequence Naveen - atoms 9-11 should be a familiar subroutine and could simply be condensed to 'find nth term of resulting linear sequence' which should be secure by the time teaching this. In the first one, atom 7 is quite a useless fact in terms of generalisation of understanding and would be relatively hard to recall, its a shortcut to solving simultaneous equations (I also think maybe it shoudn't be divided by 2?) It might be simpler to learn, as principle 3 asks for, but is in conflict with both principle 1 and 2.
In terms of the nth term - I think a link to a linear sequence as in Alex's routine, increasing generalisability, is probably worth the hit to its intuitiveness. But maybe that's not the case given the GCSE syllabus.
Engelmann is clearer than most that what we aim to teach is generalisation. So of course little or nothing useful is learnt if you only learn to memorise the Nth term rule of one specific linear sequence.
I wouldn't say that there is *no* point in teaching something in mathematics in isolation. Sometimes that's unavoidable - there are just no further connections to form within the limits of what we'll be teaching. Other times it's the first step on a longer journey, where connections will be drawn in the future, near or distance (for this I draw inspiration from Willingham's idea of Flexible Knowledge.)
(2) is interesting. I've found that intuition is sometimes a barrier to learning.
For example, if I start kids on a unit on similar shapes and scale factors using SFs of 2 and 3 (which is typical,) whatever else I tell them, they usually intuit the scale factor and ignore anything else I say. Then, with even simple scale factors like 1.5, I've seen top set groups completely fall apart - they can't figure out how to find the scale factor because they can't intuit the relationship between 12 and 8.
I've seen a similar thing happen where a very bright but also very recalcitrant 16 year old refused to listen to anything I said about solving simultaneous equations because he could brute force the simple integer solutions through trial and improvement. Though of course, as soon as the solutions were modified as little at x = 2.5 and y = 3, he could no longer find them (not that that inclined him any further towards wanting to listen to the formal method!)
What I learnt is that intuition doesn't generalise. We created formal methods for a reason, and often they are unintuitive (if they weren't, we wouldn't need them.)
On the other hand, formal methods can seem like magic without meaning if we don't actively do work to help students construct it. Sometimes that's easy to do, other times it's borderline impossible for the level we're working at (e.g. formulae for volumes of curved solids.)
My own conclusion here has been to treat intuition and formal methodology as independent of one another, teach them independently of one another, and then connect them where possible.
For example, for similar shapes I will now start with shapes where it's impossible to intuit the scale factor, e.g. corresponding lengths of 7 and 11.
They learn the rule to 'times by a fraction,' in this case
* 11/7
This idea carries through more or less all proportional reasoning, including unit conversion, ratio, gradient, and trigonometry.
It's easy to learn, easy to apply, hyper-generalised, but it doesn't carry any intuition.
By contrast, the unitary method is very intuitive.
If 7 units are worth this much, how much is 1 unit worth? Now, how much are 11 units worth?
Learn that separately, and then connect the two based on how we 'times by a fraction' - divide by the bottom, times by the top - just what we do step by step for the unitary method. Essentially 'times by a fraction' encodes and formalises the steps of the more intuitive unitary method.
--
So I don't think your (1) is always in conflict with (2) and (3.) More often than not I find they can be aligned.
The atomisation I offered for linear sequences, for example, generalised to all linear sequences. If you opt for the subtraction method for quadratics then it becomes an atom in the routine for quadratics as well (a subroutine,) generalising further.
The method of substitution for linear sequences, also, does not actually generalise to quadratics; nor does the method for quadratics generalise to cubics.
What is provides is more exposure to and practise of a kind of 'meta-idea' in maths, which is that sometimes we can model a situation by finding the parameters of a general form, and then substituting.
While that idea holds true for both linear and quadratics sequences, *how* you find the parameters is different each time.
Interesting and yeah I take that point about intuitiveness. I learnt from something you'd said at some point or other and had my first gradient examples as fractions and yes, so so much better when integers are special cases, and I'm sure with similar shapes too. I like what you said re example of teaching transformation (multiply by the fractional scale factor) and then later linking it to make sense via the unitary method.
Sometimes I've been gung ho about teaching the what and then the why later and I have found that - despite my framing that I would tell the why later and to trust me and go along with it - students often feel very dissatisfied with it. I did it once with subtracting negatives and had to concede as some students just wouldn't let it go until I explained it and it kind of ruined the lesson. I guess it's just being selective about when it's worth that trade-off. Have you encountered that resistance and do you have any other heuristic of when it's worth telling the why upfront? Feel free to just say it'll become clear with further posts. When students have just accepted how to do it and do it, I also find that I'm not motivated to explain the why to as full of an extent as I would have done so, and kids are often not as interested as they can already do the skill. Is there a danger of them internalizing that maths is not something you're supposed to understand, you just follow the method, with frequent use of what-then-why approach?
Perhaps another way of framing what I'd meant re intuitiveness with a routine was the extent to which one atom or subroutine prompts the next. Maybe the principle should be the greater extent to which each step is prompted by the prior step the better. So in this case I also think Naveen's second quadratic sequence is better for that reason too.
I think the quadratic method does generalise to higher order polynomial sequences too with a slight tweak: To find leading coefficient a_n you find the nth order common difference and divide by n! (There's a link to the successive differentiation of polynomials in generating the factorial terms). Then you just iteratively use that to obtain each successive coefficient. So coefficient of n^3 is third order common difference/3!. I think can be made more general by writing the nth term as an equation (T_n =-5n + 40) rather than expression, as then links to both linear relations and also sequences and series that they'll might do at A-level. More teaching required though and harder to get 100% learning it quickly for many classes. Would you do more general approaches with stronger classes? You said 10-15 mins to teach this which sounds far quicker than I could manage doing any approach.
"Would you do more general approaches with stronger classes?"
I prefer the same approach for all classes, and the difference comes in terms of:
(1) how many new atoms and chains a higher set can learn in a unit of time (more)
(2) how much practice a higher set needs to remember ideas in the future (less)
So I picture it as the same journey, but higher sets move along it faster.
Seen this way, the 95% Grade 9 (or equivalent) makes sense as something like: higher ability achieve it around age 10-12, and lower ability achieve it around age 16-18.
More general approaches both help to better see the structure of mathematics per se, so learn actual mathematics, and also reduce the amount you have to learn and remember in the future - or at least make it easier to access and retrieve, since it's all connected.
But as with the last comment, this only makes sense in a certain structure.
In the structure that basically all schools currently adopt, I would personally be making decisions depending on context.
Back to the e.g. of bottom set year 9 linear sequences, descending and starting with negative terms, and knowing their negative arithmetic is weak, and I'm just designing 'tomorrow's lesson,' I'm going to make the call to teach them what they need to 'do the thing,' guaranteed, simple as possible to learn and recall in the future, and knowing that they might never get to see all this deeper structure.
In this context, if I were to try and go the other way and focus on generalisability in the 15 minutes I have, odds are they would succeed at and recall nothing (except their prior failure, and knowledge that 'they're no good at maths.)
"You said 10-15 mins to teach this which sounds far quicker than I could manage doing any approach."
The point about time only makes sense once you fully internalise some structural changes.
If you think of a lesson as a 60 minute block in which you have to teach everything on a topic, it doesn't make much sense.
If you think of it as 60 minutes of time to spend on learning new content, and remembering and connecting, and you split that in a ratio of roughly 20:80, then 12 minutes becomes your upper bound for learning 'anything' new.
Not a lot of time.
Unless you then also consider 'anything' as either an atom, or a chain of known atoms, taught through instructional sequences. Then suddenly 12 minutes a huge amount of time. You can cover many atoms in that time.
Then, since you know you're getting ~48 minutes to practice prior learning tomorrow, and the day after, you don't have to practice that atom as much right now, today.
And then, since this system leads to 100% of kids both learning successfully and remembering everything, you get gargantuan efficiency gains over time. So you can progress more slowly when first learning a topic, in terms of *days* or *lessons* spent on a topic, knowing that you won't need to 'reteach' it in the future.
Am I making any sense?
It's a clear picture in my head, but in words alone it might not make a lot of senes (will also come in future posts / book.)
"I think the quadratic method does generalise to higher order polynomial sequences too with a slight tweak: To find leading coefficient a_n you find the nth order common difference and divide by n! (There's a link to the successive differentiation of polynomials in generating the factorial terms). Then you just iteratively use that to obtain each successive coefficient. So coefficient of n^3 is third order common difference/3!. I think can be made more general by writing the nth term as an equation (T_n =-5n + 40) rather than expression, as then links to both linear relations and also sequences and series that they'll might do at A-level."
Okay, I did not realise this. That is *AWESOME*
I think this is starting to get into why designing 'the perfect maths curriculum' is so difficult.
There are priorities that can sometimes compete e.g. 'do the thing' versus 'connect the thing to other bits of maths'
Which might be what you were saying about how point (1) might sometimes be in conflict with (2) and (3).
My preferred sword to take to this Gordian knot is still 'what then why.' So I'll reply to your other questions about that in follow up replies.
Now that I know this, though, I would definitely want to get it in there somewhere. Whether it's just learning this pattern that to get 'a' you divide by he highest power, or explicitly connecting it to differentiation at the right moment.
"Sometimes I've been gung ho about teaching the what and then the why later and I have found that - despite my framing that I would tell the why later and to trust me and go along with it - students often feel very dissatisfied with it."
Three ways to judge this:
1) How much sense does what you're saying currently make?
2) How much later? How long do they have to wait for the why?
3) Why are you teaching the why at all?
(1) if you were to say 'I'm going to show you how to add fractions,' and kids have no concept of 'a fraction'... or 'addition,' then this is all completely meaningless.
My first hypothesis would be that they have very little conception of negative numbers, nor or addition and subtraction in the context of signed or directed numbers. They were probably using a primary school mental model of 'counters' which I then 'add or take away,' which makes no sense in the context of signed / directed numbers.
The why of processes can wait, but knowledge of the concepts they operate on cannot wait.
In the negatives example, they have to have some concept of what a negative number is. That could come from the number line (left of or below zero,) or it could come zero sum pairs, but they have to have a sense of what we mean by 'a negative number,' otherwise this is all so meaningless.
Then they need to have some concept of addition and subtraction with respect to the chosen model: either pairing up +1 and -1s, or moving right and left along the number line to become 'more positive' or 'more negative.'
A headache might be introduced. e.g. you could choose to start with something that is intuitive e.g. zero sum pairs for small numbers like (-5) + 3 or 'start, direction, distance' for a number line
Then throw in the headache:
(-152) + 50
That's borderline impossible to evaluate with either conceptual model.
Instead you need to know that the sign will be given by the larger magnitude (-) and then the signs are different so we find the difference (152 - 50 = 102)
So the result is (-102)
That's how most of us work these numbers in our head, even if we do it implicitly.
That's the 'what,' and it's satisfying because it's the aspirin to a headache.
And then this can be related back to either model to offer the 'why'
In the case of the number line, if you map the numbers on to each other, the bigger magnitude always determines where you land, left or right of zero.
Then if they're pointing the same way, you're going to be adding (same sign sum,) if they point a different way you're going to be finding a difference (this is helped a lot if they have a concept of 'difference,' and not just a concept of 'subtraction.')
Or for zero sum pairs you can again see that the greater quantity of +1s or -1s is going to dictate whether you end up with only +1s or -1s left over.
Then, if you have different signs you're going to be finding the difference between the quantities of each. If they have the same sign you're going to be 'adding' more counters of the same type
The model helps make sense of what you were doing, but doesn't help you to learn what to do.
To later deal with adding and subtracting negative numbers, that's a simple transformation:
(-152) - (-50)
Becomes:
(152) + 50
Why?
Number line: when you add or subtract a negative you reverse direction.
Zero sum pairs: when you add -1s you're 'becoming more negative,' equivalent to subtracting positives, and vice versa
Note that there's a kind of 'limit' to all these whys. The models help to make sense of them, but e.g. they don't explain 'why' you reverse direction when adding or subtracting a negative.
But this is a point Feynman made decades ago, and Neumann echoed a similar sentiment - you can ask 'but why?' ad infinitum. So really all 'whys' attempt to do is 'satisfy' us, and to ever be satisfied you must answer the question in a framework where you agree some things to be true. For maths, this can go all the way down to foundational axioms, but few school kids care enough or could really follow that far down.
The point about caring can be answered a bit by the next two points, because most kids can be made to care at least a bit.
(2) How explicit are you being about how much later? If you're just saying 'I'll tell you later,' then that will meet with resistance, because who knows how long I have to wait, or whether you'll keep your promise.
If instead you say '20 minutes from now I'll explain why,' or 'tomorrow,' or 'three days from now,' now we're foreshadowing, which is a story telling technique, and builds suspense and intrigue.
Even, if necessary 'I can't explain why this works today, because it needs lots more mathematics to make sense of - so this is one where you'll just have to take my word for it, but if you go on to study maths two years from now, you'll cover a topic called 'integration,' and in that topic you'll be taught why this works' - any of those bound the waiting period and also allow kids to hold you to account for sharing the why; it's definitely coming.
(3) Sometimes we really, really don't care about the why, or don't need to teach it. There is an almost infinity of whys and connections between ideas in maths. So we don't need to commit to teaching a why every time. If we do that and the why isn't even interesting, then that will probably be low interest to you in teaching it, and low interest for the kids in being shown it.
Personally, as a rule of thumb, I prefer to stick to showing 'why' if I find it really interesting or exciting. That way I can make a big deal about how cool it is what I'm about to show. And then generally also no need to practice or remember these things; it's all just for genuine interest, and all much easier to hold on to if you already understand the nodes, the 'what's,' that this why is connecting.
As another rule of thumb, I usually need the 'why' to be pretty easy to follow. I want to know there's a very high probability of a 'light bulb' moment for most kids.
I was observing one of the tutors on our tuition programme last week, working with a five year old, and she came to a part in our programme where the child had been asked to evaluate both of these, using column addition:
999 + 3
1000 + 2
The second is much easier, because you just 'change the units column from 0 to 2'
But for the first, you have to 'carry' three times.
When the tutor then showed the girl 'look, you got the same number both times' - she gave a most amazing wide-eyed 'Oh yeah!'
Which then led the tutor to explain bridging as an efficient method of addition.
Headache set up.
Aspirin delivered.
Then the 'why' of that more efficient method was simple enough for a five year old to follow.
"Perhaps another way of framing what I'd meant re intuitiveness with a routine was the extent to which one atom or subroutine prompts the next. Maybe the principle should be the greater extent to which each step is prompted by the prior step the better."
I agree with this.
I said it in the original post, I think, but I try as much as possible to start with routines where the next atom logically follows from the previous.
Whenever you have to follow one chian of logic to deduce a value, and then note it for later use, while you go away and process a new chain of logic, these are much harder questions to answer.
It's the typical structure of what we call AO3 questions in the English exam system (ostensibly questions designed to test mathematical reasoning; really they tend to just to simple arithmetic problems set some kind of 'real world context,' that require these kinds of breaks in logic, like Best Buy questions.)
Re what you said about approach vs class in your first reply - yes that makes total sense. I've seen the same video of the nursery children solving simultaneous equations, but it doesn't seem like the results of Protect follow through and scaling up implementation puts the time frames for the student outcomes you mention within reach - I'm sure you have a rationale so I'm looking forward to finding out more about your vision for that as a realistic scalable possibility.
Re timing, your second reply - I think I have some understanding of what you mean here, though it seems to me that when I have tried this strand approach each time I return to an atom I inevitably end up increasing the difficulty or adding new learning to it. So, for example, I teach how to add or subtract from algebraic expressions vertically, to mirror the layout for solving equations. Then the next time, I do the same but with equations so they have to add or subtract the specified quantity to the expressions on each side. Then the next time they do it, it is interleaved with multiplying expressions by given quantities (already block practised in isolation). It just seems like there are so many little things to practise that I can't not add a little extra each time. Is that not in the intended spirit of the 80% review?
Re what then why, 3rd reply - oo I like the idea of being explicit about when I will explain the why. I feel that would help them to set aside misgivings and hold me accountable too. Hmm I'm not sure your hypothesis is correct, we had already briefly discussed conceptual meanings, used number lines to move to the left and right depending on adding or subtracting (a positive number to/from an integer), developed and practised the rule you specified about save direction, add magnitudes, different direction give the difference, and then finally (i.e lessons later) were looking at adding or subtracting negatives from an integer. And my 'why' eventually consisted of one of the typical type explanations, a floating basket with balloons (+ve numbers) and weights (-ve numbers) attached to it. I understand this is hardly rigorous and definitely longer, but I do feel it's much more satisfying than 'trust me, subtracting a negative is the same as adding'. And my perception of the payout in terms of student buy in has been that it's worth it. Incidentally, I don't think the explanation of it makes too much difference to whether students can ultimately do it fluently which is just way more a function of how much it has been practised age retrieved and used in different contexts.
I basically haven't used a headache aspirin type motivation tool before but would certainly be interested to try it, seems likely it worked be an effective tool for many students.
Re one atom promoting the next, 4th reply - ah yes you had made this point.
Thanks for your thorough and thought through responses
Not many things have given me pause about the atomisation process since I've come across it. But I do worry about the lack of explicit teaching about conceptual generalisation at work here.
In this case, what a and b "do" for the sequence. I suppose the idea is that kids generalise themselves from examples? It just seems like a case where having a conceptual picture of what's going is helpful for internalising the steps (either a literal picture or some mechanical sense of how the rule generates the sequence).
But it seems like maths teaching is like political spin - if you're explaining you're losing?
No I think it's completely fine to explain things using language, I'm just very judicious about when and how I do it.
In this instance my goal was to ensure even the weakest class, with next to no ability to process negative arithmetic, could form the Nth term rule of a descending linear sequence that starts with a negative term.
I have, say, 5-10 minutes to achieve that goal.
To my mind, this is an efficient method, that will work every time, doesn't contradict anything they will learn in the future, and will not confuse anyone. It's so effective that, as I've started working on some linear sequences tasks for a group of tutees recently, this is how I figured out the answers to my own questions.
So in my example I don't even have an 'a' and a 'b,' but it *is* clear that the difference forms the term in N, and the constant is formed by a combination of the term in N and the first term. Whether or not students will consciously process that or not... probably not, given what I've presented so far. No tasks I've suggested here force that kind of cognitive activity.
So, if we want those connections to be made, we have to do something else.
I have one 'why after what' that I've put together (for the afore mentioned tutees.) That is just an explanation of why writing '5 - 3 + 3n' gives the Nth term rule (generalised to any linear sequence.)
I don't think it's my best work, so I want to improve upon that in the future.
Naveen gives the general model for the sequence, which I think is an excellent way of doing it.
I can't attach photos in comments, but I'll see if I can reply to the Note with a photo.
To try to summarise:
- use of language: yes, could do, but there's a time and a place. More on this coming up in regular posts soon
- explaining 'why' this works: something we try to do as much as possible. I personally favour 'what then why,' and in this instance there was probably only enough time for the what
- generalisation: I think general forms are very powerful, but rather than giving students general forms, which to be *very* difficult for them to unpack and make sense of, I usually get *them* to construct the general form as a part of an expansion sequence
e.g.
Generate the first four terms of the sequence given by 2 + 3n
Generate the first four terms of the sequence given by 2 + an
Generate the first four terms of the sequence given by b + an
A lot of the difficulties here seem downstream of the labelling. Maybe this is just an annoying UK convention that you're stuck with, but here in Australia I'd:
1. label the first point as the Zeroth term (maybe a fact, or maybe the act of labelling is a transformation?).
2. Introduce the parameterisation T_n=a+bn (Fact)
3. Finding the difference (Routine)
4. Call it b (transformation)
5. a= the Zeroth term, starting point, etc. (transformation)
6. T_n = 40 - 5n
(This way of doing things also has the advantage of being a closer analogue to linear equations where a=the y intercept, and x=0 is a more natural "starting point")
I'm also interested in why you don't state the rule as an equation:
Nth term = rule
Both of you left the rule as an isolated expression. Is there a deep rationale for that? Or is it just one less thing to explain?
"1. label the first point as the Zeroth term (maybe a fact, or maybe the act of labelling is a transformation?)."
Something in this is brilliant, imho.
I have never before considered labelling the first term as the 0th term 🤔 It gave me a lot to think about.
Where I ended up:
Given this sequence: 5, 7, 9, 11
We would normally say the Nth term rule is 2n + 3
If we label 5 the 0th term, then the rule becomes 2n + 5
Which is much simpler - it has a much closer relationship to what we observe in the sequence at the surface level. (Even more so if you adopt Naveen's rearrangement of 5 + 2n, which I've started doing.)
But, it only follows *if* you consistently label the 'first term' with n = 0.
There are parts of mathematics where we do this by convention e.g. Pascal's triangle.
I also really like that is more closely models the general form of a linear equation, as you say - I think that's very powerful.
But for whatever reason, we do not do this as standard for linear and quadratic sequences - certainly not in England.
If a student wrote that the Nth term rule was 2n + 5 they would get 1 out of 2 marks, and the + 5 would be considered a common misconception.
If a student explicitly showed that they were defining their indexing to start with n = 0, on the other hand, I don't know whether they would pick up all the marks in the exam or not. Everything I've found defines the first term as n=1 by convention.
So it's a great idea that might lead to a lot of confusion for novice learners. I don't think I would advocate for it for that reason - it eventually ends up with you having to state lots of caveats, exceptions, conventions, expectations, in order to fully make sense of it all, to make sure they pick up all the marks on the exam, and to make sure they're not confused by what they'll read elsewhere.
I would add that Naveen often asks students to 'find the 0th term' as a part of her process. So given this:
1 2 3 4
5, 7, 9, 11
She asks them to write this:
0 1 2 3 4
3, 5, 7, 9, 11
For my part, as a part of a broader *conceptual atomisation* I might include something like this, but as part of a *routine atomisation* to get from that initial prompt to the correct Nth term rule, I'm not convinced it's 'unstoppable;' I think given a more difficult sequence a lot of students will struggle to figure out 'what would the 0th term be.'
So helpful in a broader sense to have been introduced to that idea and seen the relationship between it and the final Nth term rule, but not the more straightforward way to construct the rule.
"I'm also interested in why you don't state the rule as an equation..."
It's not the norm here, but it's my preference.
There are several ways of doing that. Something like T_n is probably most common, but I think I would prefer to define it as a function T(n).
However, to do that, you need to define functions and function notation first, and these are generally very, very poorly taught topics, and poorly understood by students (and by many teachers - it took me a long time to feel like I had fully gotten my head around them - and I was coming to maths teaching with a Masters in Physics. Many maths specialists in England don't come from a STEM background.)
So I (a) have to assume that a bottom set class will find that even more difficult and (b) there isn't time to teach them that, ad hoc, *and* teach them to find Nth term rules that involve processing negatives, which they are also not strong with.
If I were giving this a more thorough treatment and assumed I had a lot more time, and was planning over years, then I probably would choose to define the rule as a function, but even if I chose to do that it would still be against the grain of what's typical in England (though I am quite happy to go against the grain of what's typical; what's typical isn't working for most kids.)
I suppose a final note on that is that expressions are valid ways of defining functions. There are 8 different ways of defining a function, and isolated expressions are one of them.
This is lovely! Using Naveen's approach, the zeroth term appears as the value of a, so it could be pointed out afterwards. But the calculations etc required are identical either way for linear sequences. However, for quadratic sequences, things are different: finding the zeroth term first (which means working out the zeroth first difference and then the zeroth term - that's a non-trivial routine) makes things a bit easier in some ways: you can then read off the nth term as (zeroth term) + (zeroth first difference) * n + (zeroth second difference) * n(n-1)/2. But it's still somewhat complicated, and if the examiners expect a simplified form, there's now more algebra to do. It does extend nicely to cubics, though: you just add an extra term of (zeroth third difference) * n(n-1)(n-2)/3! .
Actually, writing this makes me realise that we can just write down a formula for the nth term of a quadratic sequence where our sequence starts counting from the first term: just replace n by n-1 in the above formula (so the nth term becomes the (n-1)th term in a reindexed sequence that starts at the zeroth term): the resulting formula is:
(first term) + (first first difference) * (n-1) + (first second difference) * (n-1)(n-2)/2 (where the final /2 is actually /2!),
and this equally extends to cubics, etc. If students' algebra is competent, then there is only one - slightly more complicated - fact to be memorised, and the chain becomes (in compressed form):
1. Calculate the first few differences
2. Calculate the first few second differences
3. Write down the above formula (presumably with letters replacing "first first difference" etc)
4. Substitute in values
5. Expand and simplify the resulting expression
In response to an earlier comment, yes, this is intimately connected to calculus, and is sometimes called "finite calculus". The reason the above works is that (using zero-based indexing) the first difference of n(n-1)...(n-(r-1))/r! is n(n-1)...(n-(r-2))/(r-1)!, the second difference is n(n-1)...(n-(r-3))/(r-2)!, etc. The value of these expressions at n=0 are all zero, except for the rth difference, which is 1. So we can get the coefficient of n(n-1)...(n-(r-1))/r! by taking the initial rth difference. (It's worth calculating the sequences given by some of these expressions and their first, second,... differences to get a sense of what's going on.) This is the discrete analogue of the derivative of x^n/n! being x^(n-1)/(n-1)!, so we can find a polynomial f by calculating the repeated derivatives at zero: f(x) = f(0) + f'(0)x + f'(0)x^2/2! +... and this is the Maclarurin series! (How and whether it works when f is not a polynomial is far more subtle, and certainly not for A level!)
And it also suggests a "best" formula for the linear case: a + b(n-1), where a = first term, b = (first) first difference, as this part remains unchanged when you go on to quadratic sequences. But the obvious downside of this is that there is more algebraic fluency required.
And it dawns on me now that many GCSE questions don't specify the form of expression required for the nth term, so the "expand and simplify the resulting expression" step is not even needed in those cases!
We should always teach forward facing methods. Teaching approaches that transfer across to complex questions rather than rules and tricks that fall apart later. Reforming long-term memory connections is more onerous than learning correctly first time.
Couldn't agree more with this. It's a key part of Engelmann's Theory of Instruction, as well. In his 'Rubric for Identifying Authentic DI Programmes,' it's axiom 1n:
"1 n. A rule or information must not be contradicted by what will be introduced later."
Although I think that axiom allows us a little leeway.
For example, my final routine does not directly help you with finding a quadratic Nth term rule, assuming you use the substitution method (it can actually assist with the subtraction method,) but not does it *contradict* anything in there.
So I still view it as 'allowable.'
I might write a little more on why I'm still very in favour of my atomisation in response to comments below.
Is there a reason why when you teach finding a you don’t teach them to divide by 2 factorial?
By doing so this would be truly forward facing, although I appreciate not many people go into to find cubic’s using this approach.
Product rule for counting is in the spec for GCSE. But yes also goes well for early introduction into ideas which go into A Level.
I don't think I'd found anything yet that pointed to 2!
Do you mean that for finding a linear sequence you take the 1st common difference and divide by 1!
For quads 2nd common difference divide by 2!
For cubics 3rd common difference divide by 3!
For quartics 4th common difference divide by 4!
For Nth common difference, divide by N!
?
If so then yeah that's great in terms of structure and generalisation.
You have to balance it the against the need to teach factorials first, and adds a lot of additional information that they don't 'need' to know for the exam / in school, which raises questions about time cost and whether you can teach it all in the time available.
Yes to the above, it’s a lovely generalisation to divide by the difference level factorial.
In terms of teaching factorials, would you be doing this when it comes to product rule? I’d say you could probably teach it like a fact. Multiply each natural number before together. In which case it shouldn’t take long to add. Is it essential for all? Definitely not and for those they will likely simply revert to divide by 2.
I think some of the maths challenges also do a little bit with factorials, or it simplifies the maths if they do?
Definitely agree with the time balance element and going too far into non-curriculum content.
Oh are you picturing A Level students? I was imagining up to GCSE
I have also been teaching in Australia and was planning to post something very similar to that Alex! Though your atomisation is a bit better than what I'd planned. I agree that there is no point teaching a skill very successfully in isolation if the skill is completely unrelated to all others. At its most extreme, I could teach this one this way: "The nth term of this sequence (...) is 40 - 5n. What is the nth term of this sequence?" Obviously this would be both successful - with sufficient retrieval practise - and utterly pointless as it wouldn't generalise to anything else and would be a useless fact connected to nothing else. On the other hand, I could go far the other way and include a check to see that it's linear, and make it more general - have a table of values in terms of x and y where x goes up in increments other than 1. Or where we're told its linear, but the x values goes up in irregular increments, and some process such as change in y over change in x, is required. This would then generalise completely to finding the equation of a straight line given a table of values.
Interesting Kris' point about the split attention effect and how a routine is better as one train of thought, I need to think about that more, seems useful.
There are three general principles I can think of when inventing a cognitive routine that need to be weighed up:
1) The more generalisable the routine is, the better. This relates to both the generalisability/usefulness of the atoms that need to be taught as well as the generalisability of the routine itself. Here, I agree that it should be a routine which clearly leads into understanding of linear relationships and graphing straight lines etc (as yours does Alex), as well as finding the nth term for quadratic sequences.
2) The more intuitive the routine is, the better. If steps make sense so that there are mechanisms for students to self check then that is better.
3) The simpler it is for students to learn, and the less time taken to teach and to do, the better.
Clearly, principle 1 is often in conflict with 2 and 3, and the desired generalisability is a function of many things.
For this specific case, I prefer the second approach you outline for finding the nth term of a quadratic sequence Naveen - atoms 9-11 should be a familiar subroutine and could simply be condensed to 'find nth term of resulting linear sequence' which should be secure by the time teaching this. In the first one, atom 7 is quite a useless fact in terms of generalisation of understanding and would be relatively hard to recall, its a shortcut to solving simultaneous equations (I also think maybe it shoudn't be divided by 2?) It might be simpler to learn, as principle 3 asks for, but is in conflict with both principle 1 and 2.
In terms of the nth term - I think a link to a linear sequence as in Alex's routine, increasing generalisability, is probably worth the hit to its intuitiveness. But maybe that's not the case given the GCSE syllabus.
Engelmann is clearer than most that what we aim to teach is generalisation. So of course little or nothing useful is learnt if you only learn to memorise the Nth term rule of one specific linear sequence.
I wouldn't say that there is *no* point in teaching something in mathematics in isolation. Sometimes that's unavoidable - there are just no further connections to form within the limits of what we'll be teaching. Other times it's the first step on a longer journey, where connections will be drawn in the future, near or distance (for this I draw inspiration from Willingham's idea of Flexible Knowledge.)
https://www.aft.org/ae/winter2002/willingham
(2) is interesting. I've found that intuition is sometimes a barrier to learning.
For example, if I start kids on a unit on similar shapes and scale factors using SFs of 2 and 3 (which is typical,) whatever else I tell them, they usually intuit the scale factor and ignore anything else I say. Then, with even simple scale factors like 1.5, I've seen top set groups completely fall apart - they can't figure out how to find the scale factor because they can't intuit the relationship between 12 and 8.
I've seen a similar thing happen where a very bright but also very recalcitrant 16 year old refused to listen to anything I said about solving simultaneous equations because he could brute force the simple integer solutions through trial and improvement. Though of course, as soon as the solutions were modified as little at x = 2.5 and y = 3, he could no longer find them (not that that inclined him any further towards wanting to listen to the formal method!)
What I learnt is that intuition doesn't generalise. We created formal methods for a reason, and often they are unintuitive (if they weren't, we wouldn't need them.)
On the other hand, formal methods can seem like magic without meaning if we don't actively do work to help students construct it. Sometimes that's easy to do, other times it's borderline impossible for the level we're working at (e.g. formulae for volumes of curved solids.)
My own conclusion here has been to treat intuition and formal methodology as independent of one another, teach them independently of one another, and then connect them where possible.
For example, for similar shapes I will now start with shapes where it's impossible to intuit the scale factor, e.g. corresponding lengths of 7 and 11.
They learn the rule to 'times by a fraction,' in this case
* 11/7
This idea carries through more or less all proportional reasoning, including unit conversion, ratio, gradient, and trigonometry.
It's easy to learn, easy to apply, hyper-generalised, but it doesn't carry any intuition.
By contrast, the unitary method is very intuitive.
If 7 units are worth this much, how much is 1 unit worth? Now, how much are 11 units worth?
Learn that separately, and then connect the two based on how we 'times by a fraction' - divide by the bottom, times by the top - just what we do step by step for the unitary method. Essentially 'times by a fraction' encodes and formalises the steps of the more intuitive unitary method.
--
So I don't think your (1) is always in conflict with (2) and (3.) More often than not I find they can be aligned.
The atomisation I offered for linear sequences, for example, generalised to all linear sequences. If you opt for the subtraction method for quadratics then it becomes an atom in the routine for quadratics as well (a subroutine,) generalising further.
The method of substitution for linear sequences, also, does not actually generalise to quadratics; nor does the method for quadratics generalise to cubics.
What is provides is more exposure to and practise of a kind of 'meta-idea' in maths, which is that sometimes we can model a situation by finding the parameters of a general form, and then substituting.
While that idea holds true for both linear and quadratics sequences, *how* you find the parameters is different each time.
Interesting and yeah I take that point about intuitiveness. I learnt from something you'd said at some point or other and had my first gradient examples as fractions and yes, so so much better when integers are special cases, and I'm sure with similar shapes too. I like what you said re example of teaching transformation (multiply by the fractional scale factor) and then later linking it to make sense via the unitary method.
Sometimes I've been gung ho about teaching the what and then the why later and I have found that - despite my framing that I would tell the why later and to trust me and go along with it - students often feel very dissatisfied with it. I did it once with subtracting negatives and had to concede as some students just wouldn't let it go until I explained it and it kind of ruined the lesson. I guess it's just being selective about when it's worth that trade-off. Have you encountered that resistance and do you have any other heuristic of when it's worth telling the why upfront? Feel free to just say it'll become clear with further posts. When students have just accepted how to do it and do it, I also find that I'm not motivated to explain the why to as full of an extent as I would have done so, and kids are often not as interested as they can already do the skill. Is there a danger of them internalizing that maths is not something you're supposed to understand, you just follow the method, with frequent use of what-then-why approach?
Perhaps another way of framing what I'd meant re intuitiveness with a routine was the extent to which one atom or subroutine prompts the next. Maybe the principle should be the greater extent to which each step is prompted by the prior step the better. So in this case I also think Naveen's second quadratic sequence is better for that reason too.
I think the quadratic method does generalise to higher order polynomial sequences too with a slight tweak: To find leading coefficient a_n you find the nth order common difference and divide by n! (There's a link to the successive differentiation of polynomials in generating the factorial terms). Then you just iteratively use that to obtain each successive coefficient. So coefficient of n^3 is third order common difference/3!. I think can be made more general by writing the nth term as an equation (T_n =-5n + 40) rather than expression, as then links to both linear relations and also sequences and series that they'll might do at A-level. More teaching required though and harder to get 100% learning it quickly for many classes. Would you do more general approaches with stronger classes? You said 10-15 mins to teach this which sounds far quicker than I could manage doing any approach.
"Would you do more general approaches with stronger classes?"
I prefer the same approach for all classes, and the difference comes in terms of:
(1) how many new atoms and chains a higher set can learn in a unit of time (more)
(2) how much practice a higher set needs to remember ideas in the future (less)
So I picture it as the same journey, but higher sets move along it faster.
Seen this way, the 95% Grade 9 (or equivalent) makes sense as something like: higher ability achieve it around age 10-12, and lower ability achieve it around age 16-18.
More general approaches both help to better see the structure of mathematics per se, so learn actual mathematics, and also reduce the amount you have to learn and remember in the future - or at least make it easier to access and retrieve, since it's all connected.
But as with the last comment, this only makes sense in a certain structure.
In the structure that basically all schools currently adopt, I would personally be making decisions depending on context.
Back to the e.g. of bottom set year 9 linear sequences, descending and starting with negative terms, and knowing their negative arithmetic is weak, and I'm just designing 'tomorrow's lesson,' I'm going to make the call to teach them what they need to 'do the thing,' guaranteed, simple as possible to learn and recall in the future, and knowing that they might never get to see all this deeper structure.
In this context, if I were to try and go the other way and focus on generalisability in the 15 minutes I have, odds are they would succeed at and recall nothing (except their prior failure, and knowledge that 'they're no good at maths.)
"You said 10-15 mins to teach this which sounds far quicker than I could manage doing any approach."
The point about time only makes sense once you fully internalise some structural changes.
If you think of a lesson as a 60 minute block in which you have to teach everything on a topic, it doesn't make much sense.
If you think of it as 60 minutes of time to spend on learning new content, and remembering and connecting, and you split that in a ratio of roughly 20:80, then 12 minutes becomes your upper bound for learning 'anything' new.
Not a lot of time.
Unless you then also consider 'anything' as either an atom, or a chain of known atoms, taught through instructional sequences. Then suddenly 12 minutes a huge amount of time. You can cover many atoms in that time.
Then, since you know you're getting ~48 minutes to practice prior learning tomorrow, and the day after, you don't have to practice that atom as much right now, today.
And then, since this system leads to 100% of kids both learning successfully and remembering everything, you get gargantuan efficiency gains over time. So you can progress more slowly when first learning a topic, in terms of *days* or *lessons* spent on a topic, knowing that you won't need to 'reteach' it in the future.
Am I making any sense?
It's a clear picture in my head, but in words alone it might not make a lot of senes (will also come in future posts / book.)
🤯
"I think the quadratic method does generalise to higher order polynomial sequences too with a slight tweak: To find leading coefficient a_n you find the nth order common difference and divide by n! (There's a link to the successive differentiation of polynomials in generating the factorial terms). Then you just iteratively use that to obtain each successive coefficient. So coefficient of n^3 is third order common difference/3!. I think can be made more general by writing the nth term as an equation (T_n =-5n + 40) rather than expression, as then links to both linear relations and also sequences and series that they'll might do at A-level."
Okay, I did not realise this. That is *AWESOME*
I think this is starting to get into why designing 'the perfect maths curriculum' is so difficult.
There are priorities that can sometimes compete e.g. 'do the thing' versus 'connect the thing to other bits of maths'
Which might be what you were saying about how point (1) might sometimes be in conflict with (2) and (3).
My preferred sword to take to this Gordian knot is still 'what then why.' So I'll reply to your other questions about that in follow up replies.
Now that I know this, though, I would definitely want to get it in there somewhere. Whether it's just learning this pattern that to get 'a' you divide by he highest power, or explicitly connecting it to differentiation at the right moment.
"Sometimes I've been gung ho about teaching the what and then the why later and I have found that - despite my framing that I would tell the why later and to trust me and go along with it - students often feel very dissatisfied with it."
Three ways to judge this:
1) How much sense does what you're saying currently make?
2) How much later? How long do they have to wait for the why?
3) Why are you teaching the why at all?
(1) if you were to say 'I'm going to show you how to add fractions,' and kids have no concept of 'a fraction'... or 'addition,' then this is all completely meaningless.
My first hypothesis would be that they have very little conception of negative numbers, nor or addition and subtraction in the context of signed or directed numbers. They were probably using a primary school mental model of 'counters' which I then 'add or take away,' which makes no sense in the context of signed / directed numbers.
The why of processes can wait, but knowledge of the concepts they operate on cannot wait.
In the negatives example, they have to have some concept of what a negative number is. That could come from the number line (left of or below zero,) or it could come zero sum pairs, but they have to have a sense of what we mean by 'a negative number,' otherwise this is all so meaningless.
Then they need to have some concept of addition and subtraction with respect to the chosen model: either pairing up +1 and -1s, or moving right and left along the number line to become 'more positive' or 'more negative.'
A headache might be introduced. e.g. you could choose to start with something that is intuitive e.g. zero sum pairs for small numbers like (-5) + 3 or 'start, direction, distance' for a number line
Then throw in the headache:
(-152) + 50
That's borderline impossible to evaluate with either conceptual model.
Instead you need to know that the sign will be given by the larger magnitude (-) and then the signs are different so we find the difference (152 - 50 = 102)
So the result is (-102)
That's how most of us work these numbers in our head, even if we do it implicitly.
That's the 'what,' and it's satisfying because it's the aspirin to a headache.
And then this can be related back to either model to offer the 'why'
In the case of the number line, if you map the numbers on to each other, the bigger magnitude always determines where you land, left or right of zero.
Then if they're pointing the same way, you're going to be adding (same sign sum,) if they point a different way you're going to be finding a difference (this is helped a lot if they have a concept of 'difference,' and not just a concept of 'subtraction.')
Or for zero sum pairs you can again see that the greater quantity of +1s or -1s is going to dictate whether you end up with only +1s or -1s left over.
Then, if you have different signs you're going to be finding the difference between the quantities of each. If they have the same sign you're going to be 'adding' more counters of the same type
The model helps make sense of what you were doing, but doesn't help you to learn what to do.
To later deal with adding and subtracting negative numbers, that's a simple transformation:
(-152) - (-50)
Becomes:
(152) + 50
Why?
Number line: when you add or subtract a negative you reverse direction.
Zero sum pairs: when you add -1s you're 'becoming more negative,' equivalent to subtracting positives, and vice versa
Note that there's a kind of 'limit' to all these whys. The models help to make sense of them, but e.g. they don't explain 'why' you reverse direction when adding or subtracting a negative.
But this is a point Feynman made decades ago, and Neumann echoed a similar sentiment - you can ask 'but why?' ad infinitum. So really all 'whys' attempt to do is 'satisfy' us, and to ever be satisfied you must answer the question in a framework where you agree some things to be true. For maths, this can go all the way down to foundational axioms, but few school kids care enough or could really follow that far down.
The point about caring can be answered a bit by the next two points, because most kids can be made to care at least a bit.
(2) How explicit are you being about how much later? If you're just saying 'I'll tell you later,' then that will meet with resistance, because who knows how long I have to wait, or whether you'll keep your promise.
If instead you say '20 minutes from now I'll explain why,' or 'tomorrow,' or 'three days from now,' now we're foreshadowing, which is a story telling technique, and builds suspense and intrigue.
Even, if necessary 'I can't explain why this works today, because it needs lots more mathematics to make sense of - so this is one where you'll just have to take my word for it, but if you go on to study maths two years from now, you'll cover a topic called 'integration,' and in that topic you'll be taught why this works' - any of those bound the waiting period and also allow kids to hold you to account for sharing the why; it's definitely coming.
(3) Sometimes we really, really don't care about the why, or don't need to teach it. There is an almost infinity of whys and connections between ideas in maths. So we don't need to commit to teaching a why every time. If we do that and the why isn't even interesting, then that will probably be low interest to you in teaching it, and low interest for the kids in being shown it.
Personally, as a rule of thumb, I prefer to stick to showing 'why' if I find it really interesting or exciting. That way I can make a big deal about how cool it is what I'm about to show. And then generally also no need to practice or remember these things; it's all just for genuine interest, and all much easier to hold on to if you already understand the nodes, the 'what's,' that this why is connecting.
As another rule of thumb, I usually need the 'why' to be pretty easy to follow. I want to know there's a very high probability of a 'light bulb' moment for most kids.
I was observing one of the tutors on our tuition programme last week, working with a five year old, and she came to a part in our programme where the child had been asked to evaluate both of these, using column addition:
999 + 3
1000 + 2
The second is much easier, because you just 'change the units column from 0 to 2'
But for the first, you have to 'carry' three times.
When the tutor then showed the girl 'look, you got the same number both times' - she gave a most amazing wide-eyed 'Oh yeah!'
Which then led the tutor to explain bridging as an efficient method of addition.
Headache set up.
Aspirin delivered.
Then the 'why' of that more efficient method was simple enough for a five year old to follow.
"Perhaps another way of framing what I'd meant re intuitiveness with a routine was the extent to which one atom or subroutine prompts the next. Maybe the principle should be the greater extent to which each step is prompted by the prior step the better."
I agree with this.
I said it in the original post, I think, but I try as much as possible to start with routines where the next atom logically follows from the previous.
Whenever you have to follow one chian of logic to deduce a value, and then note it for later use, while you go away and process a new chain of logic, these are much harder questions to answer.
It's the typical structure of what we call AO3 questions in the English exam system (ostensibly questions designed to test mathematical reasoning; really they tend to just to simple arithmetic problems set some kind of 'real world context,' that require these kinds of breaks in logic, like Best Buy questions.)
Re what you said about approach vs class in your first reply - yes that makes total sense. I've seen the same video of the nursery children solving simultaneous equations, but it doesn't seem like the results of Protect follow through and scaling up implementation puts the time frames for the student outcomes you mention within reach - I'm sure you have a rationale so I'm looking forward to finding out more about your vision for that as a realistic scalable possibility.
Re timing, your second reply - I think I have some understanding of what you mean here, though it seems to me that when I have tried this strand approach each time I return to an atom I inevitably end up increasing the difficulty or adding new learning to it. So, for example, I teach how to add or subtract from algebraic expressions vertically, to mirror the layout for solving equations. Then the next time, I do the same but with equations so they have to add or subtract the specified quantity to the expressions on each side. Then the next time they do it, it is interleaved with multiplying expressions by given quantities (already block practised in isolation). It just seems like there are so many little things to practise that I can't not add a little extra each time. Is that not in the intended spirit of the 80% review?
Re what then why, 3rd reply - oo I like the idea of being explicit about when I will explain the why. I feel that would help them to set aside misgivings and hold me accountable too. Hmm I'm not sure your hypothesis is correct, we had already briefly discussed conceptual meanings, used number lines to move to the left and right depending on adding or subtracting (a positive number to/from an integer), developed and practised the rule you specified about save direction, add magnitudes, different direction give the difference, and then finally (i.e lessons later) were looking at adding or subtracting negatives from an integer. And my 'why' eventually consisted of one of the typical type explanations, a floating basket with balloons (+ve numbers) and weights (-ve numbers) attached to it. I understand this is hardly rigorous and definitely longer, but I do feel it's much more satisfying than 'trust me, subtracting a negative is the same as adding'. And my perception of the payout in terms of student buy in has been that it's worth it. Incidentally, I don't think the explanation of it makes too much difference to whether students can ultimately do it fluently which is just way more a function of how much it has been practised age retrieved and used in different contexts.
I basically haven't used a headache aspirin type motivation tool before but would certainly be interested to try it, seems likely it worked be an effective tool for many students.
Re one atom promoting the next, 4th reply - ah yes you had made this point.
Thanks for your thorough and thought through responses
Second thought:
Not many things have given me pause about the atomisation process since I've come across it. But I do worry about the lack of explicit teaching about conceptual generalisation at work here.
In this case, what a and b "do" for the sequence. I suppose the idea is that kids generalise themselves from examples? It just seems like a case where having a conceptual picture of what's going is helpful for internalising the steps (either a literal picture or some mechanical sense of how the rule generates the sequence).
But it seems like maths teaching is like political spin - if you're explaining you're losing?
Ha! That's a funny way of putting it.
No I think it's completely fine to explain things using language, I'm just very judicious about when and how I do it.
In this instance my goal was to ensure even the weakest class, with next to no ability to process negative arithmetic, could form the Nth term rule of a descending linear sequence that starts with a negative term.
I have, say, 5-10 minutes to achieve that goal.
To my mind, this is an efficient method, that will work every time, doesn't contradict anything they will learn in the future, and will not confuse anyone. It's so effective that, as I've started working on some linear sequences tasks for a group of tutees recently, this is how I figured out the answers to my own questions.
So in my example I don't even have an 'a' and a 'b,' but it *is* clear that the difference forms the term in N, and the constant is formed by a combination of the term in N and the first term. Whether or not students will consciously process that or not... probably not, given what I've presented so far. No tasks I've suggested here force that kind of cognitive activity.
So, if we want those connections to be made, we have to do something else.
I have one 'why after what' that I've put together (for the afore mentioned tutees.) That is just an explanation of why writing '5 - 3 + 3n' gives the Nth term rule (generalised to any linear sequence.)
I don't think it's my best work, so I want to improve upon that in the future.
Naveen gives the general model for the sequence, which I think is an excellent way of doing it.
I can't attach photos in comments, but I'll see if I can reply to the Note with a photo.
To try to summarise:
- use of language: yes, could do, but there's a time and a place. More on this coming up in regular posts soon
- explaining 'why' this works: something we try to do as much as possible. I personally favour 'what then why,' and in this instance there was probably only enough time for the what
- generalisation: I think general forms are very powerful, but rather than giving students general forms, which to be *very* difficult for them to unpack and make sense of, I usually get *them* to construct the general form as a part of an expansion sequence
e.g.
Generate the first four terms of the sequence given by 2 + 3n
Generate the first four terms of the sequence given by 2 + an
Generate the first four terms of the sequence given by b + an
First thought:
A lot of the difficulties here seem downstream of the labelling. Maybe this is just an annoying UK convention that you're stuck with, but here in Australia I'd:
1. label the first point as the Zeroth term (maybe a fact, or maybe the act of labelling is a transformation?).
2. Introduce the parameterisation T_n=a+bn (Fact)
3. Finding the difference (Routine)
4. Call it b (transformation)
5. a= the Zeroth term, starting point, etc. (transformation)
6. T_n = 40 - 5n
(This way of doing things also has the advantage of being a closer analogue to linear equations where a=the y intercept, and x=0 is a more natural "starting point")
I'm also interested in why you don't state the rule as an equation:
Nth term = rule
Both of you left the rule as an isolated expression. Is there a deep rationale for that? Or is it just one less thing to explain?
"1. label the first point as the Zeroth term (maybe a fact, or maybe the act of labelling is a transformation?)."
Something in this is brilliant, imho.
I have never before considered labelling the first term as the 0th term 🤔 It gave me a lot to think about.
Where I ended up:
Given this sequence: 5, 7, 9, 11
We would normally say the Nth term rule is 2n + 3
If we label 5 the 0th term, then the rule becomes 2n + 5
Which is much simpler - it has a much closer relationship to what we observe in the sequence at the surface level. (Even more so if you adopt Naveen's rearrangement of 5 + 2n, which I've started doing.)
But, it only follows *if* you consistently label the 'first term' with n = 0.
There are parts of mathematics where we do this by convention e.g. Pascal's triangle.
I also really like that is more closely models the general form of a linear equation, as you say - I think that's very powerful.
But for whatever reason, we do not do this as standard for linear and quadratic sequences - certainly not in England.
If a student wrote that the Nth term rule was 2n + 5 they would get 1 out of 2 marks, and the + 5 would be considered a common misconception.
If a student explicitly showed that they were defining their indexing to start with n = 0, on the other hand, I don't know whether they would pick up all the marks in the exam or not. Everything I've found defines the first term as n=1 by convention.
So it's a great idea that might lead to a lot of confusion for novice learners. I don't think I would advocate for it for that reason - it eventually ends up with you having to state lots of caveats, exceptions, conventions, expectations, in order to fully make sense of it all, to make sure they pick up all the marks on the exam, and to make sure they're not confused by what they'll read elsewhere.
I would add that Naveen often asks students to 'find the 0th term' as a part of her process. So given this:
1 2 3 4
5, 7, 9, 11
She asks them to write this:
0 1 2 3 4
3, 5, 7, 9, 11
For my part, as a part of a broader *conceptual atomisation* I might include something like this, but as part of a *routine atomisation* to get from that initial prompt to the correct Nth term rule, I'm not convinced it's 'unstoppable;' I think given a more difficult sequence a lot of students will struggle to figure out 'what would the 0th term be.'
So helpful in a broader sense to have been introduced to that idea and seen the relationship between it and the final Nth term rule, but not the more straightforward way to construct the rule.
"I'm also interested in why you don't state the rule as an equation..."
It's not the norm here, but it's my preference.
There are several ways of doing that. Something like T_n is probably most common, but I think I would prefer to define it as a function T(n).
However, to do that, you need to define functions and function notation first, and these are generally very, very poorly taught topics, and poorly understood by students (and by many teachers - it took me a long time to feel like I had fully gotten my head around them - and I was coming to maths teaching with a Masters in Physics. Many maths specialists in England don't come from a STEM background.)
So I (a) have to assume that a bottom set class will find that even more difficult and (b) there isn't time to teach them that, ad hoc, *and* teach them to find Nth term rules that involve processing negatives, which they are also not strong with.
If I were giving this a more thorough treatment and assumed I had a lot more time, and was planning over years, then I probably would choose to define the rule as a function, but even if I chose to do that it would still be against the grain of what's typical in England (though I am quite happy to go against the grain of what's typical; what's typical isn't working for most kids.)
I suppose a final note on that is that expressions are valid ways of defining functions. There are 8 different ways of defining a function, and isolated expressions are one of them.
This is lovely! Using Naveen's approach, the zeroth term appears as the value of a, so it could be pointed out afterwards. But the calculations etc required are identical either way for linear sequences. However, for quadratic sequences, things are different: finding the zeroth term first (which means working out the zeroth first difference and then the zeroth term - that's a non-trivial routine) makes things a bit easier in some ways: you can then read off the nth term as (zeroth term) + (zeroth first difference) * n + (zeroth second difference) * n(n-1)/2. But it's still somewhat complicated, and if the examiners expect a simplified form, there's now more algebra to do. It does extend nicely to cubics, though: you just add an extra term of (zeroth third difference) * n(n-1)(n-2)/3! .
Actually, writing this makes me realise that we can just write down a formula for the nth term of a quadratic sequence where our sequence starts counting from the first term: just replace n by n-1 in the above formula (so the nth term becomes the (n-1)th term in a reindexed sequence that starts at the zeroth term): the resulting formula is:
(first term) + (first first difference) * (n-1) + (first second difference) * (n-1)(n-2)/2 (where the final /2 is actually /2!),
and this equally extends to cubics, etc. If students' algebra is competent, then there is only one - slightly more complicated - fact to be memorised, and the chain becomes (in compressed form):
1. Calculate the first few differences
2. Calculate the first few second differences
3. Write down the above formula (presumably with letters replacing "first first difference" etc)
4. Substitute in values
5. Expand and simplify the resulting expression
In response to an earlier comment, yes, this is intimately connected to calculus, and is sometimes called "finite calculus". The reason the above works is that (using zero-based indexing) the first difference of n(n-1)...(n-(r-1))/r! is n(n-1)...(n-(r-2))/(r-1)!, the second difference is n(n-1)...(n-(r-3))/(r-2)!, etc. The value of these expressions at n=0 are all zero, except for the rth difference, which is 1. So we can get the coefficient of n(n-1)...(n-(r-1))/r! by taking the initial rth difference. (It's worth calculating the sequences given by some of these expressions and their first, second,... differences to get a sense of what's going on.) This is the discrete analogue of the derivative of x^n/n! being x^(n-1)/(n-1)!, so we can find a polynomial f by calculating the repeated derivatives at zero: f(x) = f(0) + f'(0)x + f'(0)x^2/2! +... and this is the Maclarurin series! (How and whether it works when f is not a polynomial is far more subtle, and certainly not for A level!)
(Though the examiners probably won't love it.)
And it also suggests a "best" formula for the linear case: a + b(n-1), where a = first term, b = (first) first difference, as this part remains unchanged when you go on to quadratic sequences. But the obvious downside of this is that there is more algebraic fluency required.
And it dawns on me now that many GCSE questions don't specify the form of expression required for the nth term, so the "expand and simplify the resulting expression" step is not even needed in those cases!
We should always teach forward facing methods. Teaching approaches that transfer across to complex questions rather than rules and tricks that fall apart later. Reforming long-term memory connections is more onerous than learning correctly first time.
Couldn't agree more with this. It's a key part of Engelmann's Theory of Instruction, as well. In his 'Rubric for Identifying Authentic DI Programmes,' it's axiom 1n:
"1 n. A rule or information must not be contradicted by what will be introduced later."
https://www.zigsite.com/PDFs/rubric.pdf
Although I think that axiom allows us a little leeway.
For example, my final routine does not directly help you with finding a quadratic Nth term rule, assuming you use the substitution method (it can actually assist with the subtraction method,) but not does it *contradict* anything in there.
So I still view it as 'allowable.'
I might write a little more on why I'm still very in favour of my atomisation in response to comments below.