in touch with real speech

Speech in Action

In touch with real speech

39 – Randy Newman can or can’t

I am a big fan of Randy Newman. I have been since my late teens when I first saw him on BBC television (black and white I seem to remember). For those of you who don’t know him, he is the composer of ‘You’ve got a friend in me’ and the music for Toy Story, and he has been nominated for an Oscar twenty times, and won twice (a poor success rate, which amuses him greatly). Outside of writing for films, his songs are renowned for their dark lyrics (the persona of the song is often a very unpleasant character) and bitter-sweet melodies and instrumentation/orchestration. He recently recorded an interview with the BBC (you can download the podcast from here) which I listened to with great glee.
Because I am working on a book about listening (A Syllabus for Listening: Decoding) I tend to listen not only to the content, but also to how people say things. And it dawned on me that there was one part of the interview where I was not sure what Randy said.
Listen to this sentence, where he is giving the second half of an answer from an audience member about advice he would give to young songwriters. Does he say can or can’t?

And … in terms of writing I xxxxx give you advice

I had at first thought that he was saying can’t. In the immediate context, he does go on to give advice, but I thought that this was a kind of humble disclaimer – a move that many people use as a prelude to giving advice (often in the form ‘I’m not sure I’m the right person to give you advice but …’). But on re-listening, I thought that it actually could be can. So I did a quick survey of colleagues in a discussion group which specialises in pronunciation. Eighteen people responded and ‘voted’:

  • Eleven people ‘voted’ for can
  • Six people ‘voted’ for can’t
  • One person ‘voted’ for don’t know (‘stumped’)

So experts disagreed on what he actually said. When this kind of thing happens, I always think that we have an insight into something important. As I said, my interests these days are in the teaching and learning of listening, and so what is of concern to me is what teachers and (especially) learners make of the stream of speech as they listen and re-listen to it in the classroom. My experience is that it is possible for there to be different reasonable hearings of many stretches of speech. In the case of Randy Newman’s example it seems to me that both can and can’t are reasonable hearings of the sound substance.

And when learners comment ‘Oh I cannot tell when Americans/Scots are saying can and can’t’ – they have made a judgement that to their ears, the two things (such as can and can’t) sound the same – and often they are right, as far as the sound substance is concerned! But being expert listeners we ignore the mushy characteristics of the sound substance and allow our certainty of what was meant to shape our perceptions of what we have heard. We are deaf to the indeterminacy, the in-between-i-ness of the sound substance. For teaching decoding, we need to allow for the fact that opposites may sound the same: we need – to be effective teachers of listening – to be undeaf to these happenings.

Even when we – the teachers – are certain what meanings were conveyed , and we are certain what words were intended, the sound substance may well – with justification – be heard to be contrary to both of our personal certainties. Even when we believe that the occurrence of a word in the context of the recording is ‘impossible’, its sound shape may be present – our expertise as a native listener, or our status as an expert listener (expert=you know what the words are) may deafen you to the fact that the sound shape may be hugely at variance with your expectations.

Amongst the comments I received from colleagues about this ‘survey’ were two of particular interest. The first was from Professor John Levis of Iowa State University who wrote …

…given that the succeeding context has him giving advice … “can” makes the most sense, but the phonetic output alone seems ambiguous.

And Helen Fraser from Australia and creator of the Rethink Speech website wrote …

In some ways this is, to my mind, the most interesting thing of all: how often unresolvable ambiguity occurs in speech, usually unnoticed by conversation participants.

Lastly, on this page, you can hear different cans and can’ts from other parts of the same interview.

Photograph of Randy Newman by Robb Bradley from here

38 – Lifetime achievement award – Ron Carter

At The British Council’s annual awards ceremony the ELTons, Professor Ron Carter was given the lifetime achievement award. Unfortunately, because of illness, he was not able to attend in person, but his friend and longtime colleague Chris Kennedy read his acceptance speech (beautifully). Below is an extract from the acceptance speech, which I find particularly important … ‘don’t be dazzled, there is a lot we don’t know’.

Each year we see, as witnessed at the ELTons, astonishing levels of pedagogic and technical innovation in all aspects of course materials. The field we are in is exciting. My main hope for the future is that we do also continue to keep a precise description of the English language in our sights. It is easy to think we know a lot about the English language, and of course we do. But there is a risk that dazzled, and rightly so, by ever more creative technologies, that we may take for granted our knowledge of the English language. For there is a lot we don’t know, especially about the spoken language, about language beyond the level of the sentence, and about its newest forms in e-communication. We all need to continue learning about the English language in all its globally relevant forms.

You can see the full acceptance speech here, starting about 01:25:45.

37 – Earworms 2 – Would you like a … July … liar?

Last summer (2016) I was invited to IH London to give a two-hour seminar to teachers of English from the Basque country. They were amazingly enthusiastic and very receptive. I had given (as I usually do) a talk/workshop on how to prepare students for their listening encounters with normal everyday speech, focussing on examples of fast speech.

At the end of the session, one of the teachers came up to me and told of her experience shopping at a local supermarket, and more particularly, a question that she was asked at the check out. The question was ‘Would you like a receipt with that?’ but it was spoken so fast that the teacher had – at first – no idea what had been said. She was utterly bamboozled. Quite how she got from being bamboozled to knowing what the words were (I regret to say) I did not find out.

But, I thought I might make another ear-worm out of it, for the purposes of demonstration at the latest IATEFL conference in Glasgow. At the conference I justified the use of ear-worms such as this by arguing that they will stick annoyingly (or amusingly) in the heads of learners, and thereby get them accustomed to short stretches of rhythmic sound substance. My hope and belief is that such repetitions would accustom their short term memory and their mechanisms of speech perception to better decode the stream of speech of the language they were learning – in this case English. So below, in the table you can hear the Greenhouse, Garden and Jungle versions of this question. The Greenhouse and Garden versions contain just one run through, but in the Jungle versions you will hear them twice, interrupted by July and liar. 

 Greenhouse Would you like a receipt with that |wʊd juː laɪk ə risiːt wɪð ðæt|
 Garden  Wuhju lykuh receipt withthat  |wʊʤjuː laɪkə risiːt wɪðæt|
 Jungle – July  Wuh julyuhareseewithat?  |wʊjuːlaɪərisiːʔwɪðæt|
 Jungle – liar  Wuhyouliareseewithat?  |wʊjuːlaɪərisiːʔwɪðæt|

The reason that we have a July  version of the Jungle version is because part of the sound substance (the end of would the whole of you and the beginning of like) is hearable as July. And the reason we have a liar version is because part of the sound substance (like – with the |k| dropped, thus giving us lie –  and the indefinite article a) is hearable as liar.

Those four parts were stitched together to give us the following:

I am not teaching at the moment, so I don’t know myself whether these ear-worms are as useful as I like to think they are. But they certainly got my audience of teachers very much amused. But that, is not (of course!) proof of their usefulness.



36 – Earworms

One of the teacher-trainers that I most admire is Adrian Underhill. I like the way he encourages learners and teacher trainees to explore. He encourages people to mouth (maʊð) a full range of sounds, not simply the ‘correct’ ones – and he encourages people to dance around, using their whole body to get the feel of sounds.

And I think those of us who teach decoding in our listening classes have something to learn from his methods – and this brings me to the notion of earworms. This is an idea that I first heard about in a presentation by Annie McDonald of which you can find more details here.
An earworm is an annoying extract from a song that keeps on re-playing in your head long after you have heard the song. For me, this song by Hank Williams creates an ear worm: ‘Why don’t you love me like you used to do? How come you treat me like a worn-out shoe? My hair’s still curly, and my eyes are still blue, why don’t you love me like you used to do?’

I wonder (I’ve never tried it, so I don’t know if it will work) if we could attempt to create, and plant earworm-like stretches of speech in our learners heads, and encourage them to cherish them (rather than banish them) and repeat them over and over as they walk, jog, run, or exercise in the gym.

I’ve created one, which you can hear below, which uses the words ‘where there were’ (these words also feature on this page) as part of a follow-up to a Jungle Listening lesson (no. 10 of a pilot publication you can find here).

The idea of this particular ear worm is to provide as many different soundshapes of the word cluster ‘where there were’ as it was possible for me to do (yep, the voice is me).

The idea is to explore a sample of the range of ways in which these words might be said and heard in a world (the real world) where people have accents and the words can be said in an infinite number of ways which cannot be constrained by rules. An earworm such as this is a form of vocal gymnastics which might go some way towards fulfilling an important requirement of any decoding work identified by John Field:

‘… to encounter the same words in a wide range of contexts and voices … [by assembling] … examples of the same groups of words uttered in different circumstances and at different speeds by a number of different speakers’ (Field, 2008: 166)

My hope is, that by creating earworms, we can produce the effect of hearing words in different voices, accents, and speeds that learners can carry around in their heads, and repeat. The desired effect would be to get them accustomed to handling real-like English at fast speeds, expanding the capacity of their short term memory to hold stretches of the sound substance of English in their minds.

35 – Travelling without a map

The way we teach listening is like insisting that travellers arrive at a destination via several stops without giving them the means of travelling.

It’s like asking people to move from point to point with a map that has geographical features (hills, valleys, rivers) but with the roads tracks and trails missing. You may want them to be able to identify Mount Big Meaning, but not allow for the fact that they may find themselves in a tunnel when the moment for identification comes. You may want them to identify Castle Tikbox, without allowing for the fact that they are wholly focussed on crossing a wild river – jumping from stone to stone – without losing their balance.

We need to describe the whole journey, teach the means, teach the patterns of the stream of speech the roads, the trails, the footpaths. We need to focus far more on the relationship between sound substance and its interpretation (decoding, meaning building).

What we currently do is pretend our learners can travel meaningfully, without giving them the means of navigating through the stream of the sound substance. We keep them in ignorance of the sound substance (because we teachers are ourselves ignorant) and therefore deny learners the means of learning how to navigate.

Image from here.

Listening Cherry 34 – Aping the goal

Imagine a concert pianist, on stage playing a virtuoso sonata by Liszt. Wonderful patterns of notes a beautiful and moving (in all senses) soundscape of colours, major and minor keys, cascades, soft then loud, etc. etc. This is a public display of expertise which is a wonder to behold. Expert behaviour, hard earned and hard learned over a long period of time.

But what if someone claiming to be a teacher decided to teach pupils such expert behaviour, by focussing on the visible observable aspects of performance. Their pupils would be encouraged to sit at a piano and splash away at the keys, imitating the observable rapid movement of the fingers, the coordination of the hands, and the foot movements on the pedals. All without learning the means of playing the piano, disciplined controlled slow movements, simple scales, starting with easy pieces. The result may look enchanting an accurate depiction of what it takes to be a famous pianist – but the sound would be awful. All without paying attention to the sound.

This would be a case of aping the goal (where ‘ape’ means ‘to imitate someone or something, especially in an absurd or unthinking way’) at the cost of dealing with the major issue of sound.

In many approaches to listening comprehension exercises we ape the behaviour that is the goal, while minimising the amount of detailed instruction that provide the means towards this goal – increasing students’ mastery of the sound substance. We are thus goal-obsessed, and we starve our learners of the means of achieving the goal.

We expect learners to role-play native speaker/expert listener behaviour in listening comprehension lessons by catching meanings. But we don’t teach them how to perceive words in the sound substance of speech.

We get them to ape the goal behaviour (the describable elements of it) without giving them the means (the dimension of sound) whereby the full goal behaviour can be achieved.

The belief seems to be that through undergoing repeated listening comprehension exercises of this type, learners will eventually learn how to perceive words in the sound substance. It is as if we are leaving the undescribable (or what we believe to be the undescribable) to work its magic on the learners perception unconsciously while we focus attention on what we can describe.

Image from here. Oh, and Igor Levit, who is pictured is a wonderful pianist who produces the most gorgeous sounds.

Listening Cherry 33 – Selective reality

We like to think that making listening as real-life as possible is the best way to teach listening. But our use of ‘reality’ in the design of lessons materials is selective. We steer our lessons as close as we dare to real-life listening, and we focus on extracting meaning but we remain in denial about the realities of the sound substance.

We keep the number of listenings as close as possible to one, because – the argument goes –  in real life we only get one chance. We make the learners listen as though they are present and active at the interaction that has been recorded. And we fill their minds with contextual information about the people, the situation, the purpose and predictions about what will be said. We plug learners into a reality role. We plug them in to a mind set and situation. 

The problem is that the more we steer closer to these realities at the level of meaning, the less time there is to focus on the realities of the sound substance of speech – the normal messiness of everyday speech. The urge to mimic reality leads us to forget that the classroom is a place for teaching and learning, and that (pretty much) anything goes as long as learning is effective.

But it’s nobody’s fault. ELT simply does not (yet) have a model of speech which encompasses the messiness and wildness of everyday speech (as I have said frequently in this blog the ‘rules of connected speech’ are wholly inadequate). The only model of speech that exists in ELT is the Careful Speech Model – optimised for clear, intelligible pronunciation.

In the absence of a model of spontaneous speech (optimised for listening), the requirement to mimic reality is convenient for us, because it takes up a lot of time and it enables us to feel we are doing a good job as teachers and materials writers. Because ‘that is the way good teachers teach listening’ – we conform to the expected behaviour. The trouble is we are ignoring the realities of everyday normal speech.

We are in denial about the realities of everyday speech. To adapt the words of a famous Calvin and Hobbes cartoon ‘It’s not denial, we’re just very selective about the reality we accept’.

Image from here.

Listening Cherry 32 – The black box

We still behave, as a profession, as if the secrets of learning to listen are hidden inside a black box whose mechanisms are unknowable and unteachable. Two things inside the black box seem particularly unknowable and un-teachable: (a) the messy, unruly sound substance of normal everyday speech and (b) knowledge of what our students make of this sound substance. Because we ‘don’t know’ what goes on in this black box we focus almost all our efforts on what happens before and after the black box. We focus on the input and the output.

We strive very hard to make the input authentic, useful and appropriate – matching topics, vocabulary, context, and characters in a way that will motivate learners and facilitate transition to work on other parts of the syllabus.

We also strive hard to make the output appropriate: making the tasks that the students have to do while/after listening valid acts of meaning and communication.

We put extraordinary focus on the input and output, relying on the power of contextual meaning and contextual appropriacy to skip over the problems and challenges of the black box processes. We seem content to let the black box continue to be impenetrable and intractable.

But hang on, is that fair? Don’t we give students strategies to take with them while they are engaged inside the black box? Indeed we do. Before they enter the black box, we get them into a good learner frame of mind (focussed on the task, feeling good about themselves as learners) and we exhort them to apply good behaviours (don’t strive to hear every word, listen for the stresses, build meanings, re-evaluate and reconsider). We then exhort them to apply these behaviours when they go through the black box. And after they have been through the black box we focus on their performance of these good behaviours.

But this is still about input and output – it’s like giving people warm clothes and motivating talks before they go for a walk through an unlit mine – they have to navigate without a light, at speed, and afterwards report what they sensed in the mine (they couldn’t see anything, remember). And they have to report on the state of their clothes, whether they stayed warm, and whether they still felt good about themselves after the walk. So the preparation before and the report after are more concerned with the mine walkers self-management strategies, rather than on the nature of the mine. So it is with listening classes. We are expert at the before and after, but largely inexpert in our knowledge of the sound substance of speech. We do the before and afters very well – but we avoid the sound substance, with our focus on the peripheral (worthy, useful, but still peripheral) rather than on the central issues.

This idea of listening as a black box comes from Michael Rost, writing fifteen years ago, who wrote:

Listening is still often considered a mysterious “black box” for which the best approach seems to be ‘more practice’. Much work needs to be done to modernise the teaching of listening. (Rost, 2001: 13)

Personally, I am wholly against the idea that the best approach to listening is the ‘more practice’. If you are interested in modernising the teaching of listening, keep following this blog. You can also attend a workshop I am giving in April 2017 in London at the London Language Lab here. You can buy my Phonology for Listening: Teaching the Stream of Speech here, or wait for my Syllabus for Listening: Bottom-up approach – due late 2017.

Rost, M. (2001). Listening. In R. Carter & D. Nunan (Eds). The Cambridge Guide to Teaching English to Speakers of Other Languages. Cambridge: Cambridge University Press.

Listening Cherry 31 – Thinking warm

One of the problems with our current approach to teaching listening is that we can overdo/dose on the preparatory and post-listening activities. And we thereby run the risk of stealing time away from direct encounters with the sound substance of speech which is contained in the recordings.

It is like spending most of the time of a swimming lesson outside the pool, having long preparation and post-swim  talks which deal with:

  • Security of belongings
  • Lifeguards and first aid
  • Being safe – no jumping
  • Following health procedures – foot bath, hair wash before entering the pool
  • Warming up activities
  • [Swim]
  • Showering
  • Drying
  • Dressing
  • Feedback
  • Filling in evaluation forms for the pool administration

And rather than teaching them to swim, we give them things to think about while in the water which will make them good controllers of their own metabolism, as they move from the warmth of their clothes, to the cold of the pool, and back again.

Yes, it will feel cold, but how are you feeling at the moment in your everyday clothes? Warm, good. So while you are swimming, I want you to remember how you feel right now, in warm clothes. I want you to ‘think warm’ throughout the whole swim. You will feel a whole lot better about swimming when you think warm – you almost won’t notice the cold.

Image from here.

Listening Cherry 30 – Waterfall listening

Some listening lessons are like standing under a waterfall – you centre the student under the main flow of the water so that it is directed at the centre of their head. Sometimes the flow of water is a gentle trickle and they wonder what the value of standing there is. Suddenly torrents hit them hard on the head and cascade down over the shoulders and becomes a force under which it is difficult to stand. The student moves to one side and looks up as if to reprimand the waterfall and is surprised by a new, differently-angled cascade  that catches them full in the face. They take in mouthfuls of water, and are blinded by dollops of water catching them in the eyes which they have to rub to clear them. In doing so they lose balance and stagger, and a new harder cascade catches the top of their swimming costume, and hands come off the eyes onto the swimming costume to prevent it slipping. The student is flustered, embarrassed, temporarily blinded, coughing and spluttering water.

And the teacher asks ‘Did you see the way the sunlight caught the stream of water and made a rainbow out of the fine spray?’

Image from here.

This was, of course, a strategy-free lesson.



Richard can be contacted at

Tel: 07790 629859