Voice (AI driven) – gift for teachers and learners?
‘Voice’ is one of Mary Meeker’s top five internet trends. Voice could be a game changer in interface behaviour, like keyboards, mice, joysticks and touch, but what impact could it have in learning?
Voice recognition, driven by leaps in AI, allows personalised voice recognition, even tonal and emotional recognition, and it is hands-free. It’s also low cost, requiring a microphone & speaker, and chimes with the rise of the Internet of Things. The three big barriers to adoption are accuracy, latency and social awkwardness.
Recognition and response must be accurate and fast. Failures and slow speeds turn users off. The good news is that we’ve punched through the 90% accuracy barrier and moving fast. At 95-99% (not easy) the show really is on the road. Already, the number of smartphone users using voice went over 60% in 2015 and that number is rising as the technology gets better and habits change.
The uses, ranked, are; 1) General information (30%), 2) Personal assistant (27%), 3) Local information (22%), 4) Fun and entertainment (21%). An astonishing 1 in 5 searches on mobile in US (Android) are now by voice. But there are problems Meeker doesn’t mention. Sure we can speak (150 wpm) faster than we can type (40 wpm) but we can also read faster than we can hear. There’s also the huge embarrassment hurdle of speaking to non-humans in public spaces. She herself sees the initial impact in hands free environments – home (43%), car (36%), on the move (19%), at work (3%). This is when avoiding typing & menus, speed and convenience really matter. But in many other contexts, silence may remain golden.
AI driven voice
With Siri and Amazon Echo we see early signs of its power. Viv is coming and a slew of innovations in AI have improved its efficacy. Jeff Bezos thinks that AI will underpin tech for the foreseeable future. He goes further and thinks we currently understate its potential impact. Of course AI is not one thing, it is many very different things and voice recognition is just one of its many stunning applications. Sci-fi films have been showing us voice activated worlds for decades – it is now a reality. Natural language voice recognition, with the help of machine learning, has accelerated in just the last few years to become a mainstream consumer product.
Amazon’s Echo is a home-based, voice activated personal assistant, a competitor to Siri, using a platform called Alexa. Alexa has two software development kits; a voice service and a skills kit. As a customer, you get a weekly email telling you about these new skills (>1000). This is a big push with over 1000 Amazon staff and tons of folk doing 3rd party apps. It will play your music from Spotify, using just voice commands, even from the far side of the room when music is playing (clever), handle Google Calendar, read audiobooks, deliver news, sports results, weather, order a pizza, get a cab on Uber and control lights, switches, thermostats and so on. It is a frictionless interface to the ‘Internet of Things’. (An interesting tangent for voice activation is its use by those who are disabled.)
Above all, as a cloud-based AI service, on tap, learns fast and is always adapting to your speech, vocabulary and preferences. It becomes, in effect, a personal assistant that learns, not only about you but as an agent which also learns from aggregated data. This is where it gets interesting.
Echo as teacher
Spoken versus written word
Behind this shift from text to voice is an interesting debate. One could argue that it could push us towards more authentic, and I would argue, balanced, form of education and learning. Typing will always be an awkward interface. It is difficult to learn, error prone and requires physical input devices. Reading is another skill that takes years to master. The spoken word is a skill we do not have to learn. We are grammatical geniuses aged three. Speech is primary and normal, reading and writing relatively recent adjuncts. So, when it comes to learning, speech recognition (output) and voice (input), gives us frictionless dialogue. It could stimulate a return to more Socratic forms of teaching and learning.
This may result in significant improvements in teaching and learning, both of which have, arguably, been over-colonised by ‘text’. Schooling, in all of its manifestations, has become ever more obsessed by text and there is a good argument for rebalancing the system away from an endless series of written tasks, essays and dissertations towards more efficient, meaningful and relevant teaching and learning. It may also help redress the balance between the academic and vocational.
The blackboard has a lot to not answer for. At that moment, teachers turned their backs on dialogue and conversation with learners and began to lecture, mediating their teaching with text, it can be even worse with text-laden PowerPoint. The teacher’s voice started to get lost. Nowhere was this more evident than in HE, where the blackboard reinforced and deeply embedded the ‘lecture’. To this day, especially in maths and sciences, ‘lecturers’, (a job title that uniquely identifies the profession’s problem), turned their back to learners and started writing. I, and millions of others, have endured the ‘three huge blackboard’ method of teaching. It is the opposite of teaching, it is writing. You may as well have emailed it to me.
The essay as assessment has now descended into a game where students know that they will not get feedback for days, even weeks (often a grade with a few skimpy comments). So they share, plagiarise, buy essays and in exams, memorise them, so that critical thinking is abandoned in favour of regurgitation. This is not to argue for the abandonment of writing, or essays, just less dependence on this one-size-fits-all form of assessment.
In schooling we also had the drift towards text-based subjects. Latin is the most surreal example, a dead language, no longer spoken, no longer even written, taught for no other reason than the fact that it got embedded in the curriculum. What a waste. Shakespeare, largely taught off the page, killing it stone dead for many who should have been excited by its searing effect when spoken on the stage. The obsession with ‘maths’, in slavish adherence to PISA, which was never their intention, is also made easier by its essentially written nature.
We have a system that teaches to the text and the tests of the text. Almost everything we test is in the form of the written word. Oral and social skills count for little in education. Practical skills are shoved below stairs and we send our kids off in lock-step to universities where the process is extended for year after year, often an inefficient and expensive paper trail that results in a huge paper IOU for the student and state.
In my lifetime I have seen HE morph into a global paper farm, with exponential growth of Journals and text output, matched only by the inverse growth in readers. Research is falsely equated with paper output, where the paper is the end-in-itself. Teaching is often side-lined as this paper mill becomes the dominant goal.
In the professions, and especially in institutions, I have witnessed bureaucratic systems whose function is often to simply to produce ‘reports’. These are invariably overwritten, skimmed, then often binned. Report writing, plans and rhetoric are so often substitutes for action. Nowhere is this more apparent in the report than invariably conclude that “more research is needed in…”. Reports beget reports.
In a way I think the historic, educational obsession with long-form text has been saved by the internet, where writing returned to a broader set of forms. Young people have taken to writing like demons, in txts, messages, posts, Tweets and blogs. There has been a renaissance of writing, reflecting a wide set of forms of communication, supplemented by images and video.
So how will voice manifest itself in learning?
Voice and learning
Google is great but we still largely write our requests. This is partly because it is quicker than speech. However, as speech recognition gets better, it will become quicker and easier to simply ask verbally. As Amazon Echo, Siri and other services take many of us into the Internet of Things, in our homes, cars and other places, we will want voice to trigger events, get help, find answers and arrange our lives.
Dialogue with smart people on any topic is often a powerful form of learning. They challenge, probe, contradict. This type of collaborative learning may come into its own with speech and dialogue. There is also the sense in which some topics benefit from the lack of images and writing. It allows the imagination to construct personal imagery and links to what is being heard.
The car is now a room, somewhere you can learn? The home is now networked, a place you can learn. Your mobile is voice ready, a place where you can learn. In some of these environments, having your hands free is essential (driving) and useful (home). How to tasks, like cooking, repairing things and finding things out make sense.
Adaptive learning, intelligent tutoring, chatbots… all of these are with us now. This form of technology enhanced teaching can be further enhanced with voice recognition and feedback. One can see how AI, adaptive, tutoring software could turn this, first into a homework support tool, then a tutoring tool, through to the delivery of more sophisticated learning. It has the advantage of being able to both push and pull learning. I like this idea of encouraging habitual learning, the delivery of short questions, quizzes and spaced practice, via voice on the echo, in a personalised sequence. In the privacy of your own home, this takes away the public embarrassment factor.
Voice moves us one step closer to frictionless, anywhere, anytime learning. Places other than institutions and classrooms become learning spaces. The classroom and lecture hall were never the places where the majority of learning took place. Context matters and as learning becomes a utility, like water, we ill be able to call upon it at any time and see learning as habitual and informal, not timetabled and formal.
I am not denigrating the written word. It matters. What matters more is a rebalance in education towards knowledge and skills that are not wholly text-based, but recognise that speech is as important, sometimes more important, and that skills also matter. Imagine a world where the only response to a request or problem is… I’ll write that up. That’s a problem. Education in its current form is not the solution to that problem but part of the problem itself.