Thursday, May 30, 2019

10 ways AI is used in video for learning – from deepfakes to personalisation

Video is the medium of the age and AI is the technology of the age. Combine the two and you have a potent mixture. I’ve been involved with both, working in a video production company, using video on all sorts of media, from interactive videotape machines, laserdiscs, compact discs, CDi to streaming, even making a feature film called The Killer Tongue (you really donl;t want to know). Believe me that last one lost me a ton of money. I now run an AI for learning company WildFire and am writing a book AI for Learning. I know these two worlds well. But how do they interact?

1. Edit
There are tools that allow you to edit video much faster and to higher quality. Different cameras shoot different colour balances – that can be fixed with AI. Same actor in different scenes with different skin tone, that can be fixed with facial recognition and skin tone matching - using AI. Need your music mixed down behind dialogue - use AI. AI is increasingly used to fix, augment and enhance moving images. 

2. Fake
Of course, easy editing with AI also means easy fakes. AI generated avatars as TV presenters have appeared reading the news using text to speech software. One can have Obama saying whatever you like from a voiceover artist mimicking his voice. Similarly with a fake teacher, won can deliver talking head content. Even more worrying is fake porn. Many famous actresses and actors have had their faces transposed to create ‘deepfake’ porn scenes. This mimics what is possible with fake homework, essays and text output using OpenAIs GTP-2 software, so dangerous that they decided not to release it. Just feed it a question and it produces an essay. 

3. Create
Beyond fakery lies the world of complete video creation. Alibaba’s Aliwood software uses AI to create 20 second product videos from a company’s existing stills, identifying and selecting key images, close ups and so on. The selected images are then edited together with AI and even change with musical shifts.  They say it increases online purchases by 2.6%. Some video creation software goes further and also adds a text to speech narration, with edits at appropriate points. Many pop videos and films have been made with AI tools that use AI tools such as Deep Dream for image creation along with style capture and flow tools. There’s even complete films made from AI created scripts. We already see services that create learning videos quickly and cheaply using the same methods.

4. Caption
Once you have created a video, AI can also add captions. This type of software can even pick up on dog barks and other sounds and is now standard on TV, YouTube, Facebook, even Android phones, increasing accessibility. It is also useful in noisy environments. Language learners also commonly report cationing as having benefits in self-directed, language learning. Although one must be careful here, as Mayer’s research shows that narration and text together have an inhibitory effect on learning. 

5. Transcribe
Speech to text is also useful in transcription, where a learner may want the actual transcription of a video as notes. Some tools, such as WildFire, take these transcriptions and use them to create online learning to supplement the video with effortful learning. The learner watches the video, which is good for attitudinal and affective learning, even process and procedures but poor on semantic knowledge and detail. Adding an online treatment of the transcript, created and assessed by AI, can provide that extra dimension to the learning experience.

6. Translate
One you have the transcribed text, translation is also possible. This has improved enormously, with reduced latency, from Google Translate to more sophisticated services. Google’s Translatotron promises to deliver speech-to-speech translation with an end-to-end translation model that can deliver accurate results with low latency. Advances like these will allow any video to be translated into multiple languages, allowing low-cost and quick global distribution of learning videos.

7. Filter
Ever thought why YouTube and other video services prevent porn and other undesirable material from appearing? AI filters that use image recognition to search and delete. Facebook claims that AI now identifies 96.8% of prohibited content. It is not that AI does the whole job here. Removing dick pics and beheadings relies on algorithms and image recognition but there’s also community flagging and real people sitting watching this stuff. AI is increasingly used to protect us from undesirable content.

8. Find
Want to know something or do something? Searching YouTube is increasingly the first option chosen by learners. YouTube is probably the most used learning platform on the planet. Yet we tend to forget that it is only functional with good search. AI search techniques are what gives YouTube its power. Note that YouTube search is different from Google search. Google uses authority, relevancy, site structure and organization; whereas YouTube, being in control of all its content, uses, growth in viewing, patterns in viewing, view time, peak view times, and social media features such as shares, comments, likes and repeat views. Search is what makes YouTube such a convenient learning tool.

9. Personalise
Video services such as YouTube, Vimeo and Netflix use AI to algorithmically present content. AI is the new UI and most video content is served up in this personalised fashion. Netflix famously handed out a £1 million prize for an algorithm and has since refined their approach. This is exactly what is happening in adaptive learning systems, where individual and aggregated data is used to personalise and optimise the learning experience for each individual, so that everyone is educated uniquely.

10. Analyse
Talking of Netflix, there is now a huge amount of data collected on global services that can inform future decision making. This can be data about when people cut out of a show, literally showing favourite characters and sub-plots., which can be used to inform future script writing. Data on stars, and genres can also be used to guide scripting and spend on original content. Similarly in learning, analytics around usage, cut outs and so on can inform decisions about the efficacy of the learning.

All of the above are and will affect the delivery of video in learning. Several are already de facto techniques. We can expect them all to develop in line with advances in AI, as well as learner demand. This is clearly an example of where the learning world has lots to learn and lots to gain from consumer services. Most of the above techniques are being built, honed and delivered on consumer platforms first then being used in a learning context.

 Subscribe to RSS

Sunday, May 19, 2019

How to turn video into deep learning

With video in learning one can feel as though one is learning, as the medium holds your attention but as you are hurtled forward, that knowledge disappears off the back. It’s like a shooting star; looks and feels great but the reality is that it burns up as it enters the atmosphere and rarely ever lands.
Video and learning
We have evolved to learn our first language, walk, recognise faces and so on. This primary knowledge was not learnt in the sense of being schooled or deliberately studied. It is embodied in our evolutionary past and evolved brains. Note that some of this learning is patently wrong. Our intuitive view of inertia, forces, astronomy, biology and many other things is intuitively wrong, which is why we, as a species, developed intellectual development, science, maths, literature and… education. This secondary knowledge is not easily learnt – it has to be deliberately learned and takes effort. This includes maths, medicine, the sciences and most forms of intellectual and even practical endeavour. That brings us to the issue of how we learn this stuff.
Working and LT memory
Let’s start with the basics. What you are conscious of, is what’s in working memory, limited in capacity to 2-4 elements of information at any time. We can literally only hold these conscious thoughts in memory for 20 or so seconds. So our minds move through a leaning experience with limited capacity and duration. This is true of all experience and with video it has some interesting consequences. 
We also have a long-term memory, which has no known limits in capacity or duration, although lifespan is its obvious limit. We can transfer thoughts from long-term meory back into working memory quickly and effortlessly. This is why ‘knowing’ matters. In maths, it is useful to automatically know your times table, to allow working memory to then manipulate recalled results more efficiently. We also use existing information to cope with and integrate novel information. The more you know the easier it is to learn new information. Old, stored, processed information renders working memory enormous through effortless recall from long-term memory.
All of this raises the question of how we can get video-based learning into long-term memory.
Episodic and semantic memory
There is also the distinction, in long-term memory, between episodic and semantic memory. Episodic memories are those experiences such as what you did last night, what you ate for dinner, recalling your experience at a concert. They are, in a sense, like recalling short video sequences (albeit reconstructed). Semantic memory is the recall of facts, numbers, rules and language. They are different types of memory processed in different ways and places by the brain.
When dealing with video in learning, it is important to know what you are targeting. Video appeals far more to episodic than semantic memory – the recall of scenes, events, procedures, places and people doing things.
Element interactivity
When learning meaningful information that is processed, for example in multiplication, you have 2-4 registers for the numbers being multiplied. The elements have to be manipulated within working memory and that adds extra load. Element interactivity is always extra load. Learning simply additions or subtractions have low element interactivity but multiplication is more difficult. Learning vocabulary has low element interactivity. Learning how to put the words together into meaningful sentences is more difficult.
In video, element interactivity, is very difficult, as the brain is coping with newly presented material and the pace is not under your control. This makes video a difficult medium for learning semantic information, as well as consolidation g learning through cognitive effort and deeper processing.
Video not sufficient
Quite simple, we engage in teaching, whether offline or online, to get things into long-term memory via working memory. You must take this learning theory into account when designing video content. When using video we tend to forget about working memory as a limitation and the absence of opportunity to move working memory experiences into long-term memory.  We also tend to shove in material that is more suited to other media, semantic content such as facts, figures and conceptual manipulations. So video is often too long, shows points too quickly and is packed with inappropriate content. 
We can recognise that video has some great learning affordances in that it can capture experiences that one may not be able to experience easily, for real – human interactions, processes, procedures, places and so on. Video can also enhance learning experiences, reveal the internal thoughts of people with voiceover and use techniques that compress, focus in and highlight points that need to be learnt. When done well, it can also have an emotional or affective impact making it good for attitudinal change. The good news is that video has had a century or so to develop a rich grammar of techniques designed to telescope, highlight and get points across. The range of techniques from talking heads to drama, with sophisticated editing techniques and the ability to play with time, people and place, makes it a potent and engaging medium.
The mistake is to see video as a learning medium in itself. Video is a great learning medium if it things are paced, reinforced but made greater if the learner has the opportunity to supplement the video experience with some effortful learning.
Illusion of learning
However, the danger is that, on its own, video can encourage the illusion of learning. This phenomenon was uncovered by Bjork and others, showing that learners are easily fooled into thinking that learning experiences have stuck, when they have actually decayed from memory, often within the first 20 minutes. 
Video plus…
How do we make sure that video learning experience is not lost and forgotten? The evidence is clear, the learner needs some effortful learning – they need to supplement their video learning experience with deeper learning that allows them to move that experience from short to long-term memory.
The first is repeated access to the video, so that second and third bites of the cherry are possible. Everything in the psychology of learning tells us that repeated access to content allows us to understand, process and embed learning for retention and later recall. While repeated watching helps consolidate the learning it is not enough and an inefficient, long-winded, learning strategy.
The second is to take notes. This increase retention significantly by up to 20-30% of done well as deeper processing comes into play as you write, generate your own words, draw diagrams and so on.
The third, is far more effective and that is to engage in a form of deeper, effortful learning that involves retrieval and recall. We have built a tool, WildFire,that does exactly this.
How do you ensure that your learning is not lost and forgotten? Strangely enough it is by engaging in a learning experience that makes you recall what you think you’ve learnt. We grab the transcript of the video, put it into an AI engine that creates a supplementary learning experience, where you have to type in what you ‘think’ you know. This is both simple concepts, numbers but also open input sentences, where the AI also semantically interprets your answers. This powerful form of retrieval learning, not only gives you reinforcement through a a second bite of the cherry bit also consolidates the learning. Research has shown that recalling back into memory – literally looking away and thinking about what you know, is even more powerful than the original teaching experience or exposure. In addition, the AI creates links out to supplementary material (curates if you wish) to further consolidate memory through deeper thought and processing.

 Subscribe to RSS

Thursday, May 02, 2019

‘Machines Like Me’ by Ian McEwan – a flawed machinage a trois

Ian McEwan’s 'Machines Like Me' is a machinage a trois between Charlie, Miranda and Adam. Now Ian can pen a sentence and, at times, writes beautifully but this is a rather mechanical, predictable and, at times, flawed effort.
Robot Fallacy
The plot is remarkably similar to the 2015 threesome-with-a-robot movie Uncanny (also has an Adam) which is somewhat better than this novel. But the real problem is the Robot Fallacy – the idea that AI is all about robots – it’s not. AI, even robotics, is not all about creating interesting characters for second rate novels and films and is not on a quest to create anthropoid human robots as some sort of undefined companions. Art likes to think it is, as art needs characterisation and physical entities. AI is mostly bits not atoms, largely invisible and quite difficult to reveal, it is mostly online but that's difficult for authors and film makers. That’s why the film Her was also superior to this novel – it doesn’t fall into the idea that it’s all about physical robots. McEwan’s robot and plot limits any real depth of analysis as it’s stuck in the Mary Shelley Frankenstein myth, with Turing as the gratuitous Frankenstein. In fact, it is a simple retelling of that tale, yet another in a long line of dystopian views of technology. McEwan compounds the Robot Fallacy by making Adam appear, almost perfectly formed, from nowhere. In reality, AI is a long haul with tons of incremental trials and failures. Adam appears as if created by God. Then there’s the confusion of complexity with autonomy. Stephen Pinker and others have pointed out the muddle-headed nature of this line of thought in Enlightenment Now. It is easy to avoid autonomy in the engineering of such systems. It tries to introduce some pathos at the end but ultimately it’s an old tale not very well told.
Oddities and flaws
Putting that aside, there are some real oddities, even clangers, in McEwan’s text. The robot often washes the dishes by hand, as if we have invented a realistic human companion but not a dishwasher. In fact, dishwashers are around, as one pops up, oddly as an analogy, later in the book. The robot can’t drive yet (self-driving cars appeared but didn’t work because of a traffic jam!). Yet self-driving cars make an appearance later in the book.
Counterfactuals are tricky to handle as it makes suspension of disbelief that much harder and in this case it the entire edifice of losing the Falklands war and muddling up political events seems like artifice without any real justification. One counterfactual completely threw me. It’s one thing to counterfactually ‘extend’ Turing’s life, another to recalibrate someone’s birth date , taking it back a couple of decades, as in the appearance of Demis Hassabis (of Deepmind fame). Hassabis pops up as Turing’s brilliant young colleague in 1968, odd as he wasn’t born until 1976 (as stated on the final page)!
Then there’s an even odder insertion into the novel – Brexit. McEwan is a famous Leave campaigner and for no reason, other than pettifoggery, he drags the topic into the narrative. I have no idea why. It has no causality within the plot and no relevance to the story. It just comes across as an inconsequential and personal gripe.
The yarn has one other fatal flaw – the odd way the child in introduced into the story, via a manufactured incident in the park, a continuing thread in the story that is about believable as a chocolate robot. I’m not the first to spot the straight-up snobbery in his handling of this plot line  - working class people as hapless thugs.
To be fair there are some interesting ideas, such as the couple choosing personality settings for their robot in a weird form of parenting and this blurring of boundaries is the book’s strength. The robot shines through as being by far the most interesting character in the book, curiously philosophical, and there’s some exploration of loyalty, justice and self.
Did I learn anything about AI from this novel? Unfortunately not. In the end it’s a rather mechanical and, at times, petty work. It was difficult to hold suspension of disbelief, as so many points were unbelievable. McEwan seems to have lost his imaginative flair, along with his ability to surprise and transgress. His fictional progeny are more ciphers than people. In truth, AI is only software, and all of this angst around robots murdering us in our sleep is hyperbolic and doesn’t really tackle the main issues around automation and perhaps the good that come out of such technology.

 Subscribe to RSS