Sunday, June 09, 2019

Can AI be creative?

A friend, Mark Harrison, filmed me for his film on AI and creativity. It made me think - really think. Can AI be creative? Easy to ask and difficult to answer, because it involves complex philosophical, aesthetic and technical issues. Whenever the subject is brought up in conversation, I can feel the visceral reaction among many - that creativity is that last bastion of humanity, what makes us human-all-too-human, and not the domain of machines. Yet...

Problems with definition
The problem with ‘creativity’ is that it is so difficult to pin down. The word ‘creative’ is a bit like Wittgenstein’s word ‘game’. A game can be a sport, board game, card game, even just bouncing a ball against a wall. It defies exact definition, as words are not defined by dictionaries but use. So it is with ‘creativity’, as it can be the product of a creative work of art, creative play in sport, creative decision making, even creative accounting.
This is also a problem with AI, the phrase created by John McCarthy in 1956. There is no exact mathematical or computing definition for AI and it is many different things. Like ‘Ethics and AI’, “Aesthetics and AI’ suffers from difficulties in definition, anthropomorphism and often a failure to discuss the philosophical complexities of the issue. That, of course, does not prevent us from trying.

Is AI intrinsically creative?
Language, poetry, music and image generators have used many AI techniques, often in combination, as well as using outside data sources; TAILSPIN, MINSTREL, BRUTUS for storytelling, JAPE, STANDUP create jokes, ASPERA for poetry, AARON, NEvAr, The Painting Fool for visual works of art. In addition, the use of AI in areas such as research, maths, business and other domains could also be seen as intrinsically creative. When problems in these fields are solved in an innovative manner, are these creative acts? There are literally dozens of systems that claim to have produced creative output.
Some AI systems, often combinations of AI techniques, claim creative output, as they can produce, in a controlled or unpredictable manner, new, innovative and useful outputs. Contemporary AI techniques such as neural networks, machine learning and semantic networks, some claim, generate creative output in themselves. Stephen Thaler claims that neural networks and deep learning have already exhibited true creativity, as they are intrinsically creative.
Recently, GPT-2, a model from the not-for-profit OpenAI, showed how potent generative AI can be, producing articles, essays and text from just general queries and requests (see more here). Google’s Deep Dream is another open-source resource that uses neural networks to produce strange psychedelic imagery, used in print and moving images, such as music videos. DeepArt produces new images in the style of famous artists.
Others, however, argue that if software has to be programmed by humans it is by definition not creative, that AI can never be creative in that it can do nothing other than transform inputs into outputs. The rebuttal being that this argument could also be applied to humans. So let’s look at some specific creative domains.

Creativity, AI and language
There is linguistic creativity using AI around many forms of language, everything from punssarcasm, and irony to similes, metaphors, analogieswitticisms and jokes. Sometimes linguistic creativity involves the intensification of existing rules, sometimes the breaking of these rules. Narrative Science and many other companies have been using AI to generate text for sports, financial and other articles. These have been widely syndicated, published, read and evaluated.

Creativity, AI and games
DeepMind, when it played Space Invaders, did something quite astonishing. It shot up to either side of the screen, around the invaders, so that the space invaders could be attacked from above, something humans hadn’t done. In Chess and GO, we see this a lot. Seemingly odd, unorthodox and surprising moves, that turn out to turning points that win the game, are masterfully creative. Also in computer games such as DOTA-2 AI agents are beating humans in complex team environments.

Creativity, AI and music
The one area of Computational Creativity that has received most attention is music. Could AI composed music win a Grammy? It hasn’t some argue that one day it could. Classical music, many would say, is a crowning human achievement. It is regarded as high art and its composition creative and complex. Jazz is wonderfully improvisational. Whatever the genre, music has the ability to be transformative and plays a significant role in most of our lives. But can AI compose transformative music?
At a concert in Santa Cruz the audience clapped loudly and politely praised the pieces played. It was a success. No one knew that it had all been composed by AI. Its creator, or at least the author of the composer software, was David Cope, Professor of the University of California, an expert in AI composed music. He developed software called EMI (Experiments in Musical Intelligence) and has been creating AI composed music for decades.
Prof Steve Larson, of the University of Oregon, heard about this concert and was sceptical. He challenged Cope to a showdown, a concert where three pianists would play three pieces, composed by:
   1. Bach
   2. EMI (AI)
   3. Larson himself
Bach was a natural choice as his output is enormous and style distinctive. Larson was certain of the outcome, and in front of a large audience of lecturers, students and music fans, in the University of Oregon concert Hall, they played the three pieces. The result was fascinating. The audience believed that:
   1. Bach’s was composed by Larson
   2. EMI’s piece was by Bach
   3. Larson’s piece composed by EMI.
Interesting result. (You can buy Cope’s album Classical Music Composed by Computer.) 
Iamus, named after the Greek god who could understand birdsong, created at the University of Malaga, composed a piece called Transits - Into the Abyss, which was performed by the  London Symphony Orchestra in 2012 and also released as an album. Unline Cope's software, Iamus creates original, modern pieces that are not based on any previous style or composer. Their Melamics website has an enormous catalogue of music and has an API to allow you to integrate it into your software. They even offer adaptive music which reacts to your driving habits or lulls you into sleep in bed, by reacting to your body movements.
Further examples of the Turing Test for music have been applied to work by Kulitta at Yale. But is a Turing test really necessary? One could argue that all we’re doing is fooling people into thinking this has been composed by a machine that cheats. Cope has been creating music from computers from 1975, when he used punch cards on a mainframe. He really does believe that computers are creative. Others are not so sure and argue that his AI simply mimics the great work of the past and doesn’t produce new work. Then again, most human composers also borrow and steal from the past. The debate continues, as it should. What we need to do is look beneath the surface to see how AI works when it ‘composes’.
The mathematical nature of harmony and music has been known since the Pre-Socratics and music also has strong connections with mathematics in terms of tempo, form, scales, pitch, transformations, inversions and so on. Its structural qualities makes it a good candidate for AI production.
Remember - AI is not one thing, it is very many things. Most have been used, in some form, to create music. Beyond mimicry, algorithms can be used to make compositional decisions. One of the more interesting phenomena is the idea of improvisation through algorithms that can, in a sense, randomise and play with algorithmic structures, such as Markov chains and Monte Carlo tree decisions, to create, not deterministic outcomes, but compositions that are uniquely generated. Evolutionary algorithms have been used to generate variations that are then honed towards a musical goal. Algorithms can also be combined to produce music. This use of multiple algorithms is not unusual in AI and often plays to the multiple modality of musical structure, playing to different strengths to produce aesthetically beautiful music. In a more recent development, machine learning, presents data to the algorithmic set, which then learns from that data and goes on to refine and produce composed music, bringing an extra layer of compositional sophistication.
We, and all composers, are organisms created from a bundle of organic algorithms over millions of years. These algorithms are not linked to the materials from which you create the composer. Whether the composer is man or machine, music is music. There is no fatal objection to the idea that organic composers can do things that non-organic algorithms will never be able to replicate, even surpass. 

Creativity AI and aesthetics
The AI v human composition of music also opens up several interesting debates within aesthetics. What is art? Does ‘art’ reside in the work of art itself or in the act of appreciation or interrogation by the spectator? Does art need intention by a human artist or can other forms of ‘intelligence’ create art? Does AI challenge the institutional theory of art, as new forms of intelligent creation and judgement are in play? Does beauty itself contain algorithmic acts within our brains that are determined by our evolutionary past? AI opens up new vistas in the philosophy of art that challenge (possibly refute, possibly support) existing theories of aesthetics. This may indeed be a turning point in art. If art can be anything, can it be the product of AI? 
This area is rich in innovation and pushes and challenges us to think about what art is and could be. Is the defence of the ‘artist’ or ‘composer’ just a human conceit, built on the libertarian idea of human freedom and sanctity of the individual, that makes us repel from the idea of AI generated music and art? The advent of computers, used by musicians to compose and in live performance, has produced amazing music, some created live, even through ‘live coding’. As in other areas, where AI is delivering real solutions, music is being created that is music and is liked. Early days but it may be that musical composition, with it’s strong grounding in mathematical structures, is one of those things that AI will eventually do as well, if not better, than we mere mortals.
Let’s focus on the question, ‘What is art?’ Is it defined in terms of the:

  1. aesthetic effect of the object itself 
  2. intention of the artist
  3. institutional affirmation
If it is 1. AI created art could be eminently possible, as many AI created works have already been judged to be art.
If 2. and you need intention, then we will have to abandon hope for AI or wait until AI has intention. The Intentional fallacy was written in 1946 and attempted to strip away the intention of the artist. This argument gained momentum in the 1960s with Barthes "death of the author, Foucault's abandonment of the "author function" and Derrida's "nothing but the text".
If it is 3. then the arts community may at some time agree that something created by AI is art. Put it another way, if you define ‘creative’ as something that is, in its essence ‘human’, then by definition you have to be ‘human’ to be creative. Then AI can, logically, never be creative. If, however, we accept that AI is not about copying what it is to be human, we leave room for it being called creative. We see this in technology all the time. We didn’t copy the flapping of bird wings, we invented the airplane,. We didn’t go faster by copying the legs of a cheetah but by inventing the wheel.
So how do you decide what is art when generating AI output? It is easy to mimic or appropriate existing artworks to create what many regard as pastiches. One fundamental problem here is anthropomorphism. When we say ‘Can AI be creative?’ we may already have anthropomorphized the issue. Our benchmark may already be human capabilities and output, so that creative acts may be limited to human endeavour. 
We may, literally, in our language, be unable to envisage creative acts and works beyond our human-all-too-human abilities. What would such a thing be? How would we make that judgement? Some have proposed formal criteria, such as novelty and quality to judge creative outputs. The danger of many systems is that AI has produced lots of works but a human has ‘curated’ so that only the best are selected for scrutiny. Another solution is to posit an equivalent to Turing’s Test. The problem is that this assumes that creativity is a judgment on the work itself, without requiring intention.

Beyond the human
We seem to want creative technology to be more human but this may be a red herring. It may well be that creativity is that layer that lies just beyond the edge of our normal capabilities – that’s certainly how creative acts are sometimes described, as pushing boundaries. So why not consider acts that come from another source, such as DeepMind, OpenAI or Watson? If AI transcends what it is to be human then we may have to accept that acts of creativity may do the same. Our expectations may have to change. In art we saw this with Duchamp’s urinal (Fountain). Could it be that a Duchamp-like event could take us into the next phase of art history, where it is precisely because it was NOT created by a human that it is considered art – art as a transgressive and transcendental act?

This is a lively field of human inquiry, that has a long way to go. It is easy to jump to conclusions and underestimate the complexity of the issues, which need careful unpacking. We need to be clear in the language we use, the claims we make and the evaluative judgments we make, as it is too easy to come to premature conclusions. Moffat and Kelly (2006) produced evidence that people are biased against machines when making judgments about creativity. Others are too quick to claim that outputs constitute art.
There are several possible futures here, where: 
   AI plays no role in creative output
   AI enhances human creative output
   AI produces creative output that is valued, appreciated and bought and accepted as creative by humans 
   AI creations transcends that of humans and that art becomes the domain of AI

Time and technology will tell…

Friday, June 07, 2019

Podcasts - 20 reasons why we should be using more podcasts in learning

Even the Obamas are in on the podcast act, signing a deal with Spotify. Hardly surprising, as over 50% of Americans have now listened to a podcast, very much a medium for people of working age, with the listening figures dropping off in the 55+ age group. Reuters Journalism, Media and Technology trends highlights audio as a significant growth area and Spotify are investing $500 million in the medium.

I’m a podcast fan myself. Whether it is Talking Politics, where some of the best minds in political science discuss a contemporary political topic, or In Our Time, where history, philosophy, art and science is brought alive with a trio of academics. If I want an in-depth learning experience, this is often my medium of choice. For real depth I prefer text – books, papers, articles and blogs. For practical learning, video. For really practical learning – doing stuff and experience. But podcasts lie in that niche between long-form text and short-form video and  have their own special allure, as well as being so very convenient. So why use podcasts in learning? What type of content is suitable? How does one make one?

Podcasts tend to be long form and content rich. They are, in a sense , the opposite of microlearning or the tendency to reduce things into small pieces. They also have different cognitive affordances from video, text or graphics. Video is great for ‘showing’ things such as drama, objects, places, processes and procedures with more of an emphasis on attitudinal or practical learning. Text may be better for semantic and symbolic knowledge, where the art of the wordsmith comes into play and subjects like maths. Graphics, of course can visualise data, show schemas; diagrams can illustrate what you want to teach and photographs give a sense of realism. Podcasts, however, tend to deal with more conceptual knowledge, where ideas and discussion matter. They seem better at allowing experts or leaders to explain more complex thoughts and issues, where genuine discussion or stories can reveal the learning, with deeper levels of reflection and different perspectives. Relying on spoken language alone often gives them a depth that other media don't carry.

Many like to listen to podcasts when walking, running, in the gym, car or commuting. The sheer convenience of time shifting the experience, of using this dead time, in what Marc Augur called ‘non-places’, even if only to hold off boredom, is what makes podcasts a form of productive, mobile learning. I personally, like to listen while sitting down, with headphones, as I’m a note taker but many listen when they are doing other things. This convenience factor is a big plus.

Oral communication is more natural and feels more authentic than written text, as it has many of the human flavours of the speaker, such as tone, intonation, accent, emotion and emphasis. Technically, we are grammatical geniuses aged three, able to listen and understand complex language, without formal learning. This makes such content easy to access, especially for those with lower levels of reading literacy. It is, in this sense, a very natural form of learning. This more frictionless form of communication allows us to take deeper dives, through attentive listening (as opposed to hearing), making them potent learning experiences.

Listening to a podcast, especially with headphones, can be an intense, intimate and private affair. Many podcast fans report that sense of eavesdropping into an intimate conversation, you feel as though you get to know the people over time. There is a sense of focus and attention that the learner feels, as if one was part of the conversation, literally sitting there next to the participant(s). So in this process of eavesdropping, how many participants should one have in a podcast? 

It is hard to hold full attention for long when it is a single podcaster. Imagine sitting in a plane, asking someone a question and they come back with a 40 minute reply. Although, as a fan of the comedian Bill Burr’s podcasts, it can be done. The difference is that Bill has decades of experience as a stand-up comic and can hold an audience’s attention.

A more popular format is the interview. Joe Rogan is a good example, with massive audiences - there are many others. He interviews an individual, drawing out their stories and anecdotes. The questions in an interview format act as breaks, chunking the content down into meaningful pieces, making them easier to learn. It sometimes feels as if it is you, as the learner is asking the questions, and in that sense, feels like a personal dialogue.

Some of my favourite podcasts, the BBCs In Our Time and Talking Politics, often have three or four participants. Interestingly, they both have an anchor, Melvyn Bragg and David Runciman, who hold the discussion together and give it shape and direction. The advantage of this format is that it provides different angles on the same subject, sometimes different areas of expertise, even disagreements.

Some of the most popular podcasts have been series, where they’ve built an audience over time. These segment the content and often have cliff hangers, to make you want to listen to the next one. In learning, of course, this has the advantage of splitting material over time, introducing spaced practice, by taking just a minute or so to recap on the previous episode and summarise that the end, to top and tail, improves retention.

Media rich is not mind rich
Mayer and others have, over decades, shown us, through pinpoint research with good controls, that rich media, used unwisely, can inhibit learning, as in learning 'less if often more'. This has much to do with the limitations of working memory but also with using up cognitive channels. Yet online learning seems to ignore that simple, popular, single-channel medium – the podcast. Podcasts have the advantage of low cognitive bandwidth and low costs, along with several other advantages in learning.

Audio has the advantage of taking up only one channel, the auditory channel, leaving the mind free to generate, through the imagination, your own interpretation, allowing the brain to integrate new knowledge with your prior existing knowledge. As working memory has a limit of 4 or so registers, which we can hold for around 20 seconds, keeping some free from imagery can, for some types of knowledge, be a powerful advantage, especially for conceptual content, as it gives your working memory some time to interpret, even manipulate ideas.

Take notes
Podcasts have one great advantage over video or text/graphics. For active learners, the simple fact that you don’t have to look at a screen allows you to take notes. Research shows that note taking can increase retention from 20-30%. In learning podcasts it is important that you recommend note taking, as you are hands and eye free, allowing more sophisticated notes in your own words.

Speed control
Many listen to podcasts at 1.5 times normal speed as they can still understand what is being said. We read faster than we listen and many find that they can still get the full meaning at speeds beyond that of spoken delivery. This variability of speed allows different learners to listen at different rates, giving learning, almost personalised advantages,

Content control
Another form of control is stop, back, forward and control over a visible timeline. Most find themselves doing a lot of this when using podcasts to learn, when you don’t understand something, want to reflect more, skip extraneous material, take more detailed notes and so on. This, again, allows the learner to process content at a much deeper level for retention.

Audio quality
Nass and Reeves, in research in their book, The Media Equation, showed that although one can get away with low fidelity images in video, this doesn’t work for audio. Poor quality audio has a significantly detrimental effect on learning, lowering retention. We have evolved to have visual systems that can adapt to twilight and distance. Our auditory systems are less forgiving, and expect high fidelity audio, as if delivered by a person speaking in front of you. Distance, volume, tinny sound, mechanical delivery, all diminish attention and learning. Experienced podcast producers will recommend either a studio or as quiet as possible an environment, with a good microphone to get best results. Some avoid table-top mikes and prefer lapel or head mounted.

Reading content from a script can be a killer as delivery really does matter. Listeners want energy, passion and expert or academic gravitas. Humour often helps to punctuate, dwell, then move on. It is that sense of listening to an ‘expert’, also shown by Nass and Reeves to increase retention, that is so important in learning. Above all, podcasts seem to give authenticity to the ‘voice’ of the speaker. It must be and sound natural. Over-produced podcasts can often be counter-productive.

A good podcast also needs god preparation. Make sure your technical set-up is clear. Then prep the participants. A structured script is useful, even if it just a series of agreed questions, along with advice on short answers to questions. Test the levels, make sure the atmosphere is relaxed to encourage good discussion.

There’s an argument for having music as a lead-in, even leading out at the end, as it helps brands podcasts, especially if it is a series of episodes, but avoid laying down a music track behind the speakers – it just kills attention and retention.

While recording
‘Go again… this time shorter” if often good advice, editing out the longer version. Try to avoid recording over several session, as it is difficult to get the same levels and sound the second and third time around. And if you think you can simply drop a word into a sentence that may have been mispronounced or the wrong word, think again – this is notoriously difficult. For 'learning' podcasts, there is something to be said for more structure in the content and clear edit points for different learning objectives. There are also strong arguments for more recaps, summaries and repetition to increase retention.

AI generated podcasts
One can already generate speech from text with relative ease. This is passable but still a little ‘mechanical’. However, we are reaching a position where it will feel very natural, so automatic podcasts from text scripts will be quick and cheap to produce and one can change and update them by simply changing the text, without going back into a recording studio. We already do this in WildFire for short introductions to pieces of learning.

Google is introducing real time transcription. This is a boon for note taking, as you can annotate, add your own words, summarise, mind-map, whatever. This is often difficult when you have to ‘watch’ a lecturer, PowerPoint or video. With WildFire we have also grabbed podcast transcripts, used AI to generate active online content to supplement the listening experience and solidify knowledge.

Before commissioning or producing podcasts, listen to a few. They’re everywhere on the web. But listen to those that are most popular. You will find all sorts of subjects, by all sorts of people and variations on formats. For learning, listen to some of the more serious podcasts, although there’s nothing wrong with lightening things up. I know of several companies who do ‘learning’ podcasts and have been on the end of quite a few. Given that it is a massively popular medium, cheap to produce, with significant advantages in terms of learning, why not give them a try?
Edison Report. (2019)
Llinares D. (2018) Podcasting: New Aural Cultures and Digital Media 
Nass and Reeves. The Media Equation
Newman N. (2019) Reuters. Journalism, Media and Technology Trends and Predictions 
For some interesting, and detailed research on podcasts, try Steve Rayson's blog. He has a strong learning background and is doing detailed research on who uses podcasts and why. Some of the ideas in this piece have come from his blog.