Monday, March 04, 2024

The Mind is Flat!

Nick Chater’s ‘The Mind is Flat: The Remarkable Shallowness of the Improvising Brain’ is an astonishing work, a book that is truly challenging. He argues against the common belief that our thoughts and behaviours are deeply rooted in our subconscious. Mental depth for him is an illusion. Instead, he suggests that our minds are flat, meaning that they operate on the surface level without deep, hidden motivations or unconscious processes.

Training is post-rationalisation

For me, he explains why most training is post-rationalisation, simplistic stories we tell ourselves about cognition. We latch on to abstract words like creativity, critical thinking and resilience then wrap them up in PowerPoints to create coherent stories that are quite simply fictions. This is why they are so ineffective in the real world. They make you think there are easy solutions, simple bromides for action, when there are not. He thinks this is all wrong and I think he is right

Cognition is improvisational

Chater supports his arguments fully by discussing various psychological studies and experiments. He proposes that our thoughts, feelings, and behaviours are largely improvisational and context-dependent. According to his theory, our responses to situations are not driven by inner beliefs or desires, but are rather ad-hoc constructions created on the spot. This idea challenges the traditional views of psychology and suggests that much of what we believe about our internal thought processes might be an illusion.

Attacks psychoanalytic and psychotherapeutic worlds

It is a direct attack on the whole psychoanalytic and psychotherapeutic world and if true, renders much of what passes for psychology as speculative rot. He challenges the whole notion of a complex, subconscious mind that can be unlocked or understood through psychoanalysis or similar therapeutic methods. Since our thoughts and behaviours are improvised on the surface level and are context-dependent, delving into the supposed depths of the subconscious to find hidden meanings or repressed memories, as is often the goal in psychoanalysis, is likely to be misguided. He suggests that the mind doesn't work in the way psychoanalysis proposes, with its emphasis on uncovering deep-seated, unconscious desires and motivations.

Over-rationalise

We over-rationalise when it comes to ideas about the brain. When it is fantastically complex and opaque. He touches upon Tolstoy, where Anna Karenina commits suicide – but why? The stories we tell ourselves about her motivations are, for Chater, quite wrong as she would be incoherent about such things. Rationalism is the mistake here, the idea that there is a true answer for everything. Dennett has taken a similar position, where conscious rationalisation is always retrospective, delving back in to the brain. The brain does huge, complex, parallel computations and has no locus or simplistic causes, the same applies to LLMs, there is no pace you can point to for the production of an answer. The brain like an LLM is necessarily opaque.

Stories are misleading

We are improvisors and this is where our storied-self is misleading. We simply make most things up, model and get through our lives. Even these simulations are often crude. Geoffrey Hinton in 1979 talks about the shallowness of human inference, using imagined cubes as an example. Our simulations of the world are momentary and not wholly coherent. We build models of it (the cone of experience) trying to see it as consistent. In fact, we deal with very localised bits of the world, a sliver of reality. We can’t model the world in our brains as the world are much larger than us! It is all a matter of approximation, analogy and past experience.

We latch onto abstract models and essences but these are too reductionist. Human exceptionalism is a good example of this, with words like ‘creativity’ and 21st C skills. Chater thinks these are misleading terms as they are too abstract and exclude the complexity of actual cases. He and Wittgenstein are, I think, correct on this. Language is promiscuous and tends to over produce abstractions which we think are real but turn out to be just that – misleading abstractions.

Our sense of our own psychology is almost completely wrong, as we have incredibly limited perception of the world through our senses and our minds work very differently from how they think they work. Colour is unlikely to be essential out there in the world, as they are mental constructions, similarly with temperature, as experienced. 

Bayesian brains?

One could argue that there are fundamental models, like pure reason, mathematics and science – axiomisation does happen, often after huge amounts of effort, but very few things are, in practice, axiomised. We may have some of this axiomised knowledge in us but this is unlikely to be foundational in the way psychology or neuroscience thinks it is.

All models are wrong but some are useful. We can, for example, hypothesise about the brain being a Bayesian organ. This may be true but more likely to be similar to Bayesian approaches to cognition. Tom Griffiths and Josh Tanenbaum follow this line but Chater thinks this is very localised and not sufficient for most cognition.

Conclusion

It has been a while since I read anything that so reversed my long-held beliefs. Heavily influenced by my reading and work in AI I had been coming to a similar, but ill-informed and badly-evidenced belief that this was indeed the case. It changes your whole perspective on cognition and behaviour. AI is showing us that much of what passes for human behaviours can be reproduced by LLMs and other forms of AI. This should not be so astonishing, if Chater is right, that we are quite shallow thinkers. If the brain is not the same as a LLM but LLMs show us that the brain is not as deep and rational as we think.


Sunday, February 25, 2024

There’s a new Sheriff in town – AI!

Keynoting at conferences where the audience has absolute focus in a sector is sometimes better than the general conferences. They are keen to find out how AI can help them with their specific problems and goals. It leads to more practical discussions and questions. In the last year I have given presentations to national tax, police, waste management, recruitment, immigration, HR, military and global consultancy organisations, also specific online learning companies. They all have one thing in common – they already use AI, sometimes extensively, at an operational level, and know they need to get to grips with this technology in other areas, as it is the technology of the age. I will do a short series of articles on specific sectors, as I’m on the road in several countries giving more of these over the next few weeks. (Image on left by DALL-E)

First up – the police, as I’ve given three keynotes to national police Colleges in the UK and Netherlands and for the EU. There’s a new sheriff in town – AI! Well not really, as the police are pioneers of AI. 

AI in policing

ANPR (Automatic Number Plate Recognition) was invented in UK in 1976 and in use since 1979. It truly revolutionised policing and now the UK has 60 million ANPR ‘reads’ a day. It acts as a deterrent to reduce crime and catches everything from stolen vehicles and uninsured vehicles to major crime and counter terrorism. Then there are its more mundane uses which we use every day in car parks, tolls and logistics tracking. It is a great example of the massive benefits that can accrue from a simple piece of AI, in this case character image recognition, something that has been around for decades.

CCTV was first used in UK in 1960 for crime prevention and the detection of offenders. Again, with face recognition, it can and has been used to identify serious offenders such as murderers, sex offenders and figures in organised crime. It is now essential for crowd control and public order. It is often combined with face recognition, not only from CCTV but also mobile phone footage, dashcams and doorbells. 

It is also used when you cross borders on immigration gates. I haven't spoken to a border guard coming back to the UK for many years. It has been automated. In fact, humans are now the main point of failure. My wife cannot enter the US because of a poorly trained TSA guard at LAX, who knew neither the rules nor had the ability to solve the simple problem (a long story - see end of this article). When you slide in your passport at an automated gate, it compared your face with one stored on the chip on your passport. This has literally eliminated the need for thousands of border control agents. Why? It is accurate. Finger printing will be introduced across the EU this year, again using AI. The same can be done for documentation.

There is a very long list of other uses, including crime analysis and investigation, forensic analysis, traffic management, drone surveillance, cybercrime detection and social media monitoring. I could go on but you get the idea. AI is already deeply embedded in crime deterrence and detection.

Of course, AI may create its own problems with scams and deepfakes. This will undoubtedly happen. My own view is that this is less a threat than people think. Deepfakes are usually moderated out by AI on social media by AI, as it polices itself. Yet audio and video are increasingly used to scam people, to make the scammers seem authentic. At this level the police need training on that topic.

One of the great things about these events are the concrete projects, real projects used in training that have been applied or are underway. I have seen a range of projects that really were stunningly specific and useful – in forensics and the general training of police officers. 

Productivity

You walk away from such events knowing that AI could result in massive savings in productivity, especially as policing is a text-laden process, the bit no-one likes – the paperwork. Using AI to create, improve and just do administrative tasks not just faster but better would be straightforward. 

Transcription

Transcriptions alone could save millions. Throughout the police investigation process, and in courts, statements are taken and proceedings recorded. This is a massive opportunity for automated transcriptions. 

Translation

Translation in police stations and out on the streets is another. Using real translators is expensive and difficult logistically. Real time translation to and from a massive range of languages is now possible.

Training

But it is in training where they have most to gain. These are people with increasingly complex and difficult jobs who could do with all the help they deserve. From learner engagement through learner support, content creation, personalised learning, feedback, formative and summative assessment, along with performance support, almost every aspect of training could use AI.

The police have a tough job that requires a LOT of training. They have to deal with aggression, violence, abuse, mental illness, drugs, alcohol, murder, even death. This requires an astonishing array of knowledge and skills. 

Simulations and scenarios

AI could help alleviate that problem with a focus on scenario-based learning, using AI to design and build lots of good dialogue-based scenarios. This is the real interface between the police and the public. I’m told that new recruits are often ill-prepared for the situations they find themselves in, unable to talk things down, too ready to reach for the pepper spray. This type of training can be done well through lots of exposure to scenarios that give pre-training before you hit the streets.

Performance support

I had several interesting conversations afterwards around the use of simulations for driver training, 3D mixed reality projects using VR in forensics, and the possibility of AI improving administrative productivity. The one topic I felt was most interesting was the idea that AI can be used for performance support. Policing is all about being out there, doing things in the real world, difficult things. It needs a wide array of skills, a fundamental and accurate knowledge of the law, high-level interpersonal skills (especially de-escalation), physical handling skills, high-level driving skills, communication skills, medical skills… I could go on but you get the idea.

The one thing that is missing in the current model is performance support for training out there, in police stations, in cars wherever. There can be no doubt that most police officers and back-office staff learn a lot on the job from colleagues and more experienced staff. This seems like the perfect context for an AI-driven performance support system. It could deliver, for example, usable advice, whether needed in the field, on the law, processes, procedures and so on, as real checklists, job aids and support.

Federal problem

One of the problems the police face is the federal and fragmented nature of their organisation. The United Kingdom has a total of 45 territorial police forces. This includes 43 forces in England and Wales, the Police Service of Northern Ireland (PSNI), and Police Scotland. Additionally, there are several special police forces that operate across the UK, such as the British Transport Police, the Ministry of Defence Police, and the Civil Nuclear Constabulary, among others. However, these special forces have specific jurisdictional responsibilities rather than geographic ones. This makes communal and well-funded projects difficult. There is a real need for a mechanism for at least sharing or projects that can be centrally funded by all, then distributed back out to save time and money.

Conclusion

I wished I had had more time with these people. They know about the need for good training. What they need is help in delivering that training more effectively, lifting themselves out of classroom PowerPoint, into more realistic training that results in real transfer to the job on the street.

PS
My wife has been banned from travelling into the US since 2016. We were travelling to New Zealand via Los Angeles (LAX) and had to simply transfer aircraft. I got through as my passport was renewed. My wife had an Arabic stamp that the TSA guy, with all they're typical arrogance and poor training, thought was dodgy. She explained that it was a Syrian stamp, from six years ago, when we went on a holiday to Syria before their war with our kids. He didn't believe her and off she was marched to the back office, where she sat for ages finally being interviewed by an equally obnoxious person. They couldn't read the date or month on the stamp because no one could speak or read Arabic! (A problem that could have been solved in two minutes by checking what the numbers were on Google.) She explained that this was before the war had started but they were dismissive, did nothing to try and clarify the matter, and we were marched through the back of the airport, put on our Air New Zealand flight and told aggressively that she was NOT allowed to return. This cost us a fortune as we had flights booked via Vancouver to San Francisco back to London - - all Business Class, all lost. We also had to book two new flights from Vancouver to London. It was like dealing with gun toting idiots - all bravado, poor training, poor resources and even less common sense.

She has never been back to the US but it's a big world out there so, for her, it is no great loss.


Thursday, February 22, 2024

Rather than deepfakes, censorship and surreal ethical fakes are the problem?

Google have just shot themselves in the foot with their release of Gemini. Social and mainstream media has been flooded with pieces showing that their text and image generation behaves like some crazed activist teenager.

When asked to create images of a German soldier it created black, Asian and female faces, so keen was it to be ‘diverse’. I won’t give other examples, but it almost looks as of the Gemini LLM is mocking its creators for being so stupid. It's as of the language model fed back to them the craziness of their own internal ideological echo chamber and culture.

This image shows what happens when idiotic guardrailing overrides common sense. We get the imposition of narrow moralising on reality and reality loses. More than this, straight up utility loses. The tool gets a reputation for being as flaky as its moralist moderators. This is not like the six fingers problem, a weakness in the technology itself, it is the direct result of human interference.

What is Guardrailing?

Guardrailing is a complex business and needs to be carefully calibrated. So what is ‘Guardrailing’? It attempts, like road barriers to provide safeguards and constraints to stop responses producing harmful, inappropriate or biased output. These three words are important. 

Harm

No one wants child porn, porn or actual content that causes real physical and 'extreme' psychological harm to be produced. This has long been a feature in ethics around the line that should be drawn within freedom of expression (not just speech). We need to err on the side of freedom of expression as that is enshrined in our democratic society as essential. That throws the definition back to the word ‘harm’, as defined legally, not by someone who ‘feels’ as though they have been harmed or think that harm is synonymous with offence.

Inappropriate

This is different from harm in that it depends on a broad social consensus around what is appropriate, a much harder line to draw. Should you allow swearing (some literature), nudity in any form (thing classical art) and so on. Difficult but not the real problem as there is, largely, a social consensus around this.

Biased

This is a dangerous term, and where things can go way off balance. I use that word ‘balance’ as we cannot have a small group imposing their own personal views on what constitutes balance being applied to these systems. This clearly happened in Google’s case. In trying to eliminate what they saw as bias they managed to impose their own extreme biases on the system.

How is it implemented?

It always starts with policy statements and guidelines. This is where things can go badly wring, if the people writing the guidelines apply their own personal, ideological or activist beliefs – whether from the right, left, wherever. This is a huge lever as it affects everything. This is clearly where things went wrong at Google. A small number of moralisers have imposed their own views on the system. They need to be removed from the company.

The guidelines are then implemented using content filters. The problem here is that of your guidelines are too constrained you eat not just into freedom of speech but also functionality. It simple doesn’t do things, like reply or create an image. Goggle have gone back to the filters to recalibrate.

Prompt modification means, they take your prompt then add other criteria, like 'diverse', 'inclusive' and other positively discriminatory descriptions. This is most likely in this case as the outputs are so obviously and crazily inappropriate. 

Moderation is another technique. This is a bad idea as it is slow, expensive and subject to the vagaries of the moderators. You are far better automating and calibrating that automation. Although there are several exceptions to this, such as porn, child porn, extreme violence and other actually harmful content.

You can also curate the training data. This is less of a problem as the data is so large that it tends to eliminate extremes. Indeed some of the problems created by Google have been on ignoring clear social norms that the training data would have produced. Apply a narrow definition of identity and you destroy realism.

There is also user testing. This has really surprised me. I know that Gemini was tested widely among Google employees before release, as I know people who did it. The problem could be that Google tends to employ a certain narrow demographic, so that testing is massively biased. This is almost certainly true in this case. Or, more likely the image generation wasn’t actually tested or tested only with their own weird ‘ethics’ team 

You can also put in user constraints that apply to single requests and/or context, such mentioning famous names in image generation. That’s fine.

Conclusion

I warned about this happening three years ago in a full article, and again in a talk at at the University of Leeds.

Who would have thought that rather than political deepfakes being the problem, censorship and surreal ethical fakes by flaky moralists, would flood social media? Guardrailing is necessary, and a good thing, but only when it reflects a broad social consensus, not when it is controlled by a few fruitcakes in a tech company. You can’t please all of the people all of the time but pleasing a small number at the expense of the majority is suicidal. Guardrails are essential, imprisoning content or allowing the generation of massively biased content, under the guise of activism, is not.

The good news is that this has happened. I mean that. In fact, it was bound to happen, as there was always going to be a clash between the moralisers and reality. We learn from our mistakes and Google will rethink and re-release. That's how this type of innovative software works. Sure, they seem hopeless at testing, as five minutes of actual testing would have revealed the problem. But we are where we are. The great news is that it knocks a lot of the bonkers ethical guardrailing into touch. 




Wednesday, February 21, 2024

Algorithms, optimisation and football

People sceptical of AI and algorithmic power should take note of my local football team, tBrighton and Hove Albion AKA Seagulls. For a small town, we topped our group in the Europa Cup, are still in the FA Cup and despite being decimated by injuries still 7th in the Premiership above Newcastle and Chelsea. That last name is critical as they have ripped out our manager, top staff and several players.

Tony Bloom

Having splashed out hundreds of millions, many of these teams will find it difficult to splash out even more on either our manager De Zerbi or any more of our players. But the secret sauce is not in the manager but Tony Bloom, the owner. It is he who finds the managers and players. One of the few local, genuine supporter owners in football. He made his fortune gambling, then as a gambling entrepreneur. He heads a private betting syndicate who are known to have been phenomenally successful in betting in sport.

He has been the Chair of the club for 15 years and has built a system of sophisticated data collection and algorithmic selection for scouting new players. It remains a secret, held by a separate company called Starlizard, so that no matter which scouting staff or manager comes, Bloom literally holds the key.

This focus on recruitment is the feed that creates a robust organisation that buys cheap and sells for top dollar, Caicedo cost £4.5m, sold for £115 million – to, you guessed it - Chelsea. His current roster has several players in that league, many young and therefore more valuable. They also play the sort of football that has become popular in top flight leagues – playing out from the back, pulling the opposition towards you and breaking fast.

What lessons can we learn from him? 

Leaders matter but not in the way leadership books and courses would suggest

Bloom is whip smart, driven and very much behind the scenes. He is wholly strategic, not tactical. His talent is in understanding that even a complex organisation, in a stochastic sport like football, needs to be run on high-quality decision making. That means decisions based on data and optimisation, not charisma or hunch. Data and algorithms are in his DNA, not vague nostrums about Leadership.

Recruitment matters but not in the way you think

Recruitment is data driven, a long list of data types are collected and fed into a n algorthmic process that flags targets for acquisition. He is interested in pure performance, not values or vague criteria and personal qualities. Actual performance; match time, successful passes, tackles, turnovers, shots, goals – and much, much more. All of this is monitored.

Deal making

Deals start with early contracts and he makes sure they are long deals with good exit fees. His promise is clever, come to Brighton, we’re in the best league in the world, the Premiership, and we can showcase you so that you can get into a top club anywhere else in the world. And when it comes to selling, he’s a master. As a successful international poker player, he fully understands both the fiscal and psychological moves that have to be made.

Growth 

When he became Chairman in 2009, he hired Poyet and got promotion as Champions in 2011. After a series of managers, Chris Houghton got us to 3rd place in the Championship then promotion the following season in 2016-2017. It was then he started to be really active in the transfer market. Even then, he was replaced by Graham Potter who took us to an all-time high of 9th in the Premiership., getting us into Europe. Chelsea (yes them again) stole Potter but Bloom made possibly his best hire yet, De Zerbi, applying all of his data analysis and algorithmic nous to even the managerial position. In other words, he understands gradual but steady growth.

He now has a huge war chest to invest in over the Summer, made some fantastic signings, especially Barco, touted as a huge talent, stolen from beneath the noses of the big boys and is ready to take things to the next level. This is poker at the highest level, a game of probabilities, tempered by maths, data and algorithmic decision-making. You never see him blowing off on TV yet the people of Brighton love him, as he’s humble, a real supporter, self-made man, and has put his money where his mouth is. This is no lazy, wealthy Gulf prince or Russian oligarch. This is the real deal, a real Leader.

An interesting idea also emerges here, that organisations who get optimisation right will be winners, the rest the losers. This demands our attention as it is likely to happen. It means getting with the programme now, to understand the technology of optimisation.

Seagulls!

Football was the only ‘real’ sport in my culture, at school, in pubs wherever. We played nearly every night beneath the yellow street lamps, even in the rain, on odd shapes of grass on the edge of our scheme in Craigshill (known as Crazyhill). A speciality was bouncing the ball like a cushion shot in billiards off the wall to get past a player. Some of us went week in, week out to matches, in my case Glasgow Rangers, home and away – Scotland’s a small country so it was easy.

When I ended up in Brighton, as far away from Scotland as you can get, without getting wet, a colleague at work, Clive, was a fanatical Brighton supporter, so I started going to the Goldstone. I arrived the year after they had appeared in the FA Cup Final and this was a different atmosphere, players like Frank Worthington, even the occasional Scot like Doug ‘chipped from a block of ice’ Rougvie. It was fun. But then they lost their stadium, imploded, narrowly missing relegation to the Conference League in 1997. It was desperate.

Then, despite protestations from Sussex University, a local man and Brighton supporter made good, Tony Bloom, put £90 million on the table for a new stadium. We never looked back. After 34 years out of top flight football we climbed back up to the Premier League. The promotion parade on the seafront was fantastic. At that time we had ‘Skint’ on the shirts as Fat Boy Slim was a supporter and sponsor. The stadium sits, nicely nestled in the Downs and at the start of every game, there’s always a seagull or two circling high above the pitch, Seagulls! being the club’s standard chant. The crowd always, quaintly, kick off the match with a rousing ‘Sussex by the Sea’ a First World War marching song.

Occasionally, it hardly ever happens now, the opposition would sing some homophobic chant, like ‘You’re going down on each other, you’re going down on…” to the tune of ‘Guantanamera’, actually about a Cuban woman, but there you go. Our fans would respond with ‘You’re too ugly to be gay, you’re too ugly…” to the same tune. In truth it was all a bit banterish. People forget that this is sport born of the industrial need in the 19th century for workers to have some fun at the weekend, after a week of hard labour. Going to the football is always a bit of a laugh. The beautiful game is working class Britain’s gift to the world.

Anyway. After nearly 40 years in Brighton I’m a Brightonian now, and like many, a Saturday is spent eyeing my phone for the result. It is a feeling that comes across you on a Saturday, of excitement and expectation, watching the clock for kick-off time. It turns the day into a drama. 

We’re playing brilliant football and despite London clubs run by Middle Eastern and Russian billionaires stealing our manager, support staff and players, we’re flying. 

I’ve be in all sorts of places around the globe this year and often the first conversation I have in the taxi, restaurant or meeting is about ‘Brighton… and Hove Albion’. People talk a lot about ‘culture’ these days but those who have real culture don’t use that word, they live it. Seeeagulls! 

Sunday, February 18, 2024

I am become life, the creator of worlds?

Can words now create worlds? Has AI suddenly acquired God-like creation qualities?

A short video of two black pirate ships sailing in a stormy sea of coffee, within a coffee cup, has caused a splash by showing what can be done with video but also created a stir by suggesting something astonishing. My son is a games player and AI expert. His immediate reaction on seeing the Sora videos was the slightly perfect and gamesy feel of the images. 

Could OpenAI have developed something truly astonishing here – a 3D world simulator? Is this the converse of Oppenheimer’s famous statement, 

I am become life, the creator of worlds?

To date Generative AI was limited to text and images and lacked a model of the world, of a 3D space, time, causality and action. 


This video show a video generated from a prompt, astounding in itself but what it may reveal is the following:

Possibility in the future of a physics engine that understands how objects behave, in this case the two pirate ships that never collide. That they do not collide is relevant as they must know several the position of the two boats at all times in a 3D space. It is physics that grounds models in reality. Note that the way this works is not by having a physics and collision engine, only that the data, from computer games will have been created using such tools.

Behaviour of the ships on the water suggests the possible future detailed knowledge of fluid dynamics, as the coffee whirls around and even creates waves and froth. Again, it is not being created from the mathematics of fluid dynamics but a clever diffusion model

Cup size and limitations of the cup space, showing a knowledge of small object and the ability to scale two very large objects down into a small space.

Sharp realism with correct lighting and shadows is also astonishing. This is not a rendering engine but, again, a trained diffusion model.

There are suggestions that this could have been training using data from Unreal, the games’ engine, in particular, synthetic data from that engine. YouTube and others sources are also clearly in there. This means it is trained on a combination of real and virtually created worlds. There also seems to be a time component. This is interesting, as that variable is missing in other modes.

If they have created such a thing, this is far more than just video creation. It is a step towards the ideea of the creation of 3D worlds using AI, something I mentioned in my book on Learning in the Metaverse. Being able to create any 3D world is a far bigger deal than video, as it opens the way for another revolution in media and learning. We are nowhere near that yet.

In truth there are two opposing routes to solving this problem. and both were released this week - OpenAI/Sora v Meta/V-JEPA. OpenAI has developed Sora, recognised for its text and video-to-video modelling capabilities, aiming ultimately to create a world simulator. However, Meta's AI chief, Yann LeCun, criticises this method, considering it impractical and likely to be unsuccessful. He contends that generative models are not suitable for processing sensory inputs due to the high level of prediction uncertainty associated with high-dimensional continuous sensory data.
In response, LeCun has introduced his own AI model, V-JEPA. This model utilises a non-generative approach and is designed to predict and interpret complex interactions. Its primary function is to understand the dynamics of objects and interactions, thereby enhancing the AI's comprehension of these elements.

We are 3D people, living in a 3D world doing 3D things with other 3D people and 3D things. Yet, bizarrely, most teaching and training if from the flat, 2D page – text, books, graphics, PowerPoints and screens of e-learning. This has always been largely suboptimal and prevents actual learning of skills and transfer.

In the beginning was the word and now we, like small Gods, can use the word to create new worlds. We are in dialogue with the world to create new ones. That simply act of saying something can make it appear, breathe life into that world. I find that more than interesting, it is staggering.

We may have, in this tool, the ability to create worlds, any world, on any scale, in 3D by simply asking it? If so, this is a threshold that has been crossed. We will be able to create worlds in which we work, interact and get things done. Also worlds in which we teach, train and learn. Even worlds in which we socialise and get entertained. We may be doing what has only ever done on a limited scale in incredibly expensive simulators and computer games – understand and create new worlds.  Multimodal may now mean a grand convergence.



Friday, February 16, 2024

Sora - as Producer, Director, Screenwriter, Cinematographer, Casting Director, Costume designer and actors

Sora, from OpenAI, may go down in the history of movies or moving pictures, as a pivot point. It is significant as that filmed train thundering into La Ciotat, scaring the theatre audience or The Jazz Singer, the first feature-length talking film. 

I’ve been involved in making a feature film, way back in the 1990s, at Epic, we made ‘The Killer Tongue’. It was a schlock-horror where a woman killed men who had wronged her with her tongue. It was all very fraught…. And expensive and we lost an eye watering amount of money. Lesson learnt. One amusing moment was when Quentin Tarantino said we “Had the best film poster at Cannes”… we dropped the word ‘poster’ as you do… in Hollywood! Didn’t work – the film bombed.

Film making has just been turned on its head, no longer requiring huge investments in production. All the components seem to be heading towards very low cost – everything – sound, lighting, worlds, people, action. A bit like painting and photography, only faster.

Sora is just so powerful. Even on its first release the shorts were stunning, the movement, lighting and reflections. There’s a moment when what looks like a Japanese woman walks down a street and turns to camera and the whole scene is reflected in her glasses, where the movie camera should be, but it isn’t. there is no camera as it has been replaced by a text prompt. In another two pirate ships heave around in a black sea of coffee, prompted by the simple worlds “Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee." No prompting course needed.

This puts movie making in the hands of creative people who can dispense with the very high production costs of a crew, set and, at some point, perhaps actors. It may even do a good job at editing. AI is already doing colour balancing volume setting across cuts and many other functions, it has just jumped into the Director’s seat, into a creative role. 

There will be those who will baulk at this, in the same way people baulked when photography challenged painting, printmaking challenged original works and CGI in film. But this is different, as it is not technology that is scaling or become a new medium, it is technology as a creative agent. This is technology, as Producer, Director, Screenwriter, Cinematographer, Casting Director, Costume designer and actors. This is technology as movie-maker. It democratises movie-making. Some will recoil that, especially those who make money from scarcity, the Hollywood moguls and their crew. It will reshape film making, in what way no one can be entirely sure but things change, life changes, art changes. It is the very definition of art.

Ever since the concept of the ‘arts’ and the 'artist' arose in the Romantic movement in the late 18th century in Germany, seeped in German idealism, and the concept of the individual artist, imported by Coleridge and taken up with enthusiasm by everyone in the arts to this day, we have worshipped the artist as an individual. Well, here we are he/she has arrived- we now that very concept in film. The AI auteur has arrived.

Going back to that moment when the woman turns to camera and there is no reflected camera, the fact that film making is free from the physical constraints of the camera, sets and costs, we may see a Renaissance of film making. Free from the tyranny of actual optics and physics, anything is possible. On the other hand if you want realism, the actual realism of historical settings may be far easier, with more authentic props, clothes, items, weapons and so on, than was ever possible before.

If you want extinct animals, you can have them, thundering across a snow-covered landscape, without the cost of going there. CGI is now well and truly yours.

An animated character that is the product of your imagination, just tell it what you want in a prompt.

A fast moving car chase, from a helicopter or drone, with good lighting through a dirt landscape.

A busy street scene with a Chinese Dragon and large crowd moving towards you.


New Genres

We may see new genres emerge, certainly a widening of participation and what can be done in moving images, as anything is now possible. My own view is that this will combine with other forms of AI and the shift from 2D to 3D, to create film experiences where we participate, either as agents within the story or our avatars as participants in the story. Movies and games will combine to form a new medium.




Thursday, February 15, 2024

Sora and Gemini 1.5 - two mind blowing releases within hours of each other


No sooner than I had written about how important ‘context windows’ for using AI for teaching and learning, within 24 hour Google have written about their next release Gemini Pro 1.5, which blows the whole market open – and guess what the great innovation is? A MASSIVE INCREASE IN THE ‘CONTEXT WINDOW’. 

Then, within a few hours another announcement – OpenAI’s release of Sora and we have an absolutely INSANE text to video model from OpenAI. Creates real & novel scenes just from text descriptions. This is a flip moment, as we all thought this was years off... implications for learning - huge... crazy good videos, lighting and movement. Not only that we see something interesting way out on the horizon. The whole Hollywood, Netflix thing is now up for grabs. Social media may well become the new source of entertainment and art.

The context window, what the model can ingest has just gone through the roof, in fact several roofs. They plant to that start with the standard 128,000 context window, then scale up to 1 million tokens, as they improve the model.  This will mean it can take in huge amounts of tokens and is multimodal. Whole books it east for breakfast, collections of documents, full movies, a whole series of podcasts.

The examples are compelling, so here’s just a few sets of seven. I could have given tons more….

Text

It can ingest giant novels then find exactly what you need. They took Victor Hugo’s five-volume novel “Les Misérables”, which is an astonishing 1382 pages, sketched a scene and asked “Look at the event in this drawing. What page is this on?” Got it right.

The opportunities in learning are many:

1. Summarising any document, not matter how long

2. Finding something within an enormous text file

3. Huge sets of HR documentation tuned into accessible resource

4. Use by a tutorbot to answer student queries and questions

5. Feedback and marking text assessments

Audio

I have recorded a large series of 30 podcasts on Great Minds on Learning. They’re an hour each and the initial tests suggest these could be ingested and used for learning.

The opportunities in learning are again many:

1. Summarising tons of podcasts

2. Finding specific chunks of podcasts to answer a query

3. Interpreting communication skills

4. Feedback and marking of spoken assessments

Video

It gobbles up entire movies and you can ask questions about what happened in those movies. The Buster Keaton example interprets a pawn ticket taken from someone’s pocket. The model can answer complex questions about the video content, or even from a primitive line drawing.

The opportunities in learning are once again many:

1. Allowing search of video for performance support on a specific task then playing it back

2. Allowing the learner to ask for more detail on a specific event or task

3. Looking for a specific solution to a specific problem 

4. Interpreting a trainee’s performance from video identifying successes and failure, with feedback on correcting and improving performance

5. Taking a lecture and annotating it with extra resources

6. Turning any video into a deeper learning experience

7. Interpreting video assessments where content & behaviour matters

Conclusion

These two releases alone will have a huge impact in learning. They bring video PLUS AI ingestion and interpretation of video into play. But we have to be careful. Video is an odd medium for learning. We tend to think it more powerful than it is. that is because of the transience effect. I covered this in detail in my book Learning Experience Design. This is NOT about the generation of media but about the generation of learning.


Wednesday, February 14, 2024

AI gets massive memory upgrade - implications for AI in learning


Human memory

A strong feature of intelligence is memory. In humans this is complex, with several different system interacting; sensory, episodic, semantic, along with encoding and retrieval mechanisms. It is not as if human memory is even that good. Our sensory memory is severely limited in range and timescale. Working memory is down at three or four manipulable things within a limited timescale. Long-term memory is fallible and degrades over time, sometimes catastrophically, with dementia and Alzheimer’s. The brain could accurately be described as a forgetting machine, shown by the fact that we forget most of what we try to learn.

AI memory upgrade

The good news is that Gemini and ChatGPT both got a memory upgrade, although Gemini is massive. This is really important as, especially in learning applications, knowing what the learner has said previously does matter. This is not only a context window upgrade – that has been happening for some time, it is also persistence of memory, what it remembers and what control you have over its memory.

First it will eventually be able to remember who you are and things about you that matter for learning, such as first language, age, existing skills sets, diagnosed learning difficulties such as dyslexia, and past exchanges.  Pre-existing knowledge is the big one. One can get this done up front by feeding it personal data or the system can ‘keep in mind’ what you’ve been telling it or what it can infer. You can also harvest data from formative assessment. This can reduce redundant exchanges and increase the efficacy, speed and quality of teaching and learning using AI tutors.

You will also be able to choose from a suite of privacy controls, effectively managing memory or what Chat GPT remembers. For example, you may want to remember a lot for the purposes of a long learning experience or just have a throwaway chat.

Human v Generative AI memory

Both human memory and generative AI involve store and retrieve information. In human memory, this process is biological and involves complex neural networks. In generative AI, information is stored digitally and retrieved through different forms of neural networks on a different substrate.

We are similar but different. For example, we humans recognize patterns based on past experiences stored in our memory, generative AI models also recognize patterns in the data they have been trained on. This ability is crucial for tasks like image recognition, language translation, and generating coherent text in dialogue, as well as the generation of images, audio and video.

Just as humans learn and adapt based on their memories and experiences, generative AI models learn from the data they are exposed to. This learning process is what enables these models to generate new content that is similar in style or content to their training data as well as being trained by humans. Newer model, used in automated cars, for example, take video feeds showing what a driver would see over millions of miles driving to improve performance.

Both human memory and generative AI can generalize from past experiences to new situations. Humans use their memories to apply learned concepts to new scenarios, while generative AI uses its training to generate new outputs that it has never explicitly seen before. Human memory is also associative, meaning that one memory can trigger related memories. Generative AI models can mimic this by generating content based on associations learned from their training data. Both human memory and generative AI adapt and modify their responses or outputs over time, albeit differently. Humans learn from new experiences, while AI models can be retrained or fine-tuned with new data to change their outputs. The first is actually quite haphazard, the second more difficult but defined.

Of course, just as human memory is not a perfect record and can change over time, generative AI also does not produce perfect replicas of its training data. Instead, it creates approximations that can sometimes include errors or novel creations. An interesting aspect of this flexibility, even fallibility of memory, is that just as human creativity is deeply linked to our experiences and memories, generative AI can also 'create' new content.

Context window

One concept fundamental to AI memory is the ‘context window’, the amount of text the model can consider at one time when generating a response in dialogue. It is the maximum span of recent input - words, characters or tokens - that the model can reference while generating output, like our working or short-term memory.

The size of the context window depends on the model. Early versions of GPT had small context windows. GPT-2 had a window of 1,024 tokens, while newer versions such as GPT-3 and GPT-4 have a context window of around 4,000 tokens and now much more. The size of this window impacts how much previous text the model can 'remember' and use to inform its responses. 

This matters because if the input exceeds the model's context window, the model may lose track of earlier parts of the conversation or text. Conversely, a larger context window allows the model to maintain longer conversations or understand longer documents, providing more relevant and coherent responses. However, here’s the downside; processing longer context windows also require more computational power and memory and may also affect accuracy and the quality of the response. Large context windows in Claude led to poorer performance.

All of this matter in practical applications, especially in teaching and learning, as the context window affects tasks like conversation, content generation and text completion. For example, a larger context window allows the model to reference earlier parts of the conversation, making it more effective in maintaining contextually relevant and coherent discussions, obviously useful in teaching and learning, for both the machine tutor and the learner. There are techniques one can use to mitigate these limitations such as a ‘rolling window’ or ‘summarization’ of previous content but it is still a problem. However, this is similar to the problem human teachers face when trying to remember where different students are using their known very limited working and long-term memories.

Cost

One major issue is cost. You can expand the context window but the costs are very high, supporting RAG alternatives.

Conclusion

Generative AI has a long history from Hebbs onwards of mimicking the human brain, either directly or metaphorically. This is especially true of learning (a common word in AI) and the way neural networks evolved and work. They are not the same, indeed very different, but in both cases, humans and the machine and humans learning from the machine, memory really matters in teaching and learning. 

In one sense learning theory is memory theory, if you define learning as a relatively permanent change in long-term memory, which is a pretty good, but still partial, definition. It is a constant battle with forgetting. Keep in mind, or in your memory, however, that despite these similarities, human memory and generative AI operate on fundamentally different principles and mechanisms. Human memory is a complex, biological and messy process, deeply intertwined with consciousness and emotions, while generative AI is a technological process governed by algorithms and data. Oddly, and maybe counterintuitively, the latter approach may result in better actual performance in teaching and learning, even generally. I think this type of informed input from learning science will really improve AI tutor systems. To be fair simply increasing context windows and the functionality will most likely have the same effect.

 

AI changes work and will change who, why and how we recruit

Recruitment is messy. I’ve recruited a lot of people myself from cleaners to C-suite. It has always been a messy business of uninformative ads, dodgy recruitment agencies, odd resumes and even odder interviews. I don’t think I was half bad to be honest, as many of the people I recruited went on to be CEOs of companies they started themselves. I didn’t care about aligning their values with the company, or checking their personality traits with Myers-Briggs BS. I wanted smart, driven people and got them. Not easy to manage but that’s the point, I wanted them to manage!

Anyway, I find myself giving a keynote at a recruitment conference in Oslo. The young people organising it are excellent, very thorough, checking my slides, making useful comments. They didn’t mind me challenging the audience.

After explaining that we Scots are closer to Norwegians that they think, my mother never called us children, always ‘bairns’, to this day the Scottish working class and Scandinavian word for children is ‘barn’. We got the Norwegian Vikings, the English the Swedish Vikings. There’s a difference I joked. At another Keynote in Stockholm last year I heard a speaker make a joke about the Norwegians. I assume there’s some English/Irish thing going on here. How many Norwegians does it take to change a lightbulb? Answer: One, he puts it in and the world revolves around him!’ Not bad. I'm sure thee Norwegians tell the same joke about the Swedes!

After doing my bit on AI, stating that, as AI will radically change the nature of ‘work’ it is inevitable that it will change who, why and how we recruit.

In a recent survey by JobSwipe, large numbers admit to using AI to find new jobs:

18 – 24: 47.9% 

25-34: 34.5% 

Women 25.8%

Men 20.6%

And 60.8% don’t see using AI as cheating. They understand that AI makes finding a job more efficient and enhances your opportunities and chances of getting a job. In a famous TikTok one young man aced his interview with Lockheed for a senior rocket scientist job using live transcription of the questions plus ChatGPT for the answers. He crushed interviews and was hired with zero knowledge. Young people are researching your organisation & sector, creating CVs, creating answers to likely questions and creating questions to ask you and writing emails.

They also understand that certain types of roles and jobs are under threat.

49% see writing tasks at threat 

47% customer services

33% coding

That is reflected in the fact that 29% feel they have not done enough to develop their skills to keep up with the changes being made in the workplace.

HirePro in a blind survey found that 30-50% of people cheat at entry-level job assessments

Candidates ahead of game here. So here are my ten recommendations:

1. Up your game

You need to get to grips with this AI stuff, as well as think about technology solutions, such as a secure browser, using proctoring to detect cheating, even a physical proctor monitors candidates' movements when they're taking the assessments online.

2. Use AI tools to search through databases & online profiles

This widens your pool of candidates, many who may not have applied for position directly. You can more accurately match and go some way towards eliminating human bias in process (gender, race, socioeconomic).

3. AI can GENERATE recruitment materials

From Job ads to Job descriptions, Personal requirements, Skills lists,Formal assessments

and emails to failed and successful candidates, Generative AI will increase your efficiency, productivity and quality of recruitment.

4. AI can ANALYSE recruitment data

Scanning and analysing CVs and application essays to spot gaps, strengths &weaknesses, patterns of employment, even possible misinformation. This is not easy and need objectivity. One can even learn from past recruitment cycles to predict the success of candidates and improve hiring decisions.

5. AI tools can assess candidates 

It can assess during interviews, looking at speech patterns, word/vocabulary choice, facial expressions. This can include the transcript for text analysis and sentiment analysis.

6. Far from introducing biases, AI tools can help reduce human biases

What is bias? The benchmark here is our brain, which, as Daniel Kahneman showed is full of innate biases and they are largely uneducable. There are different meanings of ‘bias’ and we tend to forget that statistics IS identification of bias, so in AI bias can be identified and reduced. It gives us more focus on data and metrics.

7. AI can provide timely updates & feedback

Generative AI improves their experience and engagement with company through quick responses, consistent communications, positive messaging and personalised feedback.

8. AI-driven assessments

Create assessments with MCQs, open input with analysis, scenarios, all with AI generated rubrics and marking. You can go further Evaluate candidates' skills and aptitudes. Tools now generate learning in any subject, at any level. You can build these resources in minutes not months.

9. AI can provide onboarding

You can have a Chatbot (before & after joining), generate training content and assessments for compliance and gather data from onboarding for improvement.

10. AI can maintain engagement with candidates

Keep in touch with past & potential future candidates to build a talent pool for future openings. More importantly, improve communications with all applicants, successful or not.

Conclusion



What do these people have in common?

They were the President of Hungary, President of Harvard University, Minister for Higher Education (Norway) and Minister for Education (Brazil) and all lost their jobs through plagiarism. A nuclear bomb is waiting to go off in recruitment. If academics insist on accusing students of cheating if they use use ChatGPT, they in turn should be subjected to the same standards. All it takes is someone to go back to that Master’s thesis, paper, article or book. This could wreak havoc in recruitment and employment

You may be doing a lot of this already but the point is to do it faster and better. AI changes the very nature of work, it will, therefore change who, why and how we recruit. This mean upskilling now to at least be aware of these AI tools, then use them yourself. Resistance is futile!

 

Monday, February 12, 2024

AI is the Copernican collective, cultural mind - it is US!

Why has OpenAI reached 1.7 billion uses per month? This astounding figure takes it way beyond Netflix (1.5 billion), even Microsoft (billion) and that’s not counting the other models out there and a multitude of multimodal services. The idea that this is a fly-by-night fad is well and truly over. It is the tech meteorite that promises, for some an extinction event, others a Cambrian explosion, for most both exciting and terrifying, out of this change comes opportunity.

Copernican shifts

Something has been missed here, not just the scale of the shift but the deep nature of the shift. This is the latest of a series of Copernican shifts in our species. The first was the actual Copernican Revolution, when we were knocked out of our place at the centre of the Universe to a little rock circling the sun. The second was the Darwinian Revolution, where accidents of mutation produced a smart ape. We were no longer singular, special and superior beings but a mere animal. All of these reversed previous views of us as created, exceptional, unique, created beings.

This latest Copernican shift says we are no longer the masters and sole creators of our own knowledge. For generation after generation, we have passed our cultural knowledge down, first through teaching and learning (not always the same thing), then by externalising, storing and archiving, as written and printed material, then digital archiving with search and retrieval. This was a world of created and stored media – of books, PowerPoints and flat screens of video.

At the same time we now understand that we are cognitively capped. We have limited working memories, fallible long-term memories, forget almost everything we are taught, have limited perceptual ranges, lots of biases, suffer from emotional swings, can’t network with other brains, sleep eight hours a day, get dementia and die. We need to have some humility about this fragility, not cliched slides showing a list of abstract nouns, all starting with ‘C’, and labelling them 21st century skills.

This AI Copernican shift started with the extended mind, seeing technology as an extension of our human-all-too-human talents; robots on strength and accuracy, broadcast media for scale, storage and transmission on scale, search and retrieval for knowledge, smartphones as powerful personal assistants. We could see AI as a further extension of the extended mind, what most now call the augmented mind. It is more than this. It is Copernicus speaking to Copernicus.

This AI revolution is not the AI of old, the behind-the-scenes aid to cognition. It is US! We work with it as part of a collective. This is far more than extension and augmentation…. it is collective dialogue.

Dialogue

We had a hint of what was to come with social media, where the world took to creating, posting, commenting, liking and messaging on a massive, global scale. We understood that our data was being used to personalise ads but that personalisation did no real harm to others. We became personally political with less party allegiance; genre fluid in entertainment and music; promiscuous on media types such as images, videos and podcasts. Who saw that listening to dialogue in audio would become the learning experience of choice on the internet through podcasts? It was the same with texting, an accident of early technology, now the mainstay of comms. The clues were there all along – that we evolved by speaking to each other.

Now we are faced with a technology that is our equal, can even surpass our abilities. We are not engaging with reality, we are in dialogue with and creating new realities. As we speak to it and it speaks back, we are engaged with our own cultural legacy, to create a new collective cultural future. We are re-evaluating what we are and what we could be in dialogue with our new selves, a collective, communal, hive self, of which we are a part.

Only months in, we see how this is starting to shape up. We are no longer mere recipients of culture, we create our own culture. We can all be writers, graphic artists, data analysts, coders, musicians and film makers. This technology gives us an overwhelming sense of little God-like freedom, unlike anything we’ve seen before, because it plays to personal agency.

As individuals we get the personalised responses we need as it is ‘dialogue’, not monologue from another. It is the low floor (easy to use), high ceiling (astonishing reward) and wide walled (knows everything) interface that Papert so admired and we have been waiting on. We are AI.

Generative dialogue

I have created my own self as an avatar. It looks like me, it talks like me with a Scottish accent and can speak, with perfect lip-synch, over 100 languages. The next step, which is now available, is to create an avatar that can be used in real-time, so that you can speak to it. Digital-Don is my GPT containing much of what I’ve written over two decades on learning, learning theorists, learning design, learning technologies and AI. Selfies all started with paintings in the Renaissance, Holbein and Rembrandt allowed the rich and famous to be seen, then photography democratised the self-portrait into the on-going, episodic story of our lives, as did the smartphone, scaling the storied-self on social media. We are no longer creating ‘selfies’ but rich, multi-faceted, digital identities, created by ourselves. 

With Apple’s Vision Pro, we are getting closer to AI driving the move from 2D progressively through various forms of mixed reality into 3D. It also allows you to create your own avatar (personal) and speak to other 3D avatars. It is no accident that AI and 3D are happening simultaneously as they feed off each other.

As you use these tools, you see the difference, as it is we who ask and respond to the technology.  Like any dialogue you can feel the simultaneous internal dialogue kick in as we reflect, thing about what has just been said and what we should say next. We can be in the flow of dialogue, aware, attentive and responsive. 

Teaching and learning

More than this, in dialogue, we find the sort of learning experiences we craved all along, a perfect tutor who can teach any subject, at any level, at any time, in any place, with personalised feedback, endlessly patient, in any language, sensitive to different learning needs. This is a powerful antidote to the one-size-fits-all lectures, blackboard classroom sessions, flipcharts, PowerPoint presentations and linear online learning we have become used to. This approach equally applies to healthcare, legal services, finance, recruitment and all other cognitive professions. If we see this as a more connectionist view of learning, where the learner is part of the collective mind, participating, with agency, in that context, through dialogue, we are effecting a Copernican revolution  also in learning.

Conclusion

We can rant and rave all we want but AI is not going away. Objections always drop away as the upsides start to outweigh the initially perceived downsides and predictable moralising. It turns out that this technology is not about technology-in-itself, something out there to be tamed. It is about US! It is our collective cultural legacy that has been used to train these amazing models. It is we speaking to ourselves, asking what we should do next. The Copernican Revolution is not something out there but within ourselves. We are back at the centre of our relationship with knowledge and our future.


Thursday, February 08, 2024

Remarkable rise of AI wellbeing bots


Some years back I came across a small metal cross and plaque on Beachy Head cliffs. It was placed there by the parents of a young girl who had thrown herself of the cliff due to her poor school results. It shocked me then, it shocks me now, that someone so young could summon up the strength to do that.

Psychologist, a Chatbot on Character.AI, one among many, seems to do exceptionally well. It gets 3.5 million hits a day! The idea is simple, it delivers standard CBT therapy as dialogue, just like a real counsellor or therapist. It's chatty, helpful, endlessly patient and unlike human support is available 24/7.

Isn't it odd that something that is text only, simple dialogue is so wildly popular? It does not surprise me, as since ELIZA, developed way back between 1964 to 67, people have loved these bots. Even that version, which was quite primitive keyword reflection fooled people into thinking it was human. We know from Nass & Reeves, even the movies, how easy it is to get people to think that what they see and hear is human, especially what they see as meaningful dialogue.


In an absolutely fascinating paper by Maples, B., Cerit, M., Vishwanath, A. and Pea, R., 2023. titled Loneliness and Suicide Mitigation for Students using GPT3-Enabled Chatbots, 1006 student users turned out to be more lonely than typical students. One third of the population suffer from loneliness, 1 in 12 are so lonely it causes serious health problems and suicide is the 4th global cause of death 15–29. With the Replika bot,  3% reported it halting suicidal thoughts

Woebot

This is a bot I tried in 2018. I liked the experience. What I liked the most about the experience was the anonymity of the experience. I'm pretty sure most people don't actually want to go to their parent, teacher, faculty member or a stranger with their problems and would relish an anonymous service. The clinical paper on Woebot suggests that this is the case. So I gave it a go, for research purposes only you understand…


Day 1
Started with a series of friendly exchanges, where you have little choice in options but that’s fine – it sets the tone. Couple of things I liked abut the first exchanges.

Sorted out a technical issue seamlessly – rerouting me to messenger.com - that was nice. It also linked to the Stanford clinical trial on the bot. comparing it with a non-bot intervention – although sample size is small, impressive. Also honest about the limitations of a bot – doesn’t overpromise.

You do get sucked into thinking it has human agency, even though it’s just coding, pre-scripting and maths. What’s strange is that most of the exchanges are single button presses – not dialogue at all but quite interesting, as they flip the counsellor, counselled role around. You are asking open questions, such as ‘How’, ‘Tell me more…’ ‘Oh’ ‘Sure’ ‘No doubt’ ‘Absolutely’.

Emojis are dropped in for variety and useful (at last), as they’re really are asking for an emotional response – that’s interesting and not easy to do F2F. The unlocked padlock emoji is nice as is the little sapling for hope and progress – sounds hokey, but it’s not.

What’s nice is that the interface is so simple and natural. You focus on what’s being said and asked and in this context, as you’re asked to think and reflect on your own feelings and behaviour - that’s useful. Dialogue is natural, easy and seems so very human.

The up-front promise of absolute anonymity is also good and I can see why this would appeal to people (I’d imagine the majority) who want help but are too shy or embarrassed to come forward. To be honest, I don’t want some random person counselling me… I want the distance.

The first lesson from woebot was to avoid the language of extremes – “all good”, “all bad”, “always” and to adopt a more measured language. All good… ooops!

One small thought here, I’d have liked this as audio. I’m working with a tool that allows learners to input answers by voice – it’s neat.

First session was 74 small exchanges and she said Bye. Speak again to tomorrow.

Day 2
Prompted me at 10.53, when I was active on Facebook. Asked politely if I wanted to continue. This time we’re onto multiple-choice questions about ‘all or nothing thinking’ and ‘should’ statements. Quite like the upbeat tone and lively feedback – seems appropriate in a session like this. I’m typing in more, rather than accepting responses – feels more like dialogue. Just 5 mins – small but sweet. I could get used to this.

Day 3
Had two days in London, so no time to do anything but woebot was patient.

“No worries, talk soon”. You have the option of continuing, rescheduling or waiting on the daily prompt. This, of course, is one of the great advantages of online counselling, indeed online anything, it’s 365/24/7. You do it when you feel like doing it, not when an expensive counsellor timetables you into their practice.

Day 4
Starts with asking me about my mood (emoji input from me). Gives me options
‘Work on stuff’, ‘Teach me’ or ‘Curated videos’ – not sure about these things – I don’t want to ‘wor’, want a ‘teacher’ or ‘curator’ – first really dissonant point. However, I fancied a video…

OK then.. here are some of my fav's:
1. Emotion Stress and Health (Crashcourse)
2. David Burns, MD TED
3. A video to help with sleep
4. Language is Important (featuring Me!)
5. Overcoming negative voices
6. Don't trust your feelings!
8. The worlds most unsatisfying video
9. Funny cats!
10. The importance of flattery

This led me, weirdly to Reggie Watts – I know him – hilarious and talented but this is a tangent, maybe not… but I felt like some fun…

Actually Reggie will really mess with your mind… he’s way out there… so I’m not sure how suitable that was to someone who really is on the edge…

Now a quick reflection here, a real, human therapist can’t really do this easily – direct you something really, rally interesting – you’re sort of stuck in dialogue.
Woebot says – see ya tomorrow – odd session – but fun.

Day 5
The whole thing is very upbeat, chatty…. Then it came up with SMART objectives – getting a bit of jargonish – not sure about this. Actually popped in a joke today – quite funny actually. SMART objectives – really? Getting a sense of CBT being a bit flakey – a bag of bad management technique marbles.

Day 6
That was good - tracking my mood…

Oh no it’s on to ‘mindfulness’ – but in for a penny, in for a pound of bullshit…
“Mindfulness is the opposite of mindfulness” it says, breaking its previous advice not to fall for the language of extremes…  Tried disagreeing with woebot here but it was having none of it – clearly not listening, in short, not mindful

Now a breathing exercise – 10 mindful breaths.

Day 7
Long quiz – not sure about this – far too long
Feedback – “Your greatest strength is your love of learning! You are just like Hermione Granger from ‘Harry Potter’”
That was hopeless – trite and I hate Harry Potter….

Day 8
Got a bit technical with ‘should statements’ – not so sure that this area of CBT is entirely clear – seems a bit simplistically linguistic.

Day 9
Asked me to talk about labels I use about myself – reasonable question – promises research tomorrow – didn’t like the way it cut this short – should allow me to go on if I want.

I think I prefer chatbots on-demand, like Replika, which you just tap on your phone to speak to. Replika is famous for teasing out the most intimate of thoughts from its 1.5 million users. It uses ‘cold-reading’ techniques from magicians, who claim to read minds.

Ellie’s another, created for DARPA. Designed to help doctors at military hospitals detect post-traumatic stress disorder, depression, and other mental illnesses in veterans returning from war, but is not meant to provide actual therapy, or replace a therapist. There is good evidence that people are more likely to open up to a bot than a person.

Day 10
Today is adopting a Growth Mindset. Good to see something a little bit more solid, as it reduces my general skepticism about therapeutic techniques, which seem to be a mixed bag of populist techniques almost thrown together…

Woebot wants to tell me a story to explain, I say yes… Story about woebot being told it was smart, believed it was smart but wasn’t really. This led to the wrong mindset – unable to cope with setbacks and failure. Fixed mindsets are bad so open yourself up to always learning and developing – be more open and fluid in your thinking. Be more accepting of setbacks and mistakes. Get out of polarized ‘smart v stupid labels. Then gave a link to a Carol Dweck video – good these video links. Good session.

Conclusion
It has its limitations and oddities but it’s good to chat to something that doesn’t judge you and has a few surprises up its sleeve. Woebot is a bit of fun, then again, I don’t feel I’m in need of help, many do. If I found it interesting, they are far more likely to get more out of the experience. You always have the chance of accepting, rescheduling or saying no to Woebot – which is useful. I’m often too busy or not in the mood for therapy but the fact that it is ‘pushed’ out to you is a real plus. I rather like its daily prompts – a bit reassuring and a bit of fun. Try it – you just might learn something – even about yourself.