Wednesday, March 06, 2019

Summarising learning materials using AI - abundance of stuff

We’ve been using AI to create online learning for some time now. AI, we believe, is far better at precise goals, such as identifying key learning points, creating links to external content, creating podcasts using text to speech and the semantic interpretation of free text input by learners. We’ve done all of this but one thing always plagues the use of AI in learning…. the source material.


Paucity of data, abundance of stuff
Walk into many large organisations and you’ll encounter a ton of documents and PowerPoints. They’re often over-written and far too long to be useful in an efficient learning process. That doesn’t put people off and in many organisations we still have 50-120 or more PowerPoint slides delivered in a room with a projector, as training. It’s not much better in Higher Education, where the one hour lecture is still the most dominant teaching method. The trick is to have a filter that can automate the shortening of all of this stuff.

Summarisation
To summarise or précis documents (text) down in size, to focus on the ‘need to know’ content, there are three processes:

1. Human edit
No matter what AI techniques you use to précis text, it is wise to initially, edit out the extraneous material (by hand), that learners will not be expected to learn. For example, supplementary information, disclaimers, who wrote the document and so on. With large, well-structured documents, PDFs and PPTs it is often easy to simply identify the introductions or summaries in each section. These form ready-made summaries of the essential content for learning. Regard this step as simple data cleansing or hand washing!

Learning professionals are good at this as we often know what to get rid of and how ti simplify text, images, even video for learning. Documents especially are usually over-written and not designed for learning. powerPoints, less so, but even then we all know that they are usually over-engineered with far too much text.

The simple act of reducing large documents, PowerPoints and video to essential 'need to know' content is often ignored in favour of even longer, over-engineered 'learning experience' where the content is rebuilt from scratch, with expensive media production. It is often just as appropriate to reduce and summarise this source material into screen-friendly text or other media such as:
  • abstract
  • checklist
  • infographic 
  • short PPT
  • short podcast
  • short video
Now you are ready for further steps with AI....

2. Extractive AI
This technique uses a summary that keeps the sentences intact and only ‘extracts’ the relevant material. We usually look at a quick Human edit first, then extract the relevant shortened text, which can then be used in WildFire, or on its own. This is especially useful where the content may be subject to already regulated control (approved by expert, lawyer, regulator). For example in medical content in the pharmaceutical industry or compliance.

3. Abstractive AI
This is a summary that is rewritten and uses a set of training data and machine learning to produce a summary. Note that this approach needs a large domain-specific training set. By large we mean as large as possible. Some of the trainings sets are literally Gigabytes of data. That data also has to be cleaned.

Conclusion
The end result is automatically shortened documents, from original large documents, PowerPoints even video transcripts. 

It is usually best to input summarised material into WildFire, rather than delivering in]tense training on huge pieces of content, you get the essentials. The summaries themselves can be useful in the content of the learning experience. So if you have a ton of documents and PowerPoints, we can shorten them quickly and produce online learning in minutes not months, at a fraction of the cost of traditional online learning, with very high retention.

No comments: