Thursday, May 23, 2024

Context matters? Yes and AI has made its move...

Just back from Japan and  our‘point and read’ Google Lens on the smartphone was a saviour. Japanese inscriptions in Museums, gardens, menus in restaurants, products in shops… using Lens was easy and reliable. No sooner are we were back in the UK, a rack of announcements from OpenAI, Google and Microsoft that put ‘contextual computing’ on the map. One of the primary problems in learning and performance support is CONTEXT. By that I mean, the teachers (human or AI) need to know WHO, WHAT, WHERE, WHEN to deliver great support. Context always matters.

RAG is a good example in AI and shows the direction of travel. Retrieval Augmentation Genreation, says take a look at the additional context then prompt to get your answer. Yann Lecun takes this further and sees a multimodal space for contextual analysis as being essential for AGI.

Smartphone as remote control for life 

Where AI has been lacking is in knowing about your immediate context and intent. This nut is now being cracked, as AI’s multimodal abilities have now hit the market. By this I mean its ability to ‘hear’ things, ‘see’ through your camera, identify things from video or know what’s on your screen. Both OpenAI and Google have started to crack multimodal delivery. Whether it is text, images, speech or video – it understands. Microsoft have gone after your work tasks; what application are you using, what’s local on your PC. Copilot plus knows what you are up to so it can provide the right support. Agents are starting to understand you and your context so they can act as your assistant.

Your smartphone has become a robot that knows you. You can point at anything and ask questions. It’s a hearing, seeing, understanding thing, almost sentient thing. In a sense your smartphone is fast becoming a remote control for your life.

Context aids cognition

Context is a particular boon in learning, to help deliver learning, formative assessment and assessments in situ, even in imagined 3D spaces. We can transcend the tyranny of text to do so much more in the real world doing real things, with real people in real places.

In terms of performance support, you will be able to point your camera at anything and ask a question – what is this, give me more detail how I use this, teach me to use it…. You may need new knowledge, more knowledge, learn unfamiliar tasks, apply things, learn changes to familiar tasks… all of these famous ‘moments of need’ can be satisfied by contextual analysis.

If you are working on a screen, it knows what software you are using, what is in your document, image or PPT, what you need to improve in that document, image, PPT or spreadsheet. It may either help you or do that task itself. In a hospital you may get help in that specific radiology room, with that specific medical apparatus. In a factory it may know the machines, how to use them, get real health and safety in that specific area, show you what to do when things go wrong. You get the idea. Context aids cognition.

Agents

Widening the perspective, the introduction of ‘agents’ signals something else. These are entities that act on your behalf and may know what your overall goal, sub-goals and tasks are. It is here that agents come in. This is clearly a direction of travel with smart GPTs and recent announcements show agentic workflow, a fancy name for giving AI agents control to support you in specific tasks or goals. 

Moderna and other are using up to 400 agents within their organisations to help on specific tasks. With new agentic workflow, we will see significant productivity gains, as they literally automate what humans used to do. 

Context options

When discussing the context that computers can capture, we generally refer to the various dimensions that provide situational information to enhance the interaction between the user and the computer. It is worth identifying the sheer range of contextual support that is now possible:

1. Physical Context

Your smartphone has long used GPS, as well as local movements using accelerometer and gyroscope data to determine motion and orientation. It also knows what’s in the vicinity, that a traffic jam may hold you up. It also knows the time and date, holidays, calendar events, timezones, even the weather.

2. Personal Context

It knows your identity, not just information such as your name, age, gender, and possibly user profiles with data about your socio-economic status. A step further is what you are currently doing, often inferred from app usage, keyboard activity or wearable devices. Then there’s your ‘preferences’, such as likes, dislikes and preferences based on past actions, purchases and explicit settings. At the physiological level, your heart rate, stress level and other biometrics captured via wearable devices such as bands, watches and rings are knowable. Eye, head, face and gesture tracking can all be used to read you movements but also intentions. Voice may also be read for signs of emotion, stress, interest and intent. This can provide health and activity insights. You are a literally a mass of fixed and variable data as you move through the world.

3. Social Context

Don’t think it stops there, your social context inferred from social media identifies your friends, connections and interactions on social media platforms. Information about other users in the physical vicinity, which can be detected through Bluetooth, Wi-Fi, or other proximity technologies is also usable. Then there’s your long and detailed communication history on email, chat, call history, search and other forms of communication.

4. Cultural Context:

Not difficult to know the language settings of the device and the content you typically engage with to determines your socio-economic position and cultural interests. Bourdieu identified the forms this cultural capital takes in terms of your memberships, credentials, associations, the language you use, accent, sports you like and credentials you have and value.

5. Task Context

More specific task or application data can be gleaned from the screen, supplication or activity you are engaged in at that moment. Inferences can also be made from recently completed tasks or frequently performed actions, even transcripts from meetings. What are you watching, did you pause, cut out, review – all of these micro-behaviours are useful. Menus can be adapted to suit your preferences.

6. Device Context

What device are you on? A smartphone, tablet, laptop, desktop, wearable device, along with current battery status, can influence how the device manages resources. We may even know what bandwidth you have available on wi-fi, telecoms and so on, even available space on your device, CPU use and available memory.

7. Application Context:

What applications are currently in use and what is their state? Analysis of frequency and duration of tool and app usage can help predict future behaviour, even permissions tell a story.

All of the above can be used to provide personalised notifications and reminders or sell you stuff with targeted advertising. By integrating these various dimensions of context, computers can offer more intuitive, responsive, and personalized user experiences.

Conclusion

Context is the new frontier for computing and with edge computing, local AI and agents, we can see how much work can be made more efficient. This scales people, which is what most organisations, wallowing in flat of slums of productivity need. We also have to be honest and admi that they may automate what humans thought they needed to do.


No comments: