Wednesday, March 13, 2024

AI moves from 2D to 3D

Quite remarkable achievement by Deepmind. I wrote about this in my 'Learning in the Metaverse' book and the 2nd Edition of my book on GenAI coming out on May 4. The idea that AI accelerates the move from 2D to 3D.

This software takes language prompts into actions within 3D worlds. For the first time, the agent actually understands the 3D world in which it operates and can perform tasks just like a human.

How it works

All it needs are images from a screen of the game/environment and text instructions. It can therefor interact with Any virtual environment. Menus, navigation through the environments, actions and interactions with objects are all executed. They partnered with eight games companies to perform different tasks. SIMA is the AI agent that, after training, perceives and also understand a variety of environments, so that it can take actions to achieve an instructed goal.



Transfer

Even more remarkable is the fact that agents seem to transfer learning, so playing in one environment helps it succeed in others.



Multimodal now also 3D

Far too much debate around AI focus on text only LLM capabilities and not their expansion into multimodal capabilities, now including 3D worlds. The goal is to get agents to perform things in the virtual and/or real 3D world intelligently like a human.

Applications

Its obvious application is in performing risky tasks in high-risk environments but also in any 3D world. It can also be used in online 3D worlds to help with training. The early signs of a tutor within these worlds or buddy, patient, employee or customer in training. 

Its obvious application is in performing risky tasks in high-risk environments but also in any 3D world. It can also be used in online 3D worlds to help with training. The early signs of a tutor within these worlds or buddy, patient, employee or customer in training.

Full paper

No comments: