Friday, July 26, 2024

Choosing a GenAI model? Tricky... quick guide...

Choosing a GenAI model has become more difficult at the professional, even personal, level. This is to be expected in this explosive market but at least the frontrunners are clear. Note that the investment and entrepreneurial spirit to do this are not trivial, so it is unlikely that these names will change much in the near future. I suspect we'll have a few big vendors, some open source, with many falling by the wayside - that's the way markets work in tech.

BLOOM was a salutary lesson here. It was a large language model (LLM) created by over 1000 researchers from 60+ countries and 250+ institutions, was released to help advance research work on LLMs. It’s focus on being widely multilingual seemed like a strength but turned out to give little advantage. The lack of a chatbot, no RLFH and being outrun by Llama and Mistral didn’t help.

But it is not easy keeping up to date, familiar with and having the ability to discriminate and choose the model that works best for your needs. Here’s the good news – you can swap them out but be careful as they differ in many ways.

Different models

Now that we have a range of models available, not all the same, AI has become a dynamic field that benefits the consumer, with new models being regularly released. But is not just LLMs that are being released. There are the top end SOTA models, CPT- 4o, Claude Sonnet 3.5, Gemini Pro 1.5. Then there are open-source options such as Llama 3.1 and Mistral Large 2. There is also a range of smaller models. Here's a quick crib sheet...

They all come with a range of additional functionality in terms of integration within large IT environments, tools, mobile apps, some (not all) with image generation (not all), different context windows, web search (not all), validate with sources, different uploading and editing capabilities, foreign language capabilities, along with specialist services such as GPTs or Artifacts. It is complex when you get into the detail.

Choosing a model

Choosing a model depends on breadth of functionality, speed, cost, quality, access to external tools and availability. This is often poorly understood.

We have found that, depending on what you are delivering front-end, and new releases often have choices on size, price and functionality, it is often not difficult to swap out models at the back-end, getting reductions in price, better functionality and speed. A good technology partner will keep a close eye on all of this and make the right choices for you.

It is only when you work with all of the major models, in real projects with real clients that you really understand the real strengths and weaknesses of each model, especially when you hit what appears to be arbitrary limits or unreliability on access and speed.

It is easy, for example, to assume that all are multimodal, all available in Europe, all available as mobile apps on both Android and iOS, all have massive context windows and code interpreters. Their editing and integration capabilities also vary. Some have very specific functionality that others do not have, like GPTs and Artifacts.

Open source

Don’t assume that because a model is Open Source, like Llama and Mistral, it is free and allows you to dispense with the paid services. They come with licences in terms of use and are not easy to use. Open source, and we’ve had experience of this in other domains such as with VLEs and LMSs, are not easy to use on their own and, especially in AI need considerable in-house expertise.

Nevertheless, they open up a new territory for development and need to be considered. Meta believes this is a move towards the integration of open-source models, just as Linux has become an industry standard. Integrity and transparency are also increased, as you can test these models yourself.

Conclusion 

Knowing and having a real ‘working’ knowledge of these models is essential for any real implementation and especially development work. Before working with a consultant or vendor, ask them a few questions on the differences between models. Avoid vendors who just use one model, as that often shows a prejudice towards ‘their’ model. It really does separate the competent from the bandwagon jumpers.


No comments: