Thursday, January 04, 2024

A Large Language Model (LLM) happened once before in history and it changed the world forever…

The Large Language Model GPT, from OpenAI, is one of the great wonders of the modern world. It is captivating, intriguing and above all useful. For the first time in the history of our species we have personal access, on a Global scale, to the sum of human culture. When the world speaks to a language model through ChatGPT, we speak to ourselves, the global mind. A LLM, like a brain and language, is unfathomable but dialogue gradually reveals its nature. Yet this is not the first time this has happened.

2300 years ago, in Alexandria, Ptolemy 1 decided, in this city at the crossroads of Europe, Asia and Africa, to do the same thing. He collected as much of the world’s literature as he could, paying for much of it, confiscating some, even stealing some, whatever it took to create the sum of known human knowledge, in many languages, from many lands “to collect, if possible, all the books in the world.” Reading Islam Issa's wonderful book, Alexandria, I was struck by the parallels.

Global dataset

Ptolemy 1 had Global ambitions and written knowledge was all within his reach, except for China, the only other place on earth where writing had been invented. He wrote to many other leaders and had no cultural bias – anything from any land in any language was welcome. It is thought that something in the order of 700,000 scrolls were assembled into this one, huge, data set.

In gathering content for the Library he used other people content from books, reached out, bought other libraries, copied every book they could find, begged borrowed and even stole. This is close to what has happened with the training data for LLMs, where a huge corpus of text, that would take 22,000 years to read, has been used to train the GPT model. GPT was trained on around 300 billion words. The average scroll in Classical Greece was around 10-15,000 words. If we take 700,000 scrolls at an average of 12,000 words each, the total number of words in the Library was around 8.4 billion words. That was impressive!

Like GPT, it was easily the largest dataset in the world, way bigger and therefore more useful than smaller libraries. They also had a technical advantage - the means of production and delivery – papyrus. Egypt owned the papyrus trade, limited supply to foreign buyers and therefore controlled the means of distribution, just like the data and compute clout of a Microsoft for ChatGPT. It even embargoed papyrus in 190 BC, intentionally restricting the growth of other libraries, like Pergamum. Scale mattered.

They even invented the idea of metadata for large datasets. First data preparation, translating everything into Greek, giving a single data standard. Seventy two Jewish scholars were employed and paid to translate the Bible. Then labelling each scroll with the author’s name and location. Further metadata was produced with categories such as doctors, historians, legislators, philosophers, rhetoricians, comic poets, epic poets and miscellaneous. They then went alphabetical. Finally, a complete catalogue. was produced. All of this increased the efficacy of research through more efficient search. They understood that the interface, ease of acccess to knowledge, mattered. This is what gave ChatGPT its status as the fastest adopted technology in the history of our species - ease of access.

The point was not to just collect all known papyrus scrolls, it was to learn from them and to globalise knowledge. The parallels between that ancient act and the current appearance of Generative AI has some fascinating parallels. Knowledge and access to that knowledge is power and this is a story of power. That power was instantiated when the library encouraged debate, discussion and outputs.

Access was the real key to success. Ptolemy allowed any scholar from anywhere to come and use the dataset, and they did. This is why this new AI tech is so exciting - anyone has access to it at little or no cost, from anywhere. Once we democratise intelligence, we democratise (to a degree) power. LLMs are currently affecting research and outputs in many different fields or sectors, just like the Library of Alexandria, which accelerated research, productivity and the creation of ideas for centuries to come.

Rapid achievements

Foundationally in mathematics, Euclid of Alexandria wrote his 14 volume Elements here, which included the first ever written algorithm,  a method to calculate the Greatest Common denominators for any given number. His theorems and, more importantly, proofs were deduced from axioms. Familiar examples include the proof that the angles of a triangle add up to180 degrees and Pythagoras’s Theorem. It is this logical rigour that is remarkable, influencing the entire history of mathematics and science. It was used as the main textbook in mathematics for over 2000 years, well into the 20th century and all University students for centuries used this book as part of the quadriviumf algorithms. Beyond this he wrote on the rigour of mathematical proof, conic sections, the geometry of spheres and number theory. In his Phaenomena, Euclid aims at astronomy with a treatment of spherical geometry.. Like LLMs, mathematics lay at the root of this project.

Conon of Samos developed conical mathematics. In astronomy they discovered the planet Mercury, compiled a catalogue of stars and developed a heliotropic view 1800 years before Copernicus. Map drawing and geography flourished with Claudius Ptolemy’s book Geography. He also saw mathematics, an Alexandrian obsession as being superior to the metaphysics of Plato and Aristotle. Eratosthenes calculated the earth’s orbit around the sun creating an accurate calendar, also realising that the earth was round he calculated, using rods, shadows at two locations, the circumference of the earth (he was accurate to within 50 miles). Foundation models are essentially language as maths with outputs as freshly minted language. Mathematics, once again, has proved its worth in the transmission and creation of knowledge.

Medicine also advanced with anatomy, dissection (even on live criminal patients!) and the pulse as a diagnostic sign. This is also happening with AI in healthcare, as new drugs are discovered, new materials and 200 million proteins, saving 1 billion years of research, were unlocked with AI.

In literature, there was the invention of the dictionary, and creative output in poetry, drama and music, sculpture and mosaic work, just as we are seeing, with the augmentation of art and ideas with LLMs.

It did not stop there, as astonishing feats of engineering also emerged from the work at the Library. Archimedes studied here and went back to Syracuse to build sophisticated war machines for the Romans. The mechanical astrolabe was invented, along with mechanical objects such as keyboard instruments, water clocks and automatons such as singing statues, chirping birds, dancing puppets,. There were self-trimming oil lamps, syringes, lab equipment for chemistry, a coin-operated vending machine for Holy Water, fire engines, even a steam engine! With AI we are also seeing the rise of speaking robots with Tesla’s self-driving cars and the astonishing Optimus robot.

Just like modern LLMs, deepfaking started almost immediately in Alexandria’s Library with scams and forgeries. What’s new? But this was a temporary problem and soon overcome. Alexandria grew rapidly as centre of mathematics, art and philosophy. It came to an end hundreds of years later, in the fourth century, when the mathematician Hypatia, who became a woman of mathematical renown and intellectual stature, was murdered by a Christian mob. The Classical world was nearing its end and monotheistic religion was starting to dominate leaders and the intellectual world. By the late 4th century AD, Christianity was banning books, all but scripture and the library was in decline largely through censorship and eventually the banning of non-Christian books as heretical. Yet its influence remains as a conduit for knowledge and creation and invention. There was no fire, it suffered a slow decline through intolerance and misguided moral certainty. There is, perhaps, another lesson to be learnt here - not to let moralisers destroy what is good on the back of their dogmatic belief that learning and innovation is bad.

Conclusion

Alexandria teaches us a lesson, that when we pool resources and create something unique, that benefits the whole of our species, wonderful things happen. It became the intellectual centre of the world for several centuries, one of the most important cities in the world, for the Ptolemies, Romans, Arabs, Ottomans, French and British. 

With GPT we have achieved something similar but this will not take centuries, even decades to prove its worth. It is already bearing similar fruit, wonderful things; real leaps in research, going multimodal, with dialogue, voice, images and video. Significant advanced in unlocking 200 million proteins, drug discovery and millions of new materials have already emerged, along with billions of uses a month.

We are tapping into the hive mind, just as the Alexandrians did over two thousand years ago to further improve the minds of all. We can do this if we focus on learning. We should not allow it to get crushed by the usual moralisers and religious inspired end-of days dogma but look for the bounty that it offers. We cannot say with certainty what will happen but we can be sure that it will be full of surprises and challenges. AI is the new Alexandria.

PS

This is linked to my idea for an AI University. The Library at Alexandria was the first University. Plato's Academy and Aristotle's Lyceum came earlier but they were really schools built around one man and his ideas. Alexandria was a different vision; cheap, open, secular and multicultural.

It had no faculty other than those interested in cataloging and keeping the system going. It had no formal teaching, just debate and discussion with further writing and practical invention. The idea of researchers as teachers came in hte 18th century with Humboldt.

It was also a powerful generator of ideas and inventions, not too abstract but as keen on the real world as ideas themselves. It was the retreat back into the scholastic world of theological beliefs that banned the books and put an end to the Library after 600 years. 

 

No comments: