Friday, August 09, 2024

Does Derrida's View of Language help us understand Generative AI?

Many years ago, I had lunch with Jacques Derrida in the staff club at the University of Edinburgh. He was as aloof and obscure as you would imagine. The French philosopher, taught at the Sorbonne and the École Normale Supérieure and his work gained huge attention in the United States, influencing literary theory, cultural studies, and philosophy. Derrida taught at various institutions, including Yale University and the University of California, Irvine. 

What is less known is that he was influenced by JL Austin’s philosophy of language, as speech acts but went in another direction with a a purist focus on text in his books Of Grammatology (1967), Writing and Difference (1967), and Speech and Phenomena (1967), where he introduced his deconstructive methods.

His fascinating insights into language, especially his concepts of deconstruction, différance, the fluidity of meaning an dislocation from authorship, resonate powerfully with the way Large Language Models (LLMs) operate.

Big picture

But there is a bigger picture than just language here. The Metaphysics of Presence is his target, the philosophical tradition that seeks to establish a fixed, unchanging foundation of meaning or reality, privileging beliefs in a fundamental essence or truth that underlies appearances and change. This turns Structuralism in on itself, de-anchoring structures and denying the objectivity of science and reality. Like many in the Critical Theory tradition, he ‘deconstructs’ the large metaphysical and secular narratives through the deconstruction of its texts. His ideas challenge traditional assumptions about meaning, truth, and interpretation, offering a complex and nuanced view of language and texts. This was somewhat prophetic on Generative AI, LLMs in particular, where we feel a sense of dislocation of text from its past and origins. LLMs produce texts, not in the sense of a truth machine, but as something that has to be interpreted.

Text not speech

Jacques Derrida, like Heidegger, introduced an entire vocabulary of new terms, which he invents or qualifies in his philosophy (différance, intertextuality, trace, aporia, supplement, polysemy etc). He had an unerring focus on texts as he saw Western thought and culture as being over-dominated by speech (phonocentrism). This led him to elevate the written word, importantly and oddly, seen separately from its author. Rejecting the phenomenology of Husserl and the focus on consciousness and sense-data, of which speech is a part, he saw traditional philosophy as being tied to the language of speech, as opposed to writing. This prioritisation of speech over writing, was based on the assumption that speech is a more immediate, reliable, genuine and authentic expression of thought. Text freed us from this fixity of thought.

Generative AI has revived this focus on text as LLMs were introduced as text only tools and adopted on a global scale. Anyone with an internet connection could suddenly create, summarise and manipulate text in the context of dialogue. Derrida would have been fascinated by this development. He would have a lot to say about the way they are created, trained and delivered.

Texts

Knowledge is constructed through language and texts, which are ‘always’ open to interpretation and re-interpretation as there are no totalising narratives that claim to provide complete and final explanations. Everything is contested. So, Derrida encourages dialogue, critical engagement, and the deconstruction of traditional educational structures.

Teaching and learning should be a dialogical process, which aligns with AI dialogue if there is a mutual exchange of ideas, as the interaction allows for the exploration of different viewpoints and the questioning of assumptions. It should also involve critical engagement with texts and ideas, encouraging students to reflect on the underlying assumptions and power dynamics that shape knowledge.

He highlights the central role of languag in shaping our understanding of reality, the fluidity and indeterminacy of meaning, and the interconnectedness of all texts. This idea is a foundational element of his broader philosophical project of deconstruction, which seeks to uncover and challenge the assumptions and oppositions that underpin traditional ways of thinking.

Deconstruction

He is best known for developing the concept of deconstruction, a critical approach that seeks to expose and undermine the assumptions and oppositions that structure our thinking and language. Derrida used deconstruction to show how texts and philosophical concepts are inherently unstable and open to multiple interpretations. He uses various techniques to analysing a text to reveal how its apparent meaning depends on unstable structures and oppositions, such as presence/absence or speech/writing. For him, text is not a truth machine but a human activity that is complex and ambiguous, a similar view could be taken of generative LLMs.

To say, as he did in Of Grammatology (1967) that “there is nothing outside of the text”, on first appearance, seems ridiculous. The Holocaust is not a text. What he meant was not the text itself but something beyond. What that beyond is, remains problematic, as Derrida refused to engage in much interrogation of his terms but by stating that there is nothing outside the text, Derrida aims to deconstruct traditional hierarchies that privilege speech over writing or reality over representation.

He argues that every aspect of our understanding and experience is mediated through language and texts. There is no direct, unmediated access to reality; everything is interpreted through the ‘text’ of language, culture, and context. This means that meaning is always dependent on the interplay of signs within a given text and the context surrounding it. He challenges the idea that words have stable, inherent meanings. Instead, he posits that meaning is generated through the relationships between words and their differences from each other.

Deconstruction in learning

Jacques Derrida’s has very distinctive and complex views on teaching and learning that emphasise the fluidity, uncertainty, of knowledge. He pushes the interpretative, dynamic, and contingent nature of knowledge. His ‘Deconstructive’ approach sees critical engagement, dialogue, and the continuous questioning of assumptions as fundamental. This pedagogical model sees the teacher facilitate interpretation and exploration, and education as an open-ended process that values ambiguity. It is a more fluid, reflective, and inclusive approach to education that aligns with the complexities and uncertainties of contemporary knowledge.

Deconstruction in particular plays a crucial role in his views on teaching and learning. Deconstruction involves critically examining and unravelling texts and concepts to reveal hidden assumptions and contradictions. It challenges the foundational assumptions and binaries, such as true vs. false, author vs. reader, that underpin education. This encourages students to question and critically analyse accepted knowledge rather than passively absorbing it, an attitude we see frequently expressed in relation to LLMs.

Teachers need to guide students to interpret texts in a way that exposes multiple meanings and interpretations, in a process of engaging with texts and ideas in a way that reveals their complexity and ambiguity. That is because Derrida viewed knowledge as inherently unstable and contingent, not fixed and objective.

Instability of Meaning

Derrida brilliantly argued that meaning in language is never fixed and is perpetually ‘deferred’, hteir meaning coming from their production. Words derive meaning through their differences from other words and their ever-changing contextual usage. This constant flux and unsettled nature of meaning is mirrored in LLMs. These models generate text based on patterns learned from vast corpuses or datasets of text, with the meaning of any output hinging on the context provided by the input and the probabilistic associations within the model's structure. Just as Derrida proposed, the meaning in LLM outputs isn't fixed or stored as entities only in a database to be retrieved. They are created and shift with different inputs and contexts. This relational nature of language, captured in LLMs, implies that meaning is always deferred, never fully present or complete, leading to his concept of "différance".

Différance

Derrida uses the term ‘différance’ in two senses both ‘to defer’ and ‘to differ.’ This was to indicate that not only is meaning never final but it is constructed by differences, specifically by oppositions. It suggests that meaning is always deferred in language because words only have meaning in relation to other words, leading to an endless play of differences. The concept of différance captures the essence of meaning as always deferred and differentiated, never fully present in a single term but emerging through a network of differences. This is akin to the text generation process in LLMs, where each word and sentence is produced based on its differences from and deferrals of other possible words and sentences. The model’s understanding is built on these differences to predict and generate new coherent text.

Intertextuality

Intertextuality then emphasises the interconnectedness of texts. Derrida thought that language is inherently intertextual, with texts inextricably linked to and deriving meaning from other texts, creating a web of meanings that extends infinitely. This intertextuality means that no text can be understood in isolation, as its meaning is shaped by its references to other texts. Texts always refer to and are shaped by other texts, creating a web of meanings that are interconnected and interdependent. 

This is an inherent quality of LLMs. However, text in a Large Language Model (LLM), like GPT-4 or Claude, is not stored in the traditional sense of having a database of phrases or sentences.  The text, as tokens, is interconnected in a highly abstracted form, through embeddings and neural network parameters. The model doesn’t store explicit sentences or phrases but instead captures the underlying statistical and semantic patterns of the language during training. These patterns are then used to generate contextually appropriate text during inference.

Understanding a text involves recognising its relationship to other texts. Similarly, LLMs are trained on a vast and diverse corpus of texts, making their outputs inherently intertextual. The responses generated by LLMs reflect the influences of countless other texts, creating a rich web of textual references that Derrida described.

Trace

A ‘Trace’ is the notion that every element of language and meaning leaves a trace of other elements, which contributes to its meaning. This trace undermines the idea of pure presence or complete meaning. For example, the word ‘present’ carries traces of the word ‘absent,’ as our understanding of presence is inherently tied to our understanding of absence. This conforms with the way tokens (as traces) are held in vector databases and when created contain ‘traces’, as mathematical connections, of all other text in that database.

Aporia

With ‘Aporia’ we reach a state of puzzlement or doubt, often used by Derrida to describe moments in texts where contradictory meanings coexist, making a definitive interpretation impossible. LLMs, when they reach a state of Aporia, famously hallucinate or make the effort to resolve the dialogue between the user and model. The model may even apologise for not getting something right first time. It expresses puzzlement, even doubt, about its own interpretations and positions. It is in an Aporiatic state.

Writing and the Supplement

Derrida focuses on écriture (writing) and the idea of the ‘supplement’ that adds to and completes something else, but also replaces and displaces it. Derrida used this concept to show how what is considered secondary can actually be fundamental. This may have come to pass with LLMs, where new text is tied to and produced from old text but replaces it entirely. Every new word is freshly minted, there is no sampling or copying.

Writing does not just record knowledge but creates and transforms understanding as a supplement to speech. Teaching should therefore emphasise the active role of writing in shaping and reshaping knowledge. The ‘supplement’ represents the idea that meaning is never complete and always requires additional context or interpretation. This concept implies that learning is an ongoing process of adding new perspectives and insights rather than reaching a final, complete understanding. It helps deconstruct and reconstruct knowledge in meaningful ways.

We can see how Generative AI is a ‘supplement’ to forms of writing, whether a summary, rewriting, expansion even translation, as an open process of development, rather than final product.

Absence of Authorial Intention

Derrida also challenged the idea of authorial intention, suggesting that the meaning of a text emerges from the interaction of the text with readers and other texts, not from the author's intended meaning. LLMs, devoid of intentions or understanding, generate outputs through statistical associations rather than deliberate meaning-making, so focus on this interaction between the output of the LLM and the reader. LLMs de-anchors text from its authored data as used in training. The meaning in LLM responses arises from patterns in the data rather than any inherent intention, aligning perfectly with Derrida's de-emphasis on authorial intention. 

Textual Play and Polysemy

Derrida highlighted the playful nature of language and the multiplicity of meanings (polysemy) that any word or text can have. LLMs exhibit this same playfulness and multiplicity in their responses. A single input can lead to various outputs based on slight contextual variations, also across models. showcasing these models’ ability to handle and generate numerous forms of language.

Criticism 

Avoiding the reality of even speech, restricts debate to texts. Yet it is not clear that education is what he calls speech or ‘phonocentric’ and his evidence for this is vague and unconvincing. His denial of oppositional thought, which he tries to deconstruct through reversal, denies biological distinction like gender and the persistence of a subject in relation to objective reality. It becomes an excuse for avoiding debate by reducing the other person’s views as a vague text. It matters not what your intention was, only what was said.

He refuses to define or even defend concepts but it is not clear that concepts such as ‘difference’, which he defines rather confusingly as both deference and difference, are of any relevance in education and learning. As his writing moved further into wordplay, playing around with prefixes and salacious references to death and sex, it drove him further away from being in any way relevant to education, teaching and learning theory, apart from literary theory. 

Deconstruction of texts is his method of instruction but his only method of instruction. Ultimately it is an inward-looking technique that cannot escape its own gravity. No amount of debate can produce enough escape velocity to deny the results of deconstruction. His obsession with double entendres, puns, sex and death also detract from his theorising. Derrida's writing style is often seen as dense and obscure, and some, like John Searle and Jurgen Habermas, criticised him for lacking clarity and precision in his arguments. Searle accused Derrida of "bad writing," and Habermas critiqued his approach as relativistic and lacking commitment to rational discourse.

With Derrida we are at the tail-end of critical theory, where the object of criticism is reduced to texts and methods at the level of the ironic. His impact on education has been almost nil, as there is little that had enough force or meaning to have impact. Having rejected all Enlightenment values, large narratives, even speech, Derrida’s postmodernism is its own end in itself. His reputation lives on in the self-referential pomposity that postmodernism created, mostly limited to academia, and even there only in a subset of the humanities, where spoof papers that mimic its vagueness and verbosity have been regularly accepted for publication.

Our exploration of Derrida and LLMs comes up against the fact that Generative AI is not multimodal, not just text. It can engage in speech dialogue, generate images, audio, avatars, even video. Here, to find useful insights, one has to stretch Derrida to breaking point! 

Conclusion

The parallels between Derrida's theories and LLMs reveal a fascinating intersection between philosophy and technology, illustrating how philosophical ideas about language can find new life in the digital age. He takes critical theory down, away from larger narratives, groups or individuals to language itself. Through ‘deconstruction’ he looks at the ambiguity of texts, de-anchored from reality, even their authors. Derrida's deconstructionist view of language and the functioning of Large Language Models both emphasize the fluid, dynamic, and context-dependent nature of meaning. While Derrida's theories stem from deep philosophical inquiry into language and meaning, the operational mechanics of LLMs echo these ideas through their probabilistic, context-sensitive text generation. Both challenge the notion of absolute, fixed meaning and highlight the complexity and interconnectedness inherent in linguistic communication and dislocation of text from its origins and provenance.

Bibliography

Derrida, J., 2001. Writing and difference. Routledge.

Derrida, J., 1998. Of grammatology (p. 456). Baltimore: Johns Hopkins University Press.

Derrida, J., 1982. Margins of philosophy. University of Chicago Press.

Pluckrose, H. and Lindsay, J.A., 2020. Cynical theories: How activist scholarship made everything about race, gender, and identity—and why this harms everybody. Pitchstone Publishing (US&CA).

 

No comments: