Mark Betnel


March 30, 2023

The allegory of the cave from Plato presents a model of human interaction with the world as human imprisoned inside a cave, chained against a wall so all we can see are the shadows cast against the back wall. We spend our time observing those shadows, constructing theories that explain and help us understand their behavior, thinking that the shadows are “real”. Farther up toward the mouth of the cave, behind us where we can’t see, there is a light source and objects moving around, casting the shadows we see on the far wall. These objects are the “real” ones. Philosophers are supposed to get themselves unchained, leave the cave, and see the real world. Then they’re supposed to come back and try to free the rest of the imprisoned.

If I map the allegory onto language, then my thoughts are the “real” objects, and the words I produce are the light that carries the signal about those thoughts to others, who, on hearing the shadow-words, construct theories about the real thoughts that produced them. When they respond with words of their own, casting shadow puppets back to me, I form theories about the thoughts that produced them, and thus refine my own thoughts.

One use of large language models (LLMs) is to facilitate communication, by taking over some of the work of composing the text that I might send to someone else. This is introducing another layer of translation into my system of shadow words – I have a thought I wish to communicate (or a topic I wish to communicate about), so I compose a prompt for the LLM. The LLM produces text that matches the prompt. If I am careful, I will edit the result – thus getting one layer of feedback about my thoughts, causing me to refine those thoughts. The produced message will not be precisely what I intended (it never is and never has been), but I will send it anyway as it’s close enough. The receiver won’t read the message, it will be too long, and anyway, they will know that it was produced by an LLM not by me, so there’s no insult in feeding the message into their own LLM. The receiver’s LLM will interpret the result and possibly summarize it for the receiver, who will form theories about the mind that produced the text before responding. Or perhaps the receiving LLM will skip that step and just compose the reply. Either way, I will receive a reply that is mediated by one or more translations through another LLM, finally causing me to update my own thoughts based on the feedback.

Are there instances where these extra layers will enable communication with greater fidelity? Probably. Will there be instances where the layers introduce more noise? Or steer my thoughts? Certainly. And if I don’t control the intermediary LLMs, how can I be sure that the steering isn’t intended to manipulate me, the way every other algorithmic communication system has been dual purposed for manipulation?

In the allegory, the philosopher is supposed to fight to get free of the cave and then fight to free everyone else. LLMs are more like adding a series of mirrors and lenses in between the shadow puppet casters and the trapped humans instead.