Socrates and the possibility of artificial intelligence
What can political philosophy teach us about the limits of knowledge and the possibility of building mechanical minds?
The field of artificial intelligence seems to stand apart from every other field of study today, operating on the most exciting and dangerous frontiers of man's technological capability. Because of this unique position, A.I. research garners reactions ranging from unbridled optimism to catastrophic pessimism. Pessimists like Eliezer Yudkowski think efforts at building artificial general intelligence are so risky as to be practically suicidal. Optimists like Marc Andreessen, on the other hand, see in A.I. a utopian future of unimaginable wealth, productivity, and knowledge.
It's necessary to step back from the drama to ask: what, precisely, is all the optimism and pessimism about? Many people have used the current generation of cutting-edge tools, and few deny their exciting and fearsome character. Such is the excitement that it becomes tempting to look further and further ahead to the most extreme possible outcomes without ever pausing to look to the past. That is, it is all too tempting to believe that there is no room for old teachings to moderate and inform the discussion around the promise and perils of A.I., or even its essential character.
Yet, for the student of classical political philosophy, the headlines, whitepapers, and dramas unfolding across the A.I. landscape can yield to what might be characterized as a rediscovery of essential insights that Socrates, himself, brought to light in ancient Athens—insights which give rise to a cautious and inquisitive moderation. In order to investigate those insights, one must start with a spare and provisional definition of artificial intelligence as it exists today, in the form of machine learning.
What is machine learning?
It would be an error to speculate about what artificial general intelligence might entail, because such a thing does not exist. The final section of this essay will consider such possibilities. The A.I. tools that people do encounter today, e.g., large language models (LLMs), fall into a category of technologies known as deep learning. Deep learning is a subset of the broader field of machine learning. Machine learning most often refers to a category of statistical techniques that use various kinds of neural networks to manipulate information from vast data sets in such a way that general patterns can be detected and used to respond accurately to novel prompts.
When speaking of machine learning it is all too tempting to anthropomorphize, in part because the design is loosely inspired by the neural connections in human brains, but also because the surface-level interactions people have with machine learning technologies are designed to feel like interactions one would have with a person behind a screen. For example, one can query in plain English, “What were the circumstances which led to the Thirty Years’ War?” and receive a plausible answer that might receive passing marks in a high school exam on the topic.
Behind the facade of human-like interaction, though, the fact remains that machine learning tools are fundamentally computers running software programs to manipulate information into a useful form. In the case of machine learning, the software involves transforming large amounts of data in service of two broad goals, the first serving the second. The first goal is to “train” the neural network to generalize relevant patterns from the data it is given. For a relatively crude model that identifies letters and numbers in images, this might involve a human providing millions of images of written text, each of which is paired with the letter or number it depicts. At first the neural network will do nothing more than guess letters and numbers at random. But the software is designed to use human-provided descriptions to correct its predictions, thereby “training” it to accurately detect the correct letters and numbers. The second goal is to use the now-trained model to respond to novel prompts. Following the previous example, a well-trained letter detecting model can be given images it has not seen before, and accurately detect the correct letter.
The core mechanism underpinning all machine learning is prediction through mathematical means. Every neural network that has been trained to respond to prompts with satisfactory answers has been trained to predict a sequence of bytes, i.e., 1s and 0s, that will satisfy the needs of the human who submitted the prompt. As this is somewhat abstract, it's worth considering a very basic example of a much simpler technique that does not qualify as "artificial intelligence," but which performs a similar operation. Linear regression is a statistical technique that takes as input a set of data with measurable parameters (in this simple case, represented by X and Y coordinates) and calculates a linear equation that "fits" that data. A linear equation may be said to "fit" a set of data if it seems to "take the shape" of the data from which it is derived, such that points along the line pass very close to the actual data, i.e., with sufficiently low error. An adequately well-fit equation can then be used to predict where other, similar data might fall. A common exemplary application uses linear regression to predict house prices, given a set of data and relevant variables, e.g., square footage, number of rooms, etc.
Neural networks are almost incomparably more complex than linear regression models, but the two share a very important core characteristic. Namely, both use statistics and historical data to build a mathematical model, which can be used to make future predictions. In the case of LLMs, the predictive step is called "next-token prediction," which means that, given a prompt, an LLM will predict some number of words, working one word at a time until a suitably complete answer has been generated. Although next-token prediction sounds quite a lot more like the way linear regression functions than the way human intelligence functions, Ilya Sutskever, Chief Scientist at OpenAI, believes that it alone is enough to endow a machine with a capacity for so-called "general" intelligence:
What does it mean to predict the next token well enough? It's a deeper question than it seems. Predicting the next token well means you understand the underlying reality that led to the creation of that token.
Readers may sense in this description a certain skepticism about the relationship between the function of neural networks and the faculty of human minds which we call “intelligence.” That is intentional, not least because we simply do not know the causes behind the intelligence of the human mind. Honest neuroscientists will readily admit that the field is pre-paradigmatic, meaning that it is still waiting on a fundamental breakthrough. Certainly, a great many neuroscientists and A.I. researchers would seek to draw stronger connections between cognitive science and contemporary work on artificial intelligence. Holding such a view is only natural for scientists and entrepreneurs whose desires for success hinge on their ability to resolve the complexity of the human mind, but their overzealousness commits them to the principle error of bad philosophy—namely, to lack adequate awareness of what one does not know, and thus to stand with confidence atop an intellectually uncertain foundation.
If intelligence pertains to knowledge, as most seem to agree, then an understanding of knowledge and its limits will be essential for proceeding to investigate the possibility of mechanical minds. The most prominent man to declare that he knew what he did not know was Socrates. We now turn to his view of knowledge in order to shed light on modern A.I. efforts.
Socrates on knowledge
Socrates based his philosophical investigations on “a new approach to the understanding of all things,” distinguishing himself from the “madness” of philosophers who came before him by his characteristic combination of wisdom and moderation (Strauss 122-123). Many of his predecessors attempted to explain the workings of the cosmos by theorizing about the materialistic roots of things, i.e., the matter and processes that lead to the creation of the various beings that exist. Socrates, on the other hand, preferred to situate knowledge within the horizon of experiences that are natural for man in common life.
Thus, a Socratic treatment of knowledge necessarily begins with an important question. Why did Socrates pursue a different path than his predecessors did? What did Socrates think was wrong with the scientific investigation of beings in terms of matter and motion?
To help answer these questions, we can trace Socrates’s own philosophical development, drawing on the detailed account provided by Dustin Sebell in The Socratic Turn.
Problems of causality
Before he turned away from natural science, in favor of his own new approach, the young Socrates studied natural science in hopes of explaining the world. He looked upon the world of human awareness, full of things like tables and dogs and people, and sought to explain everything by reducing each thing to a set of underlying causes. In particular, natural science was concerned with material and efficient causes. Studying material causes involves investigating the essence of matter, itself. Studying efficient causes involves investigating that which is said to move matter into position. For example, one might seek to explain a wooden table by investigating the material of the table (i.e. wood) and the movement or positioning of that material (i.e. the manipulation of wood involved in carpentry). Young Socrates seems to have believed that such a style of inquiry, if successful, could explain all of the beings that man encounters.
For natural science to succeed, it had to satisfy two main conditions. First, it had to confirm its presupposition that nothing comes from nothing, i.e., that “nothing can come to be without a cause.” This amounts to a belief that all things, which both come into being and perish, are underpinned by an eternal, stable material that does neither. Such a material was called “an Atlas” by pre-Socratics, but modern readers will recognize echoes of this idea in the discovery of the atom and the principle of mass conservation. Second, science had to provide an adequate explanation of the “way of being” of each thing in terms of its matter, or physical parts and elements, and its motion, or the physical processes leading to its generation and corruption. Again, readers may recognize a modern version of this idea in claims that every single thing known to pre-scientific awareness is merely some composition of physical and chemical interactions. Everything is “just atoms,” nothing more, and the full essence of each thing can be reconstituted from a view of atoms, alone. Failing to satisfy either of these two conditions would amount to an admission that natural science, although useful in many respects, cannot explain the being of things with genuine knowledge.
In the course of his investigations as a young natural scientist, Socrates ran into a series of problems. The first pertains to the inadequacy of material and efficient causes for explaining the existence of things. As Sebell argues, “the need natural science has to reduce the way of being of each thing to its materials or elements coexists, uneasily it seems, with the need it also has to reconstitute that same way of being out of them again” (Sebell 44). That is, if there is nothing that truly exists but a homogeneous “soup” of identical, stable and eternal matter; and if there are a wide variety of heterogeneous beings which come to exist and to perish, necessarily out of that soup; then the problem of form arises. Recall, the second condition for the success of science demanded that it explain the ways of beings, not atoms. Atoms, by definition, must take many forms if they are to constitute tables and dogs and people. But what explains the existence of those forms, as wholes, in the first place? Anything worthy of being called a cause must account for why each form exists and is the way that it is. That why, or account, of each discernible form cannot exist at the atomic level, but according to materialistic natural science atoms are the only substrate in which anything can exist (without coming into being out of nothing). Here, Socrates seems to have reached an impasse.
Having recognized that matter and motion are not causes of particular forms of existence, but merely conditions for existence, Socrates turned his hopes toward that which seemed to provide the order and form which is otherwise absent: mind. Consider the example of dogs. What makes a dog a dog? For any given instance of a dog, what is responsible for the class of “dog” and the identification of a particular dog with that class? Nothing internal to the dog bears that responsibility. Rather, the mind that perceives the dog also classifies it. Likewise, where there is both a big dog and a small dog, beside one another, what can be said to explain the bigness of the one and the smallness of the other? It cannot be reduced to their position in space, nor to anything internal to each, as an even bigger dog would make the big dog seem small, and vice-a-versa. Again, it is rather the perceiving mind’s relational capacity that provides for the phenomenon in question. As Sebell, again, writes, “The primacy of form, together with the fact that what accounts for form is some mind’s eye, means that the coming into being and perishing to which the forms are subject can be fully understood if—and only if—a mind is, and is known to be, their orderer or cause” (Sebell 71). The view of mind-as-ordering-cause is known as “teleology.” Whereas material and efficient conditions could be described as a “bottom-up” view, starting with the smallest possible parts and working towards the whole, teleology takes a “top-down” view, starting with a whole, or an end (telos), that determines what would be good for that whole, and working towards ordering the parts. When a human makes a machine, he can be said to do so teleologically—with an end in mind, he orders the parts.
As promising as a teleological science may sound, Socrates’s hopes for such a science were soon dashed by further insoluble problems. Teleology focuses on knowledge of ends, or purposes. That is, it "supposes that knowledge of what is 'best for [each thing]' is somehow the same as knowledge of the cause of each thing" (Sebell 81). Each thing is somehow explained by pointing back to the purpose by which it was ordered. But in considering purposes there is a distinction to make between that of the whole and that of the parts that make up the whole. Does purpose pertain to each thing, separately? Or does it pertain to all things, as a whole? Socrates found that this investigation led to two very different conclusions, each incompatible with a firmly grounded natural science. When teleology focuses on parts, it falls into the trap of needing to presuppose a prior standard, i.e., a nature inherent to each thing to which it can point back. Instead of explaining the relevant causal question, e.g., why a dog is the way that it is, such a teleology ends up merely describing how that dogs fulfills what is needful according to the nature of dogs, leaving the presupposed nature of dogs unexplained. Furthermore, a teleology of parts quickly becomes reductionist, or concerned with parts, and parts of parts, all the way down to the level of the atom. Socrates had already rejected reductionist natural science as untenable, and this path would lead to the same dead end. But the opposite path, a teleology focused on the whole presents its own problems. Namely, such a view terminates in the wholly unscientific mystery of divine will. Can it be said that there is an “end” in “mind” for all of being? Is there, at the highest cosmic level, one universal purpose of being? If so, Socrates recognized that such knowledge could never be acquired by man. In the end, “teleology comes perilously close to—it may even be the same as—theology” (Sebell 77). Faced with the dogmatic incoherence of reductionism, on the one hand, and the dogmatic mists of theology on the other hand, Socrates was forced to abandon teleology as a ground for non-dogmatic, reason-based science.
With both approaches, the bottom-up and the top-down, lying in ruin, is there anywhere left to turn? It is from this point that Socrates set off on his “second sailing” in search of knowledge.
Knowledge though dialectic
The genius of the Socratic way is that it proceeds from a starting point that is very difficult to dispute. Namely, it begins from that which is already known to pre-scientific awareness, i.e., the noetically heterogeneous beings. His new approach dictated that an inquiry into the being of these things—of tables and dogs and humans, as well as the many human phenomena like courage and justice and love—can, and in fact must, be investigated on their own terms. Both the “too low” view of reductionism and the “too high” view of teleology were found to obscure these heterogeneous beings, as they are experienced, and each terminated in dogma as a result.
Furthermore, Socrates's approach avoids reductionism by refusing to focus exclusively on the sense perceptions of material reality. That is, his method does not blind itself to opinions about tables and dogs and humans, in favor of physically indisputable “facts.” Quite the opposite. He recognizes that the world present to our awareness, and which we hope to explain, presents itself in two distinct aspects, both of which must be acknowledged in any inquiry that does not suffer from “a kind of narrowness or incompleteness of perspective” (Sebell 108). A complete inquiry has to take into account both the presence of things through the body, in terms of sense perceptions, and the presence of things through the soul, in terms of speeches and beliefs.
In Natural Right and History, Leo Strauss gives another account of the Socratic method, which takes soul and speech into account. He writes:
Socrates started not from what is first in itself or first by nature but from what is first for us, from what comes to sight first, from the phenomena. But the being of things, their What, comes first to sight, not in what we see of them, but in what is said about them or in opinions about them. (Strauss 124)
Starting from one’s own common sense, and the common sense opinions of others, is crucial for Socrates because the alternatives tempt the inquirer to become obsessed with abstractions. Abstractions prove to be very useful because they are graspable and easily manipulated towards a use. But they purchase their usefulness at the cost of drawing man into a kind of delusion about the world—namely, that it is essentially homogeneous, rather than a complex and often contradictory composition of heterogeneous parts. Such abstractions may be useful, but that doesn't make them real or true. Furthermore, the utility and power of abstractions can come to distract the thinking man from his most important faculty, which is perception in the mind’s eye of what is real for man qua man. As Strauss puts it, referring to man’s common sense opinions:
Socrates implied that disregarding the opinions about the natures of things would amount to abandoning the most important access to reality which we have, or the most important vestiges of the truth which are within our reach… opinions are thus seen to be fragments of the truth, soiled fragments of the pure truth. (Strauss 124)
But, of course, Socrates was not satisfied with mere opinion, as opinions are almost always wrong, or at the very least incomplete. An opinion does not imply knowledge. In order to arrive at understanding, opinions have to become knowledge somehow. To that end, Socrates was famous for asking his interlocutors challenging questions, because he saw in his method of conversation the opportunity to refine opinions about a topic into genuine knowledge pertaining to some fundamental question. This conversational method is known as “Socratic dialectic.” Strauss describes Socratic dialectic as “the art of conversation or of friendly dispute” through which contradictory opinions are distilled into knowledge. No one opinion about a thing will ever be purely true, but each genuinely-held opinion contains “soiled fragments of the pure truth.” “Philosophy consists, therefore, in the ascent from opinions to knowledge or to the truth, in an ascent that may be said to be guided by opinions.” (Strauss 124)
In summary, then, Socrates seems to have believed that genuine knowledge—knowledge pertaining to the fundamental questions that man faces in his direct experience of living—is only attainable by having friendly conversations with people whose common sense opinions result in contradictions, which can then be resolved through questioning. If the right questions are asked, then the interlocutors’ false opinions can be challenged and ultimately relinquished, while preserving and refining the fragmentary and perhaps ineffable truths embedded in them.
Adherents to modern versions of physical and social science will, no doubt, bristle at certain aspects of this description of knowledge. People have become habituated to think of knowledge as something that institutional science somehow creates. By "institutional science," people tend to mean organizations, like government and university research labs, carrying out a systematic process that includes formulating plausible hypotheses, collecting vast amounts of data, and using statistical methods to try to falsify the hypotheses through hard-nosed analysis.
Anyone who enjoys the fruits of modern medicine and technology must admit that modern scientific techniques have amazing power and utility. But likewise it must be acknowledged that the entire edifice of such material science is built atop the same foundation of philosophy just now articulated. That is, science can not reject but can only build on the fundamental premise that there is a natural whole, which is composed of parts, and which the mind’s eye can directly perceive, and about which a reasoning being can make informed judgments. As Socrates discovered in his investigations as a young scientist, without this firm basis, science dissolves into incoherence and dogma. Strauss, again, explains beautifully:
All knowledge, however limited or “scientific,” presupposes a horizon, a comprehensive view within which knowledge is possible. All understanding presupposes a fundamental awareness of the whole: prior to any perception of particular things, the human soul must have had a vision of the ideas, of the articulated whole. (Strauss 125)
Thus we are returned to the core faculty underpinning human knowledge and understanding, which is a certain “fundamental awareness of the whole.” It would seem that without the soul’s capacity to intuit the whole of being, and the many noetically heterogeneous beings, there would be no way to form opinions, or even to perceive the parts that make up the whole. Those who speak of “the hard problem of consciousness,” are possibly speaking about something along these lines. And, to repeat, this mystery is a core reason that honest neuroscientists consider their field to be pre-paradigmatic.
Having traced the course of Socrates’s quest for knowledge, we can now return to modern questions about artificial intelligence. Given a fresh understanding of the character and limitations of knowledge and intelligent minds, what light can be cast on the prospects for building mechanical forms of intelligence?
Artificial intelligence in light of Socrates
In order to inquire into the possibility and limits of artificial intelligence, a standard of intelligence is required. As we have said, intelligence seems to pertain to the acquisition and application of knowledge. At a glance, various LLM technologies seem to already achieve something like this standard, in that they can provide relevant information in response to a given query. Is that not "application of knowledge," in some sense? Yet, haven't computers always done this, to an extent, since the advent of databases and the Internet? And doesn't the growing sense of fear and excitement indicate that many people want to measure artificial intelligence against a much more ambitious standard, which might equal or exceed human intellectual capacities? For clarity, then, we can formulate the question at hand as follows. Is it possible for people build computers that exhibit the kind of full intelligence characteristic of human (or possibly even super-human) beings?
Our foray into Socratic science and dialectic will provide a standard for human intelligence. A truly intelligent computer would have to meet or exceed the human ability to derive knowledge from an experience of being, and it would have to be able to adequately apply that knowledge to some end. We will leave open the provocative question of whether or not computers will be able to choose ends of their own, provisionally assuming that humans will remain "in the loop" of such an artificial general intelligence.
The first obstacle to building fully intelligent computers is that computers have a fundamental lack of awareness. Instead of awareness, computers have all manner of ingresses for information in the form of devices like cameras, microphones, thermometers, gyroscopes, etc. Smart phones are so amply outfitted with these devices as to have become eerie, quasi-aware things that more than a few people have suspected of discreetly spying. But measuring various kinds of stimuli, no matter how accurately or surreptitiously, does not meet any reasonable standard for “awareness,” and comes nowhere near the “fundamental awareness of the whole” that people clearly exhibit. Awareness is a deeply mysterious faculty of the soul, manifesting for us effortlessly and opening us to an unmediated flow of experience of the world in which we exist. Unless a computer can exhibit such a capacity, it would seem that the full scope of human intelligence is unavailable to machines. After all, without this fundamental awareness of the whole, the entire process that follows cannot begin. Parts cannot be identified, opinions cannot be formed, and dialectic in a quest for knowledge cannot commence.
Even if a lack of awareness will prohibit machines from reaching "full" intelligence, we should not stop the analysis just yet. Perhaps something equivalently, or exceedingly intelligent could emerge from a computer that has been adequately primed for experience. After all, as we have noted, machine learning algorithms involve a “training” step for a reason. Namely, computers don’t just miraculously wake up and start exploring the world on their own, so they must be “educated,” in a way. And the fact that computer scientists can and do train models would seem to suggest that machines are capable of learning, in some sense. Following the prerequisites of Socratic dialectic, then, one might say that people initially supply something like opinions to computers in the form of very large databases, containing data collected from a great many people over a long period of time. These data still need to be transformed into knowledge, though. Nobody believes that it is enough to mount a hard drive containing a very large database to a computer, and to call that “intelligence.”
For neural networks, it would seem that the training step supplies the apparent transformation of data into "knowledge." Interestingly, such training generally takes a quasi-dialectical form. The program does not proceed from axioms, but rather takes each piece of information as it is presented, and works with human-defined "correctives" to hone mistakes into accurate responses. One can imagine that an image classification A.I. model says, “This is a cat,” and the training program responds, “No, that is not a cat,” or “Yes, that is a cat.” Such a dialectical conversation is not exactly friendly, in a human sense, nor is it of the same philosophic or literary caliber as The Republic, one must admit, but is nevertheless broadly dialectical.
Yet, compared to human intelligence A.I. models exhibit a further deficiency stemming from the faultiness of equating “data” with “opinions.” It may seem like splitting hairs, but the distinction is a crucial one. Data are inert pieces of information, 1s and 0s, arranged in some state on some storage medium. A datum, itself, is not able to provide further explanation, nor is it able to frame itself within a context. In that sense, it is like the atom that cannot explain the form of the beings. A single datum, or for that matter, a group of data of arbitrary size, is still lacking an ordering cause – that which transforms the bits into knowledge. With human opinions, on the other hand, further explanation and framing within a context are available, by virtue of the fact that they are held by an already-intelligent human mind. The human holding the opinion can be asked questions. Perhaps a person shares the opinion, “I don’t like anchovies because they are stinky.” An interlocutor replies, “You say you don’t like anchovies, but at dinner last week you ate a Caesar salad!” The person responds, “Oh, well, in that case they're alright because romaine lettuce is so plain, otherwise, and at any rate…” The datum “anchovies are stinky” is at most a written artifact of an opinion, but its lifeless inability to say any more, especially in the face of a fruitful contradiction, betrays that it is not, internal to itself, a real opinion. If an opinion is to be rich enough to contain a "soiled fragment of the truth," it surely must exist in mind that already has the intelligence to reconsider.
As a brief aside, it should be noted that A.I. researchers claim to have discovered a "mysterious emergent behavior" known as in-context learning whereby already-trained LLMs seem to be capable of inferring patterns in training data without explicit, per-datum human correctives and tuning. While that is very interesting, indeed, it invalidates neither the claim that computers do not hold opinions, nor the claim that they are not aware. Rather, it speaks powerfully to what computers are capable of, namely extremely complex forms of prediction, which was described above. And, as we have also seen, mere prediction, no matter how sophisticated, does not qualify as intelligence.
Returning to the question of intelligence, it seems that without both awareness and opinions, machines are bound to remain fundamentally unintelligent. Compared to living, aware, opinionated beings, computers are seen as exactly what they are: lifeless, unaware, repositories for written artifacts of already-gained human knowledge. In spite of the amazing feats of computer science that allow for plain-language queries and rich lingual and visual responses, nothing worthy of being called full intelligence seems to be remotely possible within a computer. The living mind that orders the computer's repository of information and logical structure through instruction, it seems, can never quite make its way into the machine, remaining forever ever on the periphery.
In the end, such an inquiry leaves one rather disenchanted with (or relieved about) the prospects for building fully intelligent machine minds, and rather amazed by the mysteries of the human mind. It would seem we are safe from the most radical kind of possible upheavals, at the hands of super-intelligent computer overlords. Then again, no philosophic reasoning can ever close, once-and-for-all, the door to new revelations out of the purely unknown. It is always hypothetically possible that a miraculous breakthrough will "solve" the mystery of the conscious, intelligent mind and usher in an age of super-intelligence. However, such hopes are grounded not on a rational assessment of computers and intelligence, but a dogmatic belief in the possibility for nous to spontaneously emerge out of a bundle of connected transistors.
That said, even limited forms of artificial intelligence still pose very real, very serious threats to the stability of any regime and the flourishing of its people. Political philosophers are far back as Plato have warned that the innovations of technology inevitably cause massive upheavals. Much simpler technologies, like writing and the printing press, have proven this out. In a future essay, I hope to be able to draw on political philosophy to investigate the character of the sorts of upheavals that A.I. tools, however limited, may precipitate.