I finally took the time to read Scott Aaronson’s new paper Why philosophers should care about computational complexity (there’s a post about it with lots of comments on his blog).

Summary: Aaronson, knowing that I’m writing my thesis at the moment, took the liberty of preparing an introduction for me. Thanks! I’m going to include it verbatim. (Just kidding of course, though I’m actually going to cite this paper extensively.)

The early work on computability had a lot of philosophical significance; actually, we can say that computability was *born* to answer a philosophical question, Hilbert’s *Entscheidungsproblem*. We all know that the answer turned out to be negative (or do we? See Section 3.1). Apparently computational complexity theory, the theory of computing under limited resources, didn’t turn out to be so popular among philosophers. That’s obviously a problem, and this paper tries to do something about it.

After a brief introduction to complexity theory (Section 2), Aaronson turns his attention to one of the main cornerstones of this field, which is also one the points that are usually criticised: the relevance of polynomial time, as opposed to exponential time. Here he argues that this distinction is at least as interesting as the distinction between computable and uncomputable. Section 3.3 contains an interesting question that can be answered using a complexity-theoretic argument: why would we call 2^{43112609} − 1 (together with a proof of its primality) a “known” prime, while “the first prime large than 2^{43112609} − 1” feels somehow “unknown”?

Section 4 is about the Turing test. This is the first time I see in print what Aaronson calls the “lookup-table argument” (though he cites Ned Block and others for it): namely, that there’s no doubt whatsoever that an abstract program *P* able to pass the Turing test does actually exist (you just need a huge but finite table of replies dependent on the previous history of the conversation). The only way to claim the impossibility of a machine passing the Turing test, apart from metaphysical arguments, requires addressing concerns such as the efficiency of *P* or the availability of enough space in the universe to store it. That is, complexity-theoretic questions.

Section 5, possibly the most interesting one, addresses the problem of logical omniscience. It is usually assumed that knowledge is closed under deduction rules; for instance, if I know *A*, and I know *B*, then I certainly know *A* ∧ *B*. But consider this: while I know enough basic axioms of maths, I surely don’t *know* that Fermat’s last theorem holds (except by trusting the mathematical community, which I usually do), even though it’s a mere deductive consequence of said axioms. We can argue that human beings do not possess the computational resources needed to actually achieve omniscience.

Section 6 is about a “waterfall argument” against computationalism. The idea is that the meaning of what a supposed computer actually computes is always imposed by someone looking at it, that is, it’s always *relative to some external observer*. I freely admit I am (was?) a fan of this argument, though I don’t necessarily see it as an attack to computationalism (I use to call this “computational relativism”, a term whose invention I claim, since Google returns zero hits for it :-)). According to some proponents, this leads to some “strange” consequences.

For instance, assume that we encode chess positions in the physical states of a waterfall, then take a look at some “final” state of the waterfall, and once again interpret that as a chess position. Can the waterfall be said to play chess? Aaronson argues that it is not so, unless the encoding (which is just a reduction from chess to computing the state of the waterfall) can be computed by a procedure requiring *less* resources than those needed to actually compute the state of the waterfall. But then a question emerges: does DNA compute Hamilton paths?

Section 7 describes how computational complexity can help to define what inductive inference is about, including Occam’s razor and the “new riddle of induction”.

Then, in Section 8, Aaronson argues that complexity theory might inspire some debate about the interpretation of quantum physics, particularly about the many-worlds interpretation.

In Section 9 new notions of proof, quite distinct from the usual notion of “formal” and “informal” proofs, are described. These are all based on complexity-theoretic and cryptographic results: interactive proof systems, zero-knowledge proofs, probabilistically checkable proofs.

Section 10 is about the difference between time and space. This is usually stated as “space can be reused, time cannot”. What if we allow time travel (i.e., closed timelike curves)? The assumption that **P** ≠ **NP** might suggest an argument for the impossibility of time travel, even assuming the validity of quantum physics.

Section 11 is about economics and game theory; here complexity theory can be useful to describe the behaviour of agents under bounded rationality assumptions.

Finally, in Section 12, Aaronson concludes that *lots* of philosophical problems seem to have interesting complexity-theoretic aspects. Several questions remain open, and the most important of those is still “If **P** ≠ **NP**, then how have humans managed to make such enormous mathematical progress, even in the face of the general intractability of theorem-proving?”.

I really hope that Aaronson’s paper spurs a lot of philosophical discussion involving and concerning complexity theory; I too believe there’s much to write about that.

Thanks for the great summary! Of all the people who commented on my essay on blogs, reddit, etc., I now know there’s at least one who read it from start to finish… :-)

[...] more from the original source: Do waterfalls play chess? and other stories « Antonio E. Porreca Filed Under: Chess, General Tagged With: again-interpret, chess, look-at-some, physical, [...]

You write, “But then a question emerges: does DNA compute Hamilton paths?”. But why does that question emerge? (Aaronson’s paper doesn’t include the word “DNA”)

Well, it emerges for me because I’m personally quite familiar with Adleman’s paper, and with natural computing in general. The idea there is precisely that of encoding an instance of the problem we want to solve into some natural phenomenon, then “let it compute” and in the end read the result in the final state of the system.

Of course, in the case of Adleman’s experiment, the initial state of the system (the DNA strands used to perform it) is engineered, and human intervention is needed during the process; so, someone might argue that the situation is different from that of the waterfall (I personally don’t believe so).

[...] is some comment on the paper, and whether waterfalls play [...]

It is not enough (at all) to simply store query-response pairs and retrieve them if one wishes to approximate a human. The set of convincing responses to a message depends on the entire context of the message, including the fact that a Turing test is in progress. That someone would suggest that a simple lookup table can suffice is a clear indication of a lack of clear thinking on the subject.

As for the relationship between P=NP and time travel, I am shocked that a philosopher would miss this one. If P != NP, then the existence of a time-like loop simply implies that mathematics does not model the universe at that level. No more, no less.

The set of convincing responses to a message depends on the entire context of the message, including the fact that a Turing test is in progress.Indeed. The table we’re talking about is not just a mapping from queries to replies, but rather a mapping from dialogue histories to replies (Section 4.1). Sorry for the oversight.

If I understand Aaronson correctly, his argument would be that whether DNA can be said to compute Hamilton paths depends on the computational complexity of the encoding/decoding process. I don’t understand how the possible need for human intervention is relevant.

Josh, I was just anticipating the objection that, once a human being is involved, the process might not be considered “natural” any more. But, in my opinion, the only difference is that we’re switching from a many-one reduction to a Turing reduction.

I have only read the first part of the paper, so this question might be answered in it, but regarding the waterfall, isn’t it true that more interpretative machinery is at work in getting a third grader to understand “2 x 2 is 4″ than in actually multiplying in a 3-bit space? So wouldn’t that mean our minds wouldn’t count as computers when we’re computing using natural language etc? That seems like a weird result.

But when you include the history of the dialog in your magical lookup table, it doesn’t take long before you exhaust the amount of energy in the universe. (Note that the “history of the dialog” includes things like pauses, even inter-syllabic.)

If you think that you can dodge this problem with compression, guess what? You’re attempting to compress a space which is growing exponentially each time you add the next interaction to the dialog. (I’m assuming a skeptical interlocutor here.)

@myself, the point of the lookup table argument is to show that there *exists*, in the mathematical sense, a machine that can pass the Turing test. (This wasn’t obvious to me and I’m sure many others until we heard the argument.) Therefore, it is a matter of “mere engineering” to determine whether it’s possible to actually build such a machine.

Let me know when you find a Turing machine, and I’ll help you program it.

Seriously, the question of if one can, given infinite resources, construct a set of rules to mime some aspect of human behavior is entirely uninteresting.

Complexity theory is about what can be achieved given finite resources. If a proposed solution requires more resources than are available in the universe, then you need a different solution.

@I: A lookup table doesn’t require “infinite resources”, but yes, the question of whether it’s logically impossible to build a superintelligent machine is now “uninteresting” to the extent that it has been answered, by the argument Aaronson puts forth, if not by others. “Our best solution requires too many resources” is unequivocally an advance over “the problem may be (like the Halting problem) impossible to solve, in principle”.

[...] Do waterfalls play chess? and other stories: After a brief introduction to complexity theory (Section 2), Aaronson turns his attention to one of the main cornerstones of this field, which is also one the points that are usually criticised: the relevance of polynomial time, as opposed to exponential time. Here he argues that this distinction is at least as interesting as the distinction between computable and uncomputable. Section 3.3 contains an interesting question that can be answered using a complexity-theoretic argument: why would we call 243112609 − 1 (together with a proof of its primality) a “known” prime, while “the first prime large than 243112609 − 1” feels somehow “unknown”? [...]

You realize that the halting problem is quite solvable if you limit yourself to a set number of states? The proof that the Halting problem is unsolvable depends on having an infinite space.

The argument that the Turing test can be passed by a large hash is merely an assertion that the problem is finite in nature. If that is a surprise, well, okay.

Obviously, the Halting problem is to determine whether an arbitrary Turing machine halts, and Turing machines are defined to have access to a tape that is unbounded in one direction. I admit the point that “the Turing test can be passed by a large hash” does sound obvious in retrospect, but it wasn’t obvious to me until I heard the argument, and I know Scott wasn’t writing his paper just for me.

There’s also the problem that if we admit lookup-table programs as a valid solution to the Turing test, then we’ll create the somewhat paradoxical situation that there exist machines that pass the Turing test (are ‘Turing-conscious’) according to our best judgement, but that don’t judge each other Turing-conscious — one machine with resources, i.e lookup-table, greater than another will exhaust the other’s, and thus see it as merely putting up a show. This makes the property of being ‘Turing-conscious’ uncomfortably anthropocentric, to me.

Then there’s the issue of how these lookup-tables are created in the first place. It’s easy to create a table of all possible conversations of a certain length, but to pick out the ‘meaningful’ ones, which will be a small subset, the majority being largely gibberish, additional input is needed — so in Turing-testing these machines, who is really tested: the machine itself, or who- (or what-)ever supplied the additional input to weed out nonsensical dialogue histories? This seems somehow akin to one of those ‘reductions that do all the work’.

In analogy, one could put the requisite lookup table into a book that follows the style of these ‘choose your own adventure’-books; a ‘reader’ thinks of a certain dialogue part, then looks up the next one in the book. Should one consider the book to be ‘Turing-conscious’, i.e to pass the Turing test?

Perhaps one should consider a ‘strong’ Turing test somewhat like the following: an entity is judged to pass the strong Turing test if it passes the Turing test as administered by any other entity that passes the Turing test. This would exclude lookup-table machines, since there always exists another lookup-table machine with a larger lookup table that exhausts the first one’s table, and thus, which could administer a Turing test the first lookup machine could not pass, but would leave humans in the running (barring unfortunate practical issues such as age and death).

I think Josh’s replies are essentially the ones I would have written.

Jochen’s observations are quite interesting. I agree that we’re definitely talking about something analogous to the “reductions that do all the work”; indeed, what’s missing here is exactly the complexity-theoretic analysis of the question, as the problem is essentially solved from a purely computability-theoretic standpoint.

I don’t think the question related to machines having tables of different sizes is relevant, though. It is usually assumed that a Turing test session has a pre-defined duration, so beyond a certain size that won’t make a difference any more. (At the moment I don’t recall if the issue of duration is addressed by Turing’s original paper.)

WRT to the ‘larger table’ issue, it’s more of an ‘in-principle’-thing for me, rather than relating to ‘actual’ Turing tests one might in practice perform, perhaps overly so. Consider a Turing test analogue to examine a program for whether it computes prime numbers by observing the output: obviously, a finite, and quite short (to the point of realizability in practice) table would suffice to output enough prime numbers to convince a tester subject to any kind of reasonable bounds that the program indeed does generate prime numbers. But there’s still a difference between a program that merely outputs primes from some pre-defined table and a program that actually computes primes. Perhaps this should rather be read as an inherent limitation of the Turing test, though.

Jochen: It’s an inherent limitation of any test that takes place over a finite-capacity communication channel for a finite amount of time. That is to say, any test that conforms to the laws of physics as we currently understand them.

The way I see it, the whole point of the big table solution is to show that complexity theory gives interesting results for philosophy questions.

If a table is possible in the math sense, and the only thing that prevents us from building one is the circumstantial engineering problem that our universe has no sufficient resources, then the only thing preventing us from solving the Turing test is a complexity-theoretical issue. Therefore complexity theory is cool. QED.

anon, that’s exactly my reasoning.

(Also, I love how you define “not having sufficient resources in our universe” as a “circumstantial engineering problem”. :-D)