Archive | Logic RSS for this section

Prologomenon Tackles Logical Empiricism

Back, at last! This time we’ll have a look on a movement close to our hearts: logical empiricism. The subject is not really related to logic programming at all, but I did this text as an assignment in a Philosophy of Science course and figured that I might as well post it here in the case that someone finds it enjoyable.

Disclaimer one: It’s not particularly well-written and is probably full of errors and misunderstandings.
Disclaimer two: I did an automatic LaTeX to HTML conversion that apparently couldn’t handle footnotes and references. Since I’m a lazy and unmoral person I simply left them as is.

1. Introduction

Science is concerned with theories that make claims about the physical world. Hence it should not come as a great surprise that there exists a relation between facts and laws. Can universal laws be logically derived from observational facts? The explication of this relation is not a problem of science, but rather a philosophical one that has plagued philosophers for centuries, and is surprisingly difficult to answer without retorting to circular or ad hoc arguments. This text will give a brief introduction to two important movements in the history of the philosophy of science: empiricism\footnote{Empiricism is sometimes grouped together with positivism and inductivism.} and logical empiricism. The focus will be on one of the most important principles in the logicism movement: the distinction between analytic and synthetic sentences.

2. Empiricism

Empiricism takes a radical stance in the question of how science is related to facts, and asserts the importance and infallibility of sensory experience and observation. The ideal view is that the human mind is a blank tablet, a tabula rasa, and that knowledge can only be gained from experience. Hence, a scientist should primarily conduct research by observing the world and derive general conclusions from the observations. This methodology is called induction. \footnote{Not to be confused with the concept of mathematical induction.} The logical form is:

{property(a_0) \wedge property(a_1) \wedge ... \wedge property(a_n) \Rightarrow \forall x:\, property(x)}

Where we assume that {x} is restricted to a suitable domain. Let us say that we are interested in the question of whether all copper conducts electricity, and have successfully verified this property in {n} experiments. An application of the induction principle would then have the form:

{conducts\_electricity(copper\_sample_1)\, \wedge }

{conducts\_electricity(copper\_sample_2)\, \wedge }

{.}

{.}

{.}

{\wedge \, conducts\_electricity(copper\_sample_n)}

{\Rightarrow \forall x \in Copper:\, conducts\_electricity(x)}

Both the premises and the conclusion are true, yet the reasoning is not logically valid. Any finite number of affirmative observations cannot prove the necessity of the general hypothesis. This is known as the problem of induction and was elucidated by David Hume in the eighteenth century. At first it might appear to be a pedantic and irrelevant objection since scientists rarely claim to work with absolute knowledge. If the induction principle allows us to derive more and more probable hypotheses then that should be good enough. Here is the catch: for any set of observed facts there exists an infinite number of incongruent hypotheses. Hence we cannot claim that our particular hypothesis is probable in the usual mathematical sense of the word. This should of course not be taken as an indication that inductive testing is void of meaning, but rather that it is inherently difficult to assign a truth value to a scientific theory (that has not been falsified).

This is not the only objection to empiricism. Among other things it can be argued that too much weight is placed on sensory experience, which can hardly be considered to be realistic since almost all observations are dependent on an underlying theory. Furthermore, why should one demand that theories have to be derived by induction when the principle is not logically valid?

Further critique can be found in e.g. Chalmers \cite{Chalmers}.

3. Logical Empiricism

Logic made great strides in the late eighteenth century and the beginning of the nineteenth century. The German mathematician and philosopher Gottlob Frege developed the first formal system capable of higher order logic in the outstanding, but also rather impervious, Begriffsschrift \cite{Frege}. Frege’s aim was to show that mathematics, starting with number theory, did not depend on empirical experience but was true a priory and could be inferred from pure logic alone. That is, it is analytic instead of synthetic. The sentence There exists an infinite number of primes is to Frege an analytic proposition that is true only in virtue of its constituents. At the same time the sentence It will rain tomorrow is synthetic since there is no logical reason why it has to be true or false — it can be confirmed or falsified by observations.

We have learned that pure, strict empiricism fell short for a number of reasons. Does this mean that there is no difference between science and e.g. religion? Intuitively one would like to claim that the most important difference is that the domains of discourse are different. In the first case claims are made about the physical world. In the second case the claims are, at least partly, metaphysical in nature. If one adheres to this demarcation principle then it is clear that it is pivotal to filter out statements that are metaphysical in a scientific theory. This was one of the aims set forth by Alfred Jules Ayer in Language, Truth and Logic \cite{Ayer}, which later became the epitome of logical empiricism \cite{Stanford}. With the advances made in formal logic, it was believed that it would be possible to define a verification principle for metaphysical propositions. Just like Frege, Ayer made a distinction between analytic and synthetic propositions. For a verification principle to be successful, then, it needs to be capable of ascertaining whether a synthetic proposition is empirical. Any sentence that failed the verification principle would be labeled as metaphysical and was asserted to be absolutely void of any meaningful content. This is a harsh claim, and it is perhaps not too surprising that the definition of a verification principle is irksome and prone to errors. Ayer gave the following criterion \cite{Stanford}:

A (synthetic) sentence {A} has empiric content if either:

  • {A \Rightarrow O}, where {O} is an observational sentence.
  • There exists a {B} such that {A \wedge B \Rightarrow O}, where {O} is an observational sentence such that {B \nRightarrow O}.

The second criterion means that {A} can be used with another sentence, {B}, to derive an observational sentence that could not have been derived from {B} alone.

It can be argued that this principle is at least necessary, but is it sufficient? Not at all. Consider any sentence {A} and an observational sentence {O}. If it is possible to find a sentence {B} such that {A \wedge B} necessarily implies {O}, the principle fails. This is simple: take {B = A \Rightarrow O}. Then it can be shown by an application of modus ponens that {A \wedge B \Rightarrow O}.

Even though this simplistic attempt failed, it is not impossible that a more complicated version could do the trick. Of course, this hinges on the fact that there is a distinction between synthetic and analytic propositions. If it can be argued that no such distinction can be made, then we would simply have to abandon the idea behind Ayer’s verification principle. This was the approach taken by Willard Van Orman Quine in his seminal paper Two Dogmas of Empiricism \cite{Quine} \footnote{Where the second “dogma” is reductionism}. Quine argues that the current notion of analyticity is circular in the sense that it relies on another concept, synonymy, that cannot be explained without referring to analyticity. The argument is best illustrated with examples. One of the hallmark sentences in the analytic tradition is No unmarried man is married, which is true no matter what “man” and “married” refers to if we agree that the logical syntax remains fixed. Quine then puts forth a second category of analytic sentences that can be transformed into logically true — analytic — sentences by replacing synonyms with synonyms. For example: No bachelor is married. We have to agree that this sentence is not logically true, at least not in the same sense as the preceding sentence, but could be taken as analytic once bachelor is replaced with unmarried man.

But when are two words synonyms? A dictionary is of no help here since it only records what is already known; there is no explanation behind the stated facts. The reasoning that two words are synonyms if they are interchangeable in all contexts fares no better, since this does not apply to e.g. “bachelor” and “unmarried man” if one puts them in the context “… has fewer than 10 words”. Let us however for the sake of the argument ignore syntactic differences between words. Is this kind of interchangeability good enough to describe synonymy?

Assume that “bachelor” and “unmarried man” are interchangeable in the sense just described. Then by the statement Necessarily all and only bachelors are bachelors we can conclude that Necessarily all and only bachelors are unmarried men by substituting “bachelor” for “unmarried man”. But to claim this last statement is to claim that the statement All and only bachelors are unmarried men is analytic! Why? Because if a statement is necessarily true then it is analytic, and we are back where we started since we need the concept of analyticity to explain the concept of synonymy.

If one agrees with Quine’s reasoning then no sharp distinction can be made between synthetic and analytic propositions, which would leave most of the original logicism program in shambles.

4. Discussion

As far as I can tell there is obviously no logical flaw in Quine’s argument. I am however a bit skeptic to the fact that he is so quick to dismiss synonyms as definitions. Does it really matter why two words are synonyms? I see no inherent problem in simply accepting the fact that some words have evolved to mean the same thing. This would however have the consequence that it is impossible to say that a sentence is analytic without assuming a specific, underlying theory. But is this a problem? Most mathematicians would argue that the sentence {1 + 1 = 2} is analytic, but this is only true if we assume the usual definition of {1}, {+}, {=} and {2}; and would not be true if we interpreted {+} as another relation. A better example can be found in geometry: is the parallel postulate analytic? Yes, if we are working in Euclidian geometry. No, if we are working in a non-Euclidean geometry.

Hence I still think that we can make a distinction between synthetic and analytic propositions, but only with respect to an underlying theory; though I would suspect that this notion of analyticity is not strong enough to be of any use in the logicism program.


References

\bibitem{Chalmers} Alan Chalmers What Is This Thing Called Science. Hackett Pub Co, 1999.

\bibitem{Frege} Michael Potter and Tom Ricketts The Cambridge Companion to Frege. Cambridge University Press, 2010.

\bibitem{Ayer} Alfred Jules Ayer Language, Truth and Logic. Dover Publications; 2nd edition, 1952.

\bibitem{Stanford} Richard Creath Logical Empiricism. The Stanford Encyclopedia of Philosophy, Summer 2011 Edition.

\bibitem{Quine} Willard Van Orman Quine Two Dogmas of Empiricism. From a Logical Point of View, 1961.

Advertisements

A Gentle Introduction to Herbrand Models

Almost related pun of the day: Some interpretations are more equal than others.

I’ve not yet encountered a single person — including myself — who have grasped the concept and utilities of Herbrand interpretations at a first glance. This might lead one to believe that the theory is so hard and abstract that it’s not really related to every day logic programming anyway, but nothing could be further from the truth. On the contrary, the problem is rather that the definition of a Herbrand interpretation is too simple to make any sense — a wolf in sheep’s clothing. Hence we have a good subject at hand and a splendid opportunity for me to confuse you further with deceptive didacticism.

A logic program is a finite set of rules and facts. Let’s not bother ourselves by giving a formal definition and instead cut right to the chase with an example:

natural(zero).
natural(s(N)) \leftarrow natural(N).

From a pragmatic perspective it should be construed as: zero is natural (i.e. a natural number), and if it is known that N is a natural number, then it’s the case that s(N) is a natural number. I’ll be the first one to admit that it’s not a very interesting program, but it’ll have to do for the moment. Following this intuition it’s possible to answer queries to the program in two different ways: top-down and bottom-up. Let our query be natural(s(zero)). Here’s how the two different strategies differ:

Top-down

To solve natural(s(zero)), we unify natural(s(zero)) with natural(s(N)) and obtain the unifier \{N= zero\}. Then we recursively solve the new goal obtained from the right-hand side of the rule: natural(zero). This can be done in one step, and since there are no more goals to resolve we’re done.

Bottom-up

We begin to note that natural(zero) holds. A solid start! Then, since the body of the second clause is true with the unifier \{N= zero\} (the same as before), we conclude that natural(s(zero)) must also be true. Since this is the same as the original query there’s no reason to continue.

I’ve admittingly danced over a lot of intricate details, but I should at least mention that the top-down approach corresponds to SLD-resolution while the bottom-up approach corresponds to unit-resolution (sometimes called hyper-resolution). The difference between the two lies in the application of the resolution rule: in SLD-resolution we produce resolvents between a head of clause and a goal clause, in unit-resolution we produce resolvents between a unit-clause and the body of a clause. So we have two ways of answering a query, but how do we know that they are correct? For that we need to backtrack (pun not intended) and carefully explicate the meaning of the program. This is formally done with a model. The content of this particular program is:

  1. A single constant: zero.
  2. A single function symbol of arity 1: s.
  3. A single predicate symbol of arity 1: natural.

And to provide a meaning to the program we need to explain exactly what these symbols mean. We need two functions: one for the constant symbol, f_{zero}, and one for the function symbol, f_s. In the domain of the natural numbers these can be defined as:

  1. f_{zero}(zero) = 0
  2. f_s(N) = N + 1

Which may then be combined into a single function, f, that allows us to recursively compute the value of any term in the program. It is defined as:

f(zero) = f_{zero}(zero)

f(s(N)) = f_s(f(N))

For example:

f(s(zero)) = f_s(f_{zero}(zero)) = f_s(0) = 0 + 1 = 1

The only predicate symbol, natural, must be mapped to a relation on the domain. The simplest such is just the set \{0, 1, ...\}. And we’re done with this particular model.

Is it a good model? That depends. In the context of our logic program it seems to be a bit overkill. Previously we only had to worry about the symbols that actually appeared in the program: s, zero and natural; but now we’ve introduced arithmetical notation that so to speak goes beyond the original interpretation. There’s really no reason at all to demand that zero must be evaluated to 0 and that s(zero) must be evaluated to 1. They are just two different representations. With this in mind we can construct a simpler model:

  1. f_{zero}(zero) = zero
  2. f_s(N) = s(N)

And in the same vein as before this may be combined into a single function f. Applied to the same term as before we obtain:
f(s(zero)) = f_s(f_{zero}(zero)) = f_s(zero) = s(zero)

We’re back with what we started with! This means that the terms in the program are just mapped to themselves, and why not? The predicate symbol natural will now be mapped to the relation \{zero, s(zero),...\} instead of \{0, 1, ...\}.

These kinds of models are known as Herbrand models. The general definition closely mimics the earlier one, with the effect that all terms are mapped to themselves and that predicates are mapped to relations over the universe of terms (the Herbrand universe). Since the only interesting part of a Herbrand model is the mapping of predicates, it’s often presented as a set that enumerates the values for which the predicates should be true. In our example, which is in fact a least Herbrand model, this was \{natural(zero), natural(s(zero)), ...\}. As should be evident by now this is precisely the set that we already had in mind as programmers, it’s the intended meaning of the program.

But was this just a fluke? Could there be cases where it’s impossible to find a Herbrand model even though there’s exists another model? Fortunately the answer is no: if there exists a model for a logic program then there also exists a Herbrand model. This provides a theoretical justification for both SLD- and unit-resolution. In the first case we use this in our search for unsatisfiable sets of ground clauses from the Herbrand universe, in the second case we actually build the (least) Herbrand model from bottom up.