Assumption 1: Real Plausibility: The plausibility of a proposition A under information X can be represented by a Real number (A|X) so that larger plausibility corresponds to a larger number.
Why should we accept such an assumption? Well, because of what we want to use this kind of reasoning for.
We want to be able to contemplate the truth values of propositions. This means that we want to allow being uncertain of its value, but also to allow certainty. The “plausibility” mentioned in assumption 1 is a measure of how certain we are that the proposition is true. Clearly, we want to allow absolute certainty as the highest value, but also allow lower levels of certainty*.
We also want to be able to compare the plausibilities of different propositions. We want to be able to say things like “Proposition A is very likely, and proposition B is plausible, but proposition C isn’t serious”. The demand to be able to compare the plausibility of all propositions is the demand of strong ordering – that we could arrange any two propositions in an order, from less to more plausible.
A Real number allows strong ordering. In principle, it is possible to achieve strong ordering with weaker assumptions. Using just fractions instead of real numbers would suffice, for example. But it is unclear why we would generally only want to consider fractions. The same is true for only a few, discrete, levels of certainty. For a general theory of how to think about plausibility, it appears much more sensible to allow more leeway in choosing the values, and allow any real number.
Another critique of using real numbers is that it is too “exact”. Degrees of belief shouldn’t be considered so precisely; they should be fuzzy, vague things. I find this objection unconvincing. It is true that “fuzzy” ordering makes sense for human thinking, in that we might say that e.g. A is more likely than B but we’re not certain if it’s superior to C. But I’m not sure if that is a result of the way we think or of our ignorance of how our own brain computes, and at any case see no reason for such fuzziness in a perfectly rational agent. Exactness is a virtue, not a flaw.
I was careful to define plausibility as the degree of belief that the proposition is true. But what about our uncertainty over whether the proposition is false? Bayesians maintain that a separate measure is not needed for that – it is already inherent in ‘plausibility’. Maximum plausibility implies absolute certainty that the proposition is true, and therefore minimum certainty, in any meaningful sense, that the proposition is false: we will not be willing to bet on it being false, or in any way act on the assumption that there is even the slightest chance that it is. Maximum plausibility of A therefore implies minimum certainty that A is false. Putting this argument in reverse, minimum plausibility of A must therefore imply absolute certainty that “not A” is true, i.e. that A is false.
Those that do not accept such lines of thought may want to keep track of our degree of certainty of “not A” separately from our degree of certainty in “A”. They can perhaps argue that minimal certainty is a state of ignorance, not of knowledge, and that one should not confuse the data held about a proposition with its evaluation for the purpose of taking a distinct action. This kind of approach is taken by the Dempster-Shafer theory, which is perhaps the major contender to Bayesianism. It can be understood as keeping track not of our degree of certainty in, but rather of our support for the truth of propositions. If we denote “no support for proposition A” as s(A)=0 and “demonstrative proof that proposition A is true” as s(A)=1, we have a Real measure that characterises the proposition must like the Bayesian “plausibility” does. But we have a separate such measure for “not A”. The support for A is not simply a mirror image for the support of not-A; it may be that both lack support (s(A)=s(not A)=0), or that we have a little evidence in favor of both (e.g. s(A)=0.2, s(not-A)=0.1), or so on. We can think the of difference from “1” (1-s(A)-s(not A)) as a measure of our “ignorance” on whether A is true or not.
Note that such a two-valued approach implies that there is no strong order, as having several “plausibility” measures (s(A) and s(not-A)) prohibits simply comparing our state of knowledge about proposition A using a “greater than” relation. One may attempt to consider theories with even more parameters (e.g. reliability of A) attached to every belief, and these too will violate strong order.
Who is right, Shafer or Bayes? I am not certain (huh!), but I am inclined to support the Bayesians to the extent that for choosing between options the single-measure of Baysianism seems needed, so I think ultimately Dempster-Shafer should reduce to Bayesianism once one generates measures of “certainty” from it. At any rate, Bayesianism discusses only a single measure of belief, “certainty” or “plausibility”, and we will proceed from now on under this assumption.
Another thing we want to be able to do with plausibilities is to change our beliefs given new information. This is why “under information X” should be part of the definition of plausibility. We have not clarified just what this “information” means, and we won’t yet – that will wait for another post. But I do want to define at this stage what “further information” means.
I will mark the plausibility of A given information X and the further information that B is true as (A|B, X). And I will demand that information that a proposition is true (or false) will change the plausibility of that proposition accordingly.
Definition 1: Further Information: The plausibility (A|A,X) of A given information X and the further information that A is true, is that of truth (A|A,X)=(T|A,X). The plausibility (A|A,X) of A given information X and the further information that that A is false is that of falsehood (A|A,X)=(F|A,X).
I have here written the logical value of “True” as T, and the logical value of “False” as F. I have marked “not A” by an underline, A. I am implicitly assuming that A can i principle be both true or false – no information in the world can make a tautology false, or a contradiction true.
I have written that the “demand” is a definition, because it really only elaborates on what is meant by “information”. We will return to information later, and provide a set of assumptions that underlies its use. The reader may, however, choose to view this “definition” as another assumption, if he wishes. I find it difficult to raise any arguments against it, once Assumption 1 is accepted – clearly, given this information these are the degrees of certainty that are implied.
* The choice of a larger Real value for larger plausibility is merely a convention. We could have chosen the opposite – lower values for great plausibility – and it would work just as well. The important part of the definition is that plausibilities could be arranged with a “greater than” relation; the direction is just a convention.