## Labels

### On the Reference Class Problem

Extremely important conceptualization of the key problem of many fields of inquiry... expand

Reichenbach defines it:
“If we are asked to find the probability holding for an individual future event, we must first incorporate the case in a suitable reference class. An individual thing or event may be incorporated in many reference classes, from which different probabilities will result. This ambiguity has been called the problem of the reference class”. In Reichenbach, H. (1949): The Theory of Probability, California U.P., 374.
Moreover, Reichenbach has more to say on how the reference class is arrived at:
If probability belongs to a class, its numerical value is determined because for a class of events a frequency of occurrence may be determined. A single event, however, belongs to many classes; which of the classes are we to choose as determining the weight? Suppose a man forty years old has tuberculosis; we want to know the probability of his death. Shall we consider for that purpose the frequency of death within the class of men forty years old, or within the class of tubercular people? And there are, of course, many other classes to which the man belongs. The answer is, I think, obvious. We take the narrowest class for which we have reliable statistics.
Of course, reliable statistics implies a detailed knowledge of causes, etc. Now, the classical definition by Venn is:
every single thing or event has an indefinite number of properties or attributes observable in it, and might therefore be considered as belonging to an indefinite number of different classes of things", leading to problems with how to assign probabilities to a single case. He used as an example the probability that John Smith, a consumptive Englishman aged fifty, will live to sixty-one
Durkheim (The Rules of Sociological Method, Free Press, 1982, H. Walls trans., p. 110), in proposing methods of sociological research, analyzes the implicit difficulties of finding clear-cut definitions of species and genera and provides useful insights into them:
...It is untrue that science can formulate laws only after having reviewed all the facts they express, or arrive at categories only after having described, in their totality, the individuals that they include. The true experimental method tends rather to substitute for common facts, which only give rise to proofs when they are very numerous and which consequently allow conclusions which are always suspect. decisive or crucial facts. as Bacon said which by themselves and regardless of their number, have scientific value and interest. It is particularly necessary to proceed in this fashion when one sets about constituting genera and species. This is because to attempt an inventory of all the characteristics peculiar to an individual, is an insoluble problem. Every individual is an infinity, and infinity cannot be exhausted. Should we therefore stick to the most essential properties? If so, on what principle will we then make a selection? For this a criterion is required which is beyond the capacity of the individual and which consequently even the best monographs could not provide. Without carrying matters to this extreme of rigour, we can envisage that, the more numerous the characteristics to serve as the basis for a classification, the more difficult it will also be, in view of the different ways in which these characteristics combine together in particular cases, to present similarities and distinctions which are clear-cut enough to allow the constitution of definite groups and sub-groups.
Even were a classification possible using this method, it would present a major drawback in that it would not have the usefulness it should possess. Its main purpose should be to expedite the scientific task by substituting for an indefinite multiplicity of individuals a limited number of types. But this advantage is lost if these types can only be constituted after all individuals have been investigated and analysed in their entirety. It can hardly facilitate the research if it does no more than summarise research already carried out. It will only be really useful if it allows us to classify characteristics other than those which serve as a basis for it, and if it furnishes us with a framework for future facts. Its role is to supply us with reference points to which we can add observations other than those which these reference points have already provided. But for this the classification must be made, not on the basis of a complete inventory of all individual characteristics, but according to a small number of them, carefully selected. Under these conditions it will not only serve to reduce to some order knowledge already discovered, but also to produce more. It will spare the observer from following up many lines of enquiry because it will serve as a guide. Thus once a classification has been established according to this principle, in order to know whether a fact is general throughout a particular species, it will be unnecessary to have observed all societies belonging to this species - the study of a few will suffice. In many cases even one observation well conducted will be enough, just as often an experiment efficiently carried out is sufficient to establish a law.
The value of Durkheim's observation lies more on the enunciation of the difficulties of the reference class problem, rather than in the solution that he proposed, which seems to open many other difficulties.
In any case, Here's an interesting example of a taxonomy that somehow falls within a reference class problem:

 Variation in size of different types of soil. From Kay (1990)

Another interesting classification:

 Scale definitions and different processes with characteristic and horizontal scales. (Isidoro Orlanski, BAMS, 05/1975, 56, 5, 528)
See also the definition of Collingwood of universal and reference classes in Speculum Mentis, 162-163.

Cicero in the Orator (4.16) says:
Nor, indeed, without having studied in the schools of philosophers, can we discern the genus (classes) and species of everything; nor explain them by proper definitions; nor distribute them into their proper divisions; nor decide what is true and what is false; nor discern consequences, perceive inconsistencies, and distinguish what is doubtful.
See also, Keynes discussion of the concept in A Treatise on Probability (1948 Macmillan) p. 103ff.