Sunday, 29 May 2016

Reference observers

Evolutionary biology is intimately involved with the topic of how information about environments is transmitted down the generations. There's a fairly mature mathematical framework which engineers use for discussing this sort of thing, namely Shannon/Weaver information theory.

Crick famously mentioned information when specifying the central dogma. However, over the years, a number of people have complained about attempts to apply information theory to biology. The complaints are various: information is subjective; it isn't clear how to apply the theory; information theory is confusing; organisms inherit more than just information from their ancestors; the results are not very useful - and so on. Others think an information-based analysis is useful, but prefer other information metrics.

To give some examples, here is Daniel Dennett explaining why he doesn't use Shannon information (44 minutes in):

I'm not talking about bits when I'm talking about information, I'm talking about information in a more fundamental sense. Shannon information measured in bits is a recent and very important refinement of one concert with information but it's not the concept I'm talking about. I'm talking about the concept with information where when one chimpanzee learns how to crack nuts by watching his mother crack nuts there's information passed from mother to offspring and that is not in bits, that is that is an informational transfer but has not accomplished in any Shannon channel that is worth talking about.

Here is John Wilkins asserting that Shannon-Weaver information theory has not been very useful:

Attempts have been made to apply the Shannon-Weaver theory of communication to genetics but have typically failed to assist research (Hariri, Weber, and Olmsted 1990). The broader discipline of bioinformatics makes use of this and other analytic techniques to find patterns in data sets that may turn out to be functional or significant (Mount 2004), but such techniques require validation by experiment, and there is debate over how useful it is. Part of the problem with the Shannon account is that it is hard to find analogues for the general abstract entities of Shannon’s account.

Another common complaint is that creationists frequently use information theory to criticize evolutionary theory. Here, information theory seems to be getting tarred by association. For more examples, see the references of this post. I think that Shannon/Weaver information theory is applicable to evolutionary biology and is useful when applied there. This post is not really about that, though - instead it introduces a concept which I think is useful when applying information theory.

A conventional interpretation of the term "information" involves the "unexpected" content of a message. A novel message contains information; a message that you already know the contents of does not. This concept can be formalized and quantified if the observer places a probability density function over the domain of the expected input symbols before they receive them - allowing the 'surprise value' of the message to be quantified in bits.

However, this concept of information faces a problem when applied to scientific domains: namely, it is subjective. Two observers can easily differ on the issue of what the information content of a message actually is. Subjectivity is a problem in scientific domains: scientists go to considerable lengths to find objective metrics, to help other scientists reproduce their work. This post describes a way to resolve this issue.

It is true that the conventional interpretation of "information" is subjective. However, it is pretty easy to convert this into an objective metric - simply by specifying the observer involved. If scientists do not observe a message directly, but instead use a clearly-specified reference observer to observer it, they can agree on the information content of a message.

Reference observers are sometimes called "virtual observers" or "standard observers". To give an example of a reference observer, consider an agent with a maximum entropy prior over the available symbols and no memory or state variables. Such an observer would measure the information carrying capacity of a message. To such an agent, a 650 MB CD ROM would contain 650 MB of information. A 4.7 GB DVD would contain 4.7 GB of information - and so on.

Other portable observers could be based on standard compressors. PKZIP and GZIP are examples of widely available compression programs that could be used. They have their own prior probabilities and learning algorithms, and are standard and so can be specified by simply naming them.

A related complaint is that with lots of possible reference observers available, researchers will pick ones that promote their own theories or results, again eliminating the objectivity of science. That is a genuine concern. However pretty much the same problem applies to Kolmogorov complexity, or to priors in Bayesian statistics. This is a well-known issue which scientists should be familiar with handling. IMO, having multiple reference observers available is better than attempting to promote a one-size-fits-all scheme for measuring information scientifically.

I think that the concept of "reference observer" fairly neatly overcomes many of the objections to the use of Shannon/Weaver information theory which claim that information theory is subjective. If you specify the observer involved, the subjectivity vanishes. It can be complex to specify some observers - but other observers are very simple and easy to specify, and some standard observers are widely available.


No comments:

Post a Comment