What is information?

Question

We're all familiar with basic tenets such as "information cannot be transmitted faster than light" and ideas such as information conservation in scenarios like Hawking radiation (and in general, obviously). The Holographic Principle says, loosely, that information about a volume of space is encoded on its two-dimensional surface in Planck-sized bits.

In all these contexts, I can take "information" to mean predictive or postdictive capability, i.e. information is what enables us to state what the outcome of a measurement was or will be (locally). But what is information, exactly? Do we have any kind of microscopic description of it? Is it just a concept and, if so, how can we talk about transmitting it?

I suspect this is probably as unanswerable as what constitutes an observer/measurement for wave function collapse, but I'd love to know if we have any formulation of what information is made of, so to speak. If I'm talking nonsense, as I suspect I may be, feel free to point this out.

score 37 · Answer 1 · answered Jan 14 '11 at 03:15

37

In short:

information contained in a physical system = the number of yes/no questions you need to get answered to fully specify the system.

answered Jan 14 '11 at 03:15

Johannes

19,015

12

+1 but could be improved into "the minimum number of..." IMHO – Tobias Kienzler Apr 04 '11 at 12:35
nice indeed, eventually not true – foggy Jul 28 '15 at 15:55
@Probably - Sure, you can sharpen the above answer by making the more elaborate statement "the entropy for a given macroscopic state of an object is the number of yes/no questions you need to answer at a minimum to fully specify the detailed microscopic state of the object". however, this elaboration doesn't render the above more succinct answer "eventually not true". Also, please keep in mind that the number of yes/no questions required is typically Avogadro's number (or much, much larger in case of the context of this question which involves quantum gravity degrees of freedom). – Johannes Apr 30 '16 at 05:24
1

I disagree with this answer. Suppose I have signal arising from the voltage noise of a resistor. I have to ask a lot of yes/no questions to specify that signal, but it doesn't contain much information. To understand information, you really have to talk about prior knowledge and constraints. – DanielSank Aug 29 '17 at 07:43
@DanielSank --the voltage noise of a resistor, originating from the detailed thermal agitations in the electronic states, contains a huuuuge amount of information.... – Johannes Sep 06 '17 at 13:56
3

@DanielSank Just because you're not interested in the information does not prevent it from being information. – klutt Dec 18 '18 at 23:40
@Johannes so TV static noise would have more information content than any video. Random noise can't be compressed into any smaller number of bits. I thought information had a subjective nature to it. – Aditya P Apr 04 '19 at 13:22
@Aditya - that's correct: per definition you can compress a signal no further than down to the minimum number of bits required to reconstruct the signal. So strings of bits resulting from coin tosses have the highest information content. Of course Broman is right: whether you are, or are not, interested in this information is a different (and in general a subjective) matter. The information you are not interested in, you can refer to as entropy. – Johannes Apr 05 '19 at 16:52
Using this definition, how much information does a hydrogen atom contain? Would the filesize of the compressed Wikipedia page “hydrogen atom” be a good or bad approximation? – Jackson Walters Feb 18 '20 at 23:47

Anixx · Answer 2 · 2011-01-10T22:03:31.623

20

Information is a purely mathematical concept, usually a a characteristic of uncertainty (of a probability distribution function), but can be interpreted in different ways. In the simplest form it is introduced in information theory as a difference between uncertainties of two distributions, with uncertainty being the logarithm of a number of possible equally-probable states of a discrete random variable. For continuous distribution, it can be introduced as a logarithm of an integral. Sometimes introduced proper information - a quantity which differs from negative entropy only by a constant independent of the distribution (this constant can be taken as zero).

Thus information is a difference of proper information (difference of negative entropy) of two states. The states are represented by probability distribution functions, thus information is a formal operator of two functions.

For continuous distributions (of which discrete case is a variant) proper information of distribution $w$ is

$$I[w]=-H(w)=-\int_{-\infty}^{+\infty}w(x)\log(w(x))dx$$

and relative information of $w_2$ compared to $w_1$ is

$$I[w_2,w_1]=H(w_1)-H(w_2)=I(w_2)-I(w_1)$$

or

$$I[w_2,w_1]=\int_{-\infty}^{+\infty}\log \left(\frac{w_1(x)^{w_1(x)}}{w_2(x)^{w_2(x)}}\right)$$

This operator is not much different from norm or angle in vector spaces. It is just one measure, attributed to members of the space.

Compare this with the definition of norm:

$$||w||=\sqrt{\int_{-\infty}^{+\infty}w(x)^2dx}$$

distance

$$D[w_1,w_2]=||w_1-w_2||=\sqrt{\int_{-\infty}^{+\infty}(w_1(x)-w_2(x))^2dx}$$

angle

$$\Phi[w_1,w_2]=\arccos \frac{\int_{-\infty}^{+\infty}w_1(x)w_2(x)dx}{\sqrt{\int_{-\infty}^{+\infty}w_1(x)^2dx}\sqrt{\int_{-\infty}^{+\infty}w_2(x)^2dx}}$$

So think about information as of a mathematical quantity similar to angle.

edited Jan 10 '11 at 22:03

answered Jan 10 '11 at 21:31

Anixx

11,159

5

Any talk of information from a mathematical perspective really needs to mention Shannon entropy. – Noldorin Jan 10 '11 at 21:50
It's mentioned above. – Anixx Jan 10 '11 at 21:55
To elaborate on this one can write expression for proper information using multiplicative integral: $$I(w)=\log \int_{-\infty}^{\infty} (w^w)^{dx}$$ and compare it with the expression for norm $$||w||=\sqrt {\int_{-\infty}^{\infty} w^2 dx}$$ – Anixx Jan 10 '11 at 22:06
3

To say in simple words, norm answers the question how big something is, angle answers the question how oriented something is and entropy/information answers the question how complex something is. – Anixx Jan 10 '11 at 22:39
2

Welcome to physics.se @anixx. This is quite a remarkable answer. The viewpoint that information is a "mathematical quantity similar to angle" is also at the heart of quantum mechanics. In fact in 1981 Wootters ("Statistical distance and hilbert space", PRD) showed that the "statistical distance" between two sets of observations coincides with the angle between rays of a Hilbert space. Of course none of this would come as a surprise to R. A. Fisher ;) – Jan 10 '11 at 22:46
2

Ah ok. I missed it it seems. You should definitely call it "Shannon entropy" or "Information entropy" to distinguish it from thermodynamic entropy, given this is a physics site. – Noldorin Jan 10 '11 at 22:59
@ space_cadet Actually if to compare the formulas, information is more like distance (just substitute multiplication with exponentiation and subtraction with division). But unlike distance it is independent of the scale (like angle is). – Anixx Jan 10 '11 at 23:08
@Anixx to see the analogy I'd suggest you take a look at the paper I mentioned. But I get your point too. – Jan 11 '11 at 15:03
3

@Anixx @space_cadet -to be abit pedantic, you should call what you wrote differential entropy. In Shannon entropy, the random variable is discrete. Differential entropy extends Shannon entropy using probability density functions which are continuous, but these can be tricky and can have values greater than 1. I agree, I like the answer. +1 – Gordon Feb 18 '11 at 04:59
@Anixx Thinking information as "an angle" raises some questions.. Does it have a maximum and then repeats its properties ? like $\alpha$ and $\alpha+2\pi$. Are there "orthogonal information"? and so on.. – HDE Dec 14 '12 at 15:14
@HDE no, it is more like distance in that it is unlimited, and more like angle in that it is dimensionless. As you can see from the above the formula resembles that for distance with square replaced with self-power, subtraction with division and root with logarithm. – Anixx Dec 14 '12 at 15:45

score 20 · Answer 3 · edited Aug 02 '13 at 13:12

Since there are already outstanding technical answers to this question, I think we should add some better philosophical underpinnings for you to explore that might help with gaining a better intuitive feel for what information is.

Warren Weaver provided an excellent discussion on information theory in 1949 in his paper entitled "Recent Contributions to The Mathematical Theory of Communication".

In the paper he breaks down communications problems into three main categories: technical, semantic and effectiveness. He further explains that the concept of information is purely derived to address the technical problem in communications theory.

A simple definition of information, provided by Weaver, is that "information is a measure of one's freedom of choice when one selects a message"; or more correctly, the logarithm of that freedom of choice. Information is thus more clearly understood as a the number of combinations of component parts that are available to be chosen arbitrarily.

In this sense, on can view it as a measure of randomness associated with a string of letters. A great example is wheel of fortune. When Pat Sajak shows you a the board with the white and green blocks, he as already provided you a lot of information by placing spaces between the white blocks, because he has drastically reduced the number of possible combinations that might be possible to fill in the white blocks.

The maximum information (or entropy) of the board with 52 boxes or "trilons" and using 26 letters is $26^{52} = 3.8\times 10^{73}$ combinations or between $244$ and $245$ bits of information in binary. However, if there were only 11 boxes illuminated white, then the actual information of the board has suddenly dropped to $26^{11} = 3.6\times 10^{15}$ combinations giving an actual information content (or entropy) or $51$ to $52$ bits. The relative information is $\dfrac{51}{244} = 0.21$ or 21%. The redundancy is then given by $1 - 0.21 = 0.79$ or 79%.

As Vanna flips boxes, she is decreasing the relative entropy and increasing the redundancy to a point where the probability of solving the puzzle becomes very high. So in this sense, information, like entropy, is a measure of uncertainty about the system.

Now there are different types of uncertainty, one is the uncertainty associated with the freedom of choice of message, and the other is noise. The uncertainty discussed in wheel of fortune example is due to the freedom of choice. In a noiseless situation, we would expect the word or phrase that Vanna unveils to be exactly the one chosen before the show. In a noisy environment, for instance, one where there is some probability of a crewmember mispelling the word while setting up the blocks, then it is possible that the final word shown is not the one chosen before the show. That uncertainty, or noise, is called equivocation, and is brought in by the environment itself.

The distinction between a noisy and noiseless environment is very important. William Tuller in 1949 published a paper "THEORETICAL LIMITATIONS ON THE RATE OF TRANSMISSION OF INFORMATION" that proved that there was no limit in the amount of information that could be transmitted in a noiseless channel. This was why Shannon's paper "Communication in the Presence of Noise" was critical to communication theory in that it properly quantified what noise actually was, and how it affected communication and the transfer of information.

Now, before finishing, it should be noted that Hartley in his 1928 paper "Transmission of Information" was the first to really give a modern definition of information and give it quantitative measure. I would recommend reviewing that paper as a starting point. Other significant contributions are made from other scientists, such as Wiener which is best captured in Cybernetics.

On a closing note, it is refreshing that the significance of quantum noise is beginning to be discussed, and I hope it continues in the future.

absolutely brilliant – Unassuminglymeek May 22 '11 at 13:09 — Unassuminglymeek, May 22 '11 at 13:09

score 4 · Answer 4 · edited Mar 23 '14 at 02:55

The information is a dimensionless (unitless) - and, in this sense, "purely mathematical" - quantity measuring how much one has to learn to know something relatively to the point when he doesn't know it, expressed in particular units. Operationally speaking, it is the amount of RAM chips (or their parts) one needs to have so that they can remember some knowledge. Of course, by using the word "knowledge", I am just avoiding the word "information", and it is impossible to define any of these terms without any "circular references" because one has to know at least something to be able to define as elementary concepts as knowledge.

One bit of information is the knowledge needed to know whether a number that can be 0 or 1 with the same probability turned out (or will turn out) to be 0 or 1. In mathematics, a more natural unit than one bit is one "e-bit" which is such that 1 bit is ln(2) "e-bits". The information needed to distinguish among "N" equally likely alternatives is ln(N) "e-bits". The natural logarithm is always more natural than other logarithms - that's why it's called natural. For example, its derivative equals 1/x, without complicated constants. The formulae for the "Shannon" dimensionless information, assuming any probabilistic distribution, are given above.

In physics, every physical system with some degrees of freedom may carry some information. In quantum information, the "alternatives" are usually associated with basis vectors of the allowed Hilbert space of states. But in that context, one "bit" of information is usually referred to as a "qubit" or "quantum bit" which means that in the real world, the alternatives may also be combined into arbitrary complex linear superpositions, like the postulates of quantum mechanics dictate.

In discussions about causality, we mean that the spatially separated objects can't really influence each other. This is guaranteed by the Lorentz symmetry. In field theory, the condition is equivalent to the constraint that the space-like separated fields $\phi(x)$ and $\phi(y)$ commute with each other (or anticommute if both of them are fermionic).

Best wishes Lubos

score 0 · Answer 5 · edited Apr 13 '17 at 12:39

I recently gave a brief but but serious attempt at a different way to interpret one bit in terms of quantum physics, so perhaps it's worth mentioning that answer in the context of this much older question:

https://physics.stackexchange.com/a/91035/7670

"In terms of space, time, momentum, and matter, a single bit of information is the choice of one quantum path over another equally likely one. When applied at the level of atoms and particles, the result is a tapestry of choices that quickly becomes nearly infinite in complexity."

This definition is readily compatible with MWI approaches, as it defines the total sets of bits in the universe you can see as the "address" of your universe within the multiverse.

For better or worse, this definition is my own, not one that I'm quoting from anything. But it is nicely compatible with such simple experiments as Feynman's electron slit analysis, where photons determine the path of the electron if you get nosy about it, and so add one more bit to the definition of our observable universe.

What is information?

5 Answers5

Linked

Related