Since there are already outstanding technical answers to this question, I think we should add some better philosophical underpinnings for you to explore that might help with gaining a better intuitive feel for what information is.
Warren Weaver provided an excellent discussion on information theory in 1949 in his paper entitled "Recent Contributions to The Mathematical Theory of Communication".
In the paper he breaks down communications problems into three main categories: technical, semantic and effectiveness. He further explains that the concept of information is purely derived to address the technical problem in communications theory.
A simple definition of information, provided by Weaver, is that "information is a measure of one's freedom of choice when one selects a message"; or more correctly, the logarithm of that freedom of choice. Information is thus more clearly understood as a the number of combinations of component parts that are available to be chosen arbitrarily.
In this sense, on can view it as a measure of randomness associated with a string of letters. A great example is wheel of fortune. When Pat Sajak shows you a the board with the white and green blocks, he as already provided you a lot of information by placing spaces between the white blocks, because he has drastically reduced the number of possible combinations that might be possible to fill in the white blocks.
The maximum information (or entropy) of the board with 52 boxes or "trilons" and using 26 letters is $26^{52} = 3.8\times 10^{73}$ combinations or between $244$ and $245$ bits of information in binary. However, if there were only 11 boxes illuminated white, then the actual information of the board has suddenly dropped to $26^{11} = 3.6\times 10^{15}$ combinations giving an actual information content (or entropy) or $51$ to $52$ bits. The relative information is $\dfrac{51}{244} = 0.21$ or 21%. The redundancy is then given by $1 - 0.21 = 0.79$ or 79%.
As Vanna flips boxes, she is decreasing the relative entropy and increasing the redundancy to a point where the probability of solving the puzzle becomes very high. So in this sense, information, like entropy, is a measure of uncertainty about the system.
Now there are different types of uncertainty, one is the uncertainty associated with the freedom of choice of message, and the other is noise. The uncertainty discussed in wheel of fortune example is due to the freedom of choice. In a noiseless situation, we would expect the word or phrase that Vanna unveils to be exactly the one chosen before the show. In a noisy environment, for instance, one where there is some probability of a crewmember mispelling the word while setting up the blocks, then it is possible that the final word shown is not the one chosen before the show. That uncertainty, or noise, is called equivocation, and is brought in by the environment itself.
The distinction between a noisy and noiseless environment is very important. William Tuller in 1949 published a paper "THEORETICAL LIMITATIONS ON THE RATE
OF TRANSMISSION OF INFORMATION" that proved that there was no limit in the amount of information that could be transmitted in a noiseless channel. This was why Shannon's paper "Communication in the Presence of Noise" was critical to communication theory in that it properly quantified what noise actually was, and how it affected communication and the transfer of information.
Now, before finishing, it should be noted that Hartley in his 1928 paper "Transmission of Information" was the first to really give a modern definition of information and give it quantitative measure. I would recommend reviewing that paper as a starting point. Other significant contributions are made from other scientists, such as Wiener which is best captured in Cybernetics.
On a closing note, it is refreshing that the significance of quantum noise is beginning to be discussed, and I hope it continues in the future.