Great question about the relationship between thermodynamic entropy and information entropy. I think the most important thing to state is that this is an open scientific question. Jaynes (1957) wrote a paper that is often cited as resolving the two, but many disagree. Read the paper and see for yourself.
One key question is the role of the "alphabet" in information theory. Information entropy is based upon the number of letters in the alphabet; but this makes entropy relative to the lexicon of the observer. This would imply that thermodynamic entropy is relative, which many vehemently disagree with. Thermodynamic entropy is, however, based on a frame of reference (the system being investigated). Further, it is based on distinguishable differences; if two indistinguishable gases are combined, entropy does not change. The number of distinguishable elements might be used as the "alphabet" in thermodynamics; but this still seems to require an observer.
One issue is that Entropy can be used in several different ways. Entropy is sometimes used to describe a system state (how much entropy in the system; as in Shannon, Boltzmann and Gibbs equations), a change in state (Clausius' equation) or even as a force (see the work of Adrien Bejan at Duke). Schrödinger famously described the idea of "negative entropy" as the objective of all living things, in his essay "what is life?". That is, the intelligence of life -- the information processing that living things must do -- must reduce local entropy. This points at the importance of links between thermodynamics and information processing.
One area where thermodynamics and information converge is in artificial intelligence, and specifically, in the idea of the Boltzmann Machine. Using simulated annealing, Boltzmann Machines (from Ackley, Hinton, and Sejnowski, precursors to today's "Deep Learning" algorithms) "heat up" their internal weights and then "cool". This cooling process minimizes internal energy. Paul Smolensky's "Harmonium" described this same process, but described the objective as "maximizing harmony". When Hinton and Smolensky collaborated, they compromised and called the metric "Goodness of Fit."
How does harmony relate to entropy? In music analyses, the "Harmonic Entropy" is used to quantify dissonance. This is still informational. However, there is a strong relationship between the coherence/decoherence of harmonic oscillators (e.g., any particle) and thermodynamic entropy.
Tegmark, M., & Shapiro, H. S. (1994). Decoherence produces coherent states: An explicit proof for harmonic chains. Physical Review E, 50(4), 2538.
Jaynes, E. T. (1957). Information theory and statistical mechanics. Physical review, 106(4), 620.
Rumelhart, D. E., Smolensky, P., McClelland, J. L., & Hinton, G. (1986). Sequential thought processes in PDP models. Parallel distributed processing: explorations in the microstructures of cognition, 2, 3-57.
Ackley, David H; Hinton Geoffrey E; Sejnowski, Terrence J (1985), "A learning algorithm for Boltzmann machines", Cognitive science, Elsevier, 9 (1): 147–169