It is because quantum entanglement is not a descriptor of properties but of correlations.
To say of two things that they are in a quantum entangled state is to say that there is a physical quantity (e.g. spin direction) that can no longer be assigned to the things individually, and instead their physical state jointly embodies a correlation associated with that physical quantity. A correlation is a statement such as "A and B are the same" or "A and B are different" or "in 33 percent of cases they are the same". In classical physics, correlation is only ever expressed indirectly, as a "parasite" on the back of properties that are there. For example, if two classical coins have the same picture printed on them, then it is because each coin individually has that picture. In quantum entanglement, a different possibility occurs: one can have a case where the difference between two imprints on two coins is guaranteed to be zero, but neither coin has a well-defined picture printed on it.
The observations have the following property.
Suppose two coins are quantum-entangled as in the example above. If one observes (measures) either coin on its own, one finds an imprint drawn randomly from the set of all imprints. Upon comparing results, one also discovers that the imprint seen on coin A is the same as the one on coin B. One can repeat the experiment with lots of coins.
Such observations could also be obtained in classical physics, and none of these observations represent any form of communication from any one coin to another.
In the quantum case, new possibilities arise: one can look for superpositions of the imprints on the coins. One now needs some mathematical tools to understand it properly. Rather than imprints of pictures, one might simplify and take the case of pairs of arrows that can point in some direction in three-dimensional space. One finds that the complete set of observations that one can make on a group of such pairs of
entangled arrows includes a degree of correlation that could not be attained by any physical model in which each arrow on its own has a definite direction. That is Bell's result, called Bell's inequality.
Notice that I have not said, and I did not need to say, anything about wavefunction collapse in order to describe what is going on. The language of wavefunction collapse can be an aid to the imagination, but one should use it with caution: it can mislead. For example, it can imply that a physical cause at one place can immediately produce a physical effect at another place, but this is quite wrong. If your picture of wavefunction collapse suggests that to you, then your picture is misleading you and you would do better to drop it and try a different picture.
What Bell did is to analyse the observations which can be made on entangled quantum systems without introducing any particular model of wavefunction collapse. He simply focussed his attention on the predictions made by quantum theory about observations that can be made on sets of entangled pairs. He calculated average values of various correlations.
Now let's see how one might try to use these effects to make a form of signalling. One sets up a stream of entangled arrow-pairs, going to Alice and Bob. Let's say they both observe the arrows in the north-south direction (using the pole star Polaris to indicate north), and thus get perfectly correlated results. Then one day Alice, who lives near Alpha Centauri, wants to tell Bob, on Earth, that her answer is "yes". She wants him to know straight away, not after 4 years. They have pre-arranged that she will switch her measuring apparatus to east-west to signal "yes", and leave it at north-south to signal "no". So she switches her apparatus. What is the effect at Bob? There is no effect at all: all his observed outcomes continue to be random, just as they were before. But meanwhile Alice decides to send to him her observed outcomes, which she can do by light-speed-limited "snail mail". Four years after Alice switched her apparatus, Bob starts to notice that their two observation streams stopped being correlated at a moment 4 years ago. Thus he receives the information communicated by Alice, but he can only receive it by observing the change in the correlation, not by any change in the data at his end alone. And he can only find out the change in the correlation by having access to both sets of data. But the measurement outcomes can only be communicated from one place to another by light-speed-limited methods.
When you see popular presentations in which an observation of one arrow is said to collapse the state of the other arrow, you should understand that this is just a useful aid to the intuition about the results of quantum measurements. A better intuition is, I think, to say that the entangled state is distributed between the
two locations, and measurements at either location find out something about
the entangled state. This helps you to see that neither measurement itself affects
the other, but both are nevertheless correlated because they are measuring the same "thing", a thing (the pair of arrows) whose properties cannot be wholly accounted-for as if each arrow had all its properties all to itself.