Insofar as what is "really" going on, nobody knows for sure. However, the following is an account I have been developing for a while that is based on some of the most modern work in the field of quantum foundations and interpretations, such as and including the new "quantum reconstruction" paradigms which seek the derivation of quantum mechanics from elementary postulates in a manner similar to Einstein's derivation of special and general relativity.
The general thrust that it seems much off this work is circling around comes to that a quantum state vector
$$|\psi\rangle$$
is not an object that should directly be attributed to a physical system. Rather, what it is is a mathematical representation, or model, of information which is held by an agent, about the physical system in question. Typically this is taken as a human experimenter in a lab when it comes to scientific usage, but there is absolutely no reason it needs to be. The only things the agent needs to be able to do are to store information, to acquire information from the external system, update its internal store accordingly, and, of course, have some way of deciding which of these actions to perform. In this setup, the "wave function collapse" is just the update of the agent's information store with new information.
To get a better handle on how that works, consider the following simple scenario. Suppose that you and someone else you know are going to go out from your house to somewhere. The other someone (I'll call her Kionna) had been using it prior. There is a tea kettle in the home, and you both go out the door and get in the car. Just as you do so, you ask your friend Kionna if she left the kettle on.
We can, in this mathematical setup, model Kionna's knowledge of the kettle as follows. Let her knowledge be denoted $|\psi\rangle_\mathrm{Kionna}$. There are two vectors, $|\mbox{kettle on}\rangle$ and $|\mbox{kettle off}\rangle$, representing the state of the kettle. If she knows it is still on, we have
$$|\psi\rangle_\mathrm{Kionna} = |\mbox{kettle on}\rangle$$
and if she knows it is off,
$$|\psi\rangle_\mathrm{Kionna} = |\mbox{kettle off}\rangle$$
as you might think. But what if you ask her and she says "Well, I'm not entirely sure. I think I turned it off", and you ask "are you sure?" and she says "Well, I think maybe 75% sure". What do you make of this? Would you say that, perhaps, you would find this not as informative, perhaps not as much as you'd like, as if she said "yes, it's on" or "no, I turned it off" and "I'm sure"? I'd expect you'd find it not.
That, essentially, is the informational interpretation of probability, and in the formalism here, we'd describe her knowledge with
$$|\psi\rangle_\mathrm{Kionna} = \sqrt{0.75}\ |\mbox{kettle off}\rangle + \sqrt{0.25}\ |\mbox{kettle on}\rangle$$
This, of course, is a superposition - and this is its interpretation: it means that the information she has is less than if she knew for sure it was in either state alone. She cannot give you as much information. Nonetheless - there is a way to go find out: either you go, or you can ask her, to go back in and look. If she goes back in, then she will, of course, see, and then her knowledge should be considered as changed, i.e.
$$|\psi\rangle_\mathrm{Kionna} = |\mbox{kettle off}\rangle$$
and you then hear this new datum and are relieved.
Likewise, the same goes with a quantum system such as, in your question, a qubit, where now the states are $|0\rangle$ and $|1\rangle$, or $\left|\uparrow\right\rangle$ and $\left|\downarrow\right\rangle$, as in the case where the qubit is the direction of a vector component of spin angular momentum. When it is in a superposition, that means that we (or the agent, more generally, that we attribute this information to) has impoverished information thereabout. But then that begs the question: if this $|\psi\rangle$ is "just" information held by the agent, and not by the qubit directly, then what is all the hoopla about? Why is this different from any other case where one is simply missing information, as we just used to illustrate these concepts?
Well, the way it seems is that in many experiments, when it comes to a real quantum system, not only is it that our "agent" - such as a person experimenting on it in the lab, or anything else - that doesn't have the information, but that the qubit itself actually behaves "in reality" like its information is likewise limited! This is not sensible with a single qubit, because there's no way to empirically distinguish whether a result you get by asking for its value was there to begin with or not, or whether it was created at, or perhaps sometime between when you first became curious and, the time you looked at it, but requires many and prepared in specially-correlated "entangled" states to recover statistical phenomena that show that it is impossible for it to have always been in only states $|0\rangle$ or $|1\rangle$ all along or, at least, if that information did pre-exist, it would require a violation of relativistic causality under the hood, which seems to cast doubt upon it. Moreover, it can be derived that an agent conducting a query as to the state of the object must invariably change it on pain of gaining no information, and thus, we cannot be sure exactly how much information was there before we did so. That doesn't mean there wasn't any, but we can't know except perhaps for bounding it.
So the answer to "what happens when a system is put in a superposition?" or "is it in two states at once?" is that it's rather best thought of as being in a state in which it kind of doesn't know which one it's in. It's in a state where that there is only a fraction of a bit, if you will, specifying which of those states in, like a "splinter" of its classical, fully-resolved state. We cannot really visualize that, because our minds can are and only built for visualizing things that, effectively, can be expressed as a grid of pixels (even though a human brain doesn't literally work on that, hence why I said expressed), with classically many bits in each pixel. It is, in effect, the same situation as to trying to visualize 4 or more dimensions: it would require a higher-dimensional retina, and thus a higher-dimensional visual cortex, and thus a higher-dimensional brain altogether, to actually do it justice. The mathematical structures involved simply do not match or embed faithfully into the perceptual structures our brains operate on.
As to how exactly these things manage to exist in such a way that these parameters have these reduced, non-classical levels of information - now that's the part we don't know, and most likely, can't. This is actually a generic feature of empirical science: it only provides us with models. Where quantum theory takes this a step further is that it tells us that not only is this the case, but unless the theory is wrong, we cannot have any empirical means to even construct a model of the world that is not relativized to the knowledge of different agents therein and separable from their actions upon it: all such models are neither verifiable nor refutable. Attempts to construct such are basically what various other "interpretational" schemes like "many worlds" and so forth amount to, and thus there is no way to tell if any of them are "correct".
That said, what I will say is that one thing you may have heard in "popular" expositions of quantum theory, and a favorite jumping-off point for many "kooky" theorizers, which is that quantum mechanics says that a "conscious observer" is required or that "consciousness" changes reality by some "magical" way (like "psychic" powers) upon an observation, or whatever, or that "reality" is "all in our minds", is not required and not implied by quantum mechanics, any more than it is by classical mechanics. The reason some come to this conclusion is the need to include the agent in the description, but where it goes wrong is that it assumes additional qualifiers on the agents that are not at all necessary, such as that they be specifically conscious beings, when all that is "really" required is information processing. Of course, since conscious beings like ourselves are the chief users of the theory, they are also the chief agents we are concerned with, but that's a far cry from saying that the theory somehow demands or stipulates the need for specifically consciousness on the part of the agent we have to posit in order to use it as a descriptive tool. Nor does the theory require a belief that no "reality" which can be described independently of the agents exists: it constrains it, and it prevents building a verifiable model of it, but those are not the same as a statement of nonexistence. Again, you can believe that if you want, but it is a stronger position than is required.