I believe the key words here are time shift invariance, flow, state, linearity (or homogeneity), continuity and unitarity and the answer to your question is quite independent of whether or not we are doing quantum mechanics.
Suppose we are groping around in the dark trying to describe some new phenomenon as the early (actually 1920s) physicists were. In the tradition of Laplace, you hope that a deterministic system description will work (and the part of QM that uses complex numbers, namely unitary state evolution, is altogether and utterly deterministic). So we are going to begin with a state: some information - an array of numbers $\psi$ - real ones for now if you like - that will wholly determine the system's future (and past) if the system is sundered from the rest of the Universe.
So we have some $\psi\in\mathbb{R}^N = X$ that is our basic description.
Now an isolated system evolves on its own: with the lapsing of time the system changes in some way. Let $\varphi: X\times\mathbb{R}\to X$ be our time evolution operation: for $x\in X$ $\varphi(x,t)\in X$ is what the state evolves to after a time $t$. Moreover, the description cannot be dependent on when we let it happen: it must have a basic time shift invariance. The experiment's results cannot depend on whether I do it now or whether I wait till I've had my cup of tea. So the description of the change in some time interval $\Delta t$ must be the same as that for the change in any other $\Delta t$. If we're not interacting with the system, then there are no privileged time intervals. Therefore, we must have:
$$\varphi(x,t+s) = \varphi(x,s+t) = \varphi(\varphi(x,t),s) = \varphi(\varphi(x,s),t),\;\forall s,t\in\mathbb{R}$$
and
$$\varphi(x,N \Delta t) = \underbrace{\varphi(\varphi(\cdots \varphi(\varphi(}_{N\ \text{iterations}}x,\overbrace{\Delta t),\Delta t)\cdots,\Delta t)}^{N\ \text{iterations}}$$
so, from our Copernican notion of time shift invariance we have our first next big idea: that of a flow defined by the state transition operator $\varphi$. So our state transition operators form a one parameter, continuous group. Here we have brought our fundamental idea of continuity to bear.
This may seem to be saying (since we now have a one-parameter Lie group) that the only systems that fulfill these ideas are linear ones, but this is not quite so: the Lie group only acts on $X$ so there is always the possibility of some local, $X$-dependent nonlinear "stretching" or "shrinking" of the path. So now we make an assumption of system linearity or of some other notion of a homogeneous action (so that, intuitively, $\varphi(-,t):X\to X$ conserves some local "structure" of our state space $X$). Then the only continuous, linear, homogeneous state transition flow operator on $X$ is:
$$\varphi(x, t) = \exp(H\,t) x \stackrel{def}{=} \left({\rm id} + H\,t + H^2\,\frac{t^2}{2!} + H^3\,\frac{t^3}{2!} + \cdots \right) x$$
for some linear operator $H:X\to X$.
So now, to investigate the behaviour of this basic model, we need to think about the eigenvalues of $H$, so that, for example, we can "spectrally factorise" the operator ("diagonalise" it - or at least come as near as we can to diagonalising it - but this always can be done in QM). The natural home for the eigenvalues of an operator are $\mathbb{C}$, not $\mathbb{R}$, because the former is an algebraically closed field and the latter is not. There exist finite dimensional linear operators whose characteristic equations have solutions in $\mathbb{C}-\mathbb{R}$. Moreover, we must consider the behaviours of $e^{\lambda\,t}$ where $\lambda$ are the eigenvalues of $H$ (this is easy in a finite dimensional space - for an infinite dimensional one, see my answer here to the Physics SE question "Are all scattering states un-normalizable?")- how do these square with our physics?
If, for any eigenvalue, $\lambda$ is real and positive, or if complex and its real part is real and positive, our state vector diverges to infinite length as $t\to\infty$. If they are all complex with negative real part, then our state vector swiftly dwindles to the zero vector. Even before we have crystallised the idea of probability amplitude properly, we may have the idea that we "want as much state to stick around for as long as possible*. The system must end up in some nonzero state: our particles don't all collapse to the same state $0\in X$. So all eigenvalues being wholly imaginary so that $e^{\lambda\,t}$ has bounded magnitude that doesn't rush of to infinity or to nought, might be a fair bet. Bringing this into sharper focus: having $\exp(H\,t)$ conserve length might be another reasonable assumption. The easiest "length" in state space is of course the $\mathbf{L}^2$ one. So now we might postulate, even before our ideas about probability amplitude in QM are fully crystallised, that:
The operator $\exp(H\,t)$ is unitary
This equivalently means that:
$H$ is skew-Hermitian
or that our operator is $\exp(i\,\hat{H}\,t)$ where:
$\hat{H}$ is Hermitian
Hermitian operators, given very mild assumptions, are always diagonalisable, and they always have real eigenvalues. So entities like $\exp(i\,\omega\,t)$ where $\omega \in \mathbb{R}$ are inevitable in our state transition operator expansions.
Now you are right and we could keep everything real, use $\cos$ and $\sin$ instead and keep our state space as $\mathbb{R}^{2 N}$ were we would have $\mathbb{C}^N$ in the conventional description. Whether or not we choose to single out an object like:
$$\left(\begin{array}{cc}0&-1\\1&0\end{array}\right)\in U(1), SU(2), SO(3), U(N) \cdots$$
and give it a special symbol $i$ where $i^2=-1$ is a "matter of taste", so in this sense the use of complex numbers is not essential. Nonetheless, we would needfully still meet this object in decomposing our state transition operator and $H$, $\hat{H}$ operators and ones like it and would have to handle statements involving such objects when describing physics - there's no way around this. So, in particular, if we have bounded, but everlasting wave behaviour, we must be encountering pairs of states in our $\mathbb{R}^{2 N}$ representation of the conventional $\mathbb{C}^N$ that evolve with time through linear differential operators with submatrices like the $i$ object above.
So you actually see that complex numbers arise very naturally out of the very classical and indeed Renaissance ideas of people like Laplace, Copernicus, Leibnitz, Galileo and Newton. We simply have a better and more refined mathematical language to talk smoothly about these ideas where these guys were groping in the dark.
And now, if we keep linearity but drop time shift invariance (say we have some "control" of our linear operator $H(t)$: we might have an electron in a classical magnetic field which we can vary to "steer" the electron's state, for example) we still get most of the above ideas. For now we solve:
$${\rm d}_t U(t) = H(t) U(t)$$
which we can think of as having some dials to twiddle to steer our Lie algebra member $H$ in the tangent space at $U$ to the Lie group of state transition operators. $H$ must still be skew-hermitian and $U\in U(N)$ (if our system is finite dimensional). See my answers here to the Physics SE question "Why can the Schrödinger equation be used with a time-dependent Hamiltonian?" as well as, for a practical example, here to the Physics SE question "fixed input qubit state to an arbirary pure state using two variable rotations and one fixed rotation"