Indeed, you could quite happily assume that the joint wavefunction is the sum of two $1s$ hydrogenic wavefunctions, say,
$$
\psi_\mathrm{sum} (r_1, r_2) = \psi_{1s}(r_1) + \psi_{1s}(r_2).
$$
However, you get into trouble the minute you start wanting to use it. To start with, consider the probability that the first electron is in the interval $[a,b]$, with no restrictions on $r_2$, which should naturally be given by the integral over the entire space of $r_2$:
\begin{align}
P_{(\rm sum)}(r_1\in[a,b])
& =
\int_a^b \mathrm dr_1 \int \mathrm dr_2 |\psi_\mathrm{sum}(r_1,r_2)|^2
\\ & =
\int_a^b \mathrm dr_1 \int \mathrm dr_2 |\psi_{1s}(r_1) + \psi_{1s}(r_2)|^2
\\ & =
\int_a^b \mathrm dr_1 \int \mathrm dr_2 \left[|\psi_{1s}(r_1)|^2 + 2\mathrm{Re}\left(\psi_{1s}(r_1)\psi_{1s}(r_2)^*\right) + |\psi_{1s}(r_2)|^2 \right].
\end{align}
This is going to give a fixed contribution of $[a,b]$ from the $|\psi_{1s}(r_2)|^2$ term (which then tells you that this probability doesn't even have the right units), and a contribution of $\infty$ from the $\int_a^b \mathrm dr_1|\psi_{1s}(r_1)|^2$ term (which is what really interests us) integrated over the unbounded interval $\int\mathrm dr_2$ with nothing to temper the $r_2$ dependence.
Consider, on the other hand, the behaviour of the tensor-product wavefunction for this observable:
\begin{align}
P(r_1\in[a,b])
& =
\int_a^b \mathrm dr_1 \int \mathrm dr_2 |\psi(r_1,r_2)|^2
\\ & =
\int_a^b \mathrm dr_1 \int \mathrm dr_2 |\psi_{1s}(r_1)\psi_{1s}(r_2)|^2
\\ & =
\int_a^b \mathrm dr_1|\psi_{1s}(r_1)|^2 \int \mathrm dr_2 |\psi_{1s}(r_2)|^2
\\ & =
\int_a^b \mathrm dr_1|\psi_{1s}(r_1)|^2,
\end{align}
i.e. exactly what you need it to be.
OK, so that rules out one alternative, but it doesn't fully prove that the tensor product is the only working alternative. However , it does provide a harder criterion for what we need: we want the probability densities $|\psi_i(r_i)|^2$ to combine together like standard probability densities do when the underlying events are independent. In other words, the position-space probability density needs to be
$$
|\psi(r_1,r_2)|^2 = |\psi_{1s}(r_1)\psi_{1s}(r_2)|^2.
$$
By itself, that's not yet enough, but we can say more - we need this to work for all observables, not just for position-space measurements. This takes some doing, but one can show that when you do that general formalism, you just wind up at the universal property of the tensor product, which constrains the resulting structure to being canonically isomporphic to the tensor product.