Summary of text by Pal
First, let me summarize what is written in the relevant parts of the linked source by the OP.
Consider a $1+1$ dimensional spacetime (one spatial and one time dimension). Let $x, t$ be the coordinates of an event in an inertial frame $S$, and $x', t'$ be the coordinates of the same event in inertial frame $S'$, moving at velocity $v$ with respect to $S$.
Now $x'$ and $t'$ are related to $x$ and $t$ via
\begin{eqnarray}
x' &=& X(x, t, v) \\
t' &=& T(x, t, v)
\end{eqnarray}
Now we consider a rod that is stationary in frame $S$ with endpoints $x_1, x_2$ ($x_2>x_1$), with length $\ell$. In $S'$, at time $t$ the endpoints will be at $X(x_1, t, v)$ and $X(x_2, t, v)$, so the length in $S'$ is given by
$$
\ell' = X(x_2, t, v) - X(x_1, t, v)
$$
Now we displace the rod in so that in frame $S$, the endpoints shift to $x_1+h$ and $x_2+h$. Then in frame $S'$, the endpoints are also shifted. However, by assumption of the homogeneity of space, the length in $S'$ can't be affected by simply shifting the endpoints of the rod. Therefore
$$
\ell' = X(x_2+h, t, v) - X(x_1+h, t, v)
$$
Since $\ell'=\ell'$, we can equate the two relations above and take the limit $h\rightarrow 0$ to conclude
$$
\frac{\partial X}{\partial x}\Big|_{x_1} = \frac{\partial X}{\partial x}\Big|_{x_2}
$$
As the OP argues, this condition by itself only is enough to argue that
$$
X(t) = f(t) x + g(t)
$$
Pal then states:
One can similarly argue, invoking the homogeneity of time as well, that both $X(x, t, v)$ and $T(x, t, v)$ are linear in the arguments $x$ and $t$.
I believe the OP's question can be answered by unpacking this statement. In particular, there are other relations that hold, such as
\begin{eqnarray}
\frac{\partial X}{\partial t}\Big|_{t_1} &=& \frac{\partial X}{\partial t}\Big|_{t_2} \\
\frac{\partial T}{\partial x}\Big|_{x_1} &=& \frac{\partial T}{\partial x}\Big|_{x_2} \\
\frac{\partial T}{\partial t}\Big|_{t_1} &=& \frac{\partial T}{\partial t}\Big|_{t_2}
\end{eqnarray}
where $x_1, x_2$ and $t_1, t_2$ are arbitrary (and therefore the equations hold for any $x, t$).
Once all the conditions on partial derivatives are derived, it then follows that (as Pal says "making the trivial choice that the origins of the two frames coincide, i.e., $x = t = 0$ implies $x′ = t′ = 0$"), that
\begin{eqnarray}
X(x,t,v) &=& A_v x + B_v t \\
T(x,t,v) &=& C_v x + D_v t
\end{eqnarray}
I'll show how this works for $X$ explicitly, then wave my hands and say $T$ is similar.
Finishing the case of $X$
Recall that originally we have the length in frame $S'$ is given by
$$
\ell' = X(x_2, t, v) - X(x_1, t, v)
$$
To generalize the argument above to look at time derivatives of $X$, we merely shift the rod in time by an amount $\delta$. By homogeneity in time, the length in frame $S'$ can't change if we merely shift the rod forward in time. So we must have
$$
\ell' = X(x_2, t+\delta, v) - X(x_1, t+\delta, v)
$$
Equating $\ell'=\ell'$, and rearranging, we get that
$$
\frac{\partial X}{\partial t}\Big|_{x_1, t} = \frac{\partial X}{\partial t}\Big|_{x_2, t}
$$
If we apply this to to form $X = f(t) x + g(t)$ we derived earlier, we find
$$
f'(t) (x_2 - x_1) = 0
$$
Since this condition must hold for any $x_2$ and $x_1$, we conclude that $f'(t) = 0$. Therefore, $f$ must simply be a constant in time.
However, we still have $g(t)$.
To get a handle on this, we have to consider a different scenario. We consider two points at the same position but different times in frame $S$. We then consider what those points look like in frame $S'$.
We can consider
$$
\Delta x' = X(x, t_2, v) - X(x, t_1, v)
$$
Now here's the kicker. If we shift $t_2$ and $t_1$ by the same amount $\delta$, by homogeneity in space and time, $\Delta x'$ can't change. So
$$
\Delta x' = X(x, t_2 + \delta, v) - X(x, t_1 + \delta, v)
$$
Rearranging, we get
$$
\frac{\partial X}{\partial t}\Big|_{t1} = \frac{\partial X}{\partial t}\Big|_{t2}
$$
Applying that to what we had before, we find $g'(t_1)=g'(t_2)$ for all $t_1$ and $t_2$, so $g^\prime$ is a constant and $g$ is a linear function of $t$.
Sketch of $T$
To do the argument for $T$, instead of the length of a rod, we can consider the time difference between two events that occur at the same spatial location in the frame $S$.