Since the OP appears to have decoupled, and so cannot fix the misleading question (searching for a proof of a non-fact), I'll answer @Emilio Pisanty's dare/call-of-the-bluff: this is a ritual confusion arising from that otherwise outstanding Nieto et al. paper, probably in this site, as well. The Truax paper cited in the OP's answer is a red herring--it does not address the essence of the OP's misreading of formula (3.1) in Nieto et al, and does not clarify anything already nicely and explicitly covered by Nieto et al. on p 1109. (At best, Truax is a footnote in R Gilmore's book detailing and explaining the technique.)
The standard result @mike stone reminds you of in his answer, (3.7) may be recast through the magnificent CBH rearrangement (3.8) to S(z) =(3.2). (But note the obvious typo in the user right component of the last matrix of (3.5).)
Note (3.2) is in normal order, i.e. correct and complete without any commutators having been sacrificed in any normal-ordering truncation whatsoever!
From (3.2), one then may work backwards to the completely superfluous (! yes, I'll stick my neck out) (3.1), correctly copied as
$$
S(z)=\eta S'(z) \\
= \eta \exp \left(
\frac{1}{2} e^{i\theta} \tanh{r} a^\dagger a^\dagger
- \frac{1}{2} e^{-i\theta}\tanh{r} a a
+ (\text{sech} r-1)a^\dagger a - \frac{1}{2}\ln{(\cosh{r})}
\right) ,
$$
where $\eta$ is the normal ordering operator, trivializing any and all commutators in its argument, so, then, dictating dealing with noncommuting operators as commutative symbols when inside its domain. It is an empty and inessential flourish, unfamiliar to lots of non-QFT students, and adding little (nothing?) to the discussion. A good referee would have advised the authors to skip superfluous asides such as this to enhance readability. This very question is proof of his hypothetical sagacity, had he acted. So the question is a distraction/misreading of a classic and important group-theoretical technique. (E.g., I'm using it here.)