Moler's paper "Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later" contains the following extracts:
In estimating the time required by matrix computations it is traditional to estimate the number of multiplications and then employ some factor to account for the other operations. We suggest making this slightly more precise by defining a basic floating point operation, or “flop”, to be the time required for a particular computer system to execute the FORTRAN statement
$A(I; J) = A(I; J) + T^*A(I;K)$.
This involves one floating point multiplication, one floating point addition, a few
subscript and index calculations, and a few storage references. We can then say,
for example, that Gaussian elimination requires $n^3/3$ flops to solve an $n$-by-$n$ linear system $Ax = b$....
About $qn^3$ flops are required to evaluate [a Padé approximant]...
The scaling and squaring method of section 3 and some of the matrix decomposition methods of section 6 require on the order of $10$ to $20n^3$ flops and they obtain higher accuracies than those obtained with $200 n^3$ or more flops for the o.d.e. solvers...
Most of the computational cost [in block diagonal methods] lies in obtaining the basic Schur decomposition. Although this cost varies somewhat from matrix to matrix because of the iterative nature of the QR algorithm, a good average figure is $15 n^3$ flops, including the further reduction to block diagonal form.
and there are a few more nuggets buried in there.
Based purely on this (and without reading the paper you cite but noting that its title is "A new scaling and squaring algorithm for the matrix exponential", for which see also here and http://eprints.ma.man.ac.uk/1442/), I would guess that Al-Mohy and Higham's scaling and squaring method is in the ballpark of $Cn^3$ for $C \approx 15$.