@Qmechanic answered your question in the comments, but evidently the message did not automatically sink in. Let me try to illustrate it with the routine demonstration you probably were exposed to when learning the uses of the Dirac bra-ket notation.
The short answer is that the tracefulness of the identity in the r.h.s. of your commutator equation for an infinite dimensional Hilbert space leads to $\langle a | a\rangle \neq$1, because it is actually singular. So your equation 1) is fine, since the r.h.s. is infinity. But your equation 2) is flawed, since the relevant expression involves a 0 multiplying a stronger infinity, amounting to infinity, again, as in 1).
I will illustrate this with $A=\hat x$ and $B=\hat p /\hbar$, as in standard QM courses. Absorb $\hbar$ in $\hat p$ to make the formalism more familiar.
Starting from the standard operator equation $[\hat x ,\hat p]=i 1\!\!1 $, first take its non-diagonal matrix elements, before building up to your 2),
$$
\langle x|\hat x \hat p - \hat p \hat x|y\rangle = (x-y)\langle x|\hat p|y\rangle=(x-y)\int dp ~ \langle x| p\rangle \langle p| \hat p|y\rangle \\
=(x-y)\int dp ~ \langle x| p\rangle p \langle p |y\rangle \\
=\frac{ (x-y)}{2\pi}\int dp ~ p ~e^{i(x-y)p} =-i (x-y)\partial_x \delta (x-y) \\ =i \delta(x-y).
$$
As always, $\langle x| p\rangle=\exp(ixp) ~/\sqrt{2\pi}$. Check the last equality by operating on a well-behaved test function. It trivially reflects homogeneity of degree -1, $\delta(\lambda x)=\delta(x)/\lambda$, so differentiate this by λ and set λ=1.
That is, the expression diverges for $x\to y$, just like 1). The crucial point is that as the prefactor (x-y) decreases, the matrix element multiplying it diverges and faster.
All matrix elements of a commutator being available, as above, you may reconstitute your original operator equations from these, by insertion of resolutions of the identity on either side.