This is really two questions in one, namely what is a ket and what is a state. Both questions are bound to cause some degree of controversy between physicists of different mathematical pedigrees. I will assume finite dimensional Hilbert spaces for the entirety of this answer, since that contains the relevant conceptual details - infinite dimensions are morally the same, but come with significantly increased technical baggage.
Kets
Def: A ket is an element of the Hilbert space $\mathscr H$, and a bra is an element of the dual space $\mathscr H'$.
This definition is straightforward, and is basically defining a convenient notation. From this perspective, $|a\rangle$ is simply a vector in $\mathscr H$, and $a$ is a label we have chosen for it. This is particularly convenient when talking about eigenvectors; if a particular vector is an eigenvector of some operator $A$ with eigenvalue $a$, it is convenient to call that vector $|a\rangle$ as a helpful reminder of its most contextually relevant characteristic (namely, that $A|a\rangle = a |a\rangle$). Note that $a$ is a number, while $|a\rangle$ is the ket (i.e. element of $\mathscr H$) which we have labeled by that number.
Let's say $\psi_a$ and $\psi_b$ are eigenvectors of some self-adjoint operator $A$ with eigenvalues $a$ and $b$, respectively. We will write them in "ket form" as $|a\rangle$ and $|b\rangle$. The action of the bra $\langle a|$ on the ket $|b\rangle$ is given by
$$\langle a| \big( |b\rangle\big) \equiv \langle a|b\rangle = \langle \psi_a,\psi_b\rangle$$
Therefore, we can compute the inner product between $|a\rangle$ and $|b\rangle$ by turning $|a\rangle$ into its bra "dual" $\langle a|$ and then operating on $|b\rangle$.
This notation is exceedingly convenient. As an example (which we will see again in a bit), we might consider the operator which projects an arbitrary vector onto (the linear span of) $\psi_a$:
$$ \phi \mapsto \frac{\langle \psi_a, \phi\rangle}{\langle \psi_a,\psi_a\rangle }\psi_a$$
Utilizing bra-ket notation, this operator takes the straightforward form
$$\frac{|a\rangle\langle a|}{\langle a|a\rangle}$$
Infinite dimensional caveat: One of the subtleties which arises in infinite dimensions is the fact that we have to consider generalized (or non-normalizable) eigenvalues and eigenvectors, corresponding to operators with continuous spectra. Some authors define bras and kets such that they need not actually belong to the Hilbert space itself, as long as they obey some looser requirements. This is what happens with the "position eigenkets" $|x\rangle$, for example. For more, see e.g. my answer here.
States
This answer is more subtle, so let's zoom out a bit. In classical mechanics, a state is generally specified by giving the values of all of the possible observables which we might measure. For example:
For a system consisting of a single point particle we might specify a state by the values of the position $\mathbf x$ and the momentum $\mathbf p$; any other observable (kinetic energy, angular momentum, etc) is then determined by these values.
For a more complex system (e.g. a rigidly-rotating sphere), we might
specify its center-of-mass position $\mathbf X_{cm}$, its total
linear momentum $\mathbf P = M \dot{\mathbf X}_{cm}$, and its
angular momentum $\mathbf L$.
When we learn statistical mechanics, we discover that it's often extremely useful to generalize this. Rather than defining a state as a specification of the precise value for each observable, we can specify a state by providing a probability distribution for the values of each observable. From this perspective, the fundamental questions we can ask take the following form:
$\text{What is the probability that the observable $\mathcal O$ takes its value in some subset $E\subseteq \mathbb R$?}\tag{$1$}$
This way of thinking is deeply physical, in the sense that this is the only thing we can ever actually measure. If we imagine that we can know the exact values of all of the relevant observables, then these probabilities are always either 0 or 1; if the exact value of $\mathcal O$ is 5, then the probability that $\mathcal O$ takes its value in $E$ is 1 if $5\in E$ and 0 otherwise. However, once we add in uncertainty, the probabilities become non-trivial.
All that is to say, a very physical definition of a state is that it is a prescription of a probability distribution to each possible observable - in other words, a mechanism for answering all possible questions of the form $(1)$.
In quantum mechanics, this definition continues to hold. Given the standard formulation of quantum mechanics in which observables are represented by self-adjoint operators on some Hilbert space $\mathscr H$, whose spectra (loosely, eigenvalues) correspond to the possible measurement outcomes, one might ask how we can assign probability distributions to observables.
Here is one possibility, in which I restrict my attention to finite-dimensional Hilbert spaces for simplicity (the fundamental idea remains the same for infinite-dimensional spaces). Any self-adjoint operator $A$ can be written in the form $A = \sum_{i} \lambda_i \mathbb P_i$, where $\lambda_i$ is the $i^{th}$ eigenvalue of $A$ and $\mathbb P_i$ is the self-adjoint projection operator onto the corresponding eigenspace. Furthermore, the sum $\sum_i \mathbb P_i = \mathbf 1$ (the identity operator).
Example:
$$A = \pmatrix{1 & 0 \\ 0 & -1} = (1) \cdot \pmatrix{1&0\\0&0} + (-1) \cdot \pmatrix{0&0\\0&1}$$
$$B = \pmatrix{0 & i \\ -i & 0} = (1)\cdot\pmatrix{1/2 & i/2 \\ -i/2 & 1/2} + (-1)\cdot \pmatrix{1/2 & -i/2 \\ i/2 & 1/2}$$
It is a straightforward exercise to demonstrate that each matrix written on the right-hand side of the above expressions is indeed a projection operator onto the appropriate eigenspace, and that the projection operators sum to the identity operator.
This structure, which underlies all self-adjoint operators, is neatly summarized in the form of a projection-valued measure $\mu$, which consists of the following map. For any subset $E\subseteq \mathbb R$,
$$ \mu(E) = \sum_{\lambda_i \in E} \mathbb P_i\tag{$2$}$$
In words, it is the sum of all of the projection operators which correspond to eigenvalues which lie in $E$.
Example:
In the case of the operator $B$ given above, we would have
$$\mu_B(\{1\}) = \pmatrix{1/2&i/2\\-i/2&1/2} \qquad \mu_B\big((-\infty,0)\big) = \pmatrix{1/2&-i/2\\i/2&1/2}$$
$$\mu_B\big(\{0\}\big) = \pmatrix{0&0\\0&0} \qquad \mu_B(\mathbb R) = \pmatrix{1/2&i/2\\-i/2&1/2}+\pmatrix{1/2&-i/2\\i/2&1/2} = \pmatrix{1&0\\0&1}$$
With the concept of a projection-valued measure in hand, I make the following claim. If you pick out any non-zero vector $\psi\in \mathscr H$, the following constitutes a probability distribution for the observable represented by the self-adjoint operator $A$:
$$ \mathrm{Prob}_A(E) := \frac{\langle \psi, \mu(E) \psi\rangle}{\langle \psi, \psi\rangle}\tag{$3$}$$
Observe that $\mathrm{Prob}_A(\mathbb R) = 1$, which comes from the fact that $\mu(\mathbb R) = \mathbf 1$. It is a good exercise to prove that:
$\mathrm{Prob}_A(E^c) = 1-\mathrm{Prob}_A(E)$, where $E^c$ is the complement of $E$
$E\cap F = \emptyset \implies \mathrm{Prob}_A(E\cap F) = \mathrm{Prob}_A(E) + \mathrm{Prob}_A(F)$
So with this in mind, do vectors in the Hilbert space correspond to states? The answer is almost, but not quite - observe that for any nonzero $\alpha\in \mathbb C$, $\psi$ and $\alpha \psi$ yield exactly the same probability distribution, and therefore exactly the same physical state. As a result, a state is associated not to a single vector $\psi$ but rather the set $\{\alpha \psi \in \mathscr H : \alpha\in\mathbb C\}$. Such a set is called a ray in $\mathscr H$.
Since states correspond to entire rays, you are free to pick any vector inside the ray to use to calculate your probabilities. It is particularly (computationally) convenient if the denominator in $(2)$ is equal to 1 - i.e. if your chosen ray representative is normalized. For this reason, it is not uncommon to simply say that states correspond to normalized elements of $\mathscr H$, though this is not quite true, since even if $\psi$ is normalized, $\psi$ and $e^{i\beta}\psi$ correspond to the same state for all $\beta\in \mathbb R$.
This isn't quite the end of the story. Note that we may utilize the bra-ket notation to define the following operator:
$$\rho = \frac{|\psi\rangle\langle \psi|}{\langle\psi|\psi\rangle}$$
in which case
$$\mathrm{Prob}_A(E) = \mathrm{Tr}\big(\mu(E)\rho\big)$$
as can be straightforwardly shown in a line or two. Observe that for any $\psi,\psi'$ in the same ray, $\rho_\psi=\rho_\psi'$ - therefore, we might use this object - called a density operator or density matrix - as an unambiguous way to define a state.
However, this isn't the most general possible definition. If the state can be defined in this way (using a single ray in the Hilbert space) it is called pure. However, for any set of pure density operators $\{\rho_i\}$, the operator
$$\rho = \sum_k p_i \rho_i, \quad p_i\in[0,1] \quad \sum_i p_i = 1$$
also satisfies the requirements of a state. Such a state is called mixed.