Hope this doesn't come off as too pedantic or overinterpreted. I've been working on revisiting electrostatic and electrodynamic energy from first principles, and I have the following stumbling block. I'm working within the framework of classical electrodynamics only (mostly ignoring the interaction of radiation and matter), and this is on purpose. It's about the classical theory and how it is developed.
We can say that a charge density $\rho$ and a current density $\mathbf J$ are the fundamental sources of the electromagnetic field. We will then consider a point charge to be a sufficiently small and localized charge distribution which has a monopole moment and negligible multipole moments, for all observation points of interest.
Suppose we impose electrostatic conditions, such that all motions of charges must be infinitely slow (adiabatic, in the late 19th-century sense). This is a valid approximation e.g. when charges are confined to a conductor and are in equilibrium. We define the electrostatic potential due to the charge density $\rho$ by $$ \Phi(\mathbf r) = \frac{1}{4\pi\epsilon_0}\int \frac{\rho(\mathbf r')}{|\mathbf r - \mathbf r'|} d^3 r'. $$ And note that the electric field $\mathbf E = -\nabla \Phi$, and that $$ \Phi(\mathbf r) = -\int_{\mathbf r_0}^\mathbf r \mathbf E(\mathbf r') \cdot d\mathbf r' $$ Where $\mathbf r_0$ is a point of zero potential.
Question: How can the electrostatic energy of a continuous charge distribution be derived without heuristic arguments or generalizations from configurations of point charges, in the framework of classical electrostatics (as laid out above)?
Remarks. Energy is not a strictly defined quantity in classical physics. It does not have an exact definition, and in fact the concept is only as useful as its features allow (conservation, modes of energy transfer). The success of the common definition of electrostatic energy is not evidence for an accurate, rigorous, or unique derivation. If one were to formulate electrostatic energy from first principles in the classical model, the formulation is open to some interpretation (e.g. any constant term can be added to the total energy of a system without changing the energy balance, including terms calculated from point charge self-energy, and there is no way to transfer this part of the energy so it makes no difference to the theory and should be eliminated from it).
To make this more exact, then, I say that any definition of energy should have the following features:
- It must have units of joules, and any work done in the system between charges must be convertible from energy in the system.
- The relationship between electric potential $\Phi$ and the energy would ideally follow from the product of $\Phi$ and an infinitesimal charge $\rho dV$
- In the case of "point charges", the energy of the configuration should be obtained by substituting Dirac delta functions for each of the charges, so long as the monopole approximation holds, although the concept of a point charge can be discarded if singularities come up.
- The energy should include as few "untransferrable" components as possible, meaning all the energy in the system would ideally be assignable to a physical, reversible process, and there should be a mechanism of transferring whatever energy is stored in the system. Since work is the only mechanism I'm aware of for energy transfer in electrostatics, all energy should arise from forces required to (adiabatically) bring the system into its equilibrium state.
My point is to remove some of the hand-waving steps in the usual derivations. A self-consistent theory should be purely classical, and based only on physically realizable quantities and processes. Thus a pair of charged conducting spheres being brought slowly together is a good example, while point charges floating in space which are bounded by an unspecified constraining force are discouraged. Most derivations determine the energy in a configuration of point charges, then generalize to charge densities, but this introduces problems like infinite point charge self-energy, which are not really necessary to a classical theory (which in the above framework treats point charges as approximations of localized charge densities). The resulting energy from these derivations is $$ U_E = \frac{1}{2}\int_V \rho(\mathbf r)\Phi(\mathbf r)\ d^3 r $$ But this derived from $q\Phi \rightarrow \rho dV \Phi$ and the concept of work performed on point charges being equal to $q\Phi$, coming from the expression $$ U_E(\mathbf r) = -\int_{\mathbf r_0}^{\mathbf r} q\mathbf E(\mathbf r')\cdot d \mathbf r' $$ And substituting the expression for $\Phi$ in terms of $\mathbf E$ given above. The meaning of "bringing an infinitesimal volume charge $\rho dV$ from zero potential to its location in the configuration" is not immediately physically meaningful, because there is no mechanism I'm aware of by which this can occur adiabatically, but I'm not completely against the idea.
A resolution to this is to simply define the electrostatic energy by the expression above, and to demonstrate that it meets our criteria. But this would be admitting that the electrostatic energy is not unique (unless a proof exists that it is). Perhaps a better way forward is to determine the work which can be performed by the charge distribution upon itself until no more work can possibly be done. But I'm not sure of the "right" approach, subjective as that might be.