Formalizing the arguments of symmetry

Question

Is there any sort of self-sufficient axiomatic system for deriving properties (like the field distribution, potential energy, etcetera) of a system based on the symmetries it possesses in electrodynamics?

For example, when we apply Gauss law to supposedly "highly symmetric situations" (like an infinite sheet or infinitely long wire) we often assume the direction of the field will follow a certain pattern based on the symmetry of the system intuitively. The same is done while using Ampere's law for current-carrying conductors of a symmetric sort of geometry. Although it always intuitively seems correct and it is probably experimentally verified as well, my question is, can someone please explain a more concrete basis for making these assumptions?

I know this may seem irrelevant but I'm a mathematics student and I would appreciate some rigor here, in understanding what are the axioms and definitions of symmetry that we use to analyze physical systems. And, it also has me wondering if we formalize such a relationship between symmetry and properties of a system for simple geometries, can we extend these axioms to more complex geometries where the underlying symmetry relations are not so obvious?

Possible duplicates: https://physics.stackexchange.com/q/218511/2451 and links therein. — Qmechanic, Dec 09 '23 at 07:28
Yeah, the isometry arguments in the general solution answer my question. It still has me wondering though, that if we have any kind of complicated system for which the corresponding differential equation is unsolvable analytically, can we construct a transformation to reveal more information about the system without having to numerically integrate? And if possible, can we develop a general algorithm to construct such an isometric operation? — Cognoscenti, Dec 09 '23 at 11:07

peek-a-boo · Accepted Answer · 2023-12-09T23:20:45.003

Everything follows from the equations! It’s a combination of

a symmetry (of the geometry/equations) maps solutions to solutions in a suitable sense
solutions are unique (for given initial/boundary value problems).

That was vague, so let me be slightly more precise about the setup of the problem: let $\Omega\subset\Bbb{R}^n$ be an open set with smooth boundary $\partial\Omega$ (possibly empty), and suppose $P$ is a differential operator which sends maps $u:\Omega\to\Bbb{R}^m$ to maps $Pu:\Omega\to\Bbb{R}^k$. Suppose we are also given a map $f:\Omega\to\Bbb{R}^k$ and suppose we have a PDE of the form \begin{align} \begin{cases} Pu&=f\\ \text{BIC}(u)&=\tilde{u}, \end{cases}\tag{$*$} \end{align} where $\tilde{u}$ is a given function and $\text{BIC}(u)$ means ‘boundary or initial conditions of $u$’. I’m purposely being vague on the domain and target of $\tilde{u}$, and what exactly the boundary conditions are because this is very problem-dependent.

Definition 1. (Pullback)

Let $\Omega\subset\Bbb{R}^n$ be open and suppose we have a pair of maps $\Phi=(\psi,\phi)$ where $\phi:\Omega\to\Omega$ is a diffeomorphism, and $\psi:\Omega\times\Bbb{R}^m\to\Bbb{R}^m$ is a smooth map such that for each $x\in\Omega$, we have that the map $\psi_x=\psi(x,\cdot):\Bbb{R}^m\to\Bbb{R}^m$ is a diffeomorphism (in the case $\partial\Omega$ is non-empty, we should assume these maps also extend nicely to the boundary as well). Then, for each function $u:\Omega\to\Bbb{R}^m$, we shall transform $u$ by $\Phi$ and define a new function $u_{\Phi}:\Omega\to\Bbb{R}^m$ by the rule \begin{align} u_{\Phi}(x):=\psi_x^{-1}(u(\phi(x))). \end{align} We call $u_{\Phi}$ the pullback of $u$ by $\Phi$.

Don’t get bogged down by technicalities here: the point is that I start with a function $u$ and then by some ‘standard’ prescription, I define a corresponding function $u_{\Phi}$. We think of $\Phi$ as ‘transforming’ the function $u$ into the function $u_{\Phi}$. You may wonder why the inverse… well technically I could have avoided inverses if I simply renamed $\psi_x$ by $\psi_x^{-1}$, but I’m not doing this just to be consistent with already existing terminology (in manifold and fiber bundle theory… not that you need to know any of that).

Definition 2. (Well-Posedness)

We say the problem $(*)$ is well-posed if there exists a unique solution $u$.

Ok technically, there is an extra condition in the definition of well-posedness (continuous dependence on data) but let me gloss over that right now. A well-posed problem corresponds to ‘physically realistic’ one. If there is no solution (i.e existence fails) then the theory is vacuous, while if uniqueness fails, it means we don’t yet understand everything, i.e we’re missing some key physics.

Now, here comes the definition of symmetry

Definition 3. (Symmetry)

Let $\Omega,P,f,\tilde{u}$ be the domain, differential operator, inhomogeneity, and boundary/initial conditions as introduced above. Suppose that $u:\Omega\to\Bbb{R}^m$ is a solution to the problem $(*)$ (i.e $u$ solves the PDE $Pu=f$ along with the correct boundary/initial conditions). A pair of maps $\Phi=(\psi,\phi)$ as in definition 1 is called a symmetry of the problem $(*)$ if the pullback $u_{\Phi}$ also solves the problem $(*)$.

This should be a very obvious sounding definition. I start with my PDE and boundary/initial conditions. To have a symmetry means I take my solution, and transform it and upon doing so, nothing should change: I should still have a solution to the same problem. Now, here comes the ‘main theorem’:

Theorem.

Let $\Omega,P,f,\tilde{u}$ be as above, and assume the problem $(*)$ is well-posed. If $u$ is the solution to $(*)$ and $\Phi=(\psi,\phi)$ is a symmetry, then we have $u_{\Phi}=u$.

The proof is a one-liner: because $\Phi$ is a symmetry, by definition it means $u_{\Phi}$ solves the problem $(*)$; on the other hand $u$ also solves the problem by assumption, so by the uniqueness condition in well-posedness, these solutions are equal, hence we’re done.

All that stuff above was quite abstract, and you may wonder why this completely answers your question. So here are some examples (and in a sense, the most important examples). I’ll talk first about Poisson’s equation and then the wave equation. These are both scalar equations $(m=k=1)$, so we can take $\psi_x=\text{id}_{\Bbb{R}}$ (you’ll see why below), meaning that for $\Phi=(\psi,\phi)$, we have that $u_{\Phi}(x)=\text{id}_{\Bbb{R}}(u(\phi(x)))=u(\phi(x))$. Finally, I’ll make some comments about electromagnetism.

Example 1: Poisson’s equation on a bounded domain, with Dirichlet boundary conditions

Here, we suppose $\Omega\subset\Bbb{R}^n$ is a (connected) open and bounded set with smooth boundary $\partial\Omega$. We set $m=k=1$, and the differential operator $P$ is the Laplacian $\Delta$. Suppose we are given a function $f:\Omega\to\Bbb{R}$ and a function $b:\partial\Omega\to\Bbb{R}$. Then the problem \begin{align} \begin{cases} \Delta u&=f\\ u|_{\partial \Omega}&=b \end{cases} \end{align} is known as Poisson’s equation with Dirichlet boundary conditions.

Under suitable assumptions on the functions $f$ and $b$, one can show that this problem has a solution and the solution is unique (and the dependence of $u$ on $b$ is ‘continuous’ in a suitable sense), i.e the problem is well-posed. Just FYI: this problem has a very rich history, which you should definitely google if interested. This is 100% motivated by physics since this is the field equation for Newtonian gravity, and also for electrostatics.

Here’s a fact that I leave to you to verify: in general if $R:\Bbb{R}^n\to\Bbb{R}^n$ is an orthogonal linear map (i.e the matrix representation satisfies $RR^t=R^tR=I$), i.e is a rotation matrix (perhaps a reflection too, since I’m not insisting $\det R=+1$), then for any function $u:\Bbb{R}^n\to\Bbb{R}$, we have that \begin{align} \Delta(u\circ R)=(\Delta u)\circ R. \end{align} So, it doesn’t matter if you rotate arguments first, and then apply Laplacian, or the other way around. In general, the same statement holds for any isometry of $\Bbb{R}^n$ (i.e for translations+rotations as well). This has ‘underlying’ reason because the Laplacian $\Delta$ is really a differential operator which comes from Euclidean geometry (with the metric tensor $\delta_{ij}$)… anyway this last bit was just mumbo jumbo so feel free to ignore.

Ok, so we know how Laplacians interact with rotations. Now, let us specialize to spherically-symmetric situations, i.e suppose the domain $\Omega$ is a ball centered at the origin of a fixed positive radius. Suppose the source/inhomogeneity $f$ is rotatinally-invariant (i.e $f\circ R=f$ for all rotations $R$… i.e $f$ depends only on the radial coordinate $r$), and that the same is true for the boundary function $b$ (i.e $b\circ R=b$ for all rotations $R$, which implies $b$ is simply a constant). Therefore, if $u$ solves the problem $(*)$, then by what we mentioned in the previous paragraph, we have that for any rotation $R$, \begin{align} \Delta(u\circ R)=(\Delta u)\circ R=f\circ R=f, \end{align} (first equality is by previous paragraph, second is since $u$ solves the problem, the third is since $f$ is assumed spherically-symmetric). Next, for the boundary condition, we similarly have $(u\circ R)|_{\partial\Omega}=b\circ R|_{\partial\Omega}=b$. Therefore we have proven that a spherically-symmetric setup implies that for any solution $u$, and any rotation $R$, we have that $u\circ R$ also solves the same problem. So, by well-posedness, we have $u\circ R=u$. This is saying that the function $u$ is also rotationally-invariant.

Hopefully you understand the logic of the argument: assume the setup is spherically symmetric, take your solution and transform it by a rotation to get $u\circ R$ and show this transformed function solves the same problem, hence by uniqueness they are equal. Since this is true for all $R$, it follows $u$ only depends on the radial distance, as expected from intuition. Notice carefully that this is a dual combination of the ‘structure’ of the equations, and uniqueness of solutions.

Example 2: Wave Equation on all space

Here, we take $\Omega=\Bbb{R}\times\Bbb{R}^n$ (the coordinates denoted as $(t,x^1,\dots, x^n)$), and let $\Sigma_0=\{0\}\times\Bbb{R}^n$ “the initial time slice”, and $m=k=1$, and $P$ is the wave-operator, or also known as the D’Alembertian, \begin{align} P:=\Box=-\partial_t^2+\Delta_x=\partial_t^2+\partial_{x^1}^2+\dots+\partial_{x^n}^2 \end{align} Now, again suppose $f:\Omega=\Bbb{R}\times\Bbb{R}^n\to\Bbb{R}$ is a given function, and suppose $u_0,u_1:\Sigma_0\to\Bbb{R}$ be given functions. The problem \begin{align} \begin{cases} \Box u&=f\\ (u,\partial_tu)|_{\Sigma_0}&=(u_0,u_1) \end{cases} \end{align} is known as the initial-value-problem for the inhomogeneous wave equation on $\Bbb{R}^{1+n}$. In words, we are solving the inhomogeneous wave equation $\Box u=f$, along with the initial conditions $u(0,x)=u_0(x)$ and $(\partial_tu)(0,x)=u_1(x)$. Once again, we can show that for suitable $f,u_0,u_1$, there exist unique solutions.

The wave equation is also deeply rooted in physics; indeed, if instead of restricting to $m=k=1$, we allowed ourselves $m=k=3+3=6$ then we’d be able to describe the wave equations satisfied by the electric and magnetic fields of Maxwell’s electrodynamics.

You can now play a similar game as in example 1, but in fact here we have more to play with. The general fact here is that Lorentz transformations are to the wave operator, what the rotations were to the Laplacian. Thus, if $\Box u=f$, and $L:\Bbb{R}^{1+n}\to\Bbb{R}^{1+n}$ is a Lorentz transformation, then \begin{align} \Box(u\circ L)=(\Box u)\circ L=f\circ L. \end{align} So, if you assume that $f$ is invariant under that Lorentz transformation, then you can continue the RHS to $=f$. So, $u\circ L$ solves the same PDE as $u$. For initial conditions, one has to be a little careful though, because a general Lorentz transformation $L$ will mix up time and space, so the initial hyperplane $\Sigma_0$ will get mapped to a new hyperplane $L[\Sigma_0]$. So, let us make the assumption that $L$ maps $\Sigma_0$ onto $\Sigma_0$. This implies $L(t,x)=(\pm t,R(x))$ for some orthogonal map $R:\Bbb{R}^n\to\Bbb{R}^n$.

Case 1: $L(t,x)=(t,R(x))$. In this case, if the initial conditions $(u_0,u_1)$ are rotationally-symmetric, then we see that $u\circ L$ solves the PDE $\Box(u\circ L)=f\circ L=f$ with initial conditions \begin{align} \begin{cases} (u\circ L)|_{\Sigma_0}&=u_0\circ R=u_0\\ (\partial_t(u\circ L))|_{\Sigma_0}&=+(\partial_tu)\circ L|_{\Sigma_0}=u_1\circ R=u_1 \end{cases} \end{align} Therefore, both $u$ and $(u\circ L)$ solve the same PDE with same initial conditions, and so $u=u\circ L$ by well-posedness of the wave equation (a mathematical fact). In particular, this shows that $u(t,x)=u(t,R(x))$ for all orthogonal maps $R$, so $u$ depends only on spatial distances.
Case 2: $L(t,x)=(-t,R(x))$. This time, assume $f=0$ and $u_0=0$ and $u_1$ is rotationally invariant. So, on the one hand, $u$ solves \begin{align} \begin{cases} \Box u&=0\\ u|_{\Sigma_0}&=0\\ (\partial_tu)|_{\Sigma_0}&=u_1 \end{cases} \end{align} On the other hand, $L$ is a Lorentz transformation so $\Box(u\circ L)=(\Box u)\circ L=0$, i.e $u\circ L$ solves the same PDE as $u$. This time, the initial conditions of $u\circ L$ are \begin{align} \begin{cases} (u\circ L)|_{\Sigma_0}&=0\\ (\partial_t(u\circ L))|_{\Sigma_0}&=-(\partial_tu)\circ L|_{\Sigma_0}=-u_1\circ R=-u_1 \end{cases} \end{align} Therefore, you see that $-u$ and $(u\circ L)$ solve the same PDE with same initial conditions, and so $u\circ L=-u$ by well-posedness of the wave equation (a mathematical fact). In particular, this shows that $u(-t,R(x))=-u(t,x)$ for all orthogonal maps $R$, so $u$ depends only on spatial distances, and has a time-inversion anti-symmetry.

I broke up the two cases to show you how the uniqueness argument implies certain symmetry/anti-symmetry properties of the solution, purely from the equations and initial/boundary conditions themselves, and in particular notice the importance of the initial conditions.

Btw a final remark on this example, particularly in case 2: here, the time-reversal $L(t,x)=(-t,R(x))$ is a Lorentz transformation (which is an isometry of Minkowski space)… but very crucial here is that the Minkowski metric is a quadratic form, so the minus sign gets cancelled out. At the level of the equation, this has to do with the fact that the wave operator $\Box=-\partial_t^2+\Delta$ is second order in time, NOT first order (as is the case for the heat equation $\partial_tu+\Delta u=f$). Physically, what case 2 is saying is that if you start off a wave at equilibrium position, and initially knock it in one direction vs knocking it in the opposite direction, then for all times afterward, the waves will oscillate in opposite directions (a fact which of course sounds intuitive).

Example 3: Maxwell’s equations

I intentionally postponed the discussion of Maxwell’s equations because these are vector equations (well, really tensorial equations, particularly about differential forms… if you knew them, then what I’m about to say can be rephrased pretty trivially) so I didn’t want to complicate matters initially.
Consider the domain $\Omega=\Bbb{R}\times\Bbb{R}^3$, i.e all of space and time. Now I will describe how the pullback is defined for vector-valued maps $u:\Bbb{R}^{1+3}\to\Bbb{R}^3$ by a simple class of maps. Let $\phi:\Bbb{R}^3\to\Bbb{R}^3$ be a diffeomorphism. We shall define the pullback of $u$ by $\phi$ to be the map $u_{\phi}:\Bbb{R}^{1+3}\to\Bbb{R}^3$ given by \begin{align} u_{\phi}(t,x)&:=(D\phi_x)^{-1}(u(t,\phi(x))). \end{align} So, for the first time you see how the $\psi_x$ from the general definition above shows up. In fact, let us consider only orthogonal maps $R:\Bbb{R}^3\to\Bbb{R}^3$. Then, $R$ is linear so the derivative is itself, i.e \begin{align} u_R(t,x)&=R^{-1}(u(t,R(x))) \end{align} So, this is how we ‘transform’ a time-dependent vector field by a rotation $R$. You can now check that \begin{align} \nabla\cdot (u_R)= (\nabla\cdot u)_R,\quad\text{and}\quad \nabla\times (u_R)=\text{sign}(\det R)(\nabla\times u)_R, \end{align} or in words, the pullback of the divergence is the divergence of the pullback, and the pullback of the curl is (up to sign) the curl of the pullback. In particular if $R\in SO(3)$, i.e $R$ is an orthogonal map and $\det R=+1$, then the pesky signs disappear, and pullbacks really do commute with divergence and curl (abstract reason for why this works: pullbacks commute with exterior derivatives, and curl is really an exterior derivative with some Hodge-star inside… Hodge stars are defined using the metric and an orientation, so if you have an orientation-preserving isometry, then you will preserve the curl as well, and for the divergence, there are two Hodge stars appearing, which is why the signs cancel out).

With all that preliminary stuff out of the way, we see that if $R\in SO(3)$, and we apply the reasoning above to $u$ being the electric and magnetic fields $E,B:\Bbb{R}\times\Bbb{R}^3\to\Bbb{R}^3$, then we see that if $E,B$ satisfy Maxwell’s equations with sources $(\rho,J)$, then $E_R,B_R$ satisfy the Maxwell’s equations with sources $\rho_R,J_R$ i.e \begin{align} \begin{cases} \nabla\cdot E&=\frac{\rho}{\epsilon_0}\\ \nabla\cdot B&=0\\ \nabla\times E&=-\frac{\partial B}{\partial t}\\ \nabla\times B&=\mu_0J+\mu_0\epsilon_0\frac{\partial E}{\partial t} \end{cases} \quad\implies \quad \begin{cases} \nabla\cdot (E_R)&=\frac{\rho_R}{\epsilon_0}\\ \nabla\cdot (B_R)&=0\\ \nabla\times (E_R)&=-\frac{\partial B_R}{\partial t}\\ \nabla\times(B_R)&=\mu_0J_R+\mu_0\epsilon_0\frac{\partial E_R}{\partial t} \end{cases} \end{align} One can prove that Maxwell’s equations with sufficiently smooth initial conditions are a well-posed system of evolution equations. Therefore, if you start with sources $\rho,J$ which are invariant under a rotation $R$ (i.e $\rho_R=\rho$ and $J_R=J$), and if the initial data are also invariant under rotation, then so will the resulting electric and magnetic field $E,B$ (i.e $E_R=E,B_R=B$).

In particular if you assume your sources $\rho,J$ and initial conditions on $E,B$ are invariant under all rotations, then the same is true for the solutions. If you only assume this for rotations about a specific axis, then the solutions $E,B$ will remain invariant about that axis. So, this finally answers your question of why such symmetry arguments work in E&M.

Once again, I want to highlight that it all boils down to the symmetry properties of your PDE itself and the initial/boundary conditions. And often, if your differential operator $P$ itself arises from a geometric situation (like the Laplacian or wave operator, which are really operators arising purely from the metric tensor in question, and similarly, the divergence also relies solely on the metric tensor, while the curl relies on the metric tensor and an orientation), then isometries (or perhaps orientation-preserving isometries) of your underlying geometry will give you a corresponding symmetry of your differential operator. So, in this manner, there’s a deep connection between symmetries of your geometry and those of your equation. If you then assume the sources and initial/boundary conditions are also invariant under the given symmetry transformation of the geometry, then by well-posedness you deduce the same for the solution. In this manner, everything ties together.

What kind of transformation would be applied to a system with cylindrical symmetry with Dirichlet boundary conditions? It can't be a roto translation of any sort like the ones you have dealt with in example 1, so what can it possibly be? — Cognoscenti, Dec 10 '23 at 14:52
Also, that was amazing, I'll be honest; I got lost somewhere after the second example but I've come across the first example before in class and it was never explained this way to me. Thank you so much, this really helped. — Cognoscenti, Dec 10 '23 at 14:53
Cylindrical symmetry does mean you look at rotations about a given axis; and you also have translational symmetry along that axis. — peek-a-boo, Dec 10 '23 at 15:08
@Cognoscenti I was just reminded of this answer of mine where I explain how trigonometry can be developed from the ODE point of view. I think this is much simpler and a more direct illustration of the principle here (I guess you could call it ‘example 0’) anyway…just thought you might find it an interesting read (even though we all know trigonometry:) — peek-a-boo, Dec 16 '23 at 01:55
That is a very cool answer, I've come across that ODE so many times but I never thought about defining sin and cos using that principle, really amazing pov — Cognoscenti, Dec 16 '23 at 09:58

Formalizing the arguments of symmetry

1 Answers1

Linked