There is a mathematical point that can be made, and in my opinion is related to a deeper understanding of what it means to solve a (partial) differential equation.
I will try to keep things simple, and consider only linear models.
Suppose that you have a space $X$ with some properties, for example it has a topology. We suppose that the state of our system is an object of that space $x\in X$. The evolution of the state is given by a map $x(\cdot): I\to X$, where $I\subseteq \mathbb{R}$ represents the time interval. Now this is what "nature" offers at the most basic level, and does not give any a priori information about causality. No one assures that there would be a relation between $x(t_1)$ and $x(t_2)$, when $t_1<t_2$. Nevertheless, we usually observe one thing: the map $x(\cdot)$ is continuous with respect to time in the bounded intervals $I$ we are able to observe. Given such a continuous map, we may ask what happens to the derivative of the map, and we may define the derivative to preserve causality: we define
$$\partial_t^- x(t)\lvert_{t=t_0}=\lim_{h\to 0^-}\frac{x(t_0+h)-x(t_0)}{h}\; ,$$
so our information that comes from the derivative is only "from the past". If the limit exists, we say that $x(t)$ is (left)-differentiable at $t_0$. Now suppose that we observe that for any $t> t^-$, where $t^-$ is the minimum of our bounded interval, $\partial_t^-x(t)=A^- x(t)$ for some linear operator $A^-$ that acts on $X$. Here you have your differential equation "from the left", that takes only informations about the past. However, since the interval is bounded (and we can observe only bounded intervals of time), we can also "a posteriori" define the right derivative $\partial_t^+ x(t)$. Suppose that again for any $t< t^+$, $\partial_t^+ x(t)= A^+x(t)$, where $A^+$ is a linear operator that takes information "from the future". Since in most cases $A^-=A^+=A$, we can in that case infer (however that is not provable, just an inference) that for the system the differential equation $$\partial_t x(t)= Ax(t)$$ can describe the state $x(t)$ at any time in a unique fashion (fixed the value at one point).
So as you see, it is not necessary that the differential equation contain "information about the future": in fact we may restrict ourselves to the hypothesis that the system obeys
$$\partial_t^-x(t)=A^- x(t)$$
for any $t> t^-$. It is observation that leads us to infer that the correct equation is actually $\partial_t x(t)= Ax(t)$.
Nevertheless, there is a difference between differential and integral formulation of an equation, but it is more semantical. Consider the so-called Cauchy problem:
$$\left\{\begin{aligned}&\partial_t x(t)= Ax(t)\\&x(0)=x_0\end{aligned}\right .\; .$$
For the equation to be satisfied, it is necessary that the derivative is defined everywhere, and takes values in the space $X$; in addition it is necessary that $x(t)$ is on the domain $D(A)\subseteq X$ of $A$ for any $t$. So a solution, if it exists, will be of the type $x(t)\in C^0(I,D(A))\cap C^1(I,X)$, i.e. a differentiable map from an interval $I$ (that contains zero) to $D(A)$, such that its derivative is continuous with values in $X$. This imposes restrictions on the map $x(t)$, i.e. it has to be quite regular (in math terminology). From the point of view of numerical simulations, this required regularity gives you the additional computational cost, in my opinion.
Are we able to formulate the equation in another way, that enables us to consider more general solutions? In fact a priori we need only that $x(t)\in C^0(I,X)$, i.e. it is a continuous function. The answer is yes, we may write the integral equation
$$x(t)=x_0 +\int_0^t Ax(s)ds\; .$$
Obviously it depends from case to case, but it is often possible to find solutions of the equation in this form (especially for nonlinear systems) that need less regularity than before, for example we may find solutions $x(t)\in C^0(I, X)$ for any $x_0\in X$. The less required regularity should give, computationally, better performance.
However it is clear that a solution that is only $C^0(I, X)$ of the integral equation is not, strictly speaking, a solution of a differential equation, but any solution of the integral equation that is also $C^0(I,D(A))\cap C^1(I,X)$ is a solution of the differential equation. Conversely, every solution of the differential equation is also a solution of the integral one. So they are not exactly equivalent.
A concrete example may be the Schrödinger equation $\partial_t \psi = -iH\psi$ in $L^2$. Often $H$ is defined on a domain $D(H)$ smaller than the whole $L^2$; so the differential equation has solution only for maps $\psi(t)$ that are always in the domain $D(H)$. If, as usual, $H$ is self-adjoint, then the solution of the equation is written $\psi(t)=e^{-itH}\psi_0$, and this solves the integral equation for any $\psi_0\in L^2$. However, it solves the differential (usual) form only if $\psi_0
\in D(H)$.