First let me try to answer several points that may interest you, and then we will go onto the answer.
1) Not all PDEs modelling physical phenomena are linear. There cases where we have non-linear dynamics and we simply neglect the non-linear terms, because non-linear PDEs cannot be solved analytically in most cases. There are a lot of situations where we can approximate very well our equations to the linear case. For example, for wave propagation we have the PDE,
$${\partial_t^2 f(x,t)}=c^2\partial_x ^2f(x,t)$$
which is linear and can be solved. This represents a wave. But not all waves are described by this equation -there are also non-linear waves! For example, the cnoidal waves and solitons you can observe sometimes in fluids can be described by the Korteweg-De Vries equation:
$$\partial_t f(x,t)=6f(x,t)\partial_x f(x,t) - \partial _t ^3 f(x,t)$$
This is a non-linear PDE that have solutions that are physically waves, but are not solutions of the linear wave equation.
So, the linear equations can account a lot of phenomena, but there are many cases where they don't describe accurately the physical systems.
2) Differential equations and difference equations are not the same stuff. This point is very, very important. For example, take the following differential equation:
$$\dot{x} = 4rx(1-x)$$
This is the logistic equation (by the way, note that it is non-linear). In ecology, it represents the growth of a population, that has a density denoted by $x$ and grows with a rate $4r$. This equation has two fixed points, $x=0$ (extinction) and $x=1$ (population reaches the maximum value). It turns out that the extinction is unstable, while $x=1$ is a stable point, so for every initial condition the system will end up with $x=1$, following a simple, smooth trajectory.
However, you may think that in biology it is better to work with discrete data. I measure population every week, and the population at that time depends simply on the last record. Then, take the logistic map,
$$x_{n+1}=4rx_n(1-x_n)$$
where $x_n$ is the population at week $n$. It turns out that this map is chaotic for some values or $r$, so it jumps between different values of $x_n$ without apparent order, forever.
Note that the behaviour of this two things is very, very different, even when they represent the same process. In biology, the continuous case is often prefered since it describes better the population dynamics.
You may think that the problem is that if you write $\dot{x}$ as differences, then the equation you have for your system is different. However, when you take finite differences in PDEs using the explicit Euler method, you can have divergent behaviour, depending on you integration step $h$ and other constants of the problem (even for simple cases such as $\dot{x}=-rx$).
Even worse, if you work in 2-D space, for example, you need to select a discretization for the Laplacian. The centered difference is very usual, but you have to take in account also the symmetries of your problem: in a problem with a non-conserved flux in X direction you may want to use advanced differences for X and centered differences for Y. If you use the centered differences for X also, the numerical solution will not be correct: meaning that the discrete map and the continuous system are not equivalent!
So, two conclusions:
a) When you change directly the continuous for a discrete system they are not exactly the same thing,
b) When you do discrete differences, depending on how you them you may obtain a map which is not equivalent to your original continuous system.
Then, there are equivalent versions of the differential equations using difference equations? Well, for numerical computations, if you do well, in the limit $\Delta t\rightarrow0$, $\Delta x\rightarrow0$ they are the same. Outside this limit, are they valid? The answer is no.
3) Why PDEs? Well, using the information I have written above let me give my opinion on why PDEs and not any other description.
3a) They are intuitive. It is really easy to write these equations from an intuitive point of view. For example, speed. Let's say I increase it at a constant rate, and that the friction with air is proportional to the speed. Then, speed is going to decrease with the friction. This is written simply as
$$\dot{v} = a - f(v)$$
Now you only have to select how the friction depends on the velocity. If $f(v)$ is a smooth, well-behaved function, and we assume that for low velocities the friction is not too high, then you can Taylor expand at $v=0$ and simply take the linear part, $f(v)\simeq b v$.
You see: you have a physical phenomena, that you can model with a differential equation easily. Notice that I didn't invoke Newton's law nor any other physical concept. Simply a bit of "common sense" will do.
3b) We work with undetermined fields, on which we need conditions. This conditions are usually set over how the function evolves. See what I said before: speed changes depending on... This is by definition a derivative. Conditions over derivatives lead to differential equations.
If you work, for example, with probabilities, you often face functional equations, because the conditions are set over the arguments the functions and not over how they change. Or you can have even functional differential equations, which is the case of master equations.
3d) Well... you don't like PDEs? Not a problem, you can change all of them into operators and try to solve them using algebra. When the problem is linear, this is simply the diagonalization of the operator. However, for non-linear problems... Well, what do you know about non linear algebra? Because I think that physicists, in general, are not very well trained in non linear algebra.
Take for example Schrödinger equation. Heisenberg found it before... written in operators. Nobody at the time knew operators, so when Schrödinger found his equation, everybody leaved that complicated and strange thing and went for the traditional calculus everybody knew. Even now... what it is easier, to diagonalize the infinite-dimensional matrix momentum operator, or to solve the first order linear differential equation?
So basically many of the PDEs could be substituted by other things, but they are more complicated or we are not trained with them.