In my opinion, @gandalf61 has provided the correct answer. Let me just expand on it a little.
Measurement errors
The discrepancy between the law and the results of measurement may come from different sources, notably from the measurement errors (some of which cannot be controlled), but also possibly from sample rpeparation, different conditions at the time of the experiments, etc. Often these errors cannot be controlled or reduced to zero, even by improving the measurement techniques and the rest. The discrepancy between the observed results and the theory may suggest that the theory is not correct.
Hypothesis testing
If we suspect that the theory is not correct, we need to perform hypothesis testing. This is a rather well-defined statistical procedure, but unfortunately not given sufficient attention in modern physics education, since the high precision of measurements makes it rather unnecessary (with the important exception of particle physics, see this review).
One takes as a null hypithesis the assumption that the existing theory is correct and calculates the p-value, which is the probability that the observed anomalous data are due to statistical error only. If the p-value is smaller than the chosen confidence threshold, the null hypothesis is rejected, i.e., we conclude that the theory is wrong.
Note that the whole procedure is statistical in nature - we can never be 100% sure that our conclusions are correct!
The reason why we try to disprove the existing theory, rather than trying to prove it, is is that doing the latter requires also calculating the statistical power, which is usually a more difficult problem, requiring more assumptions.
There exist multitude of statistical tests for various types of the situations, which allows us to adapt to various sources of statistical errors.
Update
Note that familiar to everyone confidence intervals are actually a rather involved concept, grounded in hypothesis testing: The interval has an associated confidence level that gives the probability with which the estimated interval will contain the true value of the parameter.
Their everyday interpretation as the spread of measurement values around the "true" value is actually that of the credible interval in Bayesian statistics.
Anyone might doubt experimental methods or results and so what?
However strange they seem, results can be replicated, or not.
Next…
– Robbie Goodwin Apr 01 '21 at 22:52