More data points or more averages?

Question

Perhaps an elementary questions. Given a time limited measurement situation, would it be better for one to measure more averages or more data points?

More averages will increase the SNR by $$\sqrt{n}$$ , i.e., making the data point more reliable, but more data points may make the fitting better.

Consider the model is A*exp(-Bt) + C*exp(-Dt) which is a difficult model to fit when noise is introduced.

Assume due to time limit, one could only do 100 measurements in total. Should one measure 20 data points with 5 averages each or 100 data points, or 5 data points with 20 averages each?

Averaging can always be done after the experiment. Measure real data points, maybe you want to look at other properties of the distribution that would get lost by early averaging. — Alexander, Nov 21 '11 at 12:11
Well, it's not a theoretical problem, but more of a practical problem for experimental physicists. — Dan, Nov 21 '11 at 12:41
It is unclear to me what the distinction is between "averages" and "measurements." By "averages" do you mean repetitions of the same experiment with the same initial conditions? and by "measurements" do you mean running the experiment with different initial conditions? — AdamRedwine, Nov 21 '11 at 12:54
So averages means measuring at the same t for the above model, then one just average the measurement for that t. And measurement means measuring at different t. — Dan, Nov 21 '11 at 12:56
This is a fine question but it would get better answers on stats.SE. (not that the answers it got here are bad.) — Colin K, Nov 21 '11 at 23:48

AdamRedwine · Accepted Answer · 2011-11-21T15:53:57.560

In the practice of measurement control, a very good rule of thumb is to have at least 30 measurement points under the same conditions when characterizing an unknown situation. If you have time to make 100 measurements, I would suggest making 33 "repetition" measurements at low, mid, and high values of $t$.

The justification for the number 30 comes from analysis of populations of measured variables using the Student's T distribution. In your case, the degrees of freedom would be one less than the number of "repetition measurements," i.e. 32.

EDIT

I should probably point out that this rule is used for the characterization of variability in the measurement system, not the measurand. If you are using a well calibrated piece of equipment, e.g. a scale with well known uncertainty, you can make fewer measurements.

Nonetheless, I would still recommend multiple measurements of the measurand. When adding up the contribution to total uncertainty, the standard deviation of the measuring system can be devided by the root of the number of measurements taken. That is,

$$\sigma_m = \frac{\sigma_s}{\sqrt{n}}$$

where

$\sigma_m$ is the contribution to the total uncertainty in the measured value
$\sigma_s$ is the characterized uncertainty of the measurement system and
$n$ is the number of "repitition" measurements taken

You should aim to make $n$ sufficiently large that $\sigma_m$ is a small relative to other contributers to the total uncertainty.

FYI, there's thousands of books on measurement control and, while you shouldn't try to become an expert if you don't need to, it might benefit you to read a bit. A lot of resources are available free online through national metrology laboratories like NIST. Google "introduction to measurement uncertainty" and you should get more than enough. — AdamRedwine, Nov 21 '11 at 13:27
Sure thing. I happen to do measurement control for a living and coincidentally just had training on exactly this topic earlier this week. The instructors for my class highly recommended the UKAS guide M 3003. You get it from Google. — AdamRedwine, Nov 22 '11 at 13:37

score 0 · Answer 2 · answered Nov 21 '11 at 20:30

If you already know the uncertainties of your measurement process, and care about minimizing the uncertainties in the parameters of the fit, I believe it's the same whether you repeat your measurement at the same t several times or not, with an important caveat (below). I can prove that in a linear fit this is the case. I've never done the calculations for a nonlinear fit as in your example, so it may not hold true in that case.

Proof for a linear fit

I'm sorry if this is too complicated. I can clarify anything in the comments, or you can just skip this part if you're not interested in it, of course ;-)

From Bevington "Data reduction and error analysis for the physical sciences", if you fit a function with the form $y(x) = \sum_{k=1}^N a_k f_k(x)$ then the coefficients are

coefficient a_1 with determinant

(this is for N=3, but it's the same for larger N). The vertical bars are the determinant of the matrix inside them.

The only important thing to notice here is that $y_l$ appears just once for every term in $a_k$, and it's exponent is only one (i.e., there's no $y_l^2$ or higher).

The variance for $a_k$ is $\sigma_{ak}^2 = \sum_m \left(\frac{\partial a_k}{\partial y_m} \sigma_m\right)^2$ but, since you have to derive with respect to $y_l$, it doesn't appear in the final expression. You just get lots of terms and products of the form $\sum_{l=1}^N \frac{f_i(x_l) f_j(x_l)}{\sigma_l^2}$. If a few of the first terms in the sum were measured at the same point $x_1$, you can get $$\sum_{l=1}^{n} \frac{f_i(x_1) f_j(x_1)}{\sigma_1^2} + \sum_{l=n+1}^N \frac{f_i(x_l) f_j(x_l)}{\sigma_l^2} = n \frac{f_i(x_1) f_j(x_1)}{\sigma_1^2} + \sum_{l=n+1}^N \frac{f_i(x_l) f_j(x_l)}{\sigma_l^2}$$ This can be written as $$\frac{f_i(x_1) f_j(x_1)}{(\sigma_1/\sqrt{n})^2} + \sum_{l=n+1}^N \frac{f_i(x_l) f_j(x_l)}{\sigma_l^2}$$ This last expression is the same one you would get for a measurement in which you measure n times at the same $x$, and average these data points before feeding your fitting routine.

So you see that, in the linear case, there is no advantage in measuring repeatedly at the same point.

Discussion for a nonlinear fit

In the nonlinear case, as in your example, I'm not so sure if it's the same or not. I hope other users here can answer you in more detail.

In my (not so long) experience, though, I prefer to measure at different points, but (and this is the caveat) taking care to minimize the possible values of the parameters (the "wiggle" room the function has, so to speak). For example, if in your function with two exponential decays you have $B=1/s$ and $D=1/(20\,s)$ you need to measure several points around 1, and several around 10. For example, you could do 50 points from 0s to 5s and 50 from 10s to 100s.

If you don't do this, it's possible that the data you have can't determine very well one of the parameters, just because you don't have much information in the range in which one parameter is dominant.

Again, maybe someone can justify better the above reasoning, but qualitatively I think this is correct. And, as AdamRedwine wrote, this holds only when you already know your uncertainties.

More data points or more averages?

2 Answers2