Mean Spirited

One of the more interesting things about programming for the “real world” is that you get to see a lot of practical applications for the math you learned (or didn’t) in school. Every few years I find I have to refresh my memory on certain things and, at least so far, it usually comes back pretty quickly.

It isn’t surprising to forget how to do a double integral or a non-homogeneous differential equation. But even something as seemingly innocuous as an average can turn out to be more complicated than you think. Quick. What’s the average of 60 and 30?

I’m going to guess you said 45. But a better answer might be to ask what 60 and 30 represent. If they are successive measurements from an A/D converter measuring battery voltage, for example, then 45 is the right answer. That’s the arithmetic mean:

double avg=(60.0+30.0)/2.0;

What if I turned it into a word problem: When I go to the beach (it is that time of year) I like to get there in a hurry. So I drive 60 miles per hour. When I go home, I’m tired and I don’t want to go home so I drive back at 30 miles per hour. What’s my average speed? Raise your hand if you think the answer is 45.

Given that I can’t see if you raised your hand, I’m going to pretend that you did. The average of two speeds is not a straight average. The simple reason is that even though the distance is the same, I will spend twice as much time going home as I did driving to the beach.

The formula is simple but it is easier to think about it intuitively. What I really want is miles per hour. So the real question is how many miles did I go in how many hours. Assume that driving to the beach takes me T hours. Since I go home at half the speed, I will spend 2*T hours going home. The distance, call it D, is the same in both directions. So:

float D=T*60;  // it also  equals 2*T*30;
float speed=2*D/(3*T);  // distance coming and going, and the total time travelled

Of course, that’s specific to our example. The general form is to take the regular average of the reciprocal of the numbers, multiply it by the reciprocal of the count, and then take the reciprocal of the result. With a little algebra you wind up with:

unsigned count;  // set externally
double data[]; // set externally
unsigned i;
double avg=0.0;

for (i=0;i<count;i++) avg+=1.0/data[i];

If you don’t feel like compiling that, the answer is 40 miles per hour. That makes sense. I live about 30 minutes away from the Galveston beach, at 60 miles an hour. So that’s 30 miles. It takes me a half hour to get there and an hour to get back (at 30 MPH). Therefore, I went 60 miles in an hour and a half. That’s 40 MPH average.

By the way, in person last week I told someone this was the geometric mean, but that just shows that I am getting forgetful. It is actually the harmonic mean. The geometric mean is a different formula.

In some cases, it is easier to not even do that much math. For example, if you are picking up three redundant sensors, sometimes it is easier to just sort the numbers and pick the one in the middle. That’s called the median. Suppose two sensors agree that you have 100 counts, but the other one shorted to ground and is reading 0. Picking the middle value will give you one of the good values. If the two good ones don’t agree, you’ll still get one of the values.

If you have a lot of values, you probably would sort (although you can also figure out a way to stop sorting once you find the middle if you have a very large data set). If you have just a few values, some nested if statements will do the trick.

You might think that exhausts the whole topic of averages, but it really doesn’t even scratch the surface! Don’t believe me? Have a look at and don’t ask me what you do with a winsorized mean – I have no idea.

For completeness, here’s a general form for the “regular” average:

double avg=0.0;
unsigned count,i;  // count set externally
double data[];    // array of count items

for (i=0;i<count;i++) avg+=data[i];

I’m guessing you haven’t forgotten that. Or at least I haven’t. Not yet, anyway.