I missed Cosma Shalizi’s comment on my first post on averages versus means. Rather than write a blog-length reply, I’m pulling it out into its own little lexicographic excursion. To borrow stylistically from Cosma’s blog, I’ll warn you, the reader, that this post is more linguistics than statistics.
Three Concepts and Terminology
Presumably everyone’s clear on the distinctions among the three concepts,
1. [arithmetic] sample mean,
2. the mean of a distribution, and
3. the expectation of a random variable.
The relations among these concepts is very rich, which is what I conjecture is causing their conflation.
Let’s set aside the discussion of “average”, as it’s less precise terminologically. But even the very precision of the term “average” is debatable! The Random House Dictionary lists the meaning of “average” in statistics as the arithmetic mean. Wikipedia, on the other hand, agrees with Cosma, and lists a number of statistics of centrality (sample mean, sample median, etc.) as being candidates for the meaning of “average”.
Distinguishing Means and Sample Means
Getting to the references Cosma asked for, all of my intro textbooks (Ash, Feller, Degroot and Schervish, Larsen and Marx) distinguished sense (1) from senses (2) and (3). Even the Wikipedia entry for "mean" leads off with
In statistics, mean has two related meanings:
the arithmetic mean (and is distinguished from the geometric mean or harmonic mean).
the expected value of a random variable, which is also called the population mean.
Unfortunately, the invocation of the population mean here is problematic. Random variables aren’t intrinsically related to populations in any way (at least under the Bayesian conception of what can be modeled as random). Populations can be characterized by a set of (conditionally) independent and identically distributed (i.i.d.) random variables, each corresponding to a measureable quantity of a member of the population. And of course, averages of random variables are themselves random variables.
This reminds me to the common typological mistake of talking about “sampling from a random variable” (follow the link for Google hits for the phrase).
Population Means and Empirical Distributions
The Wikipedia introduces a fourth concept, population mean, which is just the arithmetic mean of a given population. This is related to the point Cosma brought up in his comment that you can think of a sample mean as the mean of a distribution with the same distribution as the empirically observed distribution. For instance, if you observe three heads and a tail in three coin flips, you create a discrete random variable with and , then the average number of heads is equal to the expectation of or the mean of .
Conflating Means and Expectations
I was surprised that like the Wikipedia, almost all the sources I consulted explicitly conflated senses (2) and (3). Feller’s 1950 Introduction to Probability Theory and Applications, Vol 1 says the following on page 221.
The terms mean, average, and mathematical expectation are synonymous. We also speak of the mean of a distribution instead of referring to a corresponding random variable.
The second sentence is telling. Distributions have means independently of whether we’re talking about a random variable or not. If one forbids talk of distributions as first-class objects with their own existence free of random variables, one might argue that concepts (2) and (3) should always be conflated.
Metonomy and Lexical Semantic Coercion
I think the short story about what’s going on in conflating (2) and (3) is metonymy. For example, I can use “New York” to refer to the New York Yankees or the city government, but no one will understand you if you try to use “New York Yankees” to refer to the city or the government. I’m taking one aspect of the team, namely its location, and using that to refer to the whole.
This can happen implicitly with other kinds of associations. I can talk about the “mean” of a random variable by implicitly invoking its probability function . I can also talk about the expectation of a distribution by implicitly invoking the appropriate random variable. Sometimes authors try to sidestep random variable notation by writing , which to my type-sensitive mind appears ill-formed; what they really mean to write is where .
I found it painfully difficult to learn statistics because of this sloppiness with respect to the types of things, especially among less careful authors. Bayesians, myself now included, often write for both a random variable and a bound variable; see Gelman et al.’s Bayesian Data Analysis, for a typical example of this style.
Settling down into my linguistic armchair, I’ll conclude by noting that it seems to me more felicitous to say
the expectation of a random variable is the mean of its distribution.
than to say
the mean of a random variable is the expectation of its distribution.