In order to understand Sigma, we must crack open our old statistics textbooks and grasp the most basic principles of this subject. We all understand the concept of an average, which in statistical terms is called the “mean.” Take any set of data, add them and divide by the number of data points, and we have the mean; pretty simple.
The mean, though, only tells us one thing: where our data is centered. It does not tell us how “spread” is that data about the mean. You may at this juncture be wondering why we would even care; good question.
If all we calculate is the mean, we have described what is happening with our data, but are left short in being able to predict what will happen when more data is collected. Example (as found in the March 2006 issue of The Canmaker - http://www.sayers-publishing.com): the mean of the following 10 beverage can diameters is 2.600 inches:
2.600
2.601
2.599
2.599

2.600
2.601
2.600
2.599
2.601
2.600
If you were asked the probability of choosing a can with a diameter of 2.600 from this population (the production line), you might say that probability is "good." Were you further asked the probability of choosing another can between 2.599 and 2.601, you might say "very good" or "a sure thing."
Now, let us look at another set of 10 beverage can diameters, whose mean diameter is also 2.600inches:
2.601
2.598
2.601
2.602
2.600
2.599
2.601
2.598
2.602
2.598
Were you asked the probability of choosing another can from this population with a diameter of 2.600, you might say "not very good." Asked of the same requirement as in the first set of cans - choosing a can with a diameter between 2.599 and 2.601 - you might say "good."
Enter the statistical concept of Sigma, or standard deviation. Here we are concerned with how far each data point in the dataset is from the calculated mean. To calculate Sigma - for each set of our cans, which which are a sample from a larger population (the production line) we:
1.) Calculate the difference between each data point and the mean and square each result. If a can diameter measures 2.601, the difference between it and the mean of 2.600 is .001. Squaring .001 yields .000001.
|
first set of cans, mean = 2.600 |
||
| diameter | difference from the mean | difference squared |
| 2.600 | 0 | 0 |
| 2.601 | .001 | .000001 |
| 2.599 | .001 | .000001 |
| 2.599 | .001 | .000001 |
| 2.600 | 0 | 0 |
| 2.601 | .001 | .000001 |
| 2.600 | 0 | 0 |
| 2.599 | .001 | .000001 |
| 2.601 | .001 | .000001 |
| 2.600 | 0 | 0 |
|
second set of cans, mean = 2.600 |
||
| diameter | difference from the mean | difference squared |
| 2.601 | .001 | .000001 |
| 2.598 | .002 | .000004 |
| 2.601 | .001 | .000001 |
| 2.602 | .002 | .000004 |
| 2.600 | 0 | 0 |
| 2.599 | .001 | .000001 |
| 2.601 | .001 | .000001 |
| 2.598 | .002 | .000004 |
| 2.602 | .002 | .000004 |
| 2.598 | .002 | .000004 |
...then we...
2.) Sum all these squared values,
|
first set of cans, mean = 2.600 |
||
| diameter | difference from the mean | difference squared |
| 2.600 | 0 | 0 |
| 2.601 | .001 | .000001 |
| 2.599 | .001 | .000001 |
| 2.599 | .001 | .000001 |
| 2.600 | 0 | 0 |
| 2.601 | .001 | .000001 |
| 2.600 | 0 | 0 |
| 2.599 | .001 | .000001 |
| 2.601 | .001 | .000001 |
| 2.600 | 0 | 0 |
| sum of squared differences | .000006 | |
|
second set of cans, mean = 2.600 |
||
| diameter | difference from the mean | difference squared |
| 2.601 | .001 | .000001 |
| 2.598 | .002 | .000004 |
| 2.601 | .001 | .000001 |
| 2.602 | .002 | .000004 |
| 2.600 | 0 | 0 |
| 2.599 | .001 | .000001 |
| 2.601 | .001 | .000001 |
| 2.598 | .002 | .000004 |
| 2.602 | .002 | .000004 |
| 2.598 | .002 | .000004 |
| sum of squared differences | .000024 | |
3.) Divide by the number of data points minus 1.
|
first set of cans, mean = 2.600 |
||
| diameter | difference from the mean | difference squared |
| 2.600 | 0 | 0 |
| 2.601 | .001 | .000001 |
| 2.599 | .001 | .000001 |
| 2.599 | .001 | .000001 |
| 2.600 | 0 | 0 |
| 2.601 | .001 | .000001 |
| 2.600 | 0 | 0 |
| 2.599 | .001 | .000001 |
| 2.601 | .001 | .000001 |
| 2.600 | 0 | 0 |
| sum of squared differences | .000006 | |
| divide by n-1, or 10-1 = 9 | .000000666... | |
|
second set of cans, mean = 2.600 |
||
| diameter | difference from the mean | difference squared |
| 2.601 | .001 | .000001 |
| 2.598 | .002 | .000004 |
| 2.601 | .001 | .000001 |
| 2.602 | .002 | .000004 |
| 2.600 | 0 | 0 |
| 2.599 | .001 | .000001 |
| 2.601 | .001 | .000001 |
| 2.598 | .002 | .000004 |
| 2.602 | .002 | .000004 |
| 2.598 | .002 | .000004 |
| sum of squared differences | .000024 | |
| divide by n-1, or 10-1 = 9 | .000002666... | |
...finally we take this result and...
4.) Calculate the square root.
|
first set of cans, mean = 2.600 |
||
| diameter | difference from the mean | difference squared |
| 2.600 | 0 | 0 |
| 2.601 | .001 | .000001 |
| 2.599 | .001 | .000001 |
| 2.599 | .001 | .000001 |
| 2.600 | 0 | 0 |
| 2.601 | .001 | .000001 |
| 2.600 | 0 | 0 |
| 2.599 | .001 | .000001 |
| 2.601 | .001 | .000001 |
| 2.600 | 0 | 0 |
| sum of squared differences | .000006 | |
| divide by n-1, or 10-1 = 9 | .000000666... | |
| take the square root | .000816496 | |
|
second set of cans, mean = 2.600 |
||
| diameter | difference from the mean | difference squared |
| 2.601 | .001 | .000001 |
| 2.598 | .002 | .000004 |
| 2.601 | .001 | .000001 |
| 2.602 | .002 | .000004 |
| 2.600 | 0 | 0 |
| 2.599 | .001 | .000001 |
| 2.601 | .001 | .000001 |
| 2.598 | .002 | .000004 |
| 2.602 | .002 | .000004 |
| 2.598 | .002 | .000004 |
| sum of squared differences | .000024 | |
| divide by n-1, or 10-1 = 9 | .000002666... | |
| take the square root | .001632992 | |
That number we have calculated is Sigma, or the sample standard deviation. In our first example above, Sigma is .000816496 inches; in the second .001632992 inches. We may now use the combination of the mean and Sigma to effect all number of statistical calculations. Notice how large is the second sample Sigma versus the first, which signifies the second as much more dispersed about the calculated mean of 2.600 inches.