Standard Deviation Calculator
This standard deviation calculator computes the population and sample standard deviation of a data set. It also provides the variance, mean, and sum. To use the calculator, please provide two or more numbers separated by a comma or semicolon and click the "Calculate" button.
What is standard deviation?
Standard deviation is a measure of variability in a given set of data. More specifically, it is a measure of how far each value in the data set, on average, lies from the mean. A higher standard deviation tells us that the values in the data set tend to be farther from the mean, meaning that the values are spread over a wider range. A lower standard deviation tells us that the values in the data set tend to be closer to the mean, and is therefore more reliable than a data set with a high standard deviation.
For example, given that the weights of 5 people are 103, 110, 115, 120, and 126 pounds, the mean of their weights is 114.8 pounds with a standard deviation of 7.935. Generally, when using standard deviation to examine a data set, we often use a normal distribution. Given that a set of data follows a normal distribution, we can expect 68% of the values in the data set to lie within 1 standard deviation of the mean. Referencing our example, we would add and subtract 7.935 from the mean of 114.8 to find the range within which we expect 68% of values to lie:
114.8 - 7.935 = 106.865
114.8 + 7.935 = 122.735
Thus, anything between 106.865 and 122.735 is within one standard deviation of the mean. In our case, 3 of the 5 values (60%) are within a single standard deviation of the mean, while the remaining 2 are within 2 standard deviations. The sample size in this case is very small, so we likely wouldn't draw many conclusions from this result, but we might suspect that the data is not as reliable as it could be since less than 68% of the data points lie within a single standard deviation.
Standard deviation formula
The formula for calculating standard deviation differs depending on whether or not the data set represents a sample or a population (discussed below). However, the two formulas discussed below are quite similar, with the exception of one term.
Population standard deviation:
Population standard deviation is calculated using the following formula,
where σ is the population standard deviation, N is the sample size (the number of data points), xi is the ith x value, and μ is the population mean.
Sample standard deviation:
where s is the sample standard deviation, N is the sample size, xi is the ith x value, and x̄ is the sample mean.
Although the formulas may look daunting, they are not that complicated to use in practice, at least for smaller data sets. Referencing the example from the previous section, and assuming that the 5 people represent the entire population, we can use the first formula above to calculate the standard deviation as follows:
Population vs. sample standard deviation
In order to understand the difference between a population standard deviation and a sample standard deviation, we must understand what is meant by the terms population and sample in a statistical context. In statistics, a population refers to the entire group being studied. If we want to collect data about students in a given high school, the population of students would be the entire student body of the school. A population does not necessarily have to be a group of people; it can be a group of any objects. For example, the population of a person's pencil case is all the contents within the case.
In contrast, a statistical sample is a subset of the population. For example, rather than considering every single student within a given high school, we may instead study students from only 3 classrooms, students who participate in band, students who bring their own lunches to school, etc. Also, because samples of a population are subsets of that population, a sample will always have fewer data points than a population.
A population standard deviation is therefore the standard deviation given that the data is collected from an entire population. Collecting data from an entire population is often very time consuming. It can also be prohibitively expensive, or sometimes may not even be possible. Thus, while researches would ideally be able to collect data from entire populations to conduct their studies, they often have to opt to collect data from samples of the population instead. These samples are meant to represent the population as a whole, and researchers use statistical methods to extrapolate characteristics about the population based on these samples.