00:00 While variance is a common measure of data dispersion, in most cases, the figure you will obtain is pretty large and hard to compare as the unit of measurement is squared. The easy fix is to calculate its square root and obtain a statistic known as standard deviation. 00:15 In most analyses you perform, standard deviation will be much more meaningful than variance. As we saw in the previous lecture, there are different measures for the population and sample variance. 00:28 Consequently, there is also population and sample standard deviation. 00:33 The formulas are the square root of the population variance and square root of the sample variance, respectively. 00:40 I believe there is no need for an example of the calculation. 00:43 Right. If you have a calculator in your hands, you'll be able to do the job. All right. 00:50 The other measure we still have to introduce is the coefficient of variation. 00:54 It is equal to the standard deviation. 00:56 Divided by the mean. 00:59 Another name for the term is relative standard deviation. 01:02 This is an easy way to remember its formula. 01:04 It is simply the standard deviation relative to the mean. 01:09 As you probably guessed, there is a population and sample formula once again. 01:15 So standard deviation is the most common measure of variability for a single data set. But why do we need yet another measure such as the coefficient of variation? Well, comparing the standard deviations of two different data sets is meaningless, but comparing coefficients of variation is not. 01:34 Aristotle once said, Tell me. 01:37 I'll forget. Show me. 01:39 I'll remember. 01:41 Involve me. 01:42 I'll understand. 01:44 To make sure you remember. 01:45 Here's an example of a comparison between standard deviations. 01:49 Let's take the prices of pizza at ten different places in New York. 01:54 They range from 1 to $11. 01:57 Now imagine that you only have Mexican pesos and to you the prices look more like $18.81 to $206.91, given the exchange rate of $18.81 for $1. 02:11 Let's combine our knowledge so far and find the standard deviations and coefficients of variation of these two data sets. 02:19 First, we have to see if this is a sample or a population. 02:22 Are there only 11 restaurants in New York? Of course not. This is obviously a sample drawn from all the restaurants in the city. 02:29 Then we have to use the formulas for sample measures of variability. 02:34 Second, we have to find the mean. 02:36 The mean in dollars is equal to 5.5 and the mean in pesos to 103.46. The third step of the process is finding the sample variance. 02:47 Following the formula that we showed earlier, we can obtain $10.72 squared and $3,793.69 squared. 02:58 The respective sample standard deviations are 3.27 and $61.59. Let's make a couple of observations. First, variance gives results in squared units while standard deviation in original units. 03:13 This is the main reason why professionals prefer to use standard deviation as the main measure of variability. 03:19 It is directly interpretable. 03:21 Squared dollars means nothing even in the field of statistics. 03:26 Second, we got standard deviations of 3.27 and 61.59 for the same pizza at the same 11 restaurants in New York City. 03:35 Seems wrong, right? Don't worry. It is time to use our last tool, the coefficient of variation, dividing the standard deviations by the respective means. 03:46 We get the two coefficients of variation. 03:48 The result is the same 0.60. 03:53 Notice that it is not dollars, pesos, dollars squared or pesos squared. 03:57 It is just 0.60. 04:00 This shows us the great advantage that the coefficient of variation gives us. 04:05 Now we can confidently say that the two data sets have the same variability, which is what we expected beforehand. 04:14 Let's recap what we have learned so far. 04:16 There are three main measures of variability, variance, standard deviation and coefficient of variation. 04:23 Each of them has different strengths and applications. 04:26 You should feel confident using all of them, as we are getting closer to more complex statistical topics. 04:31 And remember Aristotle's advice. 04:34 Involve me. 04:35 I understand. 04:37 So please don't forget to get involved with the exercises.
The lecture Univariate Measures: Data Scatter (Standard Deviation and Coefficient of Variation) by 365 Careers is from the course Statistics for Data Science and Business Analysis (EN).
5 Stars |
|
5 |
4 Stars |
|
0 |
3 Stars |
|
0 |
2 Stars |
|
0 |
1 Star |
|
0 |