00:01 All right. Excellent. 00:03 We've covered all univariate measures. 00:05 Now it is time to see measures that are used when we work with more than one variable. 00:10 In the next two lessons, we'll explore measures that can help us explore the relationship between variables. 00:16 Our focus will be on covariance and the linear correlation coefficient. 00:21 Let's zoom out a bit and think of an example that is very easy to understand and will help us grasp the nature of the relationship between two variables a bit better. 00:30 Think about real estate, which is one of the main factors that determine house prices their size. Right. 00:37 Typically, larger houses are more expensive as people like having extra space. 00:42 The table that you can see here shows us data about several houses. 00:46 On the left side, we can see the size of each house and on the right we have the price at which it's been listed in the local newspaper. 00:54 We can present these data points in a scatterplot. 00:57 The X-axis will show a house's size, and the Y-axis will provide information about its price. We can certainly notice a pattern. 01:06 There is a clear relationship between these variables. 01:10 We say that the two variables are correlated, and the main statistic to measure this correlation is called covariance. 01:17 Unlike variance, covariance may be positive equal to zero or negative. 01:22 To understand the concept better, I would like to show you the formulas that allow us to calculate the covariance between two variables. 01:29 It is formulas with an S because once again there is a sample and a population formula. Here they are. 01:37 Since this is obviously sample data, we should use the sample covariance formula. 01:43 Let's apply it in practice for the example that we saw earlier. 01:47 X will be house size and Y stands for house price. 01:51 First, we need to calculate the mean size and the mean price. 01:55 I will also compute the sample standard deviations in case we need them later on. 02:01 Ok done. 02:03 Now let's calculate the denominator of the covariance function. 02:08 Starting with the first house, I'll multiply the difference between its size and the average house size by the difference between the price of the same house and the average house price. Once we're ready, we have to perform this calculation for all houses that we have in the table and then some the numbers we've obtained. 02:26 See. Great. 02:29 Our sample size is five. 02:31 Now we have to divide the sum above by the sample size minus one. 02:37 The result is the covariance. 02:39 It gives us a sense of the direction in which the two variables are moving. 02:43 If they go in the same direction, the covariance will have a positive sign, while if they move in opposite directions, the covariance will have a negative sign. 02:51 Finally, if their movements are independent, the covariance between the house size and its price will be equal to zero. 02:58 There is just one tiny problem with covariance, though. 03:02 It could be a number like five or 50, but it can also be something like 0.0023456 or even over 30 million as in our example, values of a completely different scale. 03:15 How could one interpret such numbers? Proceed to the next lecture to find out how the correlation coefficient can help us with this issue. Thanks for watching.
The lecture Bivariate Measures: Covariance by 365 Careers is from the course Statistics for Data Science and Business Analysis (EN).
5 Stars |
|
5 |
4 Stars |
|
0 |
3 Stars |
|
0 |
2 Stars |
|
0 |
1 Star |
|
0 |