A mathematical representation of a random phenomenon with sample spaces, probabilities associated with each event and events within the sample spaces is known as a probability model. Here, random variables are numeric values which are generated based on results of a random event. Random variables are a central point of a probability model.
Example: The random variable is any number we get from a random selection. Tossing a coin, we can get head (0) or tail (1). Either of the outcomes is a random variable. In this case, a random variable is not numeric.
The set of values of a random variable is known as its sample space. For example, if we throw a dice, we have a possibility of outcome 1,2,3,4,5,6. This set of values is expected as a result of the event i.e. throwing a dice is the sample space.
Types of Random Variables
There are two main types of random variables i.e. discrete and continuous random variable.
Continuous random variable
This type of random variable has an uncountable list of outcomes associated with the event. These are random variables which are associated with measurements like the score of a soccer team, height of a group of people and student test result. It is the collection of all the possible values of probability and random variables which are expected to occur as a result of the related event.
Example: An advertisement is on air 10 – 15 times a day on different channels. It is not possible for the audience to make an estimate or count after how much time it will be on air during 24 hours. It can be 10 minutes, 1 hour or 10 hours.
Another example of continuous random variables is the time a passenger has to wait on a bus stop waiting for it. It can be 2 minutes, 10 minutes or a few seconds. It is not possible to list all the possible times associated with the arrival of bus at the stop.
Discrete random variables
A discrete random variable is a variable which has measurable/countable outcomes. The probability of a discrete random variable is known as a discrete probability.
Example: A jar is filled with 10 red, 12 purple and 5 blue marble pieces. Here the number of red marbles is countable and the probability associated with it can be measurable; hence, it is a discrete random variable.
Another example of a discrete random variable is the number of students present in a class. It is countable and probability associated with the presence of students and can be measured easily by using related formulas.
The measure of the center of a random variable is known as the expected value. The evaluation of characteristics of a random variable is considered expected value associated with an event.
Once a probability model is established, the related expected values can be predicted. Finding the expected value for continuous variables is a difficult process, hence it should be restricted to discrete random variables. The random variable is denoted by X hence the expected value of a random variable is shown as E[X].
The following mathematical formula can be used to measure expected values associated with a discrete random variable:
E[X] = Σ xP (X = x)
The roll of dice has a sample space of 1, 2, 3, 4, 5, and 6. The probability model for a dice roll is:
P(X=1) = P(X=6) = 1/2, P(X=2) = P(X=4) = P(X=6) = 1/6, and P(X=3) = 1/3.
The expected value of these discrete random variables will be as follows:
E[X] = 1(1/12) + 2(1/6) + 3(1/3) + 4(1/6) + 5(1/12) + 6(1/6) = 3 1/6.
Properties of Expected Value
The properties of expected value of discrete random variables include:
Let C be the constant
X be a random variable
With a constant, the expected value of a random variable will be as follows:
E [X+c] = E [X] + c
E[cX] = cE [X]
Suppose X and Y are two random variables, then the expected value will be as follows:
E[X+Y] = E[X] + E[Y]
Now, having three constants (a, b, c) with two random variables (X, Y), the expected value can be calculated as follows:
E [aX + c] = aE[X] + c
E [aX + bY + c] = aE[X] + bE[Y] + c
Spread is the measure of similarity or a varied set of observed values for a particular set of data. The measures of spread include range, quartiles and interquartile range. In order to find out how spread out (varied) a distribution is, it has to be measured by using specific formula for standard deviation. For finding out standard deviation value, firstly, we have to measure the variance by using this formula:
Var (X) = Σ(x – E[X]) 2 P(X=x)
In the above formula:
X = Random variable
Var (X) = Variance of X
Let the Variance of a random variable of dice roll by using the above formula will be:
Var (X) = (1-3 1/6)2 (1/12) + (2-3 1/6)2 (1/6) + (3-3 1/6)2 (1/3) + (4-3 1/6)2 (1/6) + (5-3 1/6)2 (1/12) + (5-3 1/6)2 (1/12) + (6-3 1/6)2 (1/6)
Var (X) = 2.361
Standard deviation is the square root of the variance. When putting the value of variance in the formula of standard deviation, we get:
SD(X) =√ Var(X)
Properties of Variance
If there are three constants (a, b, c) with two random variables (X, Y), the properties of variance will be as follows:
Var (X + c) = Var (X)
Var (aX) = a2 Var(X).
Var (X+Y) = Var (X-Y) = Var (X) + Var (Y)
Var (aX+ bY + c) = a2 Var(X) + b2 Var(Y)
Example: Taking the previous example of the roll of the dice from expected value section, suppose the variance of the second roll of the dice is calculated as:
Var (Y) = 1.98
Var (X) = 2.361
Then the variance and standard deviation of dice game with score (3X+ 2Y + 1) will be as follows:
Var (3X – 2Y +1) = 9Var (X) + 4Var (Y)
= 9(2.361) + 4(1.98)
SD (3X -2y + 1) = √ Var (3X – 2y + 1)
Note on Continuous Random Variable
Continuous random variables model random phenomena. Medical students will not be tested or required to deal with the calculation of expected values and variances associated with random variables. It is part of the calculus, which is not relevant to the course of medial students.
Issues in Probability Models and Random Variable
Some of the issues to be encountered by the students while using probability models or dealing with random variables includes:
- The probability models are not always correct. The probability of data collection should be questioned in order to ensure accuracy.
- If, by mistake, a wrong or unsuitable probability model will be chosen during the research process, it will nullify the effect of the whole of the data collected. If a probability model is wrong, the outcomes related to it are wrong as well.
- Dependent variables should always be taken into account with great consideration. Expected values of a random variable are required to be added. For variance of expected values, the two random variables should always be independent; otherwise, the results will be wrong.
- We only add variances of independent variables. Standard deviation of the same data should not be added.
- Variances of independent variables will always be added, no matter if you are locating the difference between the two variables.