Are you more of a visual learner? Check out our online video lectures and start your genetics course now for free!


Image: “Crowd.” by James Cridland. License: CC BY 2.0

Common Terminology in Population Genetics

  • A population can be defined as a group of interbreeding persons that are present together at the same time.
  • Genetic variation is the degree of differences seen among individuals, such as height, color, etc.
  • A genotype refers to the particular set of genes carried by an individual.
  • Gene pool refers to a collection of all genes found in a population.

Genotype and Allelic Frequencies

There are 23 pairs of chromosomes or a total of 46 chromosomes in each cell of the normal human body. Each chromosome comprises several thousand genes that contain the code for protein synthesis. Genes are located on the chromosome at the genetic locus; each locus has two genes, with one inherited from each of the parents. A variant or alternative form of the gene arising as a result of a mutation is called an allele, which is located at the same position as the gene and controls the same characteristic as that of the gene.

A genotype is the unique permutation and combination of genes in an individual that determines how and which proteins are to be synthesized. On the other hand, a phenotype determines the external appearance of the individual. It is different from the genotype as not all the instructions in the gene are expressed (or synthesized).

Note: The gene pool in a freely interbreeding population is formed by all the alleles of all the genes found in that population. Population genetics studies the allelic and genotypic variation in this gene pool and its influences on succeeding generations.

Measuring Genetic Variation

The mean number of alleles (MNA) is calculated manually to indicate the allele frequency for different populations in equilibrium. It is an indicator of genetic variation when calculated over several loci. A low MNA indicates low genetic variation, which is found in populations who are genetically isolated (inbreeding) or as a result of population bottlenecks; it is also associated with founder effects. On the other hand, if the MNA is high, it indicates greater allelic diversity probably as a result of crossbreeding.

Variation in gene frequency amongst loci and different breeds is calculated using chi-square analysis, while contingent chi-square analysis helps determine independent genotypes in all breeds.

The Hardy–Weinberg (HW) Principle

In a population at equilibrium, allelic and genotypic frequencies in that population will remain constant from generation to generation.

Godfrey Hardy & Wilhelm Weinberg, 1908

According to this principle, when the gene and genotype frequencies remain constant over generations, the population is said to be in Hardy–Weinberg equilibrium (HWE). This presumes the following conditions:

  • the population is large
  • there is no immigration or emigration of individuals in the population
  • there is no mutation
  • there is random mating
  • there is no natural selection

Mutation, migration, and selection with the non-random union of gametes can influence gene and genotype frequencies. Deviation from HWE can occur as a result of inbreeding, genotyping problems, or population stratification. Several statistical methods are helpful in calculating the deviation.

Note: The HW principle is applicable to genes with two alleles – one of which is dominant while the other is recessive.

Genotype frequency calculation

Assuming that the total genotype frequency is 1.0, the hypothetical frequency of each genotype is calculated as the number of individuals in the population with that genotype divided by the total number of individuals in the population.

Genotypes Number of individuals with these genotypes Genotype frequency
AA 647 0.82
Aa 134 0.168
aa 7 0.011
Total 788 1.0

Table 1: Calculation of genotype frequency

Allele frequency calculation from genotype frequencies

Using genotype numbers, the population allele frequencies can be calculated. The total number of allele copies in the population divided by the total number of all alleles of the gene provides the allele frequency. Consider the following example:

The total number of dominant A alleles in the population above is the sum of the following:

Number of AA individuals x 2 (number of A alleles per person) = 647 x 2 = 1294

+ Number of Aa individuals x 1 (number of A allele per person) =  + 134


Total:  1428

Since there are 788 individuals in the population and since they are diploid, the total number of alleles will be 2000.

So, the frequency of the dominant allele A in the population will be 1428/1576 = 0.906.

Since the total allele frequency is 1.0, and since there are two alleles, A and a, the derived recessive allele a frequency = 1.0 -0.906 = 0.094.

Genotypic frequency calculation from allele frequencies

If the population is in equilibrium, the HW principle postulates that it should be possible to calculate genotype frequency from the allele frequencies; if it is presumed that the frequency of dominant allele A = p and recessive allele a = q, then,

Frequency of the AA genotype is p2 (homozygous)

Frequency of the Aa genotype is 2pq   (heterozygous)

Frequency of the aa genotype is q2   (homozygous)

Since there are only two alleles and the sum total of the three probable genotypes is 1.0,

p2+ 2pq +q2 = 1

Plugging in values from the example mentioned above in the formula, we arrive at the following:

(0.906)2 + 2 (0.906) (0.094) + (0.094)2  = 1

i.e. 0.821 + 0.170 + 0.009 = 1

The Punnet Square

This is a diagram designed by Reginald Punnet to calculate the probability of a couple bearing an offspring with a specific genotype. For example, consider the trait (phenotype) eye color as follows:

A = black color, a = brown color

If the mother and the father have the genotype Aa = black eyes, then the probability of their offspring having genotype AA = 25%, Aa = 50%, and aa = 25% which means that three children born to this couple will have black eyes, while one will have brown eyes.

Paternal   A Paternal a
Maternal A AA Aa
Maternal a Aa aa

HWE in Autosomal Recessive Disease

In certain genetic disorders, such as autosomal recessive conditions, the allele is rare and there are more heterozygotes compared to homozygotes. This means that 2pq is much greater than q2. For example, the incidence of phenylketonuria (PKU) is 1 in 10,000 live births and it is difficult to calculate the number of heterozygotes but, using the HW principle, it is possible to calculate the number of heterozygous carriers as follows:

q2 = 1/ 10,000,     therefore q = 1/100 = 0.01

Since p + q = 1, p = 1 – q = 0.99

Therefore, 2pq = 2 x 0.99 x 0.01 = 0.02 is the incidence of heterozygous carriers.

Predicting the probability of the progeny having PKU

For example, if the mother has PKU (homozygous) and the father is a carrier (heterozygous) and has a 50% chance of passing a recessive allele to his offspring, then the incidence of PKU in their offspring can be calculated as follows:

0.02 x 0.50 = 0.01 = 1% probability of the couple bearing a child with PKU

HWE in X-linked Disease

Conditions like hemophilia and color blindness are inherited as X-linked disorders. The allele frequency in these disorders can be calculated by observing the number of males with these conditions, as males have only one X-chromosome (hemizygous). For example, consider color blindness and the following table:

Sex Genotype Phenotype Incidence (approx)
Male  X+ Normal p = 0.92
Xcb Color blindness q = 0.08
Female X+/X+ Normal (homozygote) P= (0.92)2 = 0.8464
X+/Xcb Normal (heterozygote) 2pq = 2 (0.92) (0.08) = 0.1472
Xcb/Xcb Normal (combined) color blind p + 2pq = 0.9936

q = (0.08)2 = 0.0064

The incidence of female carriers (heterozygous i.e. 2pq) can be calculated as follows:

2pq = 2 (0.08) (0.92) = 0.1472 which means that approximately 15% of the women will be carriers of the allele.

Factors Responsible for Genetic Variation

The HW principle assumes an ideal equilibrium between the genotypic and allelic frequencies without mutations, or genetic drift or natural selection, or random mating in a large, freely interbreeding ideal population. However, over time, to enable population survival, the gene pool can change. This change depends upon several factors, which can be classified as follows:

  1. Genetic factors, e.g., random mating, mutation, genetic drift, natural selection
  2. Environmental factors, e.g., diversity in the environment
  3. Societal factors, e.g., population size, migration of populations


Consanguineous mating refers to the mating between relatives. It is also known as inbreeding and is the most common form of non-random mating. Consanguineous mating increases homozygosity, decreases heterozygosity, and deviates from the HW principles. Inbreeding can be harmful with rare recessive alleles becoming homozygous and manifesting phenotypically. Amongst humans, the most common type of inbreeding is that between first cousins.

Stratification occurs when mates are selected from a restricted sub-group.

In assortative mating, like attracts like and can actually be beneficial.


Mutation is defined as a structural change in the gene and formation of genetic variants that are transmitted over subsequent generations. Alterations in the DNA, including deletion, insertion, or re-arrangement of genes or chromosomes, can lead to mutations, which are the main cause of genetic variations. They are usually deleterious, but can sometimes be beneficial too.

Illustrations of five types of chromosomal mutations.

Illustrations of Five Types of Chromosomal Mutations.

Genetic drift

Genetic drift refers to the random fluctuations during allele transfer from one generation to the next. It occurs when small populations are formed due to an adverse environment (bottleneck effect) or due to the separation of a subset of the population geographically (founder effect). Genetic drift leads to random changes in allele frequencies over time but does not lead to deviations from the HW equilibrium like inbreeding.

Bottleneck effect


Image: “Representation of a Population Bottleneck. Colored balls represent the alleles present in the population. The population size was initially 500; within five years, the size of the population has dwindled to 50, and within ten years, to just ten. As a consequence of the population bottleneck, there has been a random drift in the allele frequency distribution and a loss of two of the original five alleles.” by Professor Marginalia – Own work. License: CC BY-SA 3.0

Founder effect


Image: “Representation of the Founder Effect. The colored balls represent the two alleles for a specific locus, which are present in a hypothetical population; once a random sub-group of a population becomes separated from its ancestral population, the allele frequencies in the subsequent generations of the two groups can diverge widely within a relatively short period of time as a consequence of a purely random selection of alleles for reproduction.” by Professor Marginalia – Own work. License: CC BY-SA 3.0

Natural selection

Natural selection is the process whereby genotypes, which promote survival of the species in the existing environment, become more common and increase in frequency among reproducing individuals from one generation to the next. This enables individuals within the population to adapt to environmental conditions, survive, and reproduce.

The effects of natural selection are directional. The allele can either be beneficial and increase within the population’s gene pool or can be deleterious and disappear. Different populations have different habitats; natural selection can create differences among populations through different alleles in different areas.

The best example of this phenomenon is sickle cell anemia: Homozygous individuals with two copies of the mutant gene for sickle hemoglobin (HbS/HbS) suffer from the disease, while heterozygous individuals (HbS/HbA) are carriers of this condition. These heterozygous individuals are known to be resistant to malaria as opposed to homozygous individuals or those with the normal gene (HbA/HbA). This advantage of natural selection is responsible for the maintenance of the HbS gene in the population.


Image: “These charts depict different types of genetic selection. On each graph, the x-axis denotes the type of phenotypic trait, while the y-axis depicts the number of organisms. Group A is the original population and Group B is the population after selection. Graph 1 shows directional selection, in which a single extreme phenotype is favored. Graph 2 depicts stabilizing selection, where the intermediate phenotype is favored over the extreme traits. Graph 3 shows disruptive selection, in which extreme phenotypes are favored over the intermediate variant.” by Ealbert17 – Own work. License: CC BY-SA 4.0

Cancer Genetics

Cancer is a disease caused due to changes in our genes. These changes, which are due to a mutation in the DNA are of the following types:

Inherited changes, known as germline mutations, are present in every cell of the progeny, i.e. they are present in the gametes (ova and sperm).

Somatic changes are acquired during the lifetime of the individual due to exposure to carcinogens such as tobacco and radiation.

Oncogenes and Tumor Suppressor Genes

A single mutation rarely causes cancer. Gene mutations build up over time, cause changes in cell proliferation and apoptosis, and eventually lead to the development of cancers. Therefore, malignancies are usually observed in the elderly. A majority of the cancers are sporadic, while few are inherited.

Tumor suppressor genes control the rate of cell growth, monitor cell division, and repair mismatched DNA. Mutation of a tumor suppressor gene leads to uncontrolled proliferation of cells leading to the formation of a tumor. A majority of the cancers are due to p53 gene mutations and are usually acquired mutations. Germ-line mutations in p53 are rare and include Li Fraumeni syndrome. On the other hand, mutations in BRCA1 and BRCA2 are associated with a high incidence of hereditary conditions such as ovarian and breast malignancies.

Oncogenes, such as ras and HER2, are genes whose mutations lead to the development of malignancies. The ras genes control protein synthesis, which regulates cell communication, cell growth, and apoptosis pathways, while HER2 genes control the growth and spread of cancers, especially those of the breast and ovary.

DNA repair genes help repair errors during DNA duplication; however, mutations in these genes can lead to malignancies, especially if they occur in oncogenes. Mutations in the DNA repair gene can be either acquired or inherited. Lynch syndrome occurs due to inherited mutations.

Hereditary Cancers and Expression Profiling in Prognosis

Genetic microarray analysis is the study of variations in genetic transcription between normal and malignant cells. It helps differentiate between gene expression amongst hundreds of genes and create gene-expression profiles for various malignancies. It has revolutionized our understanding of the heterogeneity of malignancies, serves as a prognostic indicator, and helps develop targeted treatments. For example, expression profiling in breast cancer is useful to screen for estrogen/progesterone receptors, HER2 oncogene amplification, and t0 exclude metastatic lymph nodes.

Learn. Apply. Retain.
Your path to achieve medical excellence.
Study for medical school and boards with Lecturio.
Rate this article
1 Star2 Stars3 Stars4 Stars5 Stars (Votes: 7, average: 4.29)