Probability Concepts
In this lesson, we review the main probability concepts you should understand the math, assumptions, applications, and constraints of when preparing for interviews.
These concepts include:
- Laws of probability
- Random variable and probability functions
- Expected value
- Probability distributions
- Central limit theorem
- Law of large numbers
Laws of probability
The laws of probability are fundamental principles that govern the behavior and relationships between events in probability theory.
Numerical questions are common for this topic. Understand the underlying principles of conditional probability and joint probability for Bayes Theorem. Work through derivations of common formulas which you can fall back on if you forget the final formula.
If the question asks you to calculate the probability of an event given another event, it’s highly likely that you should solve it using the Bayes theorem.

Random variable and probability functions
A random variable is a numerical quantity that varies at random, taking on different values determined by chance. It is a foundational concept of probability theory.
Random variables are often characterized by probability functions, such as probability mass functions (PMFs) for discrete random variables and probability density functions (PDFs) for continuous random variables. These functions describe the probabilities of different outcomes or values.
Interview questions for this topic are usually related to the expected value and variance of random variables.
Expected value
The expected value, or mean, provides insight into the "average" outcome or performance of a random variable over many trials or observations. It can be interpreted as the value one would expect to obtain on average if the random experiment were repeated many times. It also represents the average payoff or utility of different possible actions, helping to identify the most favorable decision in probabilistic scenarios.
For common probability distributions (see the table below), we recommend memorizing the formula for the mean and using them to solve expected value questions.
For any random variable X, the variance of X is the expected value of the squared difference between X and its expected value:
Var[X] = E[(X-E[X])2 ] = E[X2 ] - (E[X])2
Variance of the sum of two random variables:
Var[X+Y] = Var[X] + Var[Y]+ 2⋅(E[XY] - E[X]⋅E[Y]), or
Var[X+Y] = Var[X] + Var[Y] + 2⋅Cov[X,Y].
If X and Y are independent,
Cov[X,Y] = 0
Probability distributions
A probability distribution is a list of outcomes and their associated probabilities. Probability distributions are mathematically represented by the probability functions mentioned above.
Common distributions you should know are:
- Normal (Gaussian): symmetric bell-shaped continuous probability distribution
- Bernoulli: discrete distribution with two possible outcomes (0 or 1)
- Binomial: discrete distribution representing the number of successes in n independent trials
- Poisson: discrete distribution representing the number of events occurring in a fixed interval of time or space
- Exponential: continuous distribution representing the time between events in a Poisson process
- Uniform: continuous distribution where all outcomes are equally likely within a specified range
You should understand the characteristics, applications, and properties (probability function, mean, and variance) of these distributions, which are described in the table below.

Central limit theorem
The central limit theorem (CLT) empowers data scientists to make reliable statistical inferences and effectively analyze data, even in situations where the underlying population distribution is unknown or non-normal.
CLT requires the following assumptions and conditions:
- The samples should be selected randomly from the population.
- The samples should be independent of each other.
- The sample size should be sufficiently large. While there is no strict rule, a sample size of at least 30 is often considered sufficient for the CLT to hold, although it can vary depending on the population distribution.
Law of large numbers
The law of large numbers (LLN) states that as the size of a sample increases, the sample mean approaches the population mean. In other words, the average of a large number of independent and identically distributed (i.i.d.) random variables tends to converge to the expected value of the underlying distribution.
Understanding LLN enables data scientists to interpret sample statistics accurately, make reliable inferences about population parameters, and effectively analyze data to extract meaningful insights.
LLN makes a key assumption that the random variables in the sample are i.i.d. A collection of random variables is i.i.d. if each random variable has the same probability distribution as the others, and all are mutually independent.
Practice statistics questions asked by top companies and receive peer feedback on our interview question database.