# 7 Statistical Distributions that every Data Scientist should know— with intuitive explanations (2023)

## Intuitive explanations for the Normal, Bernoulli, Binomial, Poisson, Exponential, Gamma and Weibull distribution — with Python example code

Statistical Distributions are an important tool in data science. A distribution helps us to understand a variable by giving us an idea of the values that the variable is most likely to obtain.

Besides, when knowing the distribution of a variable, we can do all sorts of probability calculations, to compute probabilities of certain situations occurring.

In this article, I share 7 Statistical Distributions with intuitive examples that often occur in real-life data.

The Normal or Gaussian distribution is arguably the most famous distribution, as it occurs in many natural situations.

A variable with a normal distribution has an average, which is also the most common value. Values closer to the average are more likely to occur, and the further a value is away from the average, the less likely it is to occur.

(Video) Probability: Types of Distributions

The normal distribution is also characterized by symmetric variation around the average, described by the standard deviation. This means that higher values are as common as lower values.

Examples of the normal distribution can be found in many variables that are natural, continuous variables. For example, the weight or height of animals would follow a normal distribution, as most animals are of the average weight, some are a little over or underweight but not so many are extremely skinny or extremely fat.

Human IQ is also a very famous example of the normal distribution, where the average is 100 and the standard deviation is 15. Most people are average intelligent, some are a bit smarter or a bit less smart, and few are very intelligent or very unintelligent.

The Bernoulli Distribution describes a probabilistic event that is repeated only once and which has only 2 possible outcomes. Those two outcomes are usually called Success, or 1, and Failure, or 0 (you can call everything success or failure, depending on what you look at).

This distribution is therefore quite simple. It has only one parameter, which is the probability of success.

A famous example is the coin flip, in which we could call either side a success. The probability of success is 0.5. This would lead to the following graph:

But the 50/50 is not a part of the Bernoulli distributions. Another example of the Bernoulli distribution is the probability of throwing a dart in the bull’s eye. It’s either in there, or it isn’t, so this makes it a 2-outcome situation. For a bad darts player, the probability of success could be 0.1, giving the following distribution:

(Video) Teach me STATISTICS in half an hour! Seriously.

The Binomial distribution is like a bigger brother of the Bernoulli distribution. It models the number of successes in a situation of repeated Bernoulli experiments. So rather than focusing on the probability of success, we focus on a success count.

The two parameters for the Binomial distribution are the number of experiments and the probability of success. A basic example of flipping a coin ten times would have the number of experiments equal to 10 and the probability of success equal to 0.5. This gives the following probability for each number of successes out of 10:

Another example of the Binomial distribution would be the probability of getting in a traffic jam in a given week, knowing that the probability of getting in a traffic jam on 1 given day is 0.2. This is a repetition of 1 Bernoulli yes/no variable on 5 works days, so the parameters are: number of experiments is 5 and the probability of success is 0.2. The outcome graph below shows that it is most likely to have 1 traffic jam, then 0 and then 2, 3, 4, and 5 respectively.

The Poisson distribution describes a number of events in a fixed time frame. The type of event you could think about is the number of customers entering a store every 15 minutes. In this case, we keep the 15 minutes as a fixed value (unit time) so that we can ignore it in the rest of the calculations.

In this scenario, there would be an average number of customers entering each unit time, which is called the rate. This rate is called Lambda and it is the only parameter needed for the Poisson distribution.

In the following example, the rate lambda is 4, so on average 4 events happen every unit time (15 minutes in this example). In the graph we can see that 3 or 4 events are most likely, then the counts diminish gradually to both sides. Anything over 12 events per unit time becomes so improbable that we cannot see their bars on the graph.

(Video) Bayes' Theorem Explained Intuitively｜Data Scientists Must Know

Other examples of Poisson events could be the number of cars passing at a certain location. Also, almost anything that has a count per unit time could be considered for a Poisson distribution.

The Exponential distribution is related to the Poisson distribution. Where the Poisson distribution describes the number of events per unit time, the exponential distribution describes the waiting time between events.

It takes the same parameter as the Poisson distribution: the event rate. In some cases, however, (amongst others in Python’s Scipy) people prefer to use the parameter 1 / event rate.

You should read the x-axes as a percentage of unit time. In the Poisson example, we said that unit time is 15 minutes. Now if we have 4 people in 15 minutes, we are most likely to wait 0.25, or 25% of this unit time for each new person. 25% of 15 minutes is 3,75 minutes.

The Gamma distribution is a variation on the exponential distribution. Rather than describing the time between events, it describes the time to wait for a fixed number of events.

It takes two parameters: the lambda parameter of the exponential distribution, plus a k parameter for the number of events to wait for.

As an example, you can think of an attraction park that can only launch an attraction when it is full, let’s say, 10 people. If they have an event rate of 4 customers coming in every 2 minutes on average, they could describe the waiting time for launching the attraction using a Gamma distribution.

(Video) Most Important CLASSIFICATION Methods: VERY INTUITIVE EXPLANATION

In the graph below we can see that the top is around 2.5. This means that, on average, the waiting time for 10 people is 2.5 times the unit time of 2 minutes (that is 5 minutes). This makes sense since we have 4 people every unit time and a total need of 10.

Something interesting from the graph is that we could be waiting up to 6 times the unit time, so (6 times 2 minutes =) 12 minutes. That is way more than the average of 5 minutes and it might be too long to wait for the customers that arrived first.

Other examples of the Gamma distributions are everything where you have to wait for an x number of events, given that the events don’t necessarily happen at the same time.

When the Gamma distribution is used for waiting times like in this example, it is also called the Erlang distribution, but I won’t go into more detail here.

The Weibull distribution is another distribution that is a variation of the waiting time problem. It describes a waiting time for one event, if that event becomes more or less likely with time.

A clear example would be the life time of a computer. You can wait for a certain time until your computer will be too old and break. The longer you have your computer, the more likely it becomes that it will break. So the probability of failure, or the rate, is not constant.

The Weibull distribution takes two parameters. Firstly, the rate parameter as in the Poisson and exponential distribution. Secondly a c parameter. A c of 1 means that there is a constant event rate (so that is actually an exponential distribution). A c higher than one means that the event rate increases with time. A c below 1 means that the event rate decreases with time.

If we take the same value of lambda as in the Poisson example (lambda = 4), and we add a value for c of 1.1 (so an increasing rate with time), we get the following result:

(Video) Statistics - A Full University Course on Data Science Basics

## Videos

1. Data Science Job Interview – Full Mock Interview
(freeCodeCamp.org)
2. Statistical Significance and p-Values Explained Intuitively
(Data Demystified)
3. Statistical data analysis | Statistical Data Science | Part 1
(Geek's Lesson)
4. Science of Data Visualization | Bar, scatter plot, line, histograms, pie, box plots, bubble chart
(Professor Ryan)
5. Entropy (for data science) Clearly Explained!!!
(StatQuest with Josh Starmer)
6. Statistical Power, Clearly Explained!!!
(StatQuest with Josh Starmer)

## References

Top Articles
Latest Posts
Article information

Author: Jerrold Considine

Last Updated: 13/09/2023

Views: 5301

Rating: 4.8 / 5 (78 voted)

Author information

Name: Jerrold Considine

Birthday: 1993-11-03

Address: Suite 447 3463 Marybelle Circles, New Marlin, AL 20765

Phone: +5816749283868

Job: Sales Executive

Hobby: Air sports, Sand art, Electronics, LARPing, Baseball, Book restoration, Puzzles

Introduction: My name is Jerrold Considine, I am a combative, cheerful, encouraging, happy, enthusiastic, funny, kind person who loves writing and wants to share my knowledge and understanding with you.