Solving Common Probability Problems with Python Pt.2 — Continuous Data

norm_dist.cdf(210) - norm_dist.cdf(190)

# 0.71463211317374

Example

Question to answer:

What is the likelihood that an NBA player’s height falls within 2.10–2.20 m?

import numpy as np
import pandas as pd
from scipy.stats import norm
import matplotlib.pyplot as plt

df = pd.read_csv('files/Players.csv')


df['height'].describe()


"""
count 3921.000000
mean 198.704922
std 9.269761
min 160.000000
25% 190.000000
50% 198.000000
75% 206.000000
max 231.000000
Name: height, dtype: float64
"""


# Get a Normal Distribution instance for our height data.
dist = norm(df['height'].mean(), df['height'].std())

# Calculate the interval probability
p = dist.cdf(220) - dist.cdf(210)

# 0.10071770329577223
# x-axis: all height data points
# y-axis: probability for each individual data point. (calculated by pdf())

plt.bar(df['height'], df['height'].apply(lambda x: dist.pdf(x)))
plt.show()

Conclusion

--

--

--

I occasionally write about software, web, blockchain, machine learning, random thoughts.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

CDPs are the New Destination for Data

U.S. Senators’ Social Network — Analysis Based on Twitter

Internship at VBLP Tech Solutions

SAM: Strategic Asset Manager

Graphs and Real-Life Application

Data Science vs. Pump It Up Competition

Thinking of Shifting Careers into Data Analytics?

Introducing MarketCI Analytics

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Weiming Chen

Weiming Chen

I occasionally write about software, web, blockchain, machine learning, random thoughts.

More from Medium

Predicting Pulsar Star using Data Science

Hierarchical Clustering

K-Nearest Neighbors

Impact of Age on NFL Player Performance: Data Selection (Part 2)