Probabilities in statistics.
Probability theory is a branch of mathematics concerned with the analysis of random phenomena.
Probability is the measure of the likelihood that an event will occur in a Random Experiment. A random experiment is a physical situation whose outcome cannot be predicted until it is observed.
Probability is quantified as a number between 0 and 1, where, 0 indicates impossibility and 1 indicates certainty. The higher the probability of an event, the more likely it is that the event will occur.
Calculating probabilities for a number of events:
import matplotlib.pyplot as plt import numpy as np ## the basic formula # counts of the different events c = np.array([ 1, 2, 4, 3 ]) # convert to probability (%) prob = 100*c / np.sum(c) print(prob)
OUT: [10. 20. 40. 30.]
Calculating probabilities for drawing marbles from a jar:
The following Python code shows probabilities and proportions calculation for case of drawing marbles of different colors - blue, yellow and orange - out of the box. It shows difference between probability and proportion and how far can it be depending the number of draws.
# colored marble counts blue = 40 yellow = 30 orange = 20 totalMarbs = blue + yellow + orange # put them all in a jar jar = np.hstack((1*np.ones(blue),2*np.ones(yellow),3*np.ones(orange))) # now we draw 500 marbles (with replacement) numDraws = 500 drawColors = np.zeros(numDraws) for drawi in range(numDraws): # generate a random integer to draw randmarble = int(np.random.rand()*len(jar)) # store the color of that marble drawColors[drawi] = jar[randmarble] # now we need to know the proportion of colors drawn propBlue = sum(drawColors==1) / numDraws propYell = sum(drawColors==2) / numDraws propOran = sum(drawColors==3) / numDraws # plot those against the theoretical probability plt.bar([1,2,3],[ propBlue, propYell, propOran ],label='Proportion') plt.plot([0.5, 1.5],[blue/totalMarbs, blue/totalMarbs],'b',linewidth=3,label='Probability') plt.plot([1.5, 2.5],[yellow/totalMarbs,yellow/totalMarbs],'b',linewidth=3) plt.plot([2.5, 3.5],[orange/totalMarbs,orange/totalMarbs],'b',linewidth=3) plt.xticks([1,2,3],labels=('Blue','Yellow','Orange')) plt.xlabel('Marble color') plt.ylabel('Proportion/probability') plt.legend() plt.show()