First, we need to know what frequency tables are. Suppose you have a set of members with various values. We can create a table that has various buckets of ranges of values in a column. Each bucket must have at least one value. Then we can count the number of members that fall into that bucket and note those counts against the buckets.
If bins is an integer, it defines the number of equal-width bins in the range.
If bins is a sequence, it defines the bin edges, including the left edge of the first bin and the right edge of the last bin; in this case, bins may be unequally spaced.
bins is interval. (default: 10)
In [1]:
import numpy as np
import matplotlib.pyplot as plt
In [2]:
scores = [1, 3, 5, 1, 2, 4, 4, 2, 5, 4, 3, 1, 2]
In [3]:
plt.hist(scores)
plt.show()
In [4]:
plt.hist(scores, bins = 5)
plt.show()
The histogram of one-dimensional data is a 2D figure.
In [5]:
plt.hist(scores, bins = 5, edgecolor="red")
plt.show()
In [6]:
plt.hist(scores, bins = 15, edgecolor="red")
plt.show()
In [7]:
x = np.random.randn(1000)
x[:20]
Out[7]:
In [8]:
plt.hist(x)
plt.show()
In [9]:
plt.hist(x, bins=30)
plt.show()
In [10]:
plt.hist(x, bins=30, density=True, histtype='step')
plt.show()
In [11]:
x1 = np.random.normal(0, 0.8, 1000)
x2 = np.random.normal(-2, 1, 1000)
x3 = np.random.normal(3, 2, 1000)
In [12]:
print(x1[:10])
print(x2[:10])
print(x3[:10])
In [13]:
plt.hist(x1)
plt.hist(x2)
plt.hist(x3);
In [14]:
kwargs = dict(alpha=0.3, histtype='stepfilled', bins=40)
plt.hist(x1, **kwargs)
plt.hist(x2, **kwargs)
plt.hist(x3, **kwargs)
plt.show()
In [15]:
kwargs = dict(alpha=0.3, histtype='stepfilled', bins=40, facecolor = 'g')
plt.hist(x1, **kwargs)
plt.hist(x2, **kwargs)
plt.hist(x3, **kwargs)
plt.show()
In [16]:
kwargs = dict(alpha=0.3, histtype='stepfilled', bins=40)
plt.hist(x1, **kwargs, label = 'x1')
plt.hist(x2, **kwargs, label = 'x2')
plt.hist(x3, **kwargs, label = 'x3')
plt.title('Histogram')
plt.xlabel("X-axis")
plt.ylabel("y-axis")
plt.legend()
plt.show()
In [17]:
n_points = 10000
bins = 50
In [18]:
x = np.random.randn(n_points)
y = np.random.randn(n_points)
In [19]:
print(x[:10])
print(y[:10])
In [20]:
plt.hist2d(x, y, bins=50)
plt.show()