Draw samples from a Zipf distribution.
Samples are drawn from a Zipf distribution with specified parameter
a > 1.
The Zipf distribution (also known as the zeta distribution) is a
continuous probability distribution that satisfies Zipf’s law: the
frequency of an item is inversely proportional to its rank in a
frequency table.
Parameters : | a : float > 1
size : int or tuple of int, optional
Output shape. If the given shape is, e.g., (m, n, k), then
m * n * k samples are drawn; a single integer is equivalent in
its result to providing a mono-tuple, i.e., a 1-D array of length
size is returned. The default is None, in which case a single
scalar is returned.
|
Returns : | samples : scalar or ndarray
The returned samples are greater than or equal to one.
|
See also
- scipy.stats.distributions.zipf
- probability density function, distribution, or cumulative density function, etc.
Notes
The probability density for the Zipf distribution is
System Message: WARNING/2 (p(x) = \frac{x^{-a}}{\zeta(a)},
)
latex exited with error:
[stderr]
[stdout]
This is pdfTeX, Version 3.1415926-2.5-1.40.14 (TeX Live 2013/TeX Live for SUSE Linux)
restricted \write18 enabled.
entering extended mode
(./math.tex
LaTeX2e <2011/06/27>
Babel <3.9f> and hyphenation patterns for 78 languages loaded.
(/usr/share/texmf/tex/latex/base/article.cls
Document Class: article 2007/10/19 v1.4h Standard LaTeX document class
(/usr/share/texmf/tex/latex/base/size12.clo))
(/usr/share/texmf/tex/latex/base/inputenc.sty
! LaTeX Error: File `utf8x.def’ not found.
Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: def)
Enter file name:
! Emergency stop.
<read *>
l.131 \endinput
^^M
No pages of output.
Transcript written on math.log.
where
System Message: WARNING/2 (\zeta)
latex exited with error:
[stderr]
[stdout]
This is pdfTeX, Version 3.1415926-2.5-1.40.14 (TeX Live 2013/TeX Live for SUSE Linux)
restricted \write18 enabled.
entering extended mode
(./math.tex
LaTeX2e <2011/06/27>
Babel <3.9f> and hyphenation patterns for 78 languages loaded.
(/usr/share/texmf/tex/latex/base/article.cls
Document Class: article 2007/10/19 v1.4h Standard LaTeX document class
(/usr/share/texmf/tex/latex/base/size12.clo))
(/usr/share/texmf/tex/latex/base/inputenc.sty
! LaTeX Error: File `utf8x.def’ not found.
Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: def)
Enter file name:
! Emergency stop.
<read *>
l.131 \endinput
^^M
No pages of output.
Transcript written on math.log.
is the Riemann Zeta function.
It is named for the American linguist George Kingsley Zipf, who noted
that the frequency of any word in a sample of a language is inversely
proportional to its rank in the frequency table.
References
Zipf, G. K., Selected Studies of the Principle of Relative Frequency
in Language, Cambridge, MA: Harvard Univ. Press, 1932.
Examples
Draw samples from the distribution:
>>> a = 2. # parameter
>>> s = np.random.zipf(a, 1000)
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> import scipy.special as sps
Truncate s values at 50 so plot is interesting
>>> count, bins, ignored = plt.hist(s[s<50], 50, normed=True)
>>> x = np.arange(1., 50.)
>>> y = x**(-a)/sps.zetac(a)
>>> plt.plot(x, y/max(y), linewidth=2, color='r')
>>> plt.show()
(Source code)