Draw samples from a Hypergeometric distribution.
Samples are drawn from a Hypergeometric distribution with specified
parameters, ngood (ways to make a good selection), nbad (ways to make
a bad selection), and nsample = number of items sampled, which is less
than or equal to the sum ngood + nbad.
Parameters : | ngood : float (but truncated to an integer)
nbad : float
nsample : float
parameter, > 0 and <= ngood+nbad
size : {tuple, int}
Output shape. If the given shape is, e.g., (m, n, k), then
m * n * k samples are drawn.
|
Returns : | samples : {ndarray, scalar}
where the values are all integers in [0, n].
|
See also
- scipy.stats.distributions.hypergeom
- probability density function, distribution or cumulative density function, etc.
Notes
The probability density for the Hypergeometric distribution is
System Message: WARNING/2 (P(x) = \frac{\binom{m}{n}\binom{N-m}{n-x}}{\binom{N}{n}},
)
latex exited with error:
[stderr]
[stdout]
This is pdfTeX, Version 3.1415926-2.5-1.40.14 (TeX Live 2013/TeX Live for SUSE Linux)
restricted \write18 enabled.
entering extended mode
(./math.tex
LaTeX2e <2011/06/27>
Babel <3.9f> and hyphenation patterns for 78 languages loaded.
(/usr/share/texmf/tex/latex/base/article.cls
Document Class: article 2007/10/19 v1.4h Standard LaTeX document class
(/usr/share/texmf/tex/latex/base/size12.clo))
(/usr/share/texmf/tex/latex/base/inputenc.sty
! LaTeX Error: File `utf8x.def’ not found.
Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: def)
Enter file name:
! Emergency stop.
<read *>
l.131 \endinput
^^M
No pages of output.
Transcript written on math.log.
where
System Message: WARNING/2 (0 \le x \le m)
latex exited with error:
[stderr]
[stdout]
This is pdfTeX, Version 3.1415926-2.5-1.40.14 (TeX Live 2013/TeX Live for SUSE Linux)
restricted \write18 enabled.
entering extended mode
(./math.tex
LaTeX2e <2011/06/27>
Babel <3.9f> and hyphenation patterns for 78 languages loaded.
(/usr/share/texmf/tex/latex/base/article.cls
Document Class: article 2007/10/19 v1.4h Standard LaTeX document class
(/usr/share/texmf/tex/latex/base/size12.clo))
(/usr/share/texmf/tex/latex/base/inputenc.sty
! LaTeX Error: File `utf8x.def’ not found.
Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: def)
Enter file name:
! Emergency stop.
<read *>
l.131 \endinput
^^M
No pages of output.
Transcript written on math.log.
and
System Message: WARNING/2 (n+m-N \le x \le n)
latex exited with error:
[stderr]
[stdout]
This is pdfTeX, Version 3.1415926-2.5-1.40.14 (TeX Live 2013/TeX Live for SUSE Linux)
restricted \write18 enabled.
entering extended mode
(./math.tex
LaTeX2e <2011/06/27>
Babel <3.9f> and hyphenation patterns for 78 languages loaded.
(/usr/share/texmf/tex/latex/base/article.cls
Document Class: article 2007/10/19 v1.4h Standard LaTeX document class
(/usr/share/texmf/tex/latex/base/size12.clo))
(/usr/share/texmf/tex/latex/base/inputenc.sty
! LaTeX Error: File `utf8x.def’ not found.
Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: def)
Enter file name:
! Emergency stop.
<read *>
l.131 \endinput
^^M
No pages of output.
Transcript written on math.log.
for P(x) the probability of x successes, n = ngood, m = nbad, and
N = number of samples.
Consider an urn with black and white marbles in it, ngood of them
black and nbad are white. If you draw nsample balls without
replacement, then the Hypergeometric distribution describes the
distribution of black balls in the drawn sample.
Note that this distribution is very similar to the Binomial
distribution, except that in this case, samples are drawn without
replacement, whereas in the Binomial case samples are drawn with
replacement (or the sample space is infinite). As the sample space
becomes large, this distribution approaches the Binomial.
References
[R196] | Lentner, Marvin, “Elementary Applied Statistics”, Bogden
and Quigley, 1972. |
Examples
Draw samples from the distribution:
>>> ngood, nbad, nsamp = 100, 2, 10
# number of good, number of bad, and number of samples
>>> s = np.random.hypergeometric(ngood, nbad, nsamp, 1000)
>>> hist(s)
# note that it is very unlikely to grab both bad items
Suppose you have an urn with 15 white and 15 black marbles.
If you pull 15 marbles at random, how likely is it that
12 or more of them are one color?
>>> s = np.random.hypergeometric(15, 15, 15, 100000)
>>> sum(s>=12)/100000. + sum(s<=3)/100000.
# answer = 0.003 ... pretty unlikely!