Python modules¶

Access Python modules by the following command or variations thereof:

import selectiontest

selectiontest.selectiontest.test_neutrality(sfs, variates0=None, variates1=None, reps=10000)¶

Calculate \(\rho\), the log odds ratio of the data for the distribution given by variates0 over the distribution given by variates1.

Parameters:	sfs (list) – Site frequency spectrum, e.g. [1, 3, 0, 2, 1] variates0 (numpy array) – Array of variates from null hypothesis distribution. Default uses Wright-Fisher model. variates1 (numpy array) – Array of variates from alternative distribution. Default uses `uniform’ model. reps (int) – Number of variates to generate if default is used.
Returns:	\(\rho\) (value of log odds ratio). Values can include inf, -inf or nan if one or both probabilities are zero due to underflow error.
Return type:	numpy.float64

selectiontest.selectiontest.calculate_D(sfs)¶

Calculate Tajima’s D from a site frequency spectrum.

Parameters:	sfs (list) – Site frequency spectrum, e.g. [1, 3, 0, 2, 1]
Returns:	Value of Tajima’s D.
Return type:	numpy.float64

selectiontest.selectiontest.sample_wf_distribution(n, reps)¶

Calculate variates for the probability distribution Q under Wright Fisher model.

Parameters:	n (int) – Sample size reps (int) – Number of variates to generate if default is used.
Yields:	numpy.ndarray – Array of variates (n-1)

selectiontest.selectiontest.sample_uniform_distribution(n, reps)¶

Calculate variates for the uniform probability distribution Q.

Parameters:	n (int) – Sample size reps (int) – Number of variates to generate if default is used.
Returns:	Array of variates (reps, n-1)
Return type:	numpy.ndarray

selectiontest.selectiontest.compute_threshold(n, seg_sites, sreps=10000, wreps=10000, fpr=0.02)¶

Calculate threshold value of \(\rho\) corresponding to a given false positive rate (FPR). For values of \(\rho\) above the threshold we reject the null (by default neutral) hypothesis.

Parameters:	n (int) – Sample size seg_sites (int) – Number of segregating sites in sample. sreps (int) – Number of SFS configs and of uniform variates to generate if default is used. wreps (int) – Number of Wright-Fisher variates to generate if default is used. fpr (float) – Selected FPR tolerance.
Returns:	Threshold value for log odds ratio
Return type:	numpy.float64

selectiontest.selectiontest.piecewise_constant_variates(n, timepoints, pop_sizes, reps=10000)¶

Generate variates corresponding to a piecewise constant demographic history.

Parameters:	n (int) – Sample size timepoints (array-like) – Times at which population changes (in generations, backward from the present). pop_sizes (array-like) – Population sizes between timepoints (only relative sizes matter.) reps (int) – Number of variates to generate.
Yields:	numpy.ndarray – Variates

selectiontest.selectiontest.vcf2sfs(vcf_file, panel, coord, start, end, select_chr=True)¶

Get SFS from vcf data for given population and sequence. The panel file is used to select samples.

Parameters:

vcf_file (pyvcf class: Reader (https://pyvcf.readthedocs.io/en/latest/)) – Variant details
panel (pandas DataFrame) – Proband details
coord (str) – Coordinate (e.g. chromosome).
start (int) – Start position of sequence.
end (int) – End position of sequence.
select_chr (bool) – If True, sample first chromosome. If False, use both.

Returns:

list – Site frequency spectrum
int – Sample size
list – Names of variants common to all elements of the sample.

The module vcf2sfs uses the pyVCF library for VCF processing: see https://pypi.org/project/PyVCF/.