Python modules¶
Access Python modules by the following command or variations thereof:
import selectiontest
-
selectiontest.selectiontest.
test_neutrality
(sfs, variates0=None, variates1=None, reps=10000)¶ Calculate \(\rho\), the log odds ratio of the data for the distribution given by variates0 over the distribution given by variates1.
Parameters: - sfs (list) – Site frequency spectrum, e.g. [1, 3, 0, 2, 1]
- variates0 (numpy array) – Array of variates from null hypothesis distribution. Default uses Wright-Fisher model.
- variates1 (numpy array) – Array of variates from alternative distribution. Default uses `uniform’ model.
- reps (int) – Number of variates to generate if default is used.
Returns: \(\rho\) (value of log odds ratio). Values can include inf, -inf or nan if one or both probabilities are zero due to underflow error.
Return type: numpy.float64
-
selectiontest.selectiontest.
calculate_D
(sfs)¶ Calculate Tajima’s D from a site frequency spectrum.
Parameters: sfs (list) – Site frequency spectrum, e.g. [1, 3, 0, 2, 1] Returns: Value of Tajima’s D. Return type: numpy.float64
-
selectiontest.selectiontest.
sample_wf_distribution
(n, reps)¶ Calculate variates for the probability distribution Q under Wright Fisher model.
Parameters: - n (int) – Sample size
- reps (int) – Number of variates to generate if default is used.
Yields: numpy.ndarray – Array of variates (n-1)
-
selectiontest.selectiontest.
sample_uniform_distribution
(n, reps)¶ Calculate variates for the uniform probability distribution Q.
Parameters: - n (int) – Sample size
- reps (int) – Number of variates to generate if default is used.
Returns: Array of variates (reps, n-1)
Return type: numpy.ndarray
-
selectiontest.selectiontest.
compute_threshold
(n, seg_sites, sreps=10000, wreps=10000, fpr=0.02)¶ Calculate threshold value of \(\rho\) corresponding to a given false positive rate (FPR). For values of \(\rho\) above the threshold we reject the null (by default neutral) hypothesis.
Parameters: - n (int) – Sample size
- seg_sites (int) – Number of segregating sites in sample.
- sreps (int) – Number of SFS configs and of uniform variates to generate if default is used.
- wreps (int) – Number of Wright-Fisher variates to generate if default is used.
- fpr (float) – Selected FPR tolerance.
Returns: Threshold value for log odds ratio
Return type: numpy.float64
-
selectiontest.selectiontest.
piecewise_constant_variates
(n, timepoints, pop_sizes, reps=10000)¶ Generate variates corresponding to a piecewise constant demographic history.
Parameters: - n (int) – Sample size
- timepoints (array-like) – Times at which population changes (in generations, backward from the present).
- pop_sizes (array-like) – Population sizes between timepoints (only relative sizes matter.)
- reps (int) – Number of variates to generate.
Yields: numpy.ndarray – Variates
-
selectiontest.selectiontest.
vcf2sfs
(vcf_file, panel, coord, start, end, select_chr=True)¶ Get SFS from vcf data for given population and sequence. The panel file is used to select samples.
Parameters: - vcf_file (pyvcf class: Reader (https://pyvcf.readthedocs.io/en/latest/)) – Variant details
- panel (pandas DataFrame) – Proband details
- coord (str) – Coordinate (e.g. chromosome).
- start (int) – Start position of sequence.
- end (int) – End position of sequence.
- select_chr (bool) – If True, sample first chromosome. If False, use both.
Returns: - list – Site frequency spectrum
- int – Sample size
- list – Names of variants common to all elements of the sample.
The module vcf2sfs uses the pyVCF library for VCF processing: see https://pypi.org/project/PyVCF/.