Cultural advice

The Australian National University acknowledges, celebrates and pays our respects to the Ngunnawal and Ngambri people of the Canberra region and to all First Nations Australians on whose traditional lands we meet and work, and whose cultures are among the oldest continuing cultures in human history.

Aboriginal and Torres Strait Islander peoples are advised that ANU Library collections may include images, names, voices, and other representations of deceased persons.

Material in the collection may contain terms, language or views that reflect the period in which the item was created and may be considered inappropriate today.

Regarding the F-word: The effects of data filtering on inferred genotype-environment associations

Loading...
Thumbnail Image

Date

Authors

Ahrens, Collin W.
Jordan, Rebecca
Bragg, Jason
Harrison, Peter A.
Hopley, Tara
Bothwell, Helen
Murray, Kevin
Steane, Dorothy
Whale, John
Byrne, Margaret

Journal Title

Journal ISSN

Volume Title

Publisher

Wiley-Blackwell

Abstract

Genotype-environment association (GEA) methods have become part of the standard landscape genomics toolkit, yet, we know little about how to best filter genotype-by-sequencing data to provide robust inferences for environmental adaptation. In many cases, default filtering thresholds for minor allele frequency and missing data are applied regardless of sample size, having unknown impacts on the results, negatively affecting management strategies. Here, we investigate the effects of filtering on GEA results and the potential implications for assessment of adaptation to environment. We use empirical and simulated data sets derived from two widespread tree species to assess the effects of filtering on GEA outputs. Critically, we find that the level of filtering of missing data and minor allele frequency affect the identification of true positives. Even slight adjustments to these thresholds can change the rate of true positive detection. Using conservative thresholds for missing data and minor allele frequency substantially reduces the size of the data set, lessening the power to detect adaptive variants (i.e., simulated true positives) with strong and weak strengths of selection. Regardless, strength of selection was a good predictor for GEA detection, but even some SNPs under strong selection went undetected. False positive rates varied depending on the species and GEA method, and filtering significantly impacted the predictions of adaptive capacity in downstream analyses. We make several recommendations regarding filtering for GEA methods. Ultimately, there is no filtering panacea, but some choices are better than others, depending on the study system, availability of genomic resources, and desired objectives.

Description

Citation

Source

Molecular Ecology Resources

Book Title

Entity type

Access Statement

License Rights

Restricted until

2099-12-31
abcd