Cultural advice

The Australian National University acknowledges, celebrates and pays our respects to the Ngunnawal and Ngambri people of the Canberra region and to all First Nations Australians on whose traditional lands we meet and work, and whose cultures are among the oldest continuing cultures in human history.

Aboriginal and Torres Strait Islander peoples are advised that ANU Library collections may include images, names, voices, and other representations of deceased persons.

Material in the collection may contain terms, language or views that reflect the period in which the item was created and may be considered inappropriate today.

A survey of the statistical power of research in behavioral ecology and animal behavior

Loading...
Thumbnail Image

Date

Authors

Jennions, Michael
Moller, Anders Pope

Journal Title

Journal ISSN

Volume Title

Publisher

Oxford University Press

Abstract

We estimated the statistical power of the first and last statistical test presented in 697 papers from 10 behavioral journals. First tests had significantly greater statistical power and reported more significant results (smaller p values) than did last tests. This trend was consistent across journals, taxa, and the type of statistical test used. On average, statistical power was 13-16% to detect a small effect and 40-47% to detect a medium effect. This is far lower than the general recommendation of a power of 80%. By this criterion, only 2-3%, 13-21%, and 37-50% of the tests examined had the requisite power to detect a small, medium, or large effect, respectively. Neither p values nor statistical power varied significantly across the 10 journals or 11 taxa. However, mean p values of first and last tests were significantly correlated across journals (r = .67, n = 10, p = .034), with a similar trend for mean power (r = .63, n = 10, p = .051). There is therefore some evidence that power and p values are repeatable among journals. Mean p values or power of first and last tests were, however, uncorrelated across taxa. Finally, there was a significant correlation between power and reported p value for both first (r = .13, n = 684, p = .001) and last tests (r = .16, n = 654, p < .0001). If true effect sizes are unrelated to study sample sizes, the average true effect size must be nonzero for this pattern to emerge. This suggests that failure to observe significant relationships is partly owing to small sample sizes, as power increases with sample size.

Description

Citation

Source

Behavioral Ecology

Book Title

Entity type

Access Statement

License Rights

Restricted until

abcd