On the prior distribution of extinction time.

Conservation biology

rsbl.royalsocietypublishing.org

On the prior distribution of extinction time Andrew R. Solow

Research Cite this article: Solow AR. 2016 On the prior distribution of extinction time. Biol. Lett. 12: 20160089. http://dx.doi.org/10.1098/rsbl.2016.0089

Received: 31 January 2016 Accepted: 10 May 2016

Subject Areas: ecology Keywords: Bayesian inference, extinction time, non-informative prior distribution

Author for correspondence: Andrew R. Solow e-mail: [email protected]

Woods Hole Oceanographic Institution, Woods Hole, MA 02543, USA ARS, 0000-0003-1978-1372 Bayesian inference about the extinction of a species based on a record of its sightings requires the specification of a prior distribution for extinction time. Here, I critically review some specifications in the context of a specific model of the sighting record. The practical implication of the choice of prior distribution is illustrated through an application to the sighting record of the Caribbean monk seal.

1. Introduction Understanding the timing of species extinctions is of interest to both palaeobiologists concerned with macroevolution and ecologists concerned with the fate of modern species. A variety of methods have been proposed for statistical inference about the extinction of a species based on a record of its sightings [1–3]. Recent work has focused on Bayesian methods [4–6]. The Bayesian approach can be appealing for both philosophical and practical reasons [7]. Among the latter is the fact that standard results for non-Bayesian inference (e.g. that the likelihood ratio statistic has an approximate x 2 distribution) do not hold in the case of an endpoint like extinction time [8]. As described below, Bayesian methods require the specification of a prior distribution for extinction time. There has been some debate in the literature over aspects of this specification and the purpose of this paper is to lay out the underlying statistical issues in one place. For concreteness, this discussion will be framed in the context of calculating extinction probability under a statistical model in which the preextinction sighting rate is constant over continuous time. However, the issues raised here also arise in other situations—e.g. when the sighting rate declines prior to extinction or when interest centres on estimating the time of extinction. The remainder of this paper is organized in the following way. The next section presents the basic statistical model and Bayesian inference up to the specification of the prior distribution for extinction time. Some alternative choices for this prior distribution are then discussed and the practical implications of these choices are illustrated using the sighting record of the Caribbean monk seal (Monachus tropicalis). The final section contains some concluding remarks.

2. Statistical model

One contribution to the special feature ‘Biology of extinction: inferring events, patterns and processes’ edited by Barry Brook and John Alroy.

The situation described here is the continuous version of the discrete one laid out in [9] in a discussion of [5]. Suppose that during the observation period (0, T ) sightings of a species occur at times s ¼ (s1,s2, . . . ,sn). Note that the convention here is that time increases from a baseline in the past toward the present. The reverse convention, in which time increases from the present into the past, is used in palaeobiology. These sightings are assumed to follow a Poisson process with rate function lðtÞ ¼ l 0 t , t ð2:1Þ 0 t t T, where t is the unknown extinction time. That is, the sighting rate is constant before extinction and 0 after. By itself, the total number of sightings—as opposed to their

& 2016 The Author(s) Published by the Royal Society. All rights reserved.

0

p(sjt) p(t) , p(sjt) p(t) dt

ð2:2Þ

where p(sjt) is the likelihood of s given t and p(t) is the prior pdf of t. For the model outlined above 9 p(sjt) ¼ 0 t , s(n) = tn s(n) , t , T ð2:3Þ ; t . T, T n where s(n) is the most recent sighting time. Note that, for this model, the sighting record enters this likelihood only through s(n): in statistical terminology, s(n) is a sufficient statistic. It follows from (2.2) and (2.3) that p(tjs) ¼ Ð T

sðnÞ

tn p(t) tn p(t)dt þ T n prðt . TÞ

s(n) , t , T:

ð2:4Þ

The term pr(t . T ) in the denominator of this expression is the prior probability that the species is not extinct by the end of the observation period. The posterior pdf in (2.4) provides the basis for Bayesian inference about extinction time.

3. Prior specification To evaluate the posterior pdf p(tjs) in (2.4), it is necessary to specify the prior pdf p(t). This prior pdf encodes available knowledge about t that is independent of s. This knowledge may come from theory, expert opinion or results for similar situations. For example, if the scientific assumption is made that the instantaneous risk of extinction is constant over time, then the conditional prior distribution for t is exponential with pdf [5] p(tju) ¼ u exp(ut):

ð3:1Þ

This pdf is conditional on the parameter u, which is the reciprocal of the expected extinction time. More generally, the exponential model is reasonable if extinction risk is roughly constant over the observation period. It is important to emphasize that (3.1) is a stochastic model of the lifetime of a species and not a representation of human uncertainty. In the Bayesian formulation, human uncertainty is reflected in a prior pdf p(u) for the parameter of this distribution which can be combined with (3.1) to produce the unconditional prior pdf of t ð1 p(t) ¼ p(tju) p(u) du ð3:2Þ 0

needed to evaluate p(tjs). A flexible and convenient choice for p(u) is the gamma pdf p(u) ¼

ba a1 u exp(bu), G (a)

ð3:3Þ

2

a

p(t) ¼

ab : (t þ b)aþ1

ð3:4Þ

For this prior distribution, pr(t . T) ¼ (b=ðb þ TÞ)a and the integral in the denominator of (2.4) must be evaluated numerically. The problem now is to specify a and b which, in Bayesian terminology, are referred to as hyperparameters. It is useful here to distinguish between choices of these hyperparameters that are intended to reflect prior information and those that are intended to be neutral or non-informative. Two options in the first instance are the elicitation [10] and combination [11] of the subjective opinions of experts and parametric empirical Bayes methods [12] that essentially fit the hyperparameters using sighting data for similar species. The way in which these approaches are applied depends on the particular situation and, although both can be useful, I will not pursue them here. In contrast with the informative case, the specification of non-informative hyperparameters can often be based on formal rules—see [13] for a review and critique. The most popular of these is the Jeffreys prior, which ensures that the results of a Bayesian analysis are invariant to alternative parametrizations of the model. For the exponential distribution, the Jeffreys prior corresponds to the limiting case a, b ! 0, so that p(u) / 1=u: For this choice, p(t) / 1=t: This pdf is improper (i.e. its integral diverges) and it is not possible to find pr(t . T ), which is needed to evaluate (2.4). A standard approach in this situation is to approximate the Jeffreys prior by taking a and b close to 0. In the context of the exponential model, Alroy [5] proposed a non-informative choice taking u ¼ log 2/s(n), ensuring that pr(t . s(n)) ¼ 1/2. The sense in which this is non-informative is unclear. On the technical side, the dependence of u on s(n) violates the Bayesian principle that the prior distribution should be specified independently of the data used to update it. Briefly, Bayes’ theorem in (2.2) relies on decomposing the joint pdf of s and t into the product of the conditional pdf of s given t and the unconditional pdf of t. This decomposition is not maintained if the latter is replaced by a function of s. This proposal was modified in [6] by taking u ¼ log 2/T, ensuring that pr(t . T ) ¼ 1/2. The latter is a natural noninformative prior probability in the Bayesian version of testing for extinction, but ensuring it by fixing u conflates the stochastic lifetime model with human uncertainty [14]. Indeed, specifying a single value for u is equivalent to complete knowledge of the instantaneous extinction risk and is, in this sense, fully informative. For example, treating u as known eliminates the possibility of learning about it from the sighting record. This is in contrast with the hierarchical specification outlined above in which the sighting record can be used to update p(u). For an exponential prior distribution for t with fixed parameter u, pr(t . T ) ¼ exp(2uT) and the integral in the denominator of (2.4) involves the incomplete gamma functions and is easily evaluated numerically. A non-informative prior specification in the spirit of that described in [6] that is not connected to a scientific model of species lifetime but reflects pure human uncertainty is pr(t , T) ¼

1 2

ð3:5Þ

Biol. Lett. 12: 20160089

p(tjs) ¼ Ð 1

in which case p(t) is the Pareto pdf


timing—provides no information about extinction time and it is natural to eliminate the parameter l by conditioning on n. It is a property of the Poisson process that, conditional on n, s represents a realization of n independent random variables uniformly distributed over the interval (0, t). Bayesian inference about t is based on its conditional or posterior distribution given s. Here and below, I will abuse notation by using the same symbol to denote a random variable and its realized value. By Bayes’ theorem, the posterior probability density function ( pdf ) of t is

Table 1. Prior and posterior extinction probabilities for the Caribbean monk sea for different speciﬁcations of the prior distribution of extinction time. prior probability

posterior probability

Pareto (0.01, 0.01)

0.09

0.44

[5] [6]

0.85 0.50

0.99 0.97

conditional uniform

0.50

0.97

5. Discussion

p(tjt , T) ¼

1 , T

ð3:6Þ

so that, given that extinction has occurred, all possible extinction times are equally likely. For this specification, p(tjs) ¼

(n 1)tn s(n1) þ (n 2) T (n1) (n)

s(n) , t , T:

ð3:7Þ

Another choice for p(t/t , T ) is the truncated exponential p(tju, t , T) ¼

u exp(ut) 1 exp(uT)

ð3:8Þ

0 , t , T,

but, as with (2.5), this would require specifying a prior distribution for u.

4. Illustration Since 1908, there have been seven confirmed sightings of the Caribbean monk seal. The latest of these was in 1952 and, to my knowledge, there has been none since. For this sighting record, n ¼ 7, s(n) ¼ 44 and T ¼ 108 (the year at writing being 2016). Here, I will focus on the posterior probability of extinction ðT pr(t , Tjs) ¼ p(tjs) dt: ð4:1Þ 0

The prior and posterior probabilities of extinction are reported in table 1 for four different specifications of p(t) discussed in the previous section. In this case, as a consequence of the small number of sightings, the effect of the prior on the posterior dominates the effect of the data. If s(n) remains at 44 but n is increased to 11 then, despite the substantial range of prior extinction probabilities, the posterior extinction probabilities are all at least 0.95. Although the relative influence of the data on the posterior will tend to increase with the number of sightings, the rate at which this occurs will depend on s(n) and on the prior. For the latter, note that the effect of increasing n to 11 is much greater for the Pareto

Competing interests. I declare that I have no competing interests. Funding. There are no funders to report. Acknowledgements. The helpful comments of four anonymous reviewers are acknowledged with gratitude.

References 1.

2.

3.

Solow A. 2005 Inferring extinction from a sighting record. Math. Biosci. 195, 47 –55. (doi:10.1016/j. mbs.2005.02.001) Rivadeneira MM, Hunt G, Roy K. 2009 The use of sighting records to infer extinctions: an evaluation of different methods. Ecology 90, 1291 –1300. (doi:10.1890/08-0316.1) Boakes EH, Rout TM, Collen B. 2015 Inferring species extinction: the use of sighting records.

4.

5.

Methods Ecol. Evol. 6, 678– 687. (doi:10.1111/2041210X.12365) Lee TE, McCarthy MA, Wintle BA, Bode M, Roberts DL, Burgman MA. 2014 Inferring extinctions from sighting records of variable reliability. J. Appl. Ecol. 51, 251– 258. (doi:10.1111/1365-2664.12144) Alroy J. 2014 A simple Bayesian method of inferring extinction. Paleobiology 40, 584–607. (doi:10. 1666/13074)

6.

7.

8.

Alroy J. 2015 Current extinction rates of reptiles and amphibians. Proc. Natl Acad. Sci. USA 112, 13 003–13 008. (doi:10.1073/pnas. 1508681112) Gelman A, Carlin JB, Stern HS, Rubin DB. 2003 Bayesian data analysis. Boca Raton, FL: Chapman & Hall/CRC. Smith RL. 1985 Maximum likelihood estimation in a class of non-regular

Biol. Lett. 12: 20160089

and

The main message of this paper is that, in Bayesian inference about extinction, it is important to think carefully about the specification of the prior distribution of extinction time. Of the specifications considered here, only the Jeffreys prior has a theoretical justification. On the practical side, it cannot be used directly and an approximation is needed. The prior in [5] not only lacks a clear justification, but its dependence on the sighting record means that it does not produce a true posterior distribution for extinction time. At first glance, the prior in [6] seems like a reasonable representation of prior ignorance but, as discussed above, it is actually strongly informative. The conditional uniform prior avoids this by separating the probability of extinction from the distribution of extinction time conditional on extinction, but is unconnected to a scientific model of extinction risk. As the results for the Caribbean monk seal with n ¼ 11 suggest, the choice among these prior specifications can have little practical consequence as the number of sightings in the record increases. This underscores the premium on extending the sighting record, even to the extent of including sightings of questionable reliability, provided this is accounted for in the statistical model [15]. Of course, this is not always an option. I have focused in this paper on prior specifications that are intended to be non-informative. This is appealing on the grounds of objectivity but, for a variety of reasons, there has been a move in the field of statistics away from non-informative priors. One good reason for this is that, in many cases, prior information is actually available. For example, in the case of the Caribbean monk seal, it is known that the reef fish that constituted its main prey were overfished [16]. Depending on the temporal pattern of this overfishing, this could militate against a prior distribution for extinction time that declines (or is flat) over the observation period. How information like this is encoded in a prior distribution is part of the bread and butter of applied Bayesian statistics.

3


speciﬁcation

prior than for the other three specifications. It is also notable that the two cases for which the prior extinction probability is 0.5 have virtually identical posterior extinction probabilities. This underscores the fact that a seemingly innocuous representation of prior ignorance can have a strong influence on the posterior results when the number of sightings is small.

11. French S. 2011 Aggregating expert judgment. Rev. R. Acad. Ciens. Ex. Fis. Nat. 105, 181– 206. (doi:10.1007/s13398-0110018-6) 12. Carlin BC, Louis TA. 2000 Bayes and empirical Bayes methods for data analysis. Boca Rato, FL: Chapman & Hall/CRC. 13. Kass RE, Wasserman L. 1996 The selection of prior distributions by formal rules. J. Am. Stat. Assoc. 91, 1343–1370. (doi:10.1080/01621459.1996.10477003)

14. Solow AR. 2016 On Bayesian inference about extinction. Proc. Natl Acad. Sci. USA 113, E1132. (doi:10.1073/pnas.1525317113) 15. Solow A, Beet A. 2014 On uncertain sightings and inference about extinction. Conserv. Biol. 28, 1119– 1123. (doi:10.1111/cobi.12309) 16. McClenachan L, Cooper AB. 2008 Extinction rate, historical population structure, and ecological role of the Caribbean monk seal. Proc. R. Soc. B B275, 1351– 1358. (doi:10.1098/rspb.2007.1757)

4


problems. Biometrika 72, 67 –90. (doi:10.2307/ 2336336) 9. Solow AR. 2016 A simple Bayesian method for inferring extinction: comment. Ecology 97, 796–798. (doi:10.1890/15-0336.1) 10. Garthwaite PH, Kadane JB, O’Hagan A. 2005 Statistical methods for eliciting probability distributions. J. Am. Stat. Soc. 100, 680–701. (doi:10.1198/01621450 5000000105)

Biol. Lett. 12: 20160089

Persistence in extinction: the sunk time effect.

Elevational distribution and extinction risk in birds.

Global distribution and drivers of language extinction risk.

Pg Extinction.

Calibrating the prior distribution for a normal model with conjugate prior.

A population's stationary distribution and chance of extinction in a stochastic environment with remarks on the theory of species packing.

Reflections on the extinction-explosion dichotomy.

Blockage of the effects of testosterone on extinction of a conditioned taste aversion by estradiol: time of action.

High diversity in cretaceous ichthyosaurs from Europe prior to their extinction.

LPI Radar Waveform Recognition Based on Time-Frequency Distribution.

Effects of memory age and interval of fear extinction sessions on contextual fear extinction.

The time course for visual extinction after a 'virtual' lesion of right posterior parietal cortex.

Extinction of chained instrumental behaviors: Effects of consumption extinction on procurement responding.

Online Bayesian learning with natural sequential prior distribution.

The pharmacology of extinction.

Time course of the rabbit's conditioned nictitating membrane movements during acquisition, extinction, and reacquisition.

The effect of morphine on fear extinction in rats.

Time-resolved force distribution analysis.

Hospital autopsies are on the verge of extinction, study finds.

Ecological impact of the end-Cretaceous extinction on lamniform sharks.

The effect of void creation prior to vertebroplasty on intravertebral pressure and cement distribution in cadaveric spines with simulated metastases.

Extinction of chained instrumental behaviors: Effects of procurement extinction on consumption responding.

Bimodal extinction without cross-modal extinction.

On uncertain sightings and inference about extinction.