Forensic Science International 248 (2015) 154–171

Contents lists available at ScienceDirect

Forensic Science International journal homepage: www.elsevier.com/locate/forsciint

Quantifying the weight of fingerprint evidence through the spatial relationship, directions and types of minutiae observed on fingermarks Cedric Neumann a,b,c,*, Christophe Champod d, Mina Yoo b, Thibault Genessay d, Glenn Langenburg e a

The South Dakota State University, Department of Mathematics and Statistics, Harding Hall, Brookings, SD 57007, United States The Pennsylvania State University, Department of Statistics, University Park, PA 16802, United States Two N’s Forensics Inc., Brookings, SD 57006, United States d School of Criminal Justice, Forensic Science Institute, University of Lausanne, Batochime, quartier Sorge, CH-1015 Lausanne-Dorigny, Switzerland e Elite Forensic Services, LLC, Saint-Paul, MN 55117, United States b c

A R T I C L E I N F O

A B S T R A C T

Article history: Received 19 May 2014 Received in revised form 30 December 2014 Accepted 7 January 2015 Available online 16 January 2015

This paper presents a statistical model for the quantification of the weight of fingerprint evidence. Contrarily to previous models (generative and score-based models), our model proposes to estimate the probability distributions of spatial relationships, directions and types of minutiae observed on fingerprints for any given fingermark. Our model is relying on an AFIS algorithm provided by 3M Cogent and on a dataset of more than 4,000,000 fingerprints to represent a sample from a relevant population of potential sources. The performance of our model was tested using several hundreds of minutiae configurations observed on a set of 565 fingermarks. In particular, the effects of various sub-populations of fingers (i.e., finger number, finger general pattern) on the expected evidential value of our test configurations were investigated. The performance of our model indicates that the spatial relationship between minutiae carries more evidential weight than their type or direction. Our results also indicate that the AFIS component of our model directly enables us to assign weight to fingerprint evidence without the need for the additional layer of complex statistical modeling involved by the estimation of the probability distributions of fingerprint features. In fact, it seems that the AFIS component is more sensitive to the sub-population effects than the other components of the model. Overall, the data generated during this research project contributes to support the idea that fingerprint evidence is a valuable forensic tool for the identification of individuals. ß 2015 Elsevier Ireland Ltd. All rights reserved.

Keywords: Fingerprint evidence Strength of evidence Sub-population effect Spatial relationship Statistical model

1. Introduction The skin of the digits (fingers and toes), palms and soles of human beings is formed of papillary ridges, also known as friction ridges. Fingerprints have been used with considerable success over the past century to determine or verify the identity of individuals using finger impressions taken under controlled conditions, or from friction ridge impressions left inadvertently on crime scenes. In particular, fingerprint examiners are concerned with the

* Corresponding author at: The South Dakota State University, Department of Mathematics and Statistics, Harding Hall, Brookings, SD 57007, United States. Tel.: +1 415 272 67 52. E-mail address: [email protected] (C. Neumann). http://dx.doi.org/10.1016/j.forsciint.2015.01.007 0379-0738/ß 2015 Elsevier Ireland Ltd. All rights reserved.

determination of the identity of criminals through the examination of partial, potentially distorted and degraded friction ridge impressions recovered on crime scenes. In line with the European terminology, we will refer to crime-scene impressions using the term fingermarks, to control impressions from a known individual of interest using the term fingerprints, and to impressions from individuals in a given population using the term reference prints. Currently, the fingerprint examiner community uses one general protocol to guide fingerprint examination. This protocol consists of 4 main stages, summarized by the acronym ACE-V (for Analysis, Comparison, Evaluation and Verification). Albeit this acronym is not always mentioned, this protocol is commonly described by the different professional bodies [1,2], in the relevant literature [3–5], and in US courts when examiners report fingerprint evidence [6–10].

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

The practical implementation of this protocol may vary between agencies. However, fingerprint professionals, and scientific and legal scholars, generally accept that it aims at minimizing the risk of errors and provides a measure of quality assurance. That said, this protocol requires examiners to take a series of decisions after each stage, and the same scholars have stressed the need to develop quantifiable measures to support these decisions: from the initial decision that a fingermark is worth examining, through its level of (dis)agreement with a fingerprint, to the final quantification of its evidential value [11] Several authors [see 4,5,12 for reviews] have argued that these decisions should be supported by a probabilistic framework, and possibly by the use of a statistical model enabling the quantification of fingerprint information, in a similar fashion as for DNA data. The aim of this research is to develop and test a model to help with the assignment of the weight of fingerprint evidence. While several models have already been proposed, none of them has reached the appropriate level of statistical rigor, and has been tested extensively enough, to be used in casework. This paper proposes a model that is developed using a different concept than the two main approaches used so far, and presents some measures of its performances on different datasets. The paper is structured as follows: it starts with a brief presentation of the past modeling efforts to clarify the novel nature of the model that is proposed in this research. Then the developed model is presented, and is studied using various datasets. Finally, its performance is discussed. 2. Models applied to fingerprints: a short review Several models have been proposed during the past century to quantify the weight of fingerprint evidence and provide support for the conclusions reached at the end of a fingerprint examination. Models pre-dating 2001 have been reviewed by Stoney [12]. More recent models were reviewed in [5,13,14]. These models can be classified in two groups: 1) so-called score-based models; 2) so-called generative models. 2.1. Score-based models: Contrary to DNA, friction ridge skin cannot be characterized by an easily definable and quantifiable set of features. Indeed, while DNA can be summarized, for forensic purposes, using alleles at given loci, which are easily measurable, friction ridge skin contains patterns with many different levels of details that cannot be readily summarized by discrete variables. In addition, impressions from these patterns are affected by numerous factors (such as distortion, substrate, detection technique), which lower the reproducibility of their characteristics and increase the complexity of their modeling. Several research projects have attempted to lower the complexity of representing the multi-dimensionality and heterogeneity in friction ridge patterns by measuring the similarity between pairs of impressions. Such similarity can be typically summarized by a univariate random variable (a similarity metric or score). Score-based models attempt to provide a measure of the weight of the fingerprint evidence, represented by the level of similarity between the fingermark and the fingerprint from a considered individual, by assigning the likelihood of that level of similarity under two mutually exclusive sets of circumstances. Score-based statistical models have intrinsic limitations:

155

2) By design (the evidence score has to be computed between the fingermark and the fingerprint from the considered individual), such models cannot be used to support decisions made at the end of the analysis phase (since the examiner is not supposed to have had access to the fingerprint during that phase). 3) Adding new features to an existing model requires the redevelopment and re-optimization of the scoring algorithm.

2.2. Feature-based models Other researchers have attempted to model the underlying distributions of some of the features that can be observed on friction ridge skin impressions. After having investigated the structure of these distributions, and estimated their parameters, it is theoretically possible to randomly sample fingermarks and fingerprints from these distributions, and to assign the probability of observing any constellation of features detected on a fingermark. In general, these models were developed on datasets that were too limited in size to account properly for the dependencies between the multiple highly dimensional variables used to describe friction ridge pattern, and to account for the variability between impressions from different fingers. It appears that these models do not fit the data well, most particularly the spatial relationships between neighboring minutiae [14]. Furthermore, currently these models do not allow for conditioning the underlying distributions on the observations made on the fingerprint; thus they do not account for the level of similarity between the fingermark and the fingerprint. Most of these models rely on some heuristic to determine whether these two impressions are sufficiently similar or not. This limits the support that those models can provide during the comparison and evaluation phases of the examination process. A third type of model emerged [16] recently, where similarity measures are used to reduce the dimensionality of the problem by mapping all impressions from their original multi-dimensional space onto a single-dimensional space. This new type of models quantifies the evidence represented by the one-dimensional projection of the fingermark and the fingerprint in the new space, and not the evidence represented by the measure of similarity between them. This new type of model can be used to support the decisions made by examiners at the end of each of the first 3 stages of ACE-V [5]. However, the model presented in [16] uses an ad hoc ‘‘weighting function’’ for the Monte-Carlo estimation of the value of the probability density of the observations made on the fingermark in both numerator and denominator distributions. It has been rightfully argued [17,18] that this model is only a very ad hoc approximation of a likelihood ratio. This paper proposes a novel approach for the quantification of the weight of fingerprint findings. In this approach, we attempt to reduce the dimensionality of the sets of variables used to describe minutiae configurations by using shape variables as proposed in [19]. The use of shape variables allows for (a) reducing the complexity of the problem, while accounting for the dependencies between fingerprint features, as in score-based models, and (b) providing a measure of the specificity of the crime scene mark without involving the fingerprint, as in generative models and in the model proposed in [16]. Overall, the direct modeling of fingerprint features through the use of shape variables enables a more statistically appropriate construction of the model when compared to [16] and other score-based models. 3. Development of the model

1) The integration of scores in the full statistical framework for quantifying the weight of forensic fingerprint evidence is not well understood, and still subject to debate [5,15].

The general framework of the model, and the notation, are similar to the ones described in Neumann et al. [16] and Neumann

156

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

et al. [20]. We denote the entire collection of observations made on the fingermark by the multi-dimensional quantity Y. We denote the observations made on corresponding properties on a set of 10 fingerprints from a given individual by X. The model uses Y and X to address the following propositions: Hp: the fingermark and the set of fingerprints have both been left by Mr. X. Hd: the set of 10 fingerprints has been provided by Mr. X., but the fingermark comes from another individual within a considered population. Following Lindley [21], the objective is to evaluate the weight of the fingerprint evidence, using a likelihood ratio (LR), which we write here, after some simplifications [16], as: LR ¼

  pY j X ðX; Y H p pY j X ðX; Y jHd Þ

(1)

In general, we note that it is not possible to calculate the LR described in Eq. (1) since we do not know the structure and the parameters of the numerator and denominator likelihood function. Therefore, we resort to approximate this likelihood ratio by making some assumptions and simplifications. While we will use the term likelihood ratio and its corresponding acronym LR in the development presented below, it should remain clear to the reader that we mean approximate likelihood ratio. In [16], it is explained that the number of minutiae k recorded on the fingermark defines the dimensionality of the problem. Following [16], we denote the vector of observations made on the fingermark by y(k) and on the fingerprint by x(k). When comparing a fingermark with a set of 10 fingerprints from a given individual (such as, Mr. X), an examiner will attempt to select the subset of features x(k) of X that corresponds best to the observations y(k) made on the fingermark: 1) The examiner first selects the fingerprint with the ridge flow that best corresponds to the fingermark, taking into account the

LR ¼

rewritten as follows: LR ¼

  pY jX min ðyðkÞ H p pY ðyðkÞ jHd Þ

At this point in the development of the model, it is critical to realize that the k minutiae on the fingermark, and the corresponding k minutiae on the selected fingerprint are assumed to be paired by virtue of the comparison process by the examiner: the ith minutia on the fingermark is associated to one and only one minutia on the selected fingerprint from Mr. X. This information is implied in the model. The numerator of the model in Eq. (2) involves estimating the probability density of the k minutiae configuration observed on the fingermark using a probability density function describing the variability of the closest k minutiae on the selected finger of Mr. X. The denominator of the model in Eq. (2) involves estimating the probability density of the k minutiae configuration observed on the fingermark using a hierarchical model of the distributions of (a) the individuals in the relevant population and (b) the set of observations made on the closest k minutiae observed on the set of 10-fingers of each individual in the population defined in (a). We will come back later on the selection of the closest k minutiae on fingers from individuals in a relevant reference population. In order to simplify the model, we consider the possibility that, in practice, the friction ridge skin of a given individual may not present a k minutiae configuration that would be considered sufficiently similar to the configuration observed on the fingermark. While we realize that we have yet to define what constitutes sufficiently similar (this will be done later), and that we could theoretically always select the most similar k minutiae configuration for any individual (even if it is only remotely similar), we chose to introduce V as an indicator variable that takes value 1 if a considered impression from a given individual has a k minutiae configuration that is sufficiently similar to the one observed on the fingermark, and 0 otherwise. We include the additional information provided by V in Eq. (2) as follows:

        pY jX min ;V ðyðkÞ H p ; v ¼ 1 pV ðv ¼ 1H p þ pY jX min ;V ðyðkÞ H p ; v ¼ 0 pV ðv ¼ 0H p pY jV ðyðkÞ jHd ; v ¼ 1Þ pV ðv ¼ 1jHd Þ þ pY jV ðyðkÞ jHd ; v ¼ 0Þ pV ðv ¼ 0jHd Þ

tolerances allowed by the mark due to finger pad distortion and other adverse factors. 2) Secondly, the examiner focuses on the general location within the ridge flow (i.e., core, delta, periphery) of the selected fingerprint, where the minutiae were observed on the fingermark (if known). 3) Thirdly, the examiner determines whether a set of features x(k) on the selected fingerprint could correspond to the set y(k) observed on the fingermark at the corresponding location within the ridge flow. 4) Finally, the examiner compares the details of the features between both impressions.  of a Mathematically, this process corresponds to theselection P10 ni single k minutiae configuration, out of the possible i¼1 k configurations on Mr. X’s 10-fingerprint set (in this paper, we are not considering palms and other friction ridge areas, but the concepts underlying this model could be extended to these areas), such that its location, shape and other features are as similar as possible to the ones of the k minutiae configuration observed on ðkÞ the fingermark. We denote this configuration by xmin . Eq. (1) can be

(2)

(3)

Eq. (3) can be simplified by making the following assumptions:     1) pY jX ;V ðyðkÞ H p ; v ¼ 0 and pV ðv ¼ 1H p bothtend tozero when min all Mr. X.’s fingers show unexplainable differences with the k configurations observed on the fingermark. This typically happens when Mr. X. is not the donor of the fingermark and the observations made on his fingers are not compatible with the ones made on the fingermark. This may also happen when the mark shows extreme distortion or degradation. At this point, we have LR = 0;   2) pV ðv ¼ 1H p tends to 1 when Hp is true (i.e., Mr. X. is truly the source) or when Mr. X, while not strictly being the true source, displays a sufficiently similar k minutiae configuration on one of his fingers. This assumption is not a very strong one, and in practice   it may be possible to assign a probability to pV ðv ¼ 1H p for any considered Mr. X; 3) pY jV ðyðkÞ jHd ; v ¼ 0Þ tends to zero for individuals in the reference populations whose fingerprints have unexplainable differences with the k minutiae observed on the fingermark.   In the proposed model, the terms pY jX ;V ðyðkÞ H p ; v ¼ 1 and min ðkÞ pY jV ðy jHd ; v ¼ 1Þ are estimated by characterizing k minutiae

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

configurations using three different variables: shape of configuration S, minutiae direction D and minutiae type T. Rewriting Eq. (3), we obtain:

LR ¼

 ðkÞ ðkÞ ðkÞ  pY jX min ;V ðyS ; yD ; yT H p ; v ¼ 1 ðkÞ

ðkÞ

ðkÞ

pY jV ðyS ; yD ; yT jHd ; v ¼ 1Þ



1 pV ðv ¼ 1jHd Þ

(4)

In Eq. (4), we consider that the shapes of minutiae configurations, and the types and directions of the minutiae are influenced by the general pattern of the prints and by the location of the configurations on the ridge flow. This dependency is captured by the variable V. Therefore, we make the assumption that within a particular location (e.g., core, delta or periphery) of a given pattern (e.g., whorl, loop, arch), configuration shapes, minutiae types and minutiae directions are independent of each other. Using this assumption, we obtain:

LR ¼

  ðkÞ  ðkÞ  pY jX min ;V ðyS H p ; v ¼ 1 pY jX min ;V ðyD H p ; v ¼ 1 ðkÞ

ðkÞ

pY jV ðyS jHd ; v ¼ 1Þ pY jV ðyD jHd ; v ¼ 1Þ  ðkÞ  pY jX min ;V ðyT H p ; v ¼ 1 1  ðkÞ pV ðv ¼ 1jHd Þ p ðy jHd ; v ¼ 1Þ Y jV

(5)

T

Given V, our model has four conditionally independent components. The first component focuses on the shape of the fingermark configuration, the second component focuses on the directions of the minutiae in the fingermark configuration, the third component focuses on their types, and the last component includes information on the general pattern of the ridge flow. Note that the design of the model enables the consideration of additional fingerprint features, conditioned on V, without the need for changing the existing elements of the model. Thus, it is possible to consider other elements commonly used by latent print examiners, such as the presence of differences between the features observed on the trace and control prints, the presence/ absence of scares, warts and creases, as well as the presence/ absence of impressions from sweat pores on the prints, or the shape of the ridges.

157

To ease the description of the model, the three first components of the model are described in separated sections below. However, we first describe two algorithms for: 1) Extracting and quantifying the features observed on k minutiae configuration present on friction ridge impressions. 2) Finding the most similar (within tolerances) k minutiae configurations on any fingerprint or reference print based on the k minutiae configuration observed on a fingermark.

3.1. Feature extraction The process of extracting features from friction ridge impressions is image dependent: minutiae locations and directions are relatively measured to a coordinate system defined by the fingerprint image. Fig. 1 displays a set of 7 features on a fingermark and the corresponding features on a fingerprint. Fig. 1 also shows that the locations and directions of corresponding minutiae are different in the two images, and that it is not possible to build a statistical model relying directly on these measurements. Following Neumann et al. [16], we propose to describe configurations of k minutiae as a set of k triangles, whose vertices are defined by pairs of consecutive minutiae and the virtual centroid of the k configuration. This design enables the capture of the spatial relationships between minutiae, provides some robustness to the distortion affecting impressions when finger pads are pressed against a surface, and allows for measuring variables with respect to the triangles, thus breaking their dependency to the images. Fig. 2 illustrates how the considered variables are extracted from a given configuration. At first, the minutiae are annotated on the finger impression using markers indicating their locations, types and directions. This image dependent information is used to organize the minutiae around a virtual centroid, defined by the arithmetic mean of the spatial coordinates of the minutiae. This process creates a series of triangles, whose vertices are defined by pairs of consecutive minutiae and the centroid. The triangulation is

Fig. 1. Raw information extracted from minutiae location and direction, with indication of the image defined axes.

158

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

Fig. 2. Extraction of the variables considered by the model from the raw information available on the image of a finger impression. From left to right: (a) annotation of the minutiae on the fingerprint image distinguishing ridge endings (round) and bifurcations (square); (b) definition of the centroid and organization of the minutiae with respect to the centroid; (c) creation of the triangles; (d) extraction of shape variables for one triangle and (e) extraction of the type and direction variables of the minutiae for one triangle (the variables for all triangles are similarly extracted).

rotationally independent: the minutiae will be organized in the same order, irrespective of the angle between the impression and the axes of the image. The triangulation also provides the capability to measure the considered variables according to the triangles, and thus to break their dependency to the images. In this research project, we decided to characterize each configuration by the following variables: S—The shape of each triangle in the configuration is described by two quantitative measurements: (a) the ratio between its area and perimeter (form factor), and (b) the ratio between the diameters of its circumcircle and incircle (aspect ratio). The shape of a fingermark configuration can be formally represented by YS = [YS,1, ..., YS,k]; D—The direction of each minutia in the configuration is described by the angle between the direction of the minutia and an axis defined by the centroid and the minutiae location (Fig. 2). The angle is measured counterclockwise from the axis to the minutiae. The directions of the minutiae in a latent print configuration can be formally represented by YD = [YD,1, ..., YD,k]; T—The type of each minutia in the configuration is described by a nominal variable, which can take the following values: RE for ridge ending minutiae; BI for bifurcation minutiae; UK for minutiae which type is unknown. The types of the minutiae in a latent print configuration can be formally represented by YT = [YT,1, ..., YT,k].

3.2. Finding the most similar k configuration and estimating pV ðv ¼ 1jHd Þ The numerator of our model assumes (in Eq. (2)) that a trained fingerprint examiner has selected the single most similar k minutiae configuration from all configurations observed on Mr. X’s 10 fingerprints, following their comparison with the fingermark. In Eq. (4), we have also assumed that if Mr. X.’s configuration does not significant discrepancies with the  show  H p would tend to 1. We note that a value fingermark, pV ðv ¼ 1  for pV ðv ¼ 1H p could be computed to remove this assumption.1 By symmetry, the denominator needs to consider the distributions of the observation of the same variables on 1 Current technical limitations due to the set-up of our AFIS prevented us from doing it.

fingerprints from individuals in the relevant reference population. Contrary to DNA profiling, which relies on population genetics and well-defined variables (selected alleles), it is currently not possible to determine analytically the structure and parameters of the distribution of friction ridge features in a relevant population. The distributions of the various friction ridge features need to be estimated using a sample of fingerprints from the individuals in this population. Since it is unrealistic to require a human examiner to determine the most similar k minutiae configuration on each set of reference prints in our sample, we used the fingermarkfingerprint matching algorithm of an Automatic Fingerprint Identification System (AFIS) provided by 3M Cogent as a proxy for the human-based comparison process described above. The matching algorithm is used to search a large dataset of reference prints from a sample of individuals, and select, for each person, the set of k minutiae that is most similar to y(k) in terms of general pattern, location on the ridge flow, and general appearance (i.e., shape). In practice, not all of the individuals in the reference dataset were found to have k minutiae that were sufficiently similar (as defined by the factory settings of the 3M Cogent’s matching algorithm2) to the ones observed on the fingermark. This process enables us to easily estimate pV ðv ¼ 1jHd Þ by simply counting the number of individuals retrieved by the system in relation to the number of prints in the database. The reader will have realized that the process described above has a major implication: the approximation of the denominator in Eq. (5) is performed based on a single impression of each k minutiae configuration in our reference dataset. The denominator of our model does not account for the within variability (due for example to the distortion of finger pads) of the selected k minutiae for each individual in the reference dataset. This allows for considerable simplifications in the development of our model at the cost of some of its coherence. We believe that, for a large reference dataset, this process still allows for a reasonable approximation of the specificity of the k minutiae configuration observed on the fingermark.

2 These settings are proprietary and not known to the authors. They influence the model as they conditioned which configurations from the reference sets are subsequently used by the model. However, this algorithm has been extensively tested by the National Institute of Standards and Technology (NIST) and appears to be very efficient at retrieving configurations that are similar in shape and location [22].

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

159

Table 1 Spearman rank correlation coefficients between the form factors measured on triangles (T1 to T12) from approximately 100,000 12 minutiae reference configurations paired with a single fingermark with 12 minutiae. Shape

T1

T2

T3

T4

T5

T6

T7

T8

T9

T10

T11

T12

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12

1 0.211 0.021 0.040 0.042 0.113 0.028 0.009 0.021 0.009 0.007 0.180

1 0.278 0.055 0.008 0.001 0.074 0.047 0.001 0.100 0.014 0.031

1 0.188 0.008 0.027 0.004 0.041 0.035 0.059 0.052 0.009

1 0.308 0.037 0.020 0.072 0.041 0.119 0.051 0.046

1 0.156 0.033 0.004 0.023 0.018 0.019 0.022

1 0.326 0.008 0.000 0.029 0.033 0.043

1 0.185 0.054 0.000 0.020 0.005

1 0.194 0.066 0.067 0.051

1 0.221 0.016 0.069

1 0.119 0.012

1 0.325

1

3.3. Shape component of the model From Eq. (5) and Section 3.1, we rewrite the shape element of the model as:  ðkÞ  pY S jX min ;V ðyS H p ; v ¼ 1 LRS ¼ ðkÞ pY S jV ðyS jHd ; v ¼ 1Þ  ðkÞ ðkÞ  pY S jX min ;V ðyS;1 ; :::; yS;k H p ; v ¼ 1 (6) ¼ ðkÞ ðkÞ pY S jV ðyS;1 ; :::; yS;k jHd ; v ¼ 1Þ ðkÞ

where YS;i represents the shape measurements performed on the ith triangle in the k minutiae configuration. In order to simplify the modeling of the joint distributions in Eq. (6), we assume that the shape of triangle i is mostly influenced by its immediate neighbors. This assumption is reasonable as adjacent triangles share one side with each other, while non-adjacent triangles share only one vertex. Table 1 presents the Spearman rank correlation coefficients between the form factors measured on triangles from more than 100,000 12 minutiae configurations that have been paired with a single fingermark configuration. The coefficients show weak correlation between immediately adjacent triangles, and no correlation between non-adjacent triangles on the same configuration. Removing dependencies between non-adjacent triangles forces us to select a first triangle in each configuration, and to assign a marginal probability to this triangle, rather than a joint probability. ðkÞ To set the first triangle, we remember that each YS;i is a bidimensional variable containing the form factor and the aspect ratio of triangle i. The form factor and the aspect ratio are functionally independent and may capture the shape of a triangle in complimentary ways. To select the first triangle, we decided to use the aspect ratio information of the k triangles in the ðkÞ ðkÞ configuration. We set YS;1 ¼ min1ik YS;i based on the aspect ratio variable of each triangle, and then register the remaining

k  1 triangles counterclockwise. Since the k minutiae in the fingermark are paired with the k minutiae in the fingerprint under consideration (by the human examiner) and the reference prints (by the AFIS matching algorithm), all triangles in these prints can ðkÞ be reordered according to YS . Once organized counterclockwise starting from the triangle with the smallest aspect ratio, the aspect ðkÞ ðkÞ ratio component from YS;i is dropped and YS;i is considered to be univariate and to only include the form factor of the triangle. In ðkÞ other words, YS;i is a variable capturing the form factor of each ðkÞ triangle, and the representations yS;i of this variable are organized according to the aspect ratio of the triangles. Based on this development, Eq. (6) can be rewritten as:

LRS ¼

 ðkÞ  pY S jX min ;V ðyS;1 H p ; v ¼ 1 ðkÞ

pY S jV ðyS;1 jHd ; v ¼ 1Þ     ðkÞ  ðkÞ ðkÞ  ðkÞ pY S jX min ;V ðyS;2 yS;1 ; H p ; v ¼ 1 pY S jX min ;V ðyS;k yS;k1 ; H p ; v ¼ 1       ðkÞ  ðkÞ ðkÞ  ðkÞ pY S jV ðyS;2 yS;1 ; Hd ; v ¼ 1 pY S jV ðyS;k yS;k1 ; Hd ; v ¼ 1 (7)

The numerator of an LRS describes the probability of observing the configuration of k minutiae on the fingermark if it originates from the same finger as the selected fingerprint. Ideally, assigning this probability would require the source of the fingerprint to generate multiple pseudo-marks in various conditions of distortion and pressure, and to model the distribution of the shape of the considered k minutiae configuration across these pseudo-marks. In practice, this is unrealistic and we used the distortion model from Neumann et al. [16], based on Bookstein [23], to generate pseudomarks from the fingerprint. The denominator relates to the probability of observing the configurations of k minutiae on the fingermark in a relevant population defined by Hd. The dataset necessary to approximate the denominator is obtained as

Table 2 Spearman rank correlation coefficients between the directions of neighboring minutiae in 100,000 reference prints paired with a single fingermark with 12 minutiae. Direction

T1

T2

T3

T4

T5

T6

T7

T8

T9

T10

T11

T12

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12

1 0.022 0.056 0.003 0.015 0.020 0.098 0.020 0.042 0.012 0.044 0.083

1 0.022 0.004 0.003 0.049 0.164 0.128 0.077 0.053 0.046 0.000

1 0.006 0.026 0.048 0.039 0.064 0.056 0.015 0.005 0.016

1 0.062 0.167 0.084 0.003 0.059 0.002 0.027 0.069

1 0.101 0.168 0.015 0.169 0.127 0.152 0.018

1 0.093 0.015 0.069 0.016 0.037 0.063

1 0.030 0.068 0.038 0.019 0.070

1 0.059 0.065 0.020 0.082

1 0.122 0.142 0.047

1 0.118 0.085

1 0.049

1

160

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

described above, using the sorting power of our AFIS algorithm to find the appropriate configurations to consider under Hd. In general, the histogram estimates for the distributions of the form factors of the triangles extracted from multiple impressions of a single k minutiae configuration (i.e., under Hp) are reasonably symmetrical and unimodal. However, the histogram estimates for the distributions of the form factors of triangles extracted from k minutiae configurations originated from different donors (i.e., under Hd) are moderately to highly skewed on their right or left tails. Therefore, we decided to model the numerator distributions using uni- and bivariate normal densities; and we choose not to impose any parametric assumption on the structure of the densities the denominator distributions by learning the density functions from the data using kernel density estimation. 3.4. Direction element of the model From Eq. (5) and Section 3.1, we rewrite the shape element of the model as:  ðkÞ  pY D jX min ;V ðyD H p ; v ¼ 1 LRD ¼ ðkÞ pY D jV ðyD jHd ; v ¼ 1Þ  ðkÞ ðkÞ  pY D jX min ;V ðyD;1 ; :::; yD;k H p ; v ¼ 1 (8) ¼ ðkÞ ðkÞ pY D jV ðyD;1 ; :::; yD;k jHd ; v ¼ 1Þ As explained previously, the direction of each minutia in a configuration is described with respect to an axis defined by the centroid of the configuration and the minutia itself. This allows for obtaining directional information on the minutiae in the configuration regardless of the rotation and location of the impression on the image. This transformation also enables the reduction of the dependency between the directions measured for neighboring minutiae. Table 2 shows the Spearman rank correlation coefficients between the minutiae directions of the same 12 minutiae configurations as in Table 1. It confirms that few neighboring minutiae have weakly correlated directions (as measured in our study) and that most show low to no correlation. By taking advantage of the low correlation between directions of neighboring minutiae, we make the assumption of independence between the minutiae direction (as specifically measured in our project) and obtain the following ratio:  ðkÞ  k p Y Y D jX min ;V ðyD;i H p ; v ¼ 1 (9) LRD ¼ ðkÞ pY D jV ðyD;i jHd ; v ¼ 1Þ i¼1 As explained previously, we used a distortion model to generate pseudo-marks for the estimation of the density function under Hp and our large reference dataset for the estimation of the density function under Hd. The histogram estimates for the density function of minutiae directions in the fingerprint show that they tend to be skewed to the right, while the estimates for the reference prints show multiple modes. We decided to approximate both distributions using non-parametric distributions based on von Mises kernels [24]. 3.5. Type element of the model From Eq. (5) and Section 3.1, the type element of the model can be rewritten as:  ðkÞ  pY T jX min ;V ðyT H p ; v ¼ 1 LRT ¼ ðkÞ pY T jV ðyT jHd ; v ¼ 1Þ  ðkÞ ðkÞ  pY T jX min ;V ðyT;1 ; :::; yT;k H p ; v ¼ 1 (10) ¼ ðkÞ ðkÞ pY T jV ðyT;1 ; :::; yT;k jHd ; v ¼ 1Þ

In order to simplify the dimensionality of the probabilities in Eq. (10), we assume that minutiae types are influenced by the location of the minutiae within the pattern of the ridge flow (accounted for in V) but not by each other. Thus, given V, we can make the following simplification:  ðkÞ  k p Y Y T jX min ;V ðyT;i H p ; v ¼ 1 (11) LRT ¼ ðkÞ pY T jV ðyT;i jHd ; v ¼ 1Þ i¼1 We have defined previously minutiae type as nominal variable such that any i minutia can take one of the following values ðkÞ YT;i ¼ fRE; BI; UKg. That said, the observation of the type of a minutia on a potentially distorted and degraded fingermark is not only conditioned by the true type of that minutia, but also by the ability of the examiner to correctly interpret the ridge flow. Therefore, we have for the numerator: X   fRE;BI;UKg  ðkÞ  ðkÞ pY T jX min ;V ðyT;i H p ;v ¼1 ¼ pY T jX min ;V ðyT;i ¼ jH p ;v ¼1 j

¼

fRE;BIg X fRE;BI;UKg X l

j





 ðkÞ ðkÞ pY T jX min ;V ðyT;i ¼ jxT;i ¼l;H p ;v¼ 1

ðkÞ

Ideally, the pX T jV ðxT;i

  ðkÞ pX T jV ðxT;i ¼ lH p ;v¼ 1

(12)   ¼ lH p ; v ¼ 1 terms should be assigned by

having the examiner annotate the type of the ith minutia on series of pseudo-marks generated by the source of the fingerprint.   ðkÞ Indeed, pX T jV ðxT;i ¼ lH p ; v ¼ 1 could be developed in a similar   ðkÞ fashion as pY jX ;V ðyT;i ¼ jH p ; v ¼ 1 by conditioning on the true T

min

type of the minutiae observed on the friction ridge skin of the finger pad. However, for all intents and purposes of this project, we consider that there is no uncertainty affecting the determination of the type of a given minutia when observed on fingerprints or   ðkÞ pseudo-traces. Thus, in our model, p ðx ¼ lH p ; v ¼ 1 takes X T jV

T;i

values {0,1} depending on whether the lth type is observed by the examiner on the ith minutia of the k configuration present on the fingerprint.    ðkÞ ðkÞ The pY jX ;V ðyT;i ¼ jxT;i ¼ l; H p ; v ¼ 1 terms take different T min values depending on the type observed on the fingermark for the ith minutia and for the corresponding type observed on the fingerprint for the corresponding minutiae. A survey of a series of 82 minutiae [25], each annotated by more than 200 latent prints examiners on 12 pairs of fingermarks and fingerprints, reveals that (for any i): When the ith minutia on the fingermark is deemed to be a ridge ending:    ðkÞ ðkÞ pY T jX min ;V ðyT;i ¼ RExT;i ¼ RE; H p ; v ¼ 1 ¼ 0:76;    ðkÞ ðkÞ pY T jX min ;V ðyT;i ¼ RExT;i ¼ BI; H p ; v ¼ 1 ¼ 0:16; all other terms equal 0; When the ith minutia on the latent print is deemed to be a bifurcation:    ðkÞ ðkÞ pY T jX min ;V ðyT;i ¼ BIxT;i ¼ RE; H p ; v ¼ 1 ¼ 0:16;    ðkÞ ðkÞ pY T jX min ;V ðyT;i ¼ BIxT;i ¼ BI; H p ; v ¼ 1 ¼ 0:75; all other terms equal 0;

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

161

Table 3 Number of reference prints by general pattern and finger number (‘‘Other’’ includes amputated and missing fingers). Finger #

Pattern

1

2

3

4

5

6

7

8

9

10

Arch Right loop Left loop Whorl Other

6,643 1,53,400 714 1,99,017 45,661

24,552 1,02,885 33,382 1,23,994 1,20,622

13,691 2,40,039 1,625 71,248 78,832

3,539 1,38,992 1,667 1,74,406 86,831

1,577 2,56,983 504 54,671 91,700

11,262 614 1,80,496 1,56,064 56,999

26,451 29,363 1,13,208 1,13,069 1,23,344

17,992 1,349 2,19,008 71,032 96,054

5,018 994 1,57,587 1,35,197 1,06,639

2,454 311 2,55,813 39,323 1,07,534

Total

4,05,435

4,05,435

4,05,435

4,05,435

4,05,435

4,05,435

4,05,435

4,05,435

4,05,435

4,05,435

When the type of the ith minutia on the latent print is unknown:    ðkÞ ðkÞ pY T jX min ;V ðyT;i ¼ UK xT;i ¼ RE; H p ; v ¼ 1 ¼ 0:08;    ðkÞ ðkÞ pY jX ;V ðyT;i ¼ UK xT;i ¼ BI; H p ; v ¼ 1 ¼ 0:09;all T min equal 0.

other

terms ðkÞ

We note that under Hp only one of the pY jX ;V ðyT;i ¼ T min      ðkÞ ðkÞ terms is non-null jxT;i ¼ l; H p ; v ¼ 1 pX T jV ðxT;i ¼ lH p ; v ¼ 1 depending on the types observed on the ith pair of minutiae on the fingermark and fingerprint. Similarly for the denominator, we have: ðkÞ

pY T jV ðyT;i jHd ; v ¼ 1Þ ¼

fRE;BI;UKg X

ðkÞ

pY T jV ðyT;i ¼ jjHd ; v ¼ 1Þ

j

¼

fRE;BIg X X fRE;BI;UKg l

(13)

ðkÞ

pY T jV ðyT;i ¼ jjHd ; v ¼ 1Þ pX T jV ðljHd ; v ¼ 1Þ

j

elements of the model were obtained by generating 2500 pseudomarks from each fingerprint, using the distortion model from [16]. The dataset used for studying the histogram estimates of the denominators of the different elements of the model contained approximately 12,000 reference prints as in [16]. The minutiae on all fingermarks, fingerprints and reference prints were annotated or verified manually as described in [16]. 4.2. Reference dataset A reference dataset of approximately 4,000,000 reference prints from 405,435 anonymous donors was used in conjunction with our AFIS algorithm to support the assignment of the probability pV ðv ¼ 1jHd Þ. Each reference print is characterized by an identifier number, which is unique to the donor of the print, by its finger number (1–10) and by its general pattern (arch, whorl, loop, other— classification made automatically by the AFIS system) (Table 3). The minutiae on the 4,000,000 prints were extracted automatically using the minutiae detection algorithm of the 3M Cogent AFIS. 4.3. Test datasets

The pX T jV ðljHd ; v ¼ 1Þ terms can be assigned by using the distribution of the type of the ith minutia in all reference prints’ k minutiae configurations retrieved by the matching algorithm as described in the previous sections. ðkÞ

The pY T jV ðyT;i ¼ jjHd ; v ¼ 1Þ terms are assigned using the same values as for the numerator depending on whether a ridge ending, bifurcation or unknown type was observed on the ith minutia of the fingermark. 4. Datasets The development of the model and the measures of its performance described in the next sections were conducted using multiple datasets that are described below. 4.1. Development dataset The model was developed using 48, 45 and 33 configurations of, respectively, 4, 8 and 12 minutiae sampled from fingermarks and corresponding fingerprints obtained from archived casework [16]. The histogram estimates of the numerators of the different Table 4 Configurations used to test the model, presented by number of minutiae and region. Note that ‘‘All regions’’ means configurations spanning across multiple regions, while the ‘‘core’’, ‘‘delta’’ or ‘‘periphery’’ implies that all minutiae were sampled from a given region. # Minutiae

3

4

5

6

7

8

9

10

11

12

All region Core Delta Periphery

96 151 61 159

99 170 70 180

98 159 66 125

97 144 57 142

97 125 61 101

100 97 29 76

100 72 25 53

96 60 24 39

93 47 17 27

89 33 14 16

The performance of the model was tested using 565 fingermarks and their corresponding fingerprints: the first 364 fingermarks originate from casework and correspond to the data used to test the model in [16]; an additional 201 fingermarks, developed in casework-like conditions, and their corresponding prints were added to complete the test datasets. Different trained analysts manually annotated the minutiae on the 565 fingermarks and their corresponding fingerprints, in different batches, using PiAnoS4 [26]. Each minutia was paired between the latent and control prints using PiAnoS4’s pairing feature. The following numbers of configurations of 3v12 minutiae (Table 4) were sampled from different regions of the 565 fingermarks and used to test the model3: Two test datasets were constructed using the configurations listed in Table 4: 1) A dataset aimed at measuring the performance of the model under Hp. This dataset includes the fingermark configurations listed in Table 4 and the corresponding configurations taken from the paired fingerprints. 2) A dataset aimed at measuring the performance of the model under Hd. To test the model under the most difficult conditions, each fingermark configuration listed in Table 4 was searched against our reference dataset, using our AFIS matching algorithm, in order to retrieve the most similar k configuration 3 Note that more configurations than those listed in Table 4 were sampled; however, when searched in the reference dataset, some configurations were not associated with a sufficient number of reference configurations to estimate the density functions.

162

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

out of 4,000,000 impressions. Thus, this dataset includes the fingermark configurations listed in Table 4 and their most similar (according to the specifications of the matching algorithm) counterparts sampled from the reference dataset.

5. Results and discussion on model performance Neumann et al. [16] provided data on the expected value of the weight of fingerprint evidence for configurations of 3–12 minutiae under Hp and Hd. This data was used to support the admissibility of fingerprint evidence in U.S. courts [27,28]. However, no data was generated to study the expected value of the weight of fingerprint evidence when the fingermark is thought to have originated from a particular finger, a finger with a particular ridge flow pattern, or from a particular region of the fingerprint. Therefore, the experiments reported below were designed to not only measure the global performance of the model under Hp and Hd as in [16], but also to study the differences between the expected weights of fingerprint evidence under different conditions. Four types of experiments were conducted: 1) Experiment #1: Measure of the global performance of the model under Hp and Hd—this experiment was designed to study the global performance of the model by assigning weights to samesource fingermark/fingerprint comparisons, and to different (but very similar)-sources comparisons, using minutiae configurations indistinctively spanning across all regions of the test fingermarks (1st row of Table 4). During this experiment, we used the entire reference dataset. 2) Experiment #2: Expected weight of the evidence when the fingermark configuration is originating from a specific region— this experiment was designed to study the differences in weight of evidence between configurations with the same number of minutiae, but sampled from different regions of the test fingermarks (2nd–4th rows of Table 4). During this experiment, we used the entire reference dataset. 3) Experiment #3: Expected weight of the evidence when the fingermark configuration is thought to have come from a finger with a specific general pattern—this experiment was designed to study the differences between the weight of evidence of a set of given fingermark configurations when there is information (typically from the fingermark itself) leading the examiner to believe that they have been left by fingers with a specific general pattern. During this experiment, a set of configurations indistinctively spanning across all regions of the test fingermarks (1st row of Table 4) were used, and our reference dataset was conditioned on three different types of patterns (loop, arch, whorl). 4) Experiment #4: Expected weight of the evidence when the fingermark configuration is thought to have come from a finger with a specific finger number—this experiment was designed to study the differences between the weight of evidence of a set of given fingermark configurations when there is information (typically from the location of the mark) leading the examiner to believe that they originate from a specific finger number (thumb, fore finger, . . .). During this experiment, a set of configurations indistinctively spanning across all regions of the test fingermarks (1st row of Table 4) were used, and our reference dataset was conditioned based on fingerprints located on thumbs vs. the other 8 fingers. We were unable to study the behavior of the model when the test configurations are assumed to come from a specific region of the print, since it is currently not possible to constrain our AFIS matching algorithm to only search among cores or deltas of the

reference prints. Nevertheless, since our 3 M Cogent AFIS algorithm is using image information in addition to the minutiae information to perform its searches, it is reasonable to assume that test configurations containing cores or deltas (as in experiment #2) were primarily searched in areas including cores and deltas on the reference prints. For each type of experiments, we differentiated the behavior of the modeled part (i.e., the shape (S), direction (D) and type (T) components) from the AFIS part (i.e., pV ðv ¼ 1jHd Þ) of the model. Experiment #1. Overall performance of the model (shape, direction, type components)—all regions Fig. 3a–d depicts the estimates for the denominator of the three components (S, D and T) of the model, separately (a–c), and jointly (d). We observe that the contribution of the shape variable to the overall weight of evidence is much larger than the other ones. We also observe the lack of contribution of the direction component of the model. Overall, Fig. 3a–d shows that the model has the ability to assess the specificity of minutiae configurations; that this specificity increases with the number of minutiae; but that it also varies between configurations with a given number of minutiae, as these configurations have different shapes and combinations of minutiae types. These observations are similar to [16], but also to most of the previous studies listed in [5,14]. Fig. 4a–d presents estimates for the numerators of the three components (S, D and T) of the model, separately (a–c), and jointly (d). The figures on the left-hand-side present the data obtained for fingermark and fingerprint configurations originating from the same source, while the figures on the right-hand-side present data obtained for the same fingermark configurations when compared with fingerprints provided by different sources (as explained above). Fig. 4a–d shows that the expected probability of observing the features on a fingermark, based on potentially corresponding features observed on the fingerprint, decreases with the number of minutiae. This is not surprising as the increase in the number of minutiae induces increasing variability between multiple impressions of the same set of k minutiae due to pressure, distortion and other factors. When comparing the left (same source) to the right (different sources) columns of Fig. 4a–d, we realize that the expected numerator probability decreases faster when fingermarks are compared to fingerprints originating from different sources. This effect is the result of the added discrimination introduced by the increasing number of features. Although it seems that there are not many differences between numerators calculated for same/ different sources situations, we remind the reader that the fingerprint configurations used when Hd is assumed to be true were selected to be very similar to the fingermark configurations: we anticipate that the expected numerator probability would decrease even faster when the model is not tested in the most difficult conditions. The somewhat large ranges of values calculated for the numerator of the model can be explained by two elements: (1) the distortion model used in this project is not providing enough variability in the set of pseudo-marks generated from the fingerprints, and thus cannot compensate for medium to large distortion effects of fingermarks; (2) the model is affected by a lack of accuracy from the users annotating the test marks and prints in PiAnoS4. Fig. 5a–d presents the estimates obtained for the overall LR of the three components (S, D and T) of the model, separately (a–c), and jointly (d). The figures on the left-hand-side column present the estimates obtained for fingermarks compared with fingerprints provided by the true source, while the figures on the

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

163

Fig. 3. (a) Estimates for the denominators of the shape component of the model—all regions. (b) Estimates for the denominators of the direction component of the model—all regions. (c) Estimates for the denominators of the type component of the model—all regions. (d) Estimates for the denominators of the model (S, D and T)—all regions.

right-hand-side column present data obtained for the same fingermarks when compared with fingerprints provided by different sources. Overall, we observe that the LRs calculated for pairs of fingermarks and fingerprints originating from the same source increases with the number of minutiae, while it remains centered around LR = 1 for pairs of fingermarks and fingerprints originating from different sources. These results are similar to the ones obtained by Neumann et al. [16] and for most models reported in [5,14]: (1) the expected value of the LR increases with the number of minutiae, (2) a range of LR values is observed for each number of minutiae, indicating that each configuration of minutiae needs to be considered on its own merits, (3) there is not clear cut-off point that would entertain the idea of a scientific basis for a numerical standard. We observe a significant number of LR calculated for pairs of fingermarks/fingerprints from different sources that are above 1 (Fig. 5d—right column), and therefore would misleadingly provide

support for the hypothesis that the fingermark originates from the same finger as the fingerprint (even though it is not true). Furthermore, we observe that a minority of these LRs have extremely high values, thus they are not only misleading, but they carry a significant weight in favor of the wrong hypothesis. Overall, this observation is not concerning. This result can be explained by the method used to select the putative sources’ fingerprints when Hd is assumed to be true: by design, we searched our fingermarks in our reference dataset of 4,000,000 prints, and we considered the most similar reference fingerprints as being from the same putative sources in order to provide a ‘‘worst case scenario’’ snapshot of the performance of the model. The selection was based on the 3M Cogent AFIS algorithm, and thus on the similarity between the shapes of fingermark and reference prints configurations. We can observe, by comparing Fig. 5a–d (right column) that the overall LRs calculated when the fingermarks and fingerprints are not coming from the same source (Fig. 5d—right column) is mainly driven by the shape component of the LR, while

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

164

a

b

c

d

Fig. 4. (a) Estimates for the numerators of the shape component of the model—all regions. Left: Same source fingerprints; Right: different source fingerprints. (b) Estimates for the numerators of the direction component of the models—all regions. Left: Same source fingerprints; right: different source fingerprints. (c) Estimates for the numerators of the type component of the model—all regions. Left: Same source fingerprints; right: different source fingerprints. (d) Estimates for the numerators of the model (S, D and T)— All regions. Left: Same source fingerprints; right: different source fingerprints.

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

165

Fig. 5. (a) Estimates for the LRs of the shape component of the model—all regions. Left: Same source fingerprints; right: different source fingerprints. (b) Estimates for the LRs of the direction component of the model—all regions. Left: Same source fingerprints; right: different source fingerprints. (c) Estimates for the LRs of the type component of the model—all regions. Left: same source fingerprints; right: different source fingerprints. (d) Estimates for the LRs (S, D and T)—all regions. Left: Same source fingerprints; right: different source fingerprints.

166

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

its direction and type components tend to correctly provide support for Hd. The results presented in Fig. 5 support the idea that our model has good performance when the putative source has not been generated as a result of a fingerprint database search. By extension, our results confirm that conclusions of fingerprint examinations, when the suspect has been generated through a database search, need to rely on features that exhibit a greater degree of specificity (and of potentially higher quality) than when the suspect has been generated through investigative work. Experiment #2. Expected value of the weight of fingerprint evidence (S, D and T components)—by region

In general, we observe that the component-wise behavior of the model in that experiment is similar to the one reported above. This observation is valid under both Hp and Hd. The overall model’s behavior when considering the configurations sampled in three

different regions of the test fingermarks (core, delta and peripheral regions) under Hp is presented Fig. 6a–c. The similarity of the behavior of the model can be observed by comparing Fig. 6 to Figs. 3–5. Interestingly, it appears that the expected weight of the evidence, as calculated for the shape, direction and type components of the model, is not different between regions of the friction ridge skin. This seems counterintuitive: we would be expecting configurations in core and delta regions to be less discriminative than configurations in the periphery of the prints (at least shape-wise), and therefore, have lower weight of evidence. The explanation of this observation requires us to take a look at the results obtained for the AFIS part of the model, which we have not considered so far. Experiments #1&2 – Part II. Overall performance of the model and expected value of the weight of evidence by region (AFIS component)

Fig. 6. (a) Estimates for the numerators of the model—shape, direction and type components. Same source—Left: core region; middle: delta region; right: peripheral region. (b) Estimates for the denominators of the model—shape, direction and type components. Same source—Left: core region; middle: delta region; right: peripheral region. (c) Estimates for the LRs of the model—shape, direction and type components. Same source—Left: core region; middle: delta region; right: peripheral region.

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

Fig. 7a–d presents the values for pV ðv ¼ 1jHd Þ estimated for the core (a), delta (b) and peripheral regions (c), and for all regions together (d). Fig. 7 shows that as the number of minutiae increases in the configuration, the number of sufficiently similar reference configurations retrieved by the AFIS algorithm decreases. We also observe that this decrease in the number of retrieved configurations varies between the different patterns of the tested configurations sampled from our fingermarks. In particular, it appears that the AFIS algorithm retrieves more reference configurations when a query configuration originating from a delta region is searched, even if that configuration includes a large number of minutiae. We would have expected more differences between the results obtained between peripheral and core regions, since it would seem that the variability of configurations in the periphery of the pattern should be larger than in the core. Nevertheless, we note that cores usually do not include many minutiae (especially in the automatic encoding of AFIS) and that it is difficult to define configurations that are entirely in the core region of a pattern. Thus, there may have been a significant overlap between peripheral and core configurations.

167

With respect to the observation that the behavior of the shape, direction and type components of the model do not show any difference between the different regions (in relation to Fig. 6), we can now see that most of the difference is already captured by the AFIS part of the model. By design, the shape, direction and type components of the model are only focusing on the reference configurations that are found to be similar to the selected configuration on the fingermark. These components do not account for the difficulty of finding these similar reference configurations. This is done by the AFIS component. These results are confirmed in the next sections for finger numbers and finger friction ridge patterns (i.e., whorl, loop, arch). Critically, the results presented in Fig. 8 show that a suitably configured commercial AFIS can readily provide information on the weight of fingerprint evidence. Experiments #3&4. Expected value of the weight of fingerprint evidence when there is information on the origin (general pattern or finger number) of the fingermark configuration (S, D, T and AFIS components)

Fig. 7. (a) Estimates for the AFIS component of the model—core region. (b) Estimates for the AFIS component of the model—delta region. (c) Estimates for the AFIS component of the model—peripheral region. (d) Estimates for the AFIS component of the model—all regions.

168

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

When information is available on the general pattern (arch, loop, whorl, . . .), and/or the finger number (thumb, fore finger, . . .) of the finger that has produced the fingermark, the shape, type, direction components of the model behave in a very similar fashion as discussed in relation to Figs. 3–6. The use of the different reference datasets does not allow us to observe any difference between the expected weight of the evidence calculated by the shape, direction and type components of the model (neither under Hp, nor Hd); and most of the effect of using different reference datasets is captured by the AFIS part of the model. Fig. 8a–f presents the effect of using information on the source finger being a thumb vs. another finger, and being an arch, a whorl, or a loop on the AFIS component of the model. Overall, Fig. 8 seems to indicate that the probability of observing any given minutiae configuration on thumbs is lower than on other fingers; and that the probability of observing any given minutiae configuration on a finger with an arch pattern is lower than on a finger with another pattern (i.e., loop and whorl). These observations seem to indicate that there is more variability between configurations located on thumbs and on arch patterns than between configurations located on other fingers/patterns. This may seem counterintuitive to any latent print examiner, whose experience is that it is more difficult to identify latent prints with arch patterns (due precisely to a much lower variability between configurations located on arch patterns). This seemingly counterintuitive result can partially be explained. The pV ðv ¼ 1jHd Þ in Fig. 8 were calculated as described in [20,29]. That is, only k minutiae configurations deemed similar to the fingermarks by our AFIS algorithm on the specific fingers/ general patterns were considered; however the number of these configurations was weighted by the total number of fingerprints in the reference dataset to reflect the actual probability of observing the considered finger/pattern combination in the population. This is justified by the choice of propositions for Hp, and Hd, which are person propositions and not finger propositions [20]. This manner of estimating pV ðv ¼ 1jHd Þ explains partially why Fig. 8 shows that the weights of configurations on thumbs are shifted when compared to the weights of the same configurations on the other fingers. For example, Table 3 shows that there are 17,905 thumbs and 95,274 other fingers with arch patterns out of 4,054,350 fingers, which corresponds to a ratio of 0.188 thumb for every other finger with arch pattern and to a base shift of 0.72 on the log10 scale in Fig. 8. Table 5 presents the log10 of the base shift for the various combinations of fingers/patterns used in this study. Fig. 8 provides the actual weight carried by any given configuration when it is thought to be on a particular finger or on a particular pattern: the weight presented in Fig. 8 are the weights that could be reported in casework if one was to present the weights of the test configurations in court. However, Table 5 is provided as a way to quantify how much of the weight of any particular configuration is due simply to the finger number/ pattern combination, and how much of that specificity is due to the variability between minutiae configurations on that finger/ pattern. We note the inconsistency of some of the results for configurations of 12 minutiae thought to originate from thumbs and other fingers on arch patterns: in these cases, a large number of test configurations were not associated with any reference configurations by the AFIS algorithm. Since it was not possible to approximate the pV ðv ¼ 1jHd Þ for these configurations, their results are not shown in Fig. 8a and b; thus the results shown for configurations of 11–12 minutiae in Fig. 8a and b are based a very small number of test fingermark configurations, which decreases the accuracy of the estimation of the expected value of the weight of evidence for these numbers of minutiae.

6. General discussion and conclusion In this paper, we present a statistical model for the quantification of the weight of forensic evidence. Our model is based on the likelihood ratio framework. According to this framework, the weight of a particular fingerprint evidence is assigned by comparing (a) the likelihood of observing a given fingermark considering that it originates from a particular person and (b) the likelihood of observing that fingermark considering that it originates from a random individual in a relevant population. Contrary to a majority of models proposed in the past years (since [30]), the model is not based on similarity measures, but directly attempts to assign probability distributions to fingerprint features. The model has 3 components, which focus respectively on the spatial relationship, the direction and the type of minutiae that can be observed in any given fingermark configuration. These components are all conditioned on a fourth component, which is designed to mimic the fingerprint comparison process where an examiner will select a set of k features on the fingermark, and select the k potentially corresponding features on a fingerprint out of the n possible features that can be observed on that fingerprint. In our model, an examiner does the selection of the k minutiae on the fingerprint from the considered individual, while an AFIS matching algorithm provided by 3M Cogent, Inc. (Pasadena, CA, USA) performs this selection for a sample of individuals from the relevant population. In practice, any means of selecting the most similar k minutiae out of a set of n can be used. The set of k minutiae configurations selected by the fingerprint examiners and by the AFIS matching algorithm are then used to support the estimation of the probability density of the spatial relationship, the direction and the type of the minutiae observed on the fingermark. On the one hand, the development of the 3 first components requires a series of assumptions and simplifications in order to maintain the statistical and computational complexity of the model to an acceptable level. On the other hand, the fourth component simply involves searching the features observed on a considered fingermark into (a) the fingerprint of a given individual and (b) a large reference set of fingerprints. The performance of the model has been tested in different situations. In particular, the expected weight of evidence of a series of fingermark configurations has been calculated using different assumptions on their origin in terms of finger number or pattern of the ridge flow. The results presented above for all 4 components of the model shows that the model works reasonably well; and in particular that the spatial relationship between minutiae observed on fingermarks carries more weight than the direction and the type of these minutiae. This effect is mainly due to the better robustness of distance measurements between minutiae when compared to direction and type assignments to the minutiae. Interestingly, the results show that the AFIS component of the model already captures a significant amount of the specificity of fingerprint features. In other words, our results show that, while the shape, direction and type components of the models enable the quantification of some of the weight of the fingerprint evidence, this weight is provided in addition to the weight already provided by the AFIS algorithm. In fact, it seems that the AFIS component of the model, by design, has a better sensitivity to the various subpopulations contained in the relevant population of alternative sources, and that it does not require as many assumptions and simplifications as the other components. Thus, this research rejoins Egli et al. [31] and confirms that a model for the quantification of the weight of fingerprint evidence can be designed directly based on an AFIS algorithm, without the need for an added layer of complexity induced by the modeling of fingerprint features. However, we need to reiterate the caveat made earlier in this paper: the use of AFIS scores in models aimed

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

169

Fig. 8. (a) Estimates for the AFIS component of the model—arch pattern on thumbs. (b) Estimates for the AFIS component of the model—arch pattern on other fingers. (c) Estimates for the AFIS component of the model—loop pattern on thumbs. (d) Estimates for the AFIS component of the model—loop pattern on other fingers. (e) Estimates for the AFIS component of the model—whorl pattern on thumbs. (f) Estimates for the AFIS component of the model—whorl pattern on other fingers.

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

170

Table 5 log10 base shift for the various combinations of fingers/patterns used in this study. This base shift represents the weight of evidence that is simply due to the finger/ pattern combination as opposed to the weight of evidence due to differences in the distributions of configurations on the different finger/pattern. Thumb

Thumb Arch Loop Whorl Other finger Arch Loop Whorl

Other Fingers

Arch

Loop

Whorl

Arch

Loop

Whorl

1.00

1.27 1.00

1.30 0.02 1.00

0.73 0.02 0.57

1.94 0.67 0.64

1.64 0.37 0.34

1.00

1.21 1.00

0.91 0.30 1.00

at quantifying the weight of fingerprint evidence is not well understood. Thus, it may be that future models should rely on AFIS technology, but not directly on AFIS scores. Our results also show that the model seems to generate a significant amount of misleading evidence when a fingermark is compared to a fingerprint from a different source, and that in some cases, the misleading evidence strongly supports the hypothesis that these impressions originate from the same source. These results root in our experimental design. Indeed, the results provided above concern comparisons between a set of fingermarks and the most similar fingerprints found in a dataset of more than 4,000,000 fingerprints. This provides an idea of the performance of the model in a ‘‘worst case scenario’’ environment. Nevertheless, some improvements in the model are needed to keep the rate of misleading evidence to an acceptable level, even in the harshest conditions. Overall, this paper provides significant data on the expected probative value of fingermark configurations under different conditions and assumptions regarding sub-populations of potential donors. This globally provides information on the validity of the use of fingerprint evidence to discriminate between individuals in a relatively restrained geographical area. In particular, our results confirm the ones presented in [16]: 1) The expected value of the LR increases with the number of minutiae. 2) A range of LR values is observed for each number of minutiae, indicating that each configuration of minutiae needs to be considered on its own merits. 3) There is not a clear cut-off point that would lend itself to the idea of a scientific basis for a numerical standard. In addition, our results stress that different magnitude of probative values are needed to reach a conclusion on the source of a mark, when the case is assessed in the light of the potential source being generated through non-fingerprint evidence, compared to the situation where the potential source was generated through a search in a fingerprint database. In the latter case, a greater degree of specificity and quality in the selected fingerprint features will be necessary for the examiner to reach a conclusion. Our data can readily be used to answer the need for statistical data to support the admissibility of fingerprint evidence in court as it was done for the data presented in [16] on behalf of the State of Minnesota [27,28]. The need for statistical data to support the general scientific foundations of fingerprint evidence was expressed by many legal and scientific commentators and summarized in the recent report from the National Research Council of the American National Academy of Sciences [32]; however, we also believe that fingerprint statistical models could

be used in any given case to quantify the evidential weight of any fingerprint comparison. Naturally, such case-specific usage of statistical models implies that the models have gone through extensive scientific and operational validation as described by SWGFAST [33]. Ideally, the rates of misleading evidences [30] of the models should be provided based on a large sample, the various assumptions supporting the models should be tested and their general accuracy should be assessed. This latter point may be the most challenging. Notwithstanding the importance of the validation process, the use of statistical models in casework also needs to be framed within a larger context of operating procedures, workflow management and operational benefits and limitations. Our model has not been validated in that sense. Overall, the main issue faced by scientists developing statistical models for fingerprint evidence is related with the dimensionality and complexity of the variables used to capture friction ridge characteristics. Our attempt to build a model in the feature space required us to make a number of assumptions related to the independence of the features; attempts to build models based on AFIS scores are slowed down by the lack of understanding of the mapping of the feature space to the score space. We believe that the solution lives in building a model in the feature space, where the characteristics are summarized by some means, such as a score as in [16]. Acknowledgments This research was funded in part by Grant 2010-DN-BX-K267 from the National Institute of Justice of the U.S. Department of Justice awarded to The Pennsylvania State University. This research was supported by data and technology provided by 3M Cogent, Inc. to Two N’s Forensics, Inc. In particular, the authors wish to thank Ms. Teresa Wu, Dr. Xian Tang, Mr. Walt Seltz, Mr. Chris Kopcsak, Mr. Martin Kenner and the rest of the team at 3M Cogent for their help in setting up the research AFIS algorithm. Finally, the authors want to thank Ms. Haleigh Boswell, Ms. Chelsea Ellis and Mr. Jonathan Duffy for annotating the fingerprint images used to test the model.

References [1] Scientific Working Group on Friction Ridge Analysis Study and Technology (SWGFAST), Standards for Examining Friction Ridge Impressions and Resulting Conclusions (Latent/Tenprint), version 2.0, 2013, http://www.swgfast.org/documents/examinations-conclusions/130427_Examinations-Conclusions_2.0.pdf (last verified April 29th 2013). [2] Interpol European Expert Group on Fingerprint Identification II—IEEGFI II, Part 2: Detailing the Method Using Common Terminology and Through the Definition and Application of Shared Principles, Interpol, Lyon, 2004. [3] D.R. Ashbaugh, in: V.J. Geberth (Ed.), Qualitative-Quantitative Friction Ridge Analysis—An Introduction to Basic and Advanced Ridgeology, CRC Press, Boca Raton, 1999. [4] C. Champod, C.J. Lennard, P.A. Margot, M. Stoilovic, Fingerprints and other Ridge Skin Impressions, CRC Press, Boca Raton, 2004. [5] C. Neumann, Statistics and probabilities as a means to support fingerprint examination, in: R. Ramotowski (Ed.), Lee and Gaensslen’s Advances in Fingerprint Technology, 3rd ed., CRC Press, 2012, pp. 419–466. [6] US v Llera Plaza, Acosta and Rodriguez, US District Court of the Eastern District of Pennsylvania, Criminal No. 98-362-10,11,12. [7] C. Neumann, Evidence, fingerprint experts seventh circuit upholds the reliability of expert testimony regarding the source of a latent fingerprint, United States v. Harvard, 260 F.3d 597 (7th Cir. 2001), Harvard Law Review, vol. 115, CRC Press, 2002, pp. 2349–2356 no. 8. [8] United States v Byron Mitchell, Court of Appeals for the Third Circuit, No. 02-2859 (April 29, 2004). [9] State of Maryland v. Bryan Rose, The Circuit Court for the Baltimore County, K060545. [10] State of Minnesota v. Jeremy Jason Hull, District Court—Seventh Judicial District, No. 48-CR-07-2336. [11] Expert Working Group on Human Factors in Latent Print Analysis, Latent Print Examination and Human Factors: Improving the Practice through a Systems

C. Neumann et al. / Forensic Science International 248 (2015) 154–171

[12] [13]

[14]

[15] [16]

[17]

[18]

[19]

[20]

[21]

Approach, U.S. Department of Commerce, National Institute of Standards and Technology, Washington, DC, 2012, pp. 327–388. D. Stoney, Measurement of fingerprint individuality, in: H. Lee, R. Gaensslen (Eds.), Advances in Fingerprint Technology, 2nd ed., CRC Press, 2001, pp. 327–388. Scientific Working Group on Friction Ridge Analysis Study and Technology (SWGFAST), SWGFAST Response to the Research, Development, Testing & Evaluation Inter-Agency Working Group of the National Science and Technology Council, Committee on Science, Subcommittee on Forensic Science, 2011, http://swgfast.org/Resources/111117-ReplytoRDT&E-FINAL.pdf (last verified April 29th 2013). J. Abraham, C. Champod, C. Lennard, C. Roux, Modern statistical models for forensic fingerprint examinations: a critical review, For. Sci. Int. 232 (1-3) (2013) 131–150. A.B. Helper, C.P. Saunders, L.J. Davis, J. Buscaglia, Score-based likelihood ratios for handwriting evidence, For. Sci. Int. 219 (1-3) (2012) 129–140. C. Neumann, I.W. Evett, J.E. Skerrett, Quantifying the weight of evidence from a forensic fingerprint comparison: a new paradigm, J. R. Stat. Soc. A 175 (2012) 371–415. H. Stern, Comments on Neumann C., Evett I., Skerrett J. (2012). Quantifying the weight of evidence from a forensic fingerprint comparison: a new paradigm, J. R. Stat. Soc. A 175 (2012) 371–415. J.B. Kadane, Comments on Neumann C., Evett I., Skerrett J. (2012). Quantifying the weight of evidence from a forensic fingerprint comparison: a new paradigm, J. R. Stat. Soc. A 175 (2012) 371–415. T. Hotz, A. Munk, Comments on Neumann C., Evett I., Skerrett J. (2012). Quantifying the weight of evidence from a forensic fingerprint comparison: a new paradigm, J. R. Stat. Soc. A 175 (2012) 371–415. C. Neumann, I.W. Evett, J.E. Skerrett, I. Mateos-Garcia, Quantitative assessment of evidential weight for a fingerprint comparison I. Generalisation to the comparison of a mark with set of ten prints from a suspect, For. Sci. Int. 207 (1–3) (2011) 101–105. D.V. Lindley, A problem in forensic science, Biometrika 64 (2) (1977) 207–213.

171

[22] M. Indovina, R.A. Hicklin, G.I. Kiebuzinski, ELFT-EFS Evaluation of Latent Fingerprint Technologies: Extended Feature Sets, National Institute of Standards and Technology Interagency Report #7775, 2011 (http://dx.doi.org/10.6028/NIST.IR.7859). [23] F. Bookstein, Principal warps: thin-plate splines and the decomposition of deformations, IEEE Trans. Pattn. Anal. Mach. Intell. 11 (6) (1989) 567–585. [24] K.V. Mardia, P. Jupp, Directional Statistics, 2nd ed., John Wiley and Sons Ltd., 2000. [25] C. Geissbuehler, Looking for AFIS Close Non-Matches, Master’s Thesis, School of Criminal Sciences, University of Lausanne, Switzerland, 2011. [26] Picture Annotation Software 4—PiAnoS, School of Criminal Sciences, University of Lausanne, Switzerland, 2012, https://ips-labs.unil.ch/pianos/index.html (last verified April 29th 2013). [27] State v. Hull, No. 48-CR-07-2336, 2008 (Minn. D. Ct. Cty. of Mille Lacs). [28] State v. Dixon, No. 27-CR-10-3378, 2011 (D. Ct. Cty. Hennepin, Minn.). [29] C. Neumann, I.W. Evett, J.E. Skerrett, I. Mateos-Garcia, Quantitative assessment of evidential weight for a fingerprint comparison. Part II. A generalisation to take account of the general pattern, For. Sci. Int. 214 (1–3) (2012) 195–199. [30] C. Neumann, C. Champod, R. Puch-Solis, N. Egli, A. Anthonioz, A. BromageGriffiths, Computation of likelihood ratios in fingerprint identification for configurations of any number of minutiae, J. For. Sci. 52 (1) (2007) 54–64. [31] N.M. Egli, C. Champod, P. Margot, Evidence evaluation in fingerprint comparison and automated fingerprint identification systems—modelling within finger variability, For. Sci. Int. 167 (2–3) (2007) 189–195. [32] National Research Council of the National Academy of Sciences, Strengthening Forensic Science in the United States: A Path Forward, Committee on Identifying the Needs of the Forensic Sciences Community; Committee on Applied and Theoretical Statistics, National Academies Press, 2009. [33] Scientific Working Group on Friction Ridge Analysis Study and Technology (SWGFAST), Standard for the Validation and Performance Review of Friction Ridge Impression Development and Examination Techniques, version 2.0, 2012, http:// swgfast.org/documents/validation/121124_Validation-Performance-Review_2.0. pdf (last verified December 14th 2014).

Quantifying the weight of fingerprint evidence through the spatial relationship, directions and types of minutiae observed on fingermarks.

This paper presents a statistical model for the quantification of the weight of fingerprint evidence. Contrarily to previous models (generative and sc...
2MB Sizes 2 Downloads 4 Views