1

Entropy Measurement for Biometric Verification Systems Meng-Hui Lim, Member, IEEE, and Pong C. Yuen, Senior Member, IEEE

Abstract—Biometric verification systems are designed to accept multiple similar biometric measurements per user due to inherent intrauser variations in the biometric data. This is important to preserve reasonable acceptance rate of genuine queries and the overall feasibility of the recognition system. However, such acceptance of multiple similar measurements decreases the imposter’s difficulty of obtaining a system-acceptable measurement, thus resulting in a degraded security level. This deteriorated security needs to be measurable to provide truthful security assurance to the users. Entropy is a standard measure of security. However, the entropy formula is applicable only when there is a single acceptable possibility. In this paper, we develop an entropy-measuring model for biometric systems that accepts multiple similar measurements per user. Based on the idea of guessing entropy, the proposed model quantifies biometric system security in terms of adversarial guessing effort for two practical attacks. Excellent agreement between analytic and experimental simulation-based measurement results on a synthetic and a benchmark face dataset justify the correctness of our model and thus the feasibility of the proposed entropy-measuring approach. Index Terms—Biometric guessing, security.

system,

entropy

measurement,

I. I NTRODUCTION IDESPREAD deployment of biometric verification systems in various applications has led to increasing concerns about the security of systems in authenticating users. Upholding biometric system security is crucial because system security has a direct effect on public acceptance of biometrics technology. To ensure minimum risk of identity theft and financial loss, biometric systems must be carefully designed to achieve low error rates and invulnerability to tampering. Two biometric samples of the same identity are rarely exactly the same due to imperfect image acquisition, changes in physiological/behavioural characteristics, ambient conditions, and user’s interaction with the sensor. To tolerate such biometric variations, similarity between a query biometric representation and the enrolled representation in database is typically measured and compared against a system decision threshold to determine whether a positive match is found.

W

Manuscript received September 12, 2014; revised December 31, 2014 and March 2, 2015; accepted April 13, 2015. This work was supported in part by the RGC General Research Funds under Grant 211612 and Grant 12201414, and in part by the Hong Kong Baptist University Science Faculty Research Grant. This paper was recommended by Associate Editor P. Bhattacharya. The authors are with the Department of Computer Science, Hong Kong Baptist University, Hong Kong (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCYB.2015.2423271

In existing template protection schemes, such a decision threshold could be the error correcting capability of error correcting code (e.g., fuzzy commitment [15] and fuzzy extractor [7]), or the similarity threshold of one-way transformation (e.g., Biohash [33] and Biophasor [34]). Although, this system tolerance could preserve a reasonable acceptance rate of genuine queries, it decreases the imposter’s difficulty of obtaining a system-acceptable measurement, thus resulting in a degraded security level or equivalently, an increased false accept rate. On top of system tolerance, another factor that could affect system security is the occurrence probability of biometric representations, which can be inferred by the publicly estimable pairwise similarity distribution of biometric representations, or often known as “imposter distribution.” In the case of equal occurrence probability of all possible (discrete) biometric representations, the imposter distribution is a binomial distribution with 0.5 success probability. Hence, no hint is given on which representation is more likely to appear acceptable. An imposter distribution that is deviated from such binomial distribution simply leaks information about the randomness of the biometric representations, where the degree of deviation implies how nonrandom the biometric representations are. If the system decision threshold, which affects the system false acceptance rate (FAR) and rejection rate, is also known, an attack strategy can be designed to defeat the system effectively. In this paper, we will focus on binary biometric representation, as binary representation (e.g., binary finger [35], iris [28], palm [30], and face [5], [10], [19] representations) has been a common form of biometric representation for the application of error-correcting-code-based template protection schemes. Ideal biometric system security can rarely be achieved in practice because a system seldom has a zero Hamming distance decision threshold to accept only the exact enrolled biometric representation of a user. To better tolerate intrauser biometric variations, the decision threshold is usually set to a nonzero value for a higher genuine acceptance rate, which also increases the adversarial chance of gaining a false accept and leads to a degradation in system security. Another reason which ideal biometric system security is hard to achieve is that the binary biometric features are usually not uniform and not independent. Uniform features can only be extracted when the interclass median (for single-bit discretization) or the population feature distribution (for multibit discretization) can be estimated accurately for equally probable quantization. However, the training set of a biometric system usually has a limited size, which could be less representative in practice.

c 2015 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/ 2168-2267 redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

IEEE TRANSACTIONS ON CYBERNETICS

In this case, the estimations from the training set can be imprecise, affecting the precision of quantization points and thus the extraction of uniform binary features. In addition, the uniformity of binary features can also be negatively affected when partial-code-based feature encoding [18] is used in multibit discretization for a better preservation of the discriminability of the extracted features. This is because only a nonuniform part of the codewords in a code is used for feature encoding. Second, biometric features are inherently correlated among one another. Many feature extraction methods such as deep learning [22] and monogenic binary coding [36] preserve such correlation to achieve an excellent recognition performance. Although holistic methods like principal component analysis and independent component analysis are able to extract uncorrelated and independent features in theory, respectively, these methods are often not adopted in practice due to their poor recognition performance that is unacceptable from an application point of view. In addition, the extracted features could be correlated in practice [13], [20]. The entropy loss resulted from the set of nonuniform and nonindependent extracted features leads to an easier adversarial attack, causing system security to degrade. This deteriorated system security needs to be measured not only to provide truthful security assurance to biometric users, but also to allow appropriate comparisons of security over different biometric systems with different quality of extracted representation and decision threshold.

for biometric system entropy measurement due to the lack of consideration of multiple acceptable possibilities. To address the absence of system–entropy measurement, empirical entropy estimation has been explored, where the hardness of adversarial guessing is quantified. Biometricrelevant approaches include coverage effort [26], guessing distance [3], and minimum decoding complexity [24]. These three measures are individually defined as: 1) the number of guesses required for recovering a certain fraction of the fingerprint minutiae; 2) the number of guesses required to obtain the most likely acceptable element as prescribed by the population feature distribution; and 3) the minimum imposter decoding complexity of a secure sketch for a match, respectively. However, these measures have the following limitations: 1) coverage effort is specific to fingerprint minutiae template transform; 2) guessing distance requires full knowledge of feature distribution, which can be hard to estimate for binary features [37]; and 3) minimum decoding complexity is a security estimate based on an attack strategy that combines a “bounded” brute force attack (guessing with a subset of possibilities) and an FAR attack but it is not clear whether this estimate can precisely capture the adversarial advantage on the entropy loss based on such a bounded brute force attack.

A. Previous Work

In this paper, we develop a generic entropy-measuring model for biometric systems. Our model is inspired by the guessing entropy [23], where the entropy is expressed in terms of adversarial guessing effort. We describe the effort as a function of adversarial probability of successful guessing at all possible trials. To estimate these probability terms, we analyze two generic cases: “sampling with replacement” and “sampling without replacement.” For each of these cases, we study an example attack [8], [25] on biometric systems based on a signal-tapping point shown in Fig. 1 and subsequently estimate the successful-guessing probability terms based on the best-predicted adversarial guessing behavior. The significance of our contribution is twofold. 1) We formulate, for both sampling with and without replacement cases, the probability of adversarial success at all possible trials, which the expected number of trials and system entropy are based on. 2) For the sampling without replacement case, we present an effective guessing strategy to best estimate the adversarial success probability based on the predicted adversarial behavior at all trials in order to achieve a good estimation of the system entropy. The structure of this paper is organized as follows. In the next section, an analogy of the problem is given. In Section III, the proposed approach of measuring multipleacceptable-input-based system entropy for the two different sampling cases is described and elaborated. The accuracy of our formulations and the effectiveness of the proposed guessing strategy for the sampling without replacement case are justified experimentally in Section IV. Finally, the conclusion is drawn in Section V.

Entropy is a standard measure of security. In cryptanalysis, entropy is regarded as a standard measure of unpredictability of the cryptographic key. Maximum entropy is achieved when all combinations of cryptographic key are equiprobable. A common technique to measure entropy on a random variable X is to apply the standard formula [29] pi log2 (pi ) (1) H(X) = − i

where pi denotes the occurrence probability of the ith possible outcome. This formula has been commonly used in quantifying information in the binary biometric representation [17]–[19], measuring security of multibiometric cryptosystem [11], and template protection [12], [16]. Alternatively, Daugman [6] measured discrimination entropy (combinatorial complexity) of iris bits from the degree of freedom of the estimated imposter distribution. However, these entropy measurements correspond to the case of single acceptable possibility. Variants of entropy used in measuring biometric information include min-entropy [7], relative entropy [1], [31], [32], and distance entropy [9]. Min-entropy is the worst-case entropy that quantifies the unpredictability of a binary representation based on the largest occurrence probability of the representation. Relative entropy or Kullback–Leibler divergence is a measure of the biometric feature information that determines how discriminative a user’s feature distribution is with reference to the population distribution. Distance entropy is a measure that quantifies the hardness of obtaining a close approximation of the user’s biometric by leveraging imposter distribution. These interesting variants are again inappropriate

B. Contributions

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. LIM AND YUEN: ENTROPY MEASUREMENT FOR BIOMETRIC VERIFICATION SYSTEMS

3

Fig. 1. Adversarial sampling cases: with replacement (point 1 attack by sensor-signal tapping) and without replacement (point 2 attack by discretizer-signal tapping).

II. A NALOGY Imagine that a challenger is given a bag of N balls, where each ball is labeled with a unique number. Suppose the challenger is allowed to draw a ball at a time, such that: 1) the ball is returned to the bag after each draw; 2) the ball is discarded after each draw. Cases 1) and 2) are known as the sampling with and without replacement cases, correspondingly. Given these cases, it is interesting to investigate how hard or unpredictable it is for the challenger to obtain his first success when there is/are m ≥ 1 acceptable possibility(ies). This hardness can often be expressed in terms of average number of trials or entropy. However, the measurement of entropy is not straightforward, as the entropy in (1) is only valid for the case of m = 1 but not m > 1. Fig. 2 shows the average trials to the challenger’s first success for N = 212 and 1 ≤ m ≤ 20. Because every ball has equal probability to be drawn, the average trials E[T], with T denoting the number of guesses, can be quantified as follows: ⎧N ⎪ for sampling with replacement ⎨ E[T] = m ⎪ ⎩ N+1 for sampling w/o replacement. m+1

(2) (3)

The proof for (3) can be found in the Appendix. From Fig. 2, it is observed that the average trials for both sampling with and without replacement cases drop exponentially as m grows and they finally approach each other at large m. This shows that the effort for obtaining an acceptable possibility decreases exponentially with the uniformly increasing acceptable possibilities. This ball-drawing problem can simply be converted to a guessing attack against a biometric verification system. Here, the bag represents the entire guess set, the challenger represents the adversary, the balls represent the guessing possibilities (biometric representations), the acceptable possibilities represent the system-acceptable biometric variations with respect to a genuine user, where the quantity is parameterized by the system decision threshold, and each draw represents a guessing trial. The sampling with or without replacement problem can be exemplified by an attack that determines whether the adversary has a direct access to the guess set and is able to recognize and exclude the improbable guesses inferred from her past guesses. These attacks are

Fig. 2. Average trials to the challenger’s first success based on different number of acceptable possibilities for N = 212 .

known, in the next section, as sensor and discretizer attacks, respectively. In the context of guessing attack, the results in (2) and (3) demonstrate the need of exponentially less guessing effort when the system decision threshold is increased to tolerate larger intrauser biometric variations. However, these results can only be applied when the extracted population biometric representations are equally probable or when there is no predictable structure of acceptable possibilities. In biometric recognition where biometric representations are seldom ideally equiprobable and where acceptable biometric variants fall within a fixed neighborhood of the enrolled biometric template, the quantification of average trials is much more challenging due to two reasons. First, the acceptable representations could happen at different probabilities and the binary probability feature distribution is often difficult to estimate [37]. Second, because the acceptable biometric variants are neighbors of the enrolled biometric template, a guessing failure also reveals that the neighbors of the failed guess do not belong to the set of acceptable guesses. To address the issues of average-trial quantification in biometric recognition, we will analyze the sampling without replacement case in great detail. We select an attack based on discretizer-signal tapping to grant the adversary control over his guesses on binary representation. By leveraging the imposter distribution, we show how an effective guessing strategy can be designed to predict the adversarial behavior in such an attack. Then, by formulating the probability of adversarial

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

Fig. 3.

IEEE TRANSACTIONS ON CYBERNETICS

Sampling with replacement case: attack by sensor-signal tapping (also known as sensor attack).

success at all possible trials based on such a guessing strategy, the expected number of trials and system entropy can thus be measured. We will describe this entropy-measuring approach in the next section. III. O UR A PPROACH On the basis of guessing entropy [23], the biometric-systementropy-measurement problem can be tackled by expressing entropy in terms of adversarial guessing effort, where the guessing effort is characterized by the average trials to the first success. By observing the relation between the guessing effort (the average number of trials needed to guess an acceptable binary representation) and the entropy, we break the entropy into probability components, where each probability component represents the probability of adversarial success at a guessing trial. By correctly formulating these probability components, we can then derive an entropy measure for system security. Hence, the main challenge lies in how we can correctly formulate the probability of adversarial success at all guessing trials to yield a good estimation of system entropy for two generic adversarial sampling cases. System-Entropy-Measuring Model: It is known that for the sampling with replacement case, an n-bit entropy implies an average of 2n guessing attempts for an adversary to obtain the correct guess, whereas for the sampling without replacement case, an n-bit entropy implies (2n + 1)/2 brute force guessing attempts on average. Mathematically, given an entropy H(X) for a variable X with possible values {1, . . . , 2H(X) }, the average trials E[T] can be expressed by ⎧ H(X) for sampling with replacement (4) ⎨2 E[T] = 2H(X) +1 ⎩ for sampling w/o replacement. (5) 2 Rearranging (4), we obtain an expression for H(X) in terms of E[T] ⎧ log2 (E[T]) for sampling with ⎪ ⎪ ⎪ ⎨ replacement (6) H(X) = ⎪ log2 (2E[T] − 1) for sampling w/o ⎪ ⎪ ⎩ replacement. (7) By definition, E[T] can alternatively be described by E[T] =

T max T=1

T · P(Xtrial = T)

(8)

where P(Xtrial = T) denotes the probability of taking T trials for the first adversarial success in guessing and Tmax denotes the maximum number of trials that is dependent on the guessing strategy. On the basis of Eq. (8), if we can formulate P(Xtrial = T) precisely for all possible values of T, we can then formulate H(X) for a biometric system with multiple acceptable representations. To formulate P(Xtrial = T), it is needed to figure out the most probable adversarial behavior (most effective guessing strategy) for different sampling cases. Signal-Tapping Attacks: For each sampling case, we consider an attack based on a signal-tapping point [8], [25] in a generic biometric system. Sensor signal tapping-based attack is an instance of the sampling with replacement case, where the adversary submits a biometric signal of her choosing at tapping point 1 illustrated in Fig. 1 without being able to observe the discretization output that is also the input of the template protection scheme. This attack can occur when sensors are not embedded in the biometric systems. Discretizer signal tapping-based attack is an instance of the sampling without replacement case, where the adversary can gain access to the discretization output at tapping point 2 and therefore she can intercept and modify a transmitted signal at the tapping point. This attack can occur when the discretized feature representation is transmitted to a remote template protector (e.g., over the Internet), which is where the adversary could snoop on the TCP/IP stack and alter certain packets [25]. We call these two attacks sensor and discretizer attacks in short. For the sensor attack, the adversary could use a large set of biometric inputs to attempt one after another until a biometric input is accepted, as shown in Fig. 3. In this attack, the system is viewed as a “black box” because the adversary could not observe the binary feature representation (discretization output) corresponding to her raw biometric input guess. Hence, during the guessing process, there is no information that can be learnt by the adversary from her past rejected inputs. As there is a possibility where a previously rejected binary representation generated from a new biometric input can be resubmitted to the system when different users’ biometric inputs are erroneously discretized to a common binary representation, this attack falls under the case of sampling with replacement on the population binary representations. For the discretizer attack, the adversary is able to observe the binary feature representation (discretization output) corresponding to a biometric input. With this, she could avoid resubmitting past unsuccessful binary representations

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. LIM AND YUEN: ENTROPY MEASUREMENT FOR BIOMETRIC VERIFICATION SYSTEMS

Fig. 4.

5

Sampling without replacement case: attack by discretizer-signal tapping (also known as discretizer attack).

to the system. This attack therefore falls under the case of sampling without replacement on the population binary representations. Before analyzing these attacks further, we make two assumptions: we assume the adversary has complete knowledge over the type of feature extractor, discretizer, and template protector/classifier and their parameters. With this knowledge, the adversary can estimate an imposter distribution from any locally collected dataset. On the other hand, we assume that the adversary has the knowledge of system decision threshold. This allows the adversary to learn the radius of neighborhood of guessing possibilities so that neighbors of a past failing guesses could be excluded in the upcoming guessing attempts. 1) Sensor Attack: The knowledge of system components and parameters does not do the adversary any good in this attack. This is because the adversary could not observe the binary representation produced by her biometric input. As a result, there is no better strategy than randomly guessing the biometric input in her database one after another. In this case, the quantification of P(Xtrial = T) in (8) can be described by P(Xtrial = T) = FAR ·

T−1

(1 − FAR) for T ≥ 1

(9)

t=1

where FAR indicates the probability of the first adverT−1 sarial success and t=1 (1 − FAR) indicates the past consecutive failures. By substituting (9) into (8) and (6), the average trials E[T] and the entropy H(X) can be derived, respectively. 2) Discretizer Attack: The knowledge of system components and parameters allows the adversary to estimate the imposter distribution and design a guessing strategy in this attack. By intercepting the transmitted signal between the feature extraction and template protection, the adversary could present a single biometric

input to the sensor and repetitively modifies the intercepted binary representation according to the estimated imposter distribution until a modified binary representation is accepted [9], as shown in Fig. 4(i). In this case, the adversary could access the entire guess set, where a certain subset is acceptable. As the adversary could modify the discretization output, the quantification of P(Xtrial = T) depends very much on the strategy of modification (guessing). In the following, we present an effective guessing strategy to best predict the adversarial behavior in this attack for a good estimation of P(Xtrial = T). A. Predicting Adversarial Behavior for the Sampling Without Replacement Case In the case of sampling without replacement, an adversary will not repeat any unsuccessful guess. Whenever a guessing attempt is unsuccessful, the incorrect binary guess is not replaced with another guess, but is modified according to the information obtained from the imposter distribution. Because the distance of the discretization output from the enrolled representation follows the imposter distribution, this allows the potentially nonbinomial imposter distribution to take effect in facilitating the adversarial attack. Suppose that at the tth trial, an adversarial modification vec(t) tor Xmod is applied on the discretization output Xbin through a function f to produce another (output) guess on the bio(t) (t) (t) (t) metric representation Xout = f (Xbin , X mod ), where Xout , Xmod , (t) and Xbin are all n-bit binary strings. To ensure that Xout cover all possible values of the binary string, we adopt the (t) simplest XOR operation for function f , such that Xout = (t) (t) f (Xbin , X mod ) = Xbin ⊕ X mod . We let Xbin = cE (k1 ) be the binary string that has a k1 -bit distance from the enrolled representation of the target user; (t) Xmod = cO (k2 ) be the binary string that has a k2 -bit dis(t) tance from an all-zero codeword; and Xout = cE (k3 ) be the

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON CYBERNETICS

binary string that has a k3 -bit distance from the enrolled representation of the target user. More specifically, k1 , k2 , and k3 denote the initial Hamming distance of the obtained representation from the enrolled representation, the Hamming weight of the modification vector, and the final Hamming distance of the modified representation from the enrolled representation, respectively (t)

(t)

Xout = Xbin ⊕ X mod =⇒ cE (k3 ) = cE (k1 ) ⊕ cO (k2 ).

(10)

It is noted that given the initial discretization output Xbin = cE (k1 ), the value of k1 is unknown because the enrolled template cE (0) is not known to the adversary. If in the tth trial, cO (k2 ) is applied such that cE (k1 ) ⊕ cO (k2 ) produces a cE (k3 ) with k3 not greater than the system decision threshold τ , then the adversary is considered to have succeeded. The problem of guessing an acceptable codeword cE (k3 ≤ τ ) can be casted as a problem of guessing k1 because if the value of k1 can be guessed correctly, the possibilities of modification vectors cO (k2 ) that can be applied for achieving cE (k3 ≤ τ ) can significantly be narrowed down, thus allowing a quicker guessing success. Since the imposter distribution Pi characterizes the distribution of cE (k1 ), a (practical) nonbinomial distribution would often reveal which of the k2 values is more probable for a success. Based on this distribution, the adversary is able to gain an advantage and produce a modification vector cO (k2 ) that would maximize the probability of getting an acceptable modified representation cE (k3 ≤ τ ). With this, we present an effective guessing strategy to identify a modification vector cO (k2 ) at each trial to maximize P(Xtrial = T) based on Pi , as shown in Fig. 4(ii). By concatenating the modification vectors over all possible trials, we eventually obtain a guessing sequence of modification vectors that can be applied sequentially to obtain a quicker adversarial success for any nonbinomial imposter distribution. Guessing Strategy: In each trial, the proposed guessing strategy determines the most probable k2 and then assigns a random nonpreviously rejected modification vector cO (k2 ) as the modification vector for that trial. The optimal k2 value (T) at trial T, namely k2(opt) , can be sought as follows: (T)

k2(opt) = arg max P(Xtrial = T).

(11)

(T)

k2

Our strategy maximizes P(Xtrial = T) by narrowing down the possible values that the discretization output cE (k1 ) can take. For each pair of k1 and k2 values, this strategy initially quantifies the maximum guessing failures based on the number of cO (k2 ) that does not result in cE (k3 ≤ τ ). As the guessing proceeds, this strategy keeps track of the number of rejected cO (k2 ) during the trials for each pair of k1 and k2 values. a) Infeasibility of k1 : If any of these numbers reaches the (precalculated) maximum guessing failures, the corresponding k1 value is then regarded infeasible in the computation of P(Xtrial = T) from that guessing trial onward. b) Infeasibility of k2 : If all numbers corresponding to a k2 value are zero, which implies that there are no cE (k1 ) codewords corresponding to any k2 -bit modification vector that would lead to a success (cE (k3 ≤ τ )), then the

Fig. 5. Proposed guessing strategy for the sampling without replacement case.

value of k2 can be excluded from consideration in (11) for the subsequent trials. The algorithmic description of our strategy is given in Fig. 5. Maximum Number of Guessing Failures αk1 k2 : Given a pair of k1 and k2 values, for any cE (k1 ), there is a fixed γk1 k2 number of cO (k2 ) that lead to a cE (k3 ≤ τ ) (⇒ system acceptance) and there are ( kn2 ) − γk1 k2 number of cO (k2 ) that would lead to a cE (k3 > τ ) (⇒ system rejection). If an adversary, who always guesses a different cO (k2 ) for a k2 value, is rejected more than ( kn2 ) − γk1 k2 times, an implication is that all the corresponding k1 -bit codewords cE (k1 ) are just impossible to be the actual bit difference between the discretization output and the enrolled representation. Otherwise, it would have led to an adversarial success (cE (k3 ≤ τ )). Hence, to test against a k1 value, there is a maximum number of guessing attempts, for each k2 value, that an adversary is willing to make, which can be quantified as

n − γk1 k2 + 1 attempts. αk1 k2 = (12) k2 The αk1 k2 value is nonzero when k2 is within the range of min(0, k1 − τ ) and max(k1 + τ, n) or when k1 is within the range of min(0, k2 − τ ) and max(k2 + τ, n). If the adversary fails after making such αk1 k2 attempts, the corresponding k1 value will be excluded from consideration in the computation of P(Xtrial = T) in the subsequent trials. Given a pair

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. LIM AND YUEN: ENTROPY MEASUREMENT FOR BIOMETRIC VERIFICATION SYSTEMS

of k1 and k2 values, the number γk1 k2 of cO (k2 ) with reference to any cE (k1 ) that leads to a cE (k3 ≤ τ ) can be expressed by

τ −k1 +k2 min ,n−k1 ,k2 2

γk 1 k 2 =

p=max(0,k2 −k1 )

=

min(τ,β) k3 =|k1 −k2 |

k1 k2 − p

k1 (k2 + k1 − k3 )/2

n − k1 p

(13)

n − k1 (k3 + k2 − k1 )/2

7

B. Quantifying P(Xtrial = T) in the Sampling Without Replacement Case 1) Probability of First Success at Tth Trial, P(Xtrial = T): Based on the proposed guessing strategy, the probability of first success at Tth trial, P(Xtrial = T) can be defined as (16), as shown at the bottom of the page, where the probability of getting a k3 -bit codeword cE (k3 ), given the values of k2 and k1 = k2 + k3 − 2p, can be expressed by

(14) where each (k2 + k1 − k3 )/2 and (k3 + k2 − k1 )/2 has to be an integer, and β = n − |n − (k1 + k2 )|. Proof: From (10), we obtain cE (k3 ) = cE (k1 ) ⊕ cO (k2 ) =⇒ cO (k3 ) ⊕ cE (0) = cO (k1 ) ⊕ cE (0) ⊕ cO (k2 ) =⇒ cO (k3 ) = cO (k1 ) ⊕ cO (k2 ). Given a cO (k1 ) codeword with k1 bits of value 1 (Hamming weight = k1 that is unknown to the adversary). Suppose that k1 ≥ k2 . If all k2 bits of alteration are made on the k1 bits of value 1 of cO (k1 ), then the Hamming weight of cO (k3 ) is minimal, that is k1 − k2 bits. In this case, there will be ( kk12) possible ways of allocating k2 altered bits to the k1 bits of value 1 of cO (k1 ). For the case of k2 ≥ k1 , if there are k1 out of k2 bits of alteration made on all k1 bits of value 1 of cO (k1 ), and the remaining k2 − k1 bits of alteration made on the n − k1 bits of value 0 of cO (k1 ), then the Hamming weight of cO (k3 ) is minimal, that is k2 − k1 bits; and there are 1 ) possible ways of allocating k2 − k1 altered bits to the ( kn−k 2 −k1 n − k1 bits of value 0 of cO (k1 ). In general, for p ≤ k2 , if only k2 − p bits of alteration are made on the k1 bits of value 1 of cO (k1 ) and p bits of alteration are made on the n − k1 bits of value 0 of cO (k1 ), then the Hamming weight of cO (k3 ) can be described by k3 = k1 − k2 + 2p bits.

(15)

1 ) possible ways of allocating In this case, there will be ( k2k−p k2 − p altered bits to the k1 bits of value 1 of cO (k1 ); and 1 ( n−k p ) possible ways of allocating p altered bits to the n − k1 1 1 bits of value 0 of cO (k1 ). Thus, ( k2k−p )( n−k p ) in (13) returns the total possibilities of bit alteration for a setting of k1 and k2 ; 1 1 )( (k3 +kn−k ) in (14) can be obtained by and ( (k2 +k1k−k 3 )/2 2 −k1 )/2 using (15). Given a pair of k1 and k2 values, when XORed with a cO (k1 ), there could be multiple numbers of cO (k2 ) that result in multiple cO (k3 ) with different value of k3 ≤ τ . Hence, these numbers are summed over the range specified by the limits that are derived from the constraints of the two combinations in (14) in order to yield the final result γk1 k2 . This completes the proof.

P(Xtrial = T) =

τ

(T) min k2 , k3

k3 =0 p=max(0,k2 +k3 −n)

= =

k2 +k3 −2p k2 −p

P(k3 |k2 , p, t) =

k1 k2 −p

n k2

−t

n−k2 −k3 +2p p

n k2

n−k1 p

−t

# k2 bit modification vectors that results in a k3 bit output # remaining k2 bit modification vectors (17)

and the infeasibility-checking function is given by (T) (T) φ = I k2 + k3 − 2p ∈ / χ1 · I k2 ∈ / χ2 .

(18)

The variable I in (18) denotes the indicator function (I[true] = 1, I[false] = 0). The infeasibility-checking function φ = 1 only when both k1 and k2 are feasible. IV. E XPERIMENTS A. Datasets and Experimental Settings To evaluate the accuracy of the formulations in both adversarial sampling cases and the effectiveness of the proposed guessing strategy in the sampling without replacement case, several experiments have been carried out. Our evaluations were made along 3-D: bit length n, decision threshold τ , and imposter distribution Pi (k1 ). While n and τ are system parameters that can be varied flexibly, different imposter distributions can only be obtained when different discretization schemes are employed. Hence, in our experiments, we consider two types of datasets consisting of imposter distributions that can be defined synthetically and be extracted from real biometric data, respectively. 1) Synthetic Dataset: This dataset contains imposter distributions that are modeled as normal distributions [6], [9] with a range of standard deviation values (signifying different uniformity of binary representations). The evaluated standard deviation values of the normalized imposter distribution range from 0.05 to 0.20. 2) Real Dataset: This dataset contains imposter distributions that are generated from a large subset of the FERET face dataset [27] (consisting 3000 images

T−1 (T) (t) (T) 1 − P k3 k2 , p, t P k3 k2 , p, T · Pi k2 + k3 − 2p · φ t=1

(16)

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

with 12 images per user for 250 users) using a collection of feature extractors and discretizers (if the extracted features are not represented in binary). The feature extractors include eigenfeature regularization and extraction (ERE) [14], eigenface extraction [4], and monogenic binary coding (MBC) with code maps for amplitude (MBC-A), orientation (MBC-O), and phase (MBC-P) [36]; while the discretizers include reliability-dependent bit allocator with equalprobable (RDBA + EP) discretizer [19] and equalwidth discretizer [21]. The procedure of generating an imposter distribution from the face dataset is as follows: a) train the feature extractor and discretizer using six randomly selected images per user; b) extract mean binary representation for each user; and c) for nonuserspecific feature extractors and discretizers, compute Hamming distance between every possible pair of users’ mean binary representation and generate the imposter distribution from these distances. For RDBA + EP discretizer where each setting is specific to a target user, take a (target) user’s discretization setting and compute the Hamming distance between the enrolled binary representation of the target user and that of every other user (query template) in the face dataset. This procedure is repeated by taking every other user as the target user and the imposter distribution can be generated from the entire set of Hamming distances. Our experiments can be divided into two parts. The first part validates the analytic expressions of P(Xtrial = T) for the sampling with replacement case [i.e., in (9)] based on the synthetic dataset; while the second part validates the analytic expressions of P(Xtrial = T) in (16) and evaluates the effectiveness of the proposed guessing strategy in providing a reliable estimation of system entropy for the sampling without replacement case based on the real dataset. The validations were performed via comparing the analytical estimations with the corresponding empirical values that are obtained from the guessing attack simulation; while the strategy evaluations were made via comparing the proposed strategy with the baseline brute force guessing strategy. The simulation results for the expected number of trials and the inferred entropy were averaged over a total of 1000 repeated simulation runs in both parts of the experiment. For the real-dataset-based experiments in the sampling without replacement case, the simulation was repeated four times per target user with different adversarial-selected biometric inputs Xbin . These four adversarial-selected biometric inputs were randomly selected from the 249 nontarget users in the dataset. B. Part I: Validation of Analytic Expressions for the Sampling With Replacement Case In this part of the experiment, we validate the analytic expression of P(Xtrial = T) in (9) by comparing the expected trials in (8) and entropy in (6) with the corresponding simulation results of the sensor attack. The reported results for this part of experiment are limited to results based

IEEE TRANSACTIONS ON CYBERNETICS

on the synthetic dataset because the repetitive adversarial guesses at tapping point 1 in Fig. 1 requires the availability of an enormous number of biometric images that could produce all 2n unique binary representations so that correct guessing is always probable despite that the target user could be represented by any binary string. Because we do not have such a large dataset (e.g., with 2100 images for a 100-bit representation) in practice, we are unable to conduct these experiments on real dataset for this part of the evaluations. Fig. 6 illustrates the evaluation results for the P(Xtrial = T)based analytic expressions: (i) expected number of trials E[T] using (8) and (9) and (ii) entropy H(X) in (6), (8), and (9) in terms of: (a) bit length n; (b) decision threshold τ ; and (c) the spread of imposter distribution Pi . Generally, in Fig. 6(a)–(c), it can be noticed that the analytical and experimental curves for both expected number of trials and entropy are well-matched at all tested values of n, τ , and standard deviation of normalized imposter distributions σ , respectively. These observations validate the correctness of our formulations for the sensor attack. To interpret the results further, it is observed that as the ratio n/τ increases [i.e., n increases under a fixed τ in Fig. 6(a) or τ decreases under a fixed n in Fig. 6(b)], the expected number of trials and the corresponding entropy increases. This can be explained by that n determines the cardinality of the set of guesses while τ determines the cardinality of the set of acceptable guesses. When the set of guesses increases (n increases) or when the set of acceptable guesses decreases (τ decreases), obtaining a correct guess becomes more difficult. Hence, as n/τ increases, it takes a higher number of trials for an adversary to succeed in guessing and hence higher system entropy results. In Fig. 6(b)(i), it is rather surprising to find that for a 100-bit binary representation with σ = 0.055 and system decision threshold ranging from 13 to 25 bits, the achievable system entropy is only 12–13 bits. In Fig. 6(c), it is observed that the expected number of trials and entropy decreases with increasing σ . This is due to that the probabilities at both tails of the imposter distribution (at low and high values of k1 ) gets higher when σ increases. As a result, the FAR = τk1 =0 Pi (k1 ) increases accordingly, thus making the guessing attack easier. For a 100-bit binary representation and 15-bit system decision threshold with σ increased from 0.055 to 0.2, the actual system entropy drops below 5 bits. C. Part II: Validations of Analytic Expressions and Guessing Strategy for the Sampling Without Replacement Case Fig. 7 illustrates the evaluation results for the analytic expressions of (i) entropy H(X) in (7), (ii) expected number of trials E[T] in (8), and (iii) P(Xtrial = T) in (16) in terms of (a) bit length n; (b) decision threshold τ ; and (c) imposter distribution extracted from different discretizers Pi . In Fig. 7[(a)–(c)](i)–(iii), we observe an excellent agreement between the analytical and the experimental curves of the proposed entropy-measuring model,

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. LIM AND YUEN: ENTROPY MEASUREMENT FOR BIOMETRIC VERIFICATION SYSTEMS

9

justify the correctness of our formulations for the discretizer attack. To evaluate the effectiveness of the proposed strategy, it is noticed from (i) and (ii) in Fig. 7(a) and (b) that the proposed guessing strategy is more effective than the brute force attack at most values of n and τ , which can be seen from the lower entropy and expected number of trials of our guessing strategy than the brute force attack. In Fig. 7(a)(i), for a 12-bit binary representation with a 5-bit decision threshold, less than 2-bit system entropy is achieved with the proposed guessing strategy, while in Fig. 7(b)(i), for a 18-bit binary representation with decision threshold greater than 7 bits, the entropy drops below 2 bits with the proposed strategy. In Fig. 7(c)(i) and (ii), the reduction of entropy and expected trials of the proposed guessing strategy is more evident for all evaluated feature extractions than ERE + RDBA + EP. Among them, MBC-O and MBC-P suffer the two highest entropy loss due to the highly correlated orientation and phase bits of monogenic binary code, causing imposter distribution to deviate significantly from the binomial distribution. These distributions leak information that can be used to facilitate adversarial guessing using the proposed strategy. On the contrary, the reduction of entropy and expected trials is not so evident for ERE + RDBA + EP because it produces nearly equally probable binary representations. This makes guessing attack harder (nearly as hard as brute force attack) despite precise imposter distribution is available. On top of this, this explanation is supported by our observation in Fig. 7(c)(iii), where the probability of adversarial success at the first trial P(Xtrial = 1) of ERE + RDBA + EP is much lower than that of the other feature extractions. The maximum failing trials of both baseline and the proposed strategy shown in (iv) in Fig. 7(a)–(c) may also reflect the effectiveness of the proposed strategy. Here, the strategyspecific Tmax represents the size of the guessing sequence of a strategy that covers all possibilities of the guesses, which can also be referred to as the worst case scenario of the strategy. For the brute force attack strategy, Tmax is computed by Tmax = total possibilities − total acceptable guesses = 2n − τt=0 ( nt ); while for the proposed strategy, Tmax is obtained from the generated guessing sequence as shown in Fig. 4(ii). To a certain extent, the difference between the two curves could reveal indications about the amount of entropy reduction in the proposed strategy in Fig. 7[(a) and (b)](iv). However, in Fig. 7(c)(iv), such an explanation does not seem valid. D. Discussions Fig. 6. Sampling with replacement-based sensor attack: accuracy evaluation of the proposed measuring model in terms of (a) bit length n, τ = 15, k1 ∼ N (μ = 0.5, σ = 0.055); (b) decision threshold τ , n = 100, k1 ∼ N (μ = 0.5, σ = 0.055); and (c) imposter distribution Pi (k1 ), n = 100, τ = 15, k1 ∼ N (μ = 0.5, σ ).

except with some trivial mismatch in subfigures (iii), which could be due to imprecise averaging over a limited number of experimental runs. These observations sufficiently

Existing measures such as standard entropy and minimum entropy cannot be used to measure biometric system security because these measures can only be applied when a system accepts only a single binary representation per user, which is inconsistent with the typical multiple acceptable possibilities in a practical biometric system. The proposed entropy-measuring model provides a solution by expressing entropy in terms of adversarial guessing effort, thus taking into account the difficulty of an adversary in a guessing attack.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

IEEE TRANSACTIONS ON CYBERNETICS

Fig. 7. Sampling without replacement-based discretizer attack: evaluation results of the proposed entropy-measuring model in terms of (a) bit length, τ = 5, imposter distribution for ERE + RDBA + EP features and (b) decision threshold τ , n = 18, imposter distribution for ERE + RDBA + EP features.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. LIM AND YUEN: ENTROPY MEASUREMENT FOR BIOMETRIC VERIFICATION SYSTEMS

11

Fig. 7. (Continued.) Sampling without replacement-based discretizer attack: evaluation results of the proposed entropy-measuring model in terms of (c) imposter distribution Pi (k1 ), n = 18, τ = 5. The “[Proposed]” and “[Brute Force]” in the legend of (i) and (ii) are referred to as the proposed guessing strategy and the brute force guessing strategy, respectively.

By predicting the adversarial behavior in a practical scenario, the difficulty of breaking a system (respectively, entropy) can be quantified for any system setting including the case of multiple acceptable possibilities. Although, we have proven experimentally the feasibility of measuring the system entropy for both sampling with and without replacement cases accurately with the proposed entropy-measuring model, a limitation of the measuring model is the high complexity in the estimation of probability of successful guessing at all trials based on the proposed guessing strategy for the sampling without replacement case. At this stage, these probability estimations can be impractically slow for long binary representation because they require quantification of the remaining possible modification vectors at every possible trial. In Section IV-C of the experimental evaluation, the evaluated lengths of binary representation at n ≤ 18 could be too small from being practical for current biometric systems. It is noticed in (iv) in Fig. 7(a)–(c) that a huge size of guessing sequence in the order of 105 has been generated at n = 18 and τ = 5. For the same setting, based on our

CPU with the following specification—Dual 4-core Intel Xeon X5570 2.93 GHz with 8 MB L3 Cache and 32 GB memory, the time taken to generate a guessing sequence of the proposed guessing strategy is in the order of 103 s. Hence, the computational complexity is an important issue to be solved in order for the measurement of system entropy to be practical with the proposed model. V. C ONCLUSION In this paper, we have proposed an entropy-measuring model for biometric verification systems with multiple acceptable biometric measurements to quantify the system security in terms of adversarial guessing effort. We have analyzed two sampling cases (with and without replacement) for systementropy measurement. We have formulated the system entropy for two attacks based on these two cases, which is dependent on the probability of first correct adversarial guess of the biometric representation at all possible trials. As the system entropy measurement for the sampling without replacement case is reliant on the adversarial guessing strategy, we have

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 12

IEEE TRANSACTIONS ON CYBERNETICS

proposed an effective guessing strategy for a good estimation of the system entropy. We have verified the accuracy of our formulations experimentally using a synthetic and a benchmark face dataset and have justified the effectiveness of the proposed guessing strategy based on the baseline brute force attack. For both sampling cases, the experimental results show satisfactory agreement between the experimental and analytical results for the evaluation of formulation accuracy. To justify the effectiveness of the proposed guessing strategy, the results show clear reduction in the expected number of trials and entropy of brute force guessing strategy when the binary feature extractor does not produce equally probable outcomes. While this model involves sequential calculation of probability of adversarial success at all trials, this measurement model could be impractical when the number of possible binary representations is too large (e.g., over a billion). A future direction in this regard is to work toward an efficient estimation of these probabilities to yield a more practical system entropy measure. A PPENDIX Given N equiprobable possibilities, among which m of them are acceptable, the proof for E[T] = N + 1/m + 1 in the sampling without replacement case is given as follows. Let the probability of taking T trials for the first adversarial success in guessing be defined as ⎧m ⎨ for T = 1 P(Xtrial = T) = N m N − m − x T−2 ⎩ for T > 1 N x=0 N − 1 − x T−2 where the components m/N and x=0 N − m − x/N − 1 − x denote the probability of the first success and the probability of T −1 consecutive failures, respectively. With the total possible guessing trials Tmax = N + 1 − m and the sum of partial fracn−1 tion identity: k=0 (x + k)!/k! = (x + n)!/(x + 1)(n − 1)!, we have the following: E[T] =

N+1−m

T · P(Xtrial = T)

T=1

m N − m

m = +2 N N N−1 m N − m N − m − 1

+3 n 2 N−1 N−2 m N − m 1

... + · · · + (N + 1 − m) N N−1 m N+1 . = m+1 This completes the proof. R EFERENCES [1] A. Adler, R. Youmaran, and S. Loyka, “Towards a measure of biometric feature information,” Pattern Anal. Appl., vol. 12, no. 3, pp. 261–270, 2009. [2] H. Al-Assam and S. Jassim, “Security evaluation of biometric keys,” Comput. Security, vol. 31, no. 2, pp. 151–163, 2012. [3] L. Ballard, S. Kamara, and M. Reiter, “The practical subtleties of biometric key generation,” in Proc. 17th Conf. USENIX Security Symp., San Jose, CA, USA, 2008, pp. 61–74.

[4] P. N. Belhumeur, J. P. Kriegman, and D. J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 711–720, Jul. 1997. [5] C. Chen, R. Veldhuis, T. Kevenaar, and A. Akkermans, “Biometric quantization through detection rate optimized bit allocation,” EURASIP J. Adv. Signal Process., vol. 2009, May 2009, Art. ID 784834. [6] J. Daugman, “The importance of being random: Statistical principles of iris recognition,” Pattern Recognit., vol. 36, no. 2, pp. 279–291, 2003. [7] Y. Dodis, R. Ostrovsky, L. Reyzin, and A. Smith, “Fuzzy extractors: How to generate strong keys from biometrics and other noisy data,” SIAM J. Comput., vol. 38, no. 1, pp. 97–139, 2008. [8] M. Faundez-Zanuy, “On the vulnerability of biometric security systems,” IEEE Aerosp. Electron. Syst. Mag., vol. 19, no. 6, pp. 3–8, Jun. 2004. [9] Y. C. Feng, P. C. Yuen, and M.-H. Lim, “Distance entropy as an information measure for binary biometric representation,” in Proc. 6th Chin. Conf. Biometr. Recognit. (CCBR), vol. 7701. Guangzhou, China, 2012, pp. 332–339. [10] Y. C. Feng and P. C. Yuen, “Binary discriminant analysis for generating binary face template,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 2, pp. 613–624, Apr. 2012. [11] B. Fu, S. X. Yang, J. Li, and D. Hu, “Multibiometric cryptosystem: Model structure and performance analysis,” IEEE Trans. Inf. Forensics Security, vol. 4, no. 4, pp. 867–882, Dec. 2009. [12] T. Ignatenko and F. M. J. Willems, “Information leakage in fuzzy commitment schemes,” IEEE Trans. Inf. Forensics Security, vol. 5, no. 2, pp. 337–348, Jun. 2010. [13] M. Inki, “A model for analyzing dependencies between two ICA features in natural images,” in Proc. 5th Int. Conf. Independent Anal. Blind Signal Separat., vol. 3195. Granada, Spain, 2004, pp. 914–921. [14] X. D. Jiang, B. Mandal, and A. Kot, “Eigenfeature regularization and extraction in face recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 3, pp. 383–394, Mar. 2008. [15] A. Juels and M. Wattenberg, “A fuzzy commitment scheme,” in Proc. 6th ACM Conf. Comput. Commun. Security (CCS), Singapore, 1999, pp. 28–36. [16] E. J. C. Kelkboom, J. Breebaart, I. Buhan, and R. N. J. Veldhuis, “Maximum key size and classification performance of fuzzy commitment for Gaussian modeled biometric sources,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 4, pp. 1225–1241, Aug. 2012. [17] M.-H. Lim, A. B. J. Teoh, and K.-A. Toh, “Biometric discretization via a dynamic detection rate-based bit allocation with genuine interval concealment,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 43, no. 3, pp. 843–857, Jun. 2013. [18] M.-H. Lim and A. B. J. Teoh, “A novel class of encoding scheme for efficient biometric discretization: Linearly separable subcode,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 2, pp. 300–313, Feb. 2013. [19] M.-H. Lim, A. B. J. Teoh, and K.-A. Toh, “An efficient dynamic reliability-dependent bit allocation for biometric discretization,” Pattern Recognit., vol. 45, no. 5, pp. 1960–1971, 2012. [20] M.-H. Lim and A. B. J. Teoh, “An analytic performance estimation framework for multi-bits biometric discretization based on equalprobable quantization and linearly separable subcode encoding,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 4, pp. 1242–1254, Aug. 2012. [21] M.-H. Lim, A. B. J. Teoh, and K.-A. Toh, “An analysis on linearly separable subcode-based equal width discretization and its performance resemblances,” EURASIP J. Adv. Signal Process., vol. 2011, no. 82, pp. 1–14, 2011. [22] P. Luo, X. Wang, and X. Tang, “A deep sum-product architecture for robust facial attributes analysis,” in Proc. Int. Conf. Comput. Vis., Sydney, NSW, Australia, 2013, pp. 2864–2871. [23] J. L. Massey, “Guessing and entropy,” in Proc. IEEE Int. Symp. Inf. Theory, Trondheim, Norway, 1994, p. 204. [24] A. Nagar, K. Nandakumar, and A. K. Jain, “Multibiometric cryptosystems based on feature level fusion,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 1, pp. 255–268, Feb. 2012. [25] N. Ratha, J. Connell, and R. Bolle, “An analysis of minutiae matching strength,” in Proc. 3rd Int. Conf. Audio Video Based Biometr. Pers. Authenticat., Halmstad, Sweden, 2001, pp. 223–228. [26] A. Nagar and A. K. Jain, “On the security of non-invertible fingerprint template transforms,” in Proc. IEEE Workshop Inf. Forensics Security, London, U.K., 2009, pp. 81–85. [27] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET evaluation methodology for face-recognition algorithms,” IEEE Trans. Pattern Recognit. Mach. Intell., vol. 22, no. 10, pp. 1090–1104, Oct. 2000. [28] C. Rathgeb and C. Busch, “Cancelable multi-biometrics: Mixing iriscodes based on adaptive bloom filters,” Comput. Security, vol. 42, pp. 1–12, May 2014.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. LIM AND YUEN: ENTROPY MEASUREMENT FOR BIOMETRIC VERIFICATION SYSTEMS

[29] C. E. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J., vol. 27, no. 3, pp. 379–423, 1948. [30] L. Shen, W. Wu, S. Jia, and W. Guo, “Coding 3D Gabor features for hyperspectral palmprint recognition,” in Proc. Int. Conf. Med. Biometr., Shenzhen, China, 2014, pp. 169–173. [31] Y. Sutcu, H. T. Sencar, and N. Memon, “How to measure biometric information?” in Proc. 20th IEEE Int. Conf. Pattern Recognit., Istanbul, Turkey, 2010, pp. 1469–1472. [32] K. Takahashi and T. Murakami, “A measure of information gained through biometric systems,” Image Vis. Comput., vol. 32, no. 12, pp. 1194–1203, 2014. [33] A. B. J. Teoh, A. Goh, and D. C. L. Ngo, “Random multispace quantisation as an analytic mechanism for BioHashing of biometric and random identity inputs,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 12, pp. 1892–1901, Dec. 2006. [34] A. B. J. Teoh, K.-A. Toh, and W. K. Yip, “2N discretisation of BioPhasor in cancellable biometrics,” in Proc. 2nd Int. Conf. Biometr., Seoul, Korea, Aug. 2007, pp. 435–444. [35] A. Vij and A. Namboodiri, “Learning minutiae neighborhoods: A new binary representation for matching fingerprints,” in Proc. IEEE Comput. Vis. Pattern Recognit. Workshop, Columbus, OH, USA, 2014, pp. 64–69. [36] M. Yang, L. Zhang, S. C. K. Shiu, and D. Zhang, “Monogenic binary coding: An efficient local feature extraction approach to face recognition,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 6, pp. 1738–1751, Dec. 2012. [37] X. Zhou, A. Kuijper, R. Veldhuis, and C. Busch, “Quantifying privacy and security of biometric fuzzy commitment,” in Proc. IEEE Int. Joint Conf. Biometr., Washington, DC, USA, 2011, pp. 1–8.

Meng-Hui Lim (M’13) received the Ph.D. degree from Yonsei University, Seoul, Korea, in 2012. He was with the Department of Computer Science, Hong Kong Baptist University, Hong Kong as a Post-Doctoral Research Fellow for a year, where he has been a Research Assistant Professor since 2013. His current research interests include pattern recognition, cryptography, and biometric security.

13

Pong C. Yuen (SM’11) received the B.Sc. (First Class Hons.) degree in electronic engineering from the City Polytechnic of Hong Kong, Hong Kong, in 1989 and the Ph.D. degree in electrical and electronic engineering from the University of Hong Kong, Hong Kong, in 1993. He was with the Hong Kong Baptist University, Hong Kong, in 1993, where he is currently a Professor and the Head of the Department of Computer Science. He was associated with the Laboratory of Imaging Science and Engineering, Department of Electrical Engineering, University of Sydney, Sydney, NSW, Australia. In 1998, he spent a six-month sabbatical leave in the University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD, USA. From 2005 to 2006, he was a Visiting Professor with Graphics, Vision and Robotics Laboratory, INRIA Rhône-Alpes, Rhône-Alpes, France. He was the Director of Croucher Advanced Study Institute (ASI) on Biometric Authentication in 2004 and Croucher ASI on Biometric Security and Privacy in 2007. His current research interests include video surveillance, human face recognition, biometric security, and privacy. Dr. Yuen was a recipient of the University Fellowship to visit the University of Sydney in 1996. He was actively involved in many international conferences as an Organizing Committee and/or Technical Program Committee Member. He was the track Co-Chair of the International Conference on Pattern Recognition in 2006 and the Program Co-Chair of the IEEE Fifth International Conference on Biometrics: Theory, Applications and Systems in 2012. He serves as an Advisory Board Member of the BTAS Conference and a Hong Kong Research Grant Council Engineering Panel Member. He is currently an Editorial Board Member of Pattern Recognition and an Associate Editor of the IEEE T RANSACTIONS ON I NFORMATION F ORENSICS AND S ECURITY and the SPIE Journal of Electronic Imaging.