1040-5488/14/9101-0076/0 VOL. 91, NO. 1, PP. 76Y85 OPTOMETRY AND VISION SCIENCE Copyright * 2013 American Academy of Optometry

ORIGINAL ARTICLE

Pacific Acuity Test: Testability, Validity, and Interobserver Reliability John P. Lowery*, John R. Hayes†, Megan Sis‡, Anna Griffith§, and Darcy Taylor§ ABSTRACT Purpose. The Pacific Acuity Test (PAT) is a new vanishing optotype test designed to measure recognition visual acuities in preverbal children using a face and opposing oval figure in a forced-choice preferential looking format. This study evaluates the testability, validity, and interobserver reliability of the PAT. Methods. Fifty-two subjects, aged 6 to 36 months, were tested by a primary observer to determine both recognition and resolution visual acuities using the PAT. Subjects were also tested using the Cardiff Acuity Test (CAT) to provide comparative resolution acuities. Two additional observers independently evaluated video-recorded subject responses for testability and interobserver reliability analysis. An independent grader determined acuity thresholds from each observer’s observations, and a logistic regression model was used for additional analysis of acuity thresholds, validity, and testability. Results. Forty-seven of 52 subjects completed testing to obtain visual acuities with the PAT. Sixty-nine percent of subjects followed the desired forced-choice strategy to yield recognition acuities with the PAT. Testability for children younger than 18 months was 44%, whereas 96% of children 18 months and older responded to the recognition testing format. Testability for resolution acuity was 92% and 98% for the PAT and CAT, respectively. The mean difference between PAT recognition and CAT resolution acuity thresholds (PAT-CAT) was +0.11 logMAR (0.15 SD, p G 0.001). The observers were in agreement as determined by intraclass correlation coefficients of 0.90 for both PAT recognition and the CAT. Conclusions. High testability and valid recognition acuity measures were achieved using the PAT with children by approximately 18 months of age. The recognition acuities obtained with the PAT were higher, particularly for younger subjects, than comparative resolution acuities found with both the PAT and CAT. Interobserver reliability of observers was the same between the PAT and the CAT. (Optom Vis Sci 2014;91:76Y85) Key Words: Pacific Acuity Test, Cardiff Acuity Test, recognition acuity, vanishing optotype, testability, interobserver reliability

M

easurement of visual acuity in young children and individuals with developmental disabilities is limited by attention, cognition, and verbal response capabilities. Most children are not able to respond to tests that require recognition of specific optotypes until ages 3 or 4 years. However, it is usually possible to obtain resolution visual acuity with preverbal children or disabled individuals using forced-choice preferential looking (FPL) acuity tests such as the Teller Acuity Cards (Stereo Optical, Chicago, IL) and Cardiff Acuity Test ([CAT] Peter Allen Associates, United Kingdom). These resolution tests use a stimulus figure composed of spatial frequency components that are of equal overall luminance with a neutral gray background, such that the stimulus can no longer be detected at the resolution threshold of the *OD, MEd † PhD ‡ OD, MS § OD Pacific University College of Optometry, Forest Grove, Oregon (all authors).

components. Young children will preferentially fixate on the novel stimulus if they are able to resolve the spatial frequency content of the stimulus against the gray background of the card. The Teller Acuity Cards were the first FPL test to become widely used in clinical practice after a well-documented history of normative and clinical research.1Y3 The Teller Acuity Cards are particularly useful for evaluating vision in young infants and nonverbal individuals with disabilities.3Y11 However, it is well established that resolution acuity obtained with this grating detection task is not equivalent to recognition acuities obtained with standard optotype tests. Clinical studies evaluating the Teller Acuity Cards have demonstrated that acuities derived from preferential looking response to grating stimuli overestimate acuity compared with recognition methods in children with visual impairment because of ocular disease and amblyopia.12Y15 Use of more complex stimuli than homogeneous gratings within the context of a preferential looking format have been suggested as a means of improving detection of amblyopia.16 The CAT uses a vanishing optotype design that minimizes the gap between the resolution and the recognition threshold of acuity.

Optometry and Vision Science, Vol. 91, No. 1, January 2014

Copyright © American Academy of Optometry. Unauthorized reproduction of this article is prohibited.

Pacific Acuity Test Validity and ReliabilityVLowery et al.

The CAT uses the same FPL strategy as the Teller Acuity Test except that the stimulus is a simple figure (house, boat, car, duck, etc.) and is positioned either on the top or bottom of each card. By observing whether a child looks up or down, a clinician can judge if the child is able to resolve the vanishing optotype figure on each card. The CAT has been shown to provide high testability and valid estimates of acuity, comparable to grating acuities in infants or nonverbal children with disabilities.17Y20 However, the CAT may fail to demonstrate acuity deficits caused by refractive errors and mild amblyopia.20Y22 There seems to be a clinically significant gap between the detection/resolution task of the CAT and a recognition task presented by standard optotypes, particularly for children with reduced visual acuity. To bridge the gap between the currently available FPL tests and recognition tests, we developed a new vanishing optotype test for measuring acuity in nonverbal children. The Pacific Acuity Test ([PAT] Good-Lite Corporation, Elgin, IL) uses the same stimulus line configuration as the CAT, two 0.25-cycle dark bands bordering a 0.5-cycle white band positioned on a gray background. However, each PAT card has two optotypes, a face of a young person and an oval of the same overall dimensions with interior ovals designed to match some of the detail of the face (Fig. 1). Each card contains the same figures; only the positions of the two figures vary between cards. When the targets presented are beyond the subject’s resolution capabilities, they ‘‘vanish’’ into the gray background. Near the resolution threshold, the patient must not only be able to detect the targets but must also discriminate between the features of the face versus those of the oval target to identify the position of the face by gaze alone or gaze and pointing. The face target and opposing figure were designed based on a wealth of research showing that young children are perceptually tuned to recognize facial features and are particularly drawn to direct eye contact from early infancy.23Y26 Finding a face is a motivating task for most young children, and this strategy has been used in many pediatric vision tests. Previous studies demonstrated that a vanishing optotype face pattern provided acuity results that were more precise with infants and agreed better with recognition acuities than square-wave gratings in verbal subjects with amblyopia.27,28 In the present study, we evaluated several questions related to the PAT in normal healthy children aged 6 to 36 months: 1. At what age can we expect children to actively respond to the test paradigm by pointing or demonstrating clear gaze preference such that clinicians could easily observe that the child is able to discriminate the face from the oval target? 2. Can the PAT be used to measure a detection/resolution threshold of acuity if a child does not respond by pointing or clear gaze preference for the face? Will an examiner be able to determine if a child is seeing any targets versus not seeing targets at all? (For clarity, we will refer to this acuity threshold as the ‘‘resolution’’ threshold.) 3. Is there a significant gap between the threshold for resolution of the targets and the threshold for recognition of the face from the oval target in young children? How does this gap vary by age? 4. How reliable are the observations of different clinicians in deriving thresholds of detection acuity versus thresholds of recognition acuity using this test?

77

METHODS Fifty-two subjects, aged 6 to 36 months, underwent testing with both the PAT and the CAT to allow direct comparison between the two tests. The study protocol was approved by the Human Subjects Research Review Board of Pacific University, followed the ethical principles outlined in the Declaration of Helsinki, and informed consent was obtained from parents of all subjects before participation. The CAT was chosen as a control because it is well normed and validated and uses the same psychophysical principles as the PAT. However, the CAT uses only one optotype per card and serves to provide a measure of resolution threshold in a simpler detection task paradigm when used in accordance with standard test instructions. Exclusion criteria included history or indication of developmental delay, neurological conditions, systemic illness, ocular disease, or strabismus. After acuity testing, a vision screening was completed as a public health service for the participants and to confirm subject eligibility for the study. The test subjects were seated on a parent’s lap facing the primary examiner who presented the acuity cards at a 50-cm distance from the test subject’s eyes. In addition to fluorescent background room illumination, an adjustable incandescent lamp above and behind the subject was adjusted to provide specific illumination on the test card. Luminance levels were measured at 290 to 310 lux at the card presentation position. Test order was alternated for every subject. A 20/200 demonstration card set was used at the beginning of each test session for training and orienting the child. The examiner encouraged the child to find the target by saying ‘‘where is the little boy or baby?’’ for the PAT card and ‘‘where is the car, boat, duck, and so on’’ for the specific CAT card. Parents were asked what word the subject would likely be more familiar with to improve comprehension and testability. Complete testing consisted of three card presentations at each of the following acuity levels down to two levels below acuity threshold: 20/200, 160, 127, 100, 80, 63, 50, 40, 30, and 20 (or 1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.18, and 0.0 logMAR). If the subject responded well (fixation response observable) to the 0.3 logMAR level at 50 cm, the same set of cards were randomized again and tested at 75 cm away (0.18 logMAR) and at 100 cm (0.0 logMAR). An independent data recorder randomized the card order within each acuity level for each subject and recorded all observations provided by the primary examiner. Testing for each subject started at 20/200 (1.0 logMAR), and cards were presented down to two levels past what the data recorder determined to be detection threshold (when the subject no longer appeared to see any targets). Because the primary examiner was blind to target positions on each card, the data recorder compared the examiner’s observations to the actual target positions on each card and tallied the number of consecutive ‘‘none’’ observations to make an ultimate decision of when to stop testing. The subject was encouraged to point to the target if capable; however, when testing younger subjects, the examiner relied purely on gaze preference. For each acuity card presentation, one of several possible observations was recorded based on the subject’s responses. For the PAT, the observers could record four possible observations: top, bottom, both top and bottom, or no fixation. It was expected that subjects would demonstrate many possible fixation patterns depending on how engaged they were with the test and ability to see the targets. If a

Optometry and Vision Science, Vol. 91, No. 1, January 2014

Copyright © American Academy of Optometry. Unauthorized reproduction of this article is prohibited.

78 Pacific Acuity Test Validity and ReliabilityVLowery et al.

FIGURE 1. Pacific Acuity Test.

subject was motivated to find the face target, it was expected that they would look back and forth between the targets for the majority of card presentation to identify the face with certainty. Some subjects may only indicate a preference by demonstrating a longer fixation time on the face or a single fixation if the face is seen on the initial glance. Subjects who were unable to see either target may indicate this with very brief or no fixation on specific target positions. Additional behavioral observations like facial expressions, gestures, or loss of interest were also used, particularly to indicate that the child was seeing the face or not seeing any targets. If the examiner felt that the subject demonstrated greater fixation on the top or bottom target, then the observation was recorded as ‘‘up’’ or ‘‘down.’’ If the subject fixated on both targets for about an equal length of time, ‘‘both’’ was recorded as the response. If the subject did not appear to see any targets at all, the examiner recorded ‘‘none.’’ For the CAT, there were three possible observations recorded: ‘‘up,’’ ‘‘down,’’ or ‘‘none.’’ Test sessions were recorded with a video camera to capture subject responses. These recordings were viewed without audio by two additional examiners to provide independent observations using the same criteria as the primary observer. All three of the examiners were aware of what acuity level was being tested (marked on the back of the cards) but were blind to the position of the targets on each card.

All observations were independently recorded. For each presentation, the examiners also indicated their confidence in the observation based on the subject’s fixation response and other nonverbal behavior. A high confidence in the observation was assigned 2; a low confidence in the observation was assigned 1. For each card presentation with the PAT, the two video examiners also recorded if the subject appeared to be following the desired test strategy by looking at both targets before showing a preference (gaze or gaze/pointing) to the face.

Data Analysis The data coordinator independently graded each observation by the examiners according to whether the observation matched or did not match the actual position of the face (correct responses) on the PAT or the figure on the CAT. From this, a clinical threshold was derived from each examiner as the last acuity level in which the child correctly responded to at least two of three cards. This grading yielded a recognition threshold for the PAT and a resolution threshold for the CAT for each subject. In addition, observations on the PAT were graded for a resolution threshold: the last level in which the subject appeared able to see any target, either face or oval, or both targets on at least two of three cards presented.

Optometry and Vision Science, Vol. 91, No. 1, January 2014

Copyright © American Academy of Optometry. Unauthorized reproduction of this article is prohibited.

Pacific Acuity Test Validity and ReliabilityVLowery et al.

With two optotypes, determining the difference between a subject seeing both targets versus looking for targets but not actually seeing them requires more subtle observations of the child’s behavior beyond gaze. To add to the certainty of this resolution threshold, high confidence was required of observations that indicated the subject was looking equally at ‘‘both’’ targets or looking at the oval target if there was another observation of ‘‘none’’ recorded by the observer at the same acuity level. For each subject, mean acuity thresholds (resolution and recognition for the PAT and resolution for the CAT) were calculated from the independent observations of each of the three examiners. The thresholds from each examiner as determined from the independent grading analysis were compared to evaluate interobserver reliability via intraclass correlation coefficients (ICCs) calculated from a mixed-model analysis of variance. Observer confidence in observations was also compared between tests and observers to aid our testability analysis. For each subject who completed both tests (PAT and CAT), the proportion of high confidence observations (2) from each observer was determined, and from these, a mean proportion of the three observers was derived. These proportions were used to evaluate the difference between confidence values for the tests. Intraclass correlation coefficients were also used to evaluate the reproducibility in reported confidence between observers. Threshold acuity estimates were also derived from a weighted logistic regression model (IBM SPSS Version 20, Proc Logistic Regression). The confidence ratings provided the weights where more confident judgments were given more weight in the analysis. The logistic model was used to give an objective assessment of the acuity threshold for each subject using all available data. This allowed us to objectively model the subjective judgment of children’s preferential looking behavior where observer confidence plays a significant role in the final assessment. The logistic model was ln[p/(1-p)] = a + b*(logMAR) and was separately computed for each subject using all values. In this equation, p was a correct response (seeing the target) and the b coefficient was the slope of the curve progressing from responses that were all correct to all incorrect. A steeper slope reflected a sharper distinction between positive and negative responses. The threshold of acuity was the probability of a correct response equal to 0.5. The solution for p= 0.5 in the above equation is -a/b. The observations evaluated in each threshold estimate are correlated so we used logistic regression to compute a summary statistic of the threshold, not making inferences within an individual subject. The visual acuity range of logMAR values was 0 to 1. Subjects covered this entire spectrum. Values were added to anchor the ends of the scale. Ten additional logMAR values of 1 (20/200) scored as correct were added to each subject for each judge and 10 logMAR values of -0.3 (20/10) scored as incorrect were added to each subject for each judge to anchor the extreme values and allow the equations to all converge. The procedure of anchoring the performance with assumed values of correct and incorrect subject response at the extremes allowed the actual judgments to define the main slope of the function within the context of expected vision. Ten was an arbitrary number at each end and was chosen to be about one third of the average number of actual observations for each subject. A within-subjects analysis of variance was used to evaluate the equivalence of slopes and thresholds between the PAT and CAT strategies. It is known that acuity improves with age in our subject sample, so we tested the validity of the measures by

79

evaluating the variance accounted for by the threshold regressed on age (R 2). Differences between measures, both clinical and logistic models, were evaluated across levels of acuity using Bland-Altman charts. Planned comparisons between mean thresholds were illustrated in Fig. 7 in which nonoverlapping 84% confidence intervals are equivalent to least significant different tests and reflect statistically significant differences at an unadjusted p G 0.05.29 This particular method allows for visual comparison of means rather than a complex table.

RESULTS Fifty-two subjects reported for testing. Five children (all younger than 16 months) were uncooperative for completion of one or both acuity tests, leaving 47 subjects whose data could be used for threshold comparisons between tests and reliability analysis. In evaluating the children who were uncooperative for yielding threshold acuities, four of these five subjects were specifically uncooperative with the PAT but responded much better to the CAT. There was no relationship between test order and cooperation in this group. All three examiners agreed that these four children appeared to reach threshold on the CAT but not the PAT. Only one subject (aged 8 months) was specifically unresponsive to the CAT but was relatively cooperative for the PAT. To evaluate testability for recognition acuity on the PAT, we divided the subjects into children who clearly looked at both targets for at least 50% of target presentations, indicating that they were actively engaged in the task of discriminating the face from the oval target, and children who responded less than 50% of the time in this manner, only looking at one target. Thirty-six of 47 subjects who completed testing responded to the PAT following the desired strategy. In addition, the subject who was specifically unresponsive to the CAT but was relatively cooperative for the PAT did not look at both targets on at least 50% of presentations. Therefore, the overall testability for recognition acuity with the PAT was 36 of 52 (69%) of all subjects for whom testing was attempted. There was a clear association between age and response pattern as shown in Fig. 2. Testability for children younger than 18 months was 44%, whereas 96% of children 18 months and older responded well to the recognition paradigm. Although many of the younger subjects did not respond well to the dual-optotype format of the PAT, it was still possible to obtain resolution acuity as long as the child was cooperative enough to complete the test. Forty-eight (92%) of 52 subjects were testable to yield resolution acuity with the PAT. In comparison, 51 (98%) of 52 subjects were testable to yield resolution acuity with the CAT. There were significant correlations of visual acuity (logMAR) thresholds with age across all three methods of measuring acuity thresholds for both clinical criteria (R2 = 0.60, 0.69, 0.58 for PAT recognition, PAT resolution, and CAT, respectively; Fig. 3) and logistic model estimates (R2 = 0.73, 0.71, 0.65 for PAT recognition, PAT resolution, and CAT, respectively; Fig. 4). There was no interaction between method of measuring acuity and age for the clinical criteria (F2,90 = 1.502, p = 0.23), but there was a modest interaction in the logistic model threshold estimates (F2,90 = 3.67, p = 0.03) as realized by a converging of the resolution and recognition with increasing age. The age-adjusted thresholds had a significant main effect for methods estimated clinically and by logistic

Optometry and Vision Science, Vol. 91, No. 1, January 2014

Copyright © American Academy of Optometry. Unauthorized reproduction of this article is prohibited.

80 Pacific Acuity Test Validity and ReliabilityVLowery et al.

FIGURE 2. Subject age distribution and testability for recognition acuity with the Pacific Acuity Test (testability based on subjects looking at both targets before indicating the position of the face with gaze and/or pointing on at least 50% of card presentations).

model (F2,90 = 73.62, p = 0.001 and F2,90 = 27.66, p G 0.001, respectively). Fig. 5 demonstrates the difference between PAT recognition and CAT as a function of age. Although the mean difference between PAT recognition acuity and CAT (resolution) acuity thresholds was +0.11 logMAR (0.15 SD, p G 0.001), there was considerable variance with a range from -0.27 to +0.41 logMAR in the acuities derived from the clinical model. Fifty-one percent of subjects yielded thresholds that were one line or more greater for PAT recognition versus CAT. Three subjects (6%) yielded thresholds that were one or more lines lower for the PAT recognition acuity relative to the CAT. Fig. 6 compares clinical and logistic model acuities across age. Negative values reflect poorer acuity in the logistic model. The mean difference between the clinical and logistic models was -0.12 logMAR (0.13 SD, p G 0.001), 0.09 logMAR (0.04 SD, p G 0.001), and 0.02 logMAR (0.04 SD, p G 0.023) for PAT recognition, PAT resolution, and CAT, respectively. The logistic model determined significantly higher PAT recognition values at younger ages (G22 months) but was similar to the clinical model at older ages. There was no difference over age between the clinical and logistic model for PAT resolution and CAT acuities. Fig. 7 illustrates the differences between PAT recognition and the other methods in both clinical and logistic model estimates. The logistic model demonstrated higher CAT resolution visual acuity than PAT (paired t = 2.54, p = 0.015), whereas there was no difference between CAT and PAT resolution for the clinical model (paired t = 0.83, p = 0.41). The PAT measure was valid as its relationship with age was essentially the same as that of the CAT. The reliability analysis measured the consistency of threshold acuity values (logMAR) derived from the three observers while viewing the same patient responses. A clinical threshold acuity and logistic model estimate was computed for each subject and each observer as previously described. Analyses comparing estimates based on the clinical threshold values revealed that the three observers were in agreement as determined by ICCs of 0.90, 0.94, and 0.90 for PAT recognition, PAT resolution, and CAT,

respectively. Comparison of the logistic model threshold values yielded ICCs of 0.95, 0.96, and 0.95 for PAT recognition, PAT resolution, and CAT, respectively. For reference, ICC values less than 0.4 are considered poor; 0.4 to 0.75, fair; and more than 0.75, excellent.30 The average range between acuity thresholds determined from the three observers using the clinical criteria was 0.12 logMAR and 0.09 logMAR for PAT recognition and CAT, respectively. For the PAT, 51% of recognition acuity thresholds between the three observers were within 0.1 logMAR (one line), 87% were within 0.2 logMAR (two lines), and 100% were within 0.3 logMAR (three lines, one octave). For the CAT, 72% of acuities between observers were within 0.1 logMAR and 89% were within 0.2 logMAR and 94% were within 0.3 logMAR. Consistency over visits for the same patient was not measured. The proportion of high confidence values (2) for observations were compared between the PAT and CAT to provide insight into the perceived ability of examiners to make judgments of gaze. We found no difference in overall confidence of observers between the two tests. The mean difference in high confidence proportions (PAT-CAT) was 0.00 (paired t test p = 0.98). To determine if observer confidence was dependent on age and testability, we divided subjects into two groups. For subjects aged 6 to 17 months (n = 22), the proportion of high confidence observations with the PAT and CAT tests was 0.70 and 0.69, respectively. For subjects aged 18 to 35 months (n = 25), the proportion of high confidence observations was 0.88 and 0.89 for PAT and CAT, respectively. The ICC for PAT confidence values between the three observers for all subjects was 0.83. The ICC for CAT confidence values between observers was 0.80.

DISCUSSION In this study, we found that 69% of subjects were able to clearly respond to the more complex task of discriminating between two vanishing optotype figures (face vs. opposing oval) to

Optometry and Vision Science, Vol. 91, No. 1, January 2014

Copyright © American Academy of Optometry. Unauthorized reproduction of this article is prohibited.

Pacific Acuity Test Validity and ReliabilityVLowery et al.

81

FIGURE 3. Clinical model acuity comparison between Pacific recognition, Pacific resolution, and Cardiff (resolution) testing strategies.

yield a recognition visual acuity with the PAT. Although the small sample size limits our ability to provide definitive testability norms, we see a clear age-related change in testability, with a transition from relatively low to high testability for recognition acuity from 16 to 22 months of age (Fig. 2). We also see higher acuity thresholds with PAT recognition compared with PAT or CAT resolution testing for children younger than 22 months than for those older than this age (Fig. 5). This disparity with younger subjects is especially notable for the logistic model (Figs. 4, 6) because the logistic threshold analysis takes into account responses across all acuity levels weighted with examiner confidence values. Observers were equally confident in judging gaze preference with either PAT or CAT for both the younger and older age groups, so observer confidence in the children’s gaze responses does not appear to be a significant factor. The relatively elevated recognition acuity threshold we see for many of the younger subjects is clearly associated with testability. Although there was a strong tendency for all subjects to show preference for the face in their fixation pattern, this preference was not consistent with many of the younger children. The detection task with a single optotype (CAT) requires less sustained attention and scanning. To obtain a recognition threshold with the PAT, the examiner may have to present each card for a longer period to observe the child’s cumulative gaze responses to the targets. Older children, who were actively engaged in finding the face target, were able complete the forced-choice recognition process quickly and clearly indicate that they had found the face (most were willing to point). Younger children in our study would often make a quick glance at the card on initial presentation but would not sustain attention or look between targets to demonstrate that they had actually shown a preference for one target over the other. After independent grading, we found that most of these ‘‘quick glance’’ responses were directed toward the

face rather than the oval target. However, we cannot be certain that a child is truly engaged in the recognition process unless we see clear evidence that the child is actively analyzing the details of each target before indicating a final choice. The challenge of testability when using the PAT paradigm for children younger than 18 months decreases the validity of the threshold values for many of these subjects. These findings underscore the importance of evaluating the reliability of patient responses to suprathreshold targets with the PAT to determine if testing for recognition acuity will be valid. Based on the age-adjusted mean thresholds (Fig. 7), the clinical threshold analysis revealed no significant difference between CAT and PAT resolution thresholds. The CAT and PAT resolution acuities found in the present study are also very comparable in value across age with CAT measures from a normative study.18 The variance we see between the PAT and CAT measures in the present study can be considered in light of the expected test-retest variability using vanishing optotypes in young children. In the CAT normative study noted above, 39 subjects were retested in a separate testing session. Mean test-retest difference was 0.1 logMAR or less in 74% of subjects and was 0.2 to 0.4 logMAR in 26% of subjects. In the present study, we see a very similar variance between PAT resolution thresholds and CAT where 53% of measures between the two tests differed by 0.1 logMAR or less and 47% of measures varied by 0.1 to 0.33 logMAR. This indicates that a 1 to 3 acuity level difference between measures may not be significant unless observed over repeated measures. Temporal variability in behavior in young preschoolers is a confounding factor for any comparison limited to two measures. Although there was very close agreement between the resolution thresholds of acuity determined by both the PAT and CAT, there may be greater uncertainty in determining this threshold (the point when the child was no longer able to see anything) with the

Optometry and Vision Science, Vol. 91, No. 1, January 2014

Copyright © American Academy of Optometry. Unauthorized reproduction of this article is prohibited.

82 Pacific Acuity Test Validity and ReliabilityVLowery et al.

FIGURE 4. Logistic model acuity comparisons between Pacific recognition, Pacific resolution, and Cardiff (resolution) testing strategies.

dual-optotype format of the PAT. When the resolution threshold is reached with the CAT, the child typically searches for the target but will demonstrate inability to see the target by failing to hold fixation on any specific location. With the PAT, it may be possible to make the false assumption that the child is seeing both targets when they scan up and down looking for the face. This is why we required a high confidence for an observation of ‘‘both’’ to be considered valid toward the detection threshold when an observation of ‘‘none’’ was determined for any observation at the same acuity level. Without verbal confirmation, observers had to rely on additional behavioral cues of facial expression and fixation time to judge when the child had reached the detection limit with the PAT. In the experimental setting in which many cards had been presented before reaching this point, observers were in a better position to make these judgments than a clinical setting where fewer cards would be presented before reaching threshold. We would recommend presenting more than three cards per level at and just below threshold when using the PAT to determine the resolution threshold to increase the validity of the acuity measure or use a single optotype format (face only) with children who are not able to respond appropriately to the dual-optotype format. For most subjects, the PAT recognition paradigm yielded a higher threshold of acuity than the simpler detection task of the CAT. These results were expected given the additional challenge the recognition task entails. The CAT can be used as a recognition test if the acuity threshold is determined by asking the child to identify the figures. Using the CAT in this manner has been found to yield acuity thresholds 0.1 to 0.2 LogMAR higher than the detection task threshold obtained using the test in a preferential looking manner.18 Therefore, the results of this study are consistent with the expected outcome given the addition of a discrimination task using vanishing optotypes. The question that begs further study is how does this challenge compare with the recognition challenge of symbols in an isolated or standard chart format? We know that the psychophysical basis of the vanishing

optotype may yield very different results depending on the nature of the specific vision condition. The figures used in both the CAT and PAT are quite large, subtending a retinal image size much larger than the fovea. However, near the threshold of acuity, it is only possible to resolve the specific portion of the figure around the foveal fixation point. Identifying the figure then becomes a matter of scanning to find recognizable features. Therefore, recognition requires attention, motivation, and feature analysis skills that are developmentally and perceptually dependent. However, we also know that children with visual deficits caused by amblyopia or pathology have more difficulty with attention to details, fixation stability, and feature analysis. Although there are psychophysical differences between the recognition acuity task with vanishing optotype figures and high-contrast optotypes, the results of this study indicate that the dual-optotype strategy of the

FIGURE 5. Comparison between Pacific recognition acuities and Cardiff resolution acuities as a function of age.

Optometry and Vision Science, Vol. 91, No. 1, January 2014

Copyright © American Academy of Optometry. Unauthorized reproduction of this article is prohibited.

Pacific Acuity Test Validity and ReliabilityVLowery et al.

FIGURE 6. Comparison between acuities derived from the clinical and logistic models across age.

PAT yields acuity thresholds that are a step closer to what we would find with a standard optotype test. Further research is needed to compare the PAT against standard optotypes in children and adults with refractive or pathological visual deficits so that clinicians can interpret and apply the results with greater certainty. The interobserver reliability found in this study for both PAT and CAT was high, but we caution readers against comparing our findings with those of previous FPL studies that used a testretest strategy.31 In the present study, all three observers were analyzing the same responses for each subject, so the value of our interobserver reliability is only useful for comparison between the two tests evaluated. We found no difference between the interobserver reliability for the PAT (recognition) and the CAT. Even though the PAT recognition paradigm requires more complex

83

observations, the consistency between different observers in determining the threshold of acuity for each subject was essentially the same as the simpler detection paradigm of the CAT. There are some potential limitations of our experimental design. We were concerned that there may be a systematic bias in our experiment because of the different figures between the PAT and CAT. The CAT optotypes change with each acuity level, whereas the PAT uses the same optotypes across all levels. It is possible that children are more interested and maintain attention longer with one test or the other because of these differences. Several findings indicate that optotype design did not create a systematic bias. We did not see a significant difference between CAT and PAT resolution acuity thresholds across age, indicating that children were equally attentive to both tests when observers were only evaluating the presence or absence of gaze behavior. Also, observers demonstrated the same confidence in the ability to judge subject gaze behavior between the two tests. The only part of our analysis where we feel that optotype design may have introduced bias is testability. It is possible that the four subjects who completed the CAT but failed to complete the testing for the PAT were more interested in the CAT optotypes. Because these untestable subjects were all younger than 16 months, we feel that it is equally plausible that the dual-optotype paradigm of the PAT was confusing to these children and they were unsure what was being asked of them. Future research could investigate this further by comparing a single-optotype (face only) PAT presentation against CAT. Finally, there may have been some examiner bias affecting acuity thresholds in our study because of knowledge of what acuity level was being tested, especially for the second test administered because of the knowledge of the approximate threshold of the previous test based on child behavior and where the data recorder stopped the testing. Test order was alternated to reduce the effect of test order bias in our overall analysis. There are also significant differences between the experimental protocol used in this study and the recommended clinical protocol

FIGURE 7. Age-adjusted mean threshold acuity estimates. Nonoverlapping 84% confidence limits are statistically significant at an unadjusted p G 0.05. Optometry and Vision Science, Vol. 91, No. 1, January 2014

Copyright © American Academy of Optometry. Unauthorized reproduction of this article is prohibited.

84 Pacific Acuity Test Validity and ReliabilityVLowery et al.

for each test. We presented all of the cards at each level for each subject to compare observations and results between observers without bias. However, this research methodology yielded a significantly longer testing time than what would be achieved using a clinical methodology. It was often difficult to maintain a child’s attention through the course of the 10- to 20-minute overall testing time for both tests. It is likely that testability to threshold would have improved, particularly for younger subjects, with less card presentations at acuity levels well above threshold. In this study, it was necessary to determine acuity thresholds by only one pass through the card levels to avoid bias for the video observers. In a clinical setting, additional cards would be presented around threshold to increase validity. Also, in a clinical setting, the observer would verify the actual target position after each response and adjust testing as needed. In our experiment, observers were blind to target positions. Therefore, observer confidence was based only on the child’s fixation and behavioral pattern. Observers found this more challenging than a clinical method where immediate feedback on target position is possible.

CONCLUSIONS In this study, we found a significant age-related improvement in testability with the PAT. Children seem to respond well to the forced-choice dual-optotype strategy, yielding valid recognition acuity with the PAT by approximately 18 months of age. It was possible to obtain resolution acuity with the dual-optotype design of the PAT starting at 6 months of age, but a single-optotype design is likely to be more efficient for obtaining resolution acuity in a clinical setting for children younger than approximately 18 months. The PAT provided very reliable results in reaching the same thresholds based on the same observations between the three different observers. There was no difference in observer confidence in observations of children’s fixation behavior between the single-optotype FPL test (CAT) and the dualoptotype FPL paradigm of the PAT. The PAT yielded a mean recognition threshold that was 0.1 logMAR (one Snellen line) higher than the resolution threshold of a single vanishing optotype test (the CAT). Further research is needed to evaluate the PAT against standard optotypes in children and adults with subnormal vision so that clinicians can use the test with confidence to detect and measure visual deficits in young children and individuals with disabilities.

ACKNOWLEDGMENTS This study was made possible by a grant from the Kikuchi College Research Fund. None of the authors have any financial interest in any of the commercial products evaluated in this study. The first author designed and developed the Pacific Acuity Test in collaboration with Good-Lite Corporation. Received March 11, 2013; accepted August 11, 2013.

REFERENCES 1. Dobson V, Teller DY. Visual acuity in human infants: a review and comparison of behavioral and electrophysiological studies. Vision Res 1978;18:1469Y83.

2. Mayer DL, Arendt RE. Visual acuity assessment in infancy. In: Singer LT, Zeskind PS, eds. Biobehavioral Assessment of the Infant. New York, NY: Guilford Press; 2001:81Y94. 3. Duckman R. Visual acuity in the young child. In: Duckman, RH, ed. Visual Development, Diagnosis and Treatment of the Pediatric Patient. Philadelphia, PA: Lippincott Williams & Wilkins; 2006:34Y42. 4. Hertz BG. Acuity card testing of retarded children. Behav Brain Res 1987;24:85Y92. 5. Hertz BG, Rosenberg J. Acuity card testing of spastic children: preliminary results. J Pediatr Ophthalmol Strabismus 1988;25:139Y44. 6. Chandna A, Karki C, Davis J, Doran RM. Preferential looking in the mentally handicapped. Eye (Lond) 1989;3(Pt. 6):833Y9. 7. Adams RJ, Courage ML. Assessment of visual acuity in children with severe neurological impairments. J Pediatr Ophthalmol Strabismus 1990;27:185Y9. 8. Mackie RT, McCulloch DL. Assessment of visual acuity in multiply handicapped children. Br J Ophthalmol 1995;79:290Y6. 9. Geruschat DR. Using the acuity card procedure to assess visual-acuity in children with severe and multiple impairments. J Visual Impair Blin 1992;86:25Y7. 10. Courage ML, Adams RJ, Reyno S, Kwa PG. Visual acuity in infants and children with Down syndrome. Dev Med Child Neurol 1994;36:586Y93. 11. Van Splunder J, Stilma JS, Evenhuis HM. Visual performance in specific syndromes associated with intellectual disability. Eur J Ophthalmol 2003;13:566Y74. 12. Stiers P, Vanderkelen R, Vandenbussche E. Optotype and grating visual acuity in patients with ocular and cerebral visual impairment. Invest Ophthalmol Vis Sci 2004;45:4333Y9. 13. Kushner BJ, Lucchese NJ, Morton GV. Grating visual acuity with Teller cards compared with Snellen visual acuity in literate patients. Arch Ophthalmol 1995;113:485Y93. 14. Dobson V, Quinn GE, Tung B, Palmer EA, Reynolds JD. Comparison of recognition and grating acuities in very-low-birth-weight children with and without retinal residua of retinopathy of prematurity. Cryotherapy for Retinopathy of Prematurity Cooperative Group. Invest Ophthalmol Vis Sci 1995;36:692Y702. 15. Mayer DL, Fulton AB, Rodier D. Grating and recognition acuities of pediatric patients. Ophthalmology 1984;91:947Y53. 16. Mayer DL. Acuity of amblyopic children for small field gratings and recognition stimuli. Invest Ophthalmol Vis Sci 1986;27:1148Y53. 17. Adoh TO, Woodhouse JM, Oduwaiye KA. The Cardiff Test: a new visual acuity test for toddlers and children with intellectual impairment. A preliminary report. Optom Vis Sci 1992;69:427Y32. 18. Adoh TO, Woodhouse JM. The Cardiff acuity test used for measuring visual acuity development in toddlers. Vision Res 1994;34:555Y60. 19. Johnson C, Kran BS, Deng L, Mayer DL. Teller II and Cardiff Acuity testing in a school-age deafblind population. Optom Vis Sci 2009;86:188Y95. 20. Sharma P, Bairagi D, Sachdeva MM, Kaur K, Khokhar S, Saxena R. Comparative evaluation of Teller and Cardiff acuity tests in normals and unilateral amblyopes in under-two-year-olds. Indian J Ophthalmol 2003;51:341Y5. 21. Geer I, Westall CA. A comparison of tests to determine acuity deficits in children with amblyopia. Ophthalmic Physiol Opt 1996;16:367Y74. 22. Howard C, Firth AY. Is the Cardiff Acuity Test effective in detecting refractive errors in children? Optom Vis Sci 2006;83:577Y81. 23. Turati C, Simion F, Milani I, Umilta C. Newborns’ preference for faces: what is crucial? Dev Psychol 2002;38:875Y82.

Optometry and Vision Science, Vol. 91, No. 1, January 2014

Copyright © American Academy of Optometry. Unauthorized reproduction of this article is prohibited.

Pacific Acuity Test Validity and ReliabilityVLowery et al. 24. Farroni T, Johnson MH, Menon E, Zulian L, Faraguna D, Csibra G. Newborns’ preference for face-relevant stimuli: effects of contrast polarity. Proc Natl Acad Sci U S A 2005;102:17245Y50. 25. Farroni T, Csibra G, Simion F, Johnson MH. Eye contact detection in humans from birth. Proc Natl Acad Sci U S A 2002;99:9602Y5. 26. Frank MC, Vul E, Johnson SP. Development of infants’ attention to faces during the first year. Cognition 2009;110:160Y70. 27. Harris SJ, Hansen RM, Fulton AB. Assessment of acuity in human infants using face and grating stimuli. Invest Ophthalmol Vis Sci 1984;25:782Y6. 28. Harris SJ, Hansen RM, Fulton AB. Assessment of acuity of amblyopic subjects using face, grating, and recognition stimuli. Invest Ophthalmol Vis Sci 1986;27:1184Y7.

85

29. Payton ME, Greenstone MH, Schenker N. Overlapping confidence intervals or standard error intervals: what do they mean in terms of statistical significance? J Insect Sci 2003;3:34. 30. Fleiss JL. The Design and Analysis of Clinical Experiments. New York, NY: Wiley; 1986. 31. Getz LM, Dobson V, Luna B, Mash C. Interobserver reliability of the Teller Acuity Card procedure in pediatric patients. Invest Ophthalmol Vis Sci 1996;37:180Y7.

John P. Lowery Pacific University College of Optometry 2043 College Way Forest Grove, OR 97116 e-mail: [email protected]

Optometry and Vision Science, Vol. 91, No. 1, January 2014

Copyright © American Academy of Optometry. Unauthorized reproduction of this article is prohibited.

Pacific acuity test: testability, validity, and interobserver reliability.

The Pacific Acuity Test (PAT) is a new vanishing optotype test designed to measure recognition visual acuities in preverbal children using a face and ...
714KB Sizes 0 Downloads 0 Views