HHS Public Access Author manuscript Author Manuscript

Lang Cogn Neurosci. Author manuscript; available in PMC 2017 January 01. Published in final edited form as: Lang Cogn Neurosci. 2016 ; 31(3): 404–424. doi:10.1080/23273798.2015.1105984.

The (in)dependence of articulation and lexical planning during isolated word production Esteban Buz and University of Rochester, Department of Brain and Cognitive Sciences

Author Manuscript

T. Florian Jaeger University of Rochester, Departments of Brain and Cognitive Sciences, Computer Science, and Linguistics

Abstract

Author Manuscript

The number of phonological neighbors to a word (PND) can affect its lexical planning and pronunciation. Similar parallel effects on planning and articulation have been observed for other lexical variables, such as a word’s contextual predictability. Such parallelism is frequently taken to indicate that effects on articulation are mediated by effects on the time course of lexical planning. We test this mediation assumption for PND and find it unsupported. In a picture naming experiment, we measure speech onset latencies (planning), word durations, and vowel dispersion (articulation). We find that PND predicts both latencies and durations. Further, latencies predict durations. However, the effects of PND and latency on duration are independent: parallel effects do not imply mediation. We discuss the consequences for accounts of lexical planning, articulation, and the link between them. In particular, our results suggest that ease of planning does not explain effects of PND on articulation.

Keywords language production; lexical planning; articulation; neighborhood density; confusability

Introduction

Author Manuscript

The link between the lexical planning of speech and articulation continues to play an important theoretical role in our understanding of speech production. Yet, its nature remains poorly understood. A priori, three aspects of speech production can be distinguished: the process of planning (e.g. lexical and phonological retrieval processes), the articulatory plan generated by planning processes (which depends not only on the process, but also the representations that it operates over), and the execution of the plan (i.e. articulation). Research in language production has largely focused on the relation between the first and the last aspect. An increasingly common theoretical position is that the process of planning is directly reflected in the articulatory plan and consequently in articulation and thus pronunciation (cf. Arnold & Watson, 2015; Goldrick, Vaughn, & Murphy, 2013; Kahn &

Corresponding author: [email protected].

Buz and Jaeger

Page 2

Author Manuscript

Arnold, 2015; Kello, 2004; Kirov & Wilson, 2013; Watson, Buxó-Lugo, & Simmons, 2015). We will refer to this as the planning-drives-articulation hypothesis. Some accounts go further and propose that any systematic variation in articulation stems exclusively from variation in the course of planning and retrieving a word’s representation. For example, Kahn and Arnold (2012), aiming to account for effects of givenness on acoustic realization, propose that […] acoustic reduction […] emerges from facilitation of the mechanisms of production. We hypothesize that reduction results from some combination of (1) activation of the conceptual and linguistic representations associated with a word, and (2) facilitation of any of the processes associated with generating an articulatory plan from a concept. Kahn and Arnold, 2012, p. 313

Author Manuscript

According to this view, changes in a word’s pronunciation due to contextual givenness are thus assumed to wholly originate in facilitation of any of the representations or encoding processes involved in planning (see also Bard et al., 2000). Similar accounts have been proposed for changes in pronunciation due to priming (Balota, Boland, & Shields, 1989; Bell, Brenier, Gregory, Girand, & Jurafsky, 2009; Kello, 2004) and Stroop tasks (Kawamoto, Kello, Higareda, & Vu, 1999; Kello, 2004; Kello, Plaut, & MacWhinney, 2000). Similar assumptions about the link between lexical planning and articulation are increasingly accepted in psycholinguistic research (for citations, see Arnold, 2008; Balota et al., 1989; Bard et al., 2000; Bell et al., 2009; Kahn & Arnold, 2012, 2015; Lam & Watson, 2010; MacDonald, 2013; Watson et al., 2015).

Author Manuscript

Yet, despite the central role of the planning-drives-articulation hypothesis, direct tests of the hypothesis have largely been lacking. Previous evaluations of the planning-drivesarticulation hypothesis have relied on indirect evidence. Specifically, one common argument is based on evidence that some lexical or task properties affect both production planning and articulation in similar ways (e.g., Balota et al., 1989; Fox, Reilly, & Blumstein, 2015; Gahl, Yao, & Johnson, 2012; Kahn & Arnold, 2012; Kello, 2004; Kello et al., 2000). However, such parallel effects are insufficient to argue that effects on articulation are mediated through lexical planning (the central claim of the production-drives-articulation view). Indeed, as we show below, parallel effects can arise in the absence of mediation. It is thus necessary to test the central prediction made by the planning-drives-production view: differences in lexical planning should be reflected in similar differences in articulation (and thus pronunciation), and possibly, all systematic variation in articulation should be mediated by, and reducible to, lexical planning.

Author Manuscript

The present work contributes to recent attempts to address this gap in the literature (Heller & Goldrick, 2014; Munson, 2007; Watson et al., 2015). We focus on a lexical property that has received much attention in the lexical planning and articulation literature, phonological neighborhood density (PND). Before we introduce the relevant literature on PND, we briefly elaborate on the type of account we aim to test and how we aim to test it. Specifically, there are two broad classes of accounts inspired by the planning-drives-articulation perspective. Competition accounts hold that increased competition during lexical planning leads to increased articulatory detail (Fox et al., 2015; Goldrick et al., 2013; Kirov & Wilson, 2013; see also Kello et al., 2000, for a cascading activation approach). Since competition is not a Lang Cogn Neurosci. Author manuscript; available in PMC 2017 January 01.

Buz and Jaeger

Page 3

Author Manuscript Author Manuscript

directly observable quantity, these accounts require further specification before they begin to make testable predictions about the planning-articulation link (for a more detailed critique of these accounts, see Jaeger & Buz, 2016). For example, it is sometimes argued that planning latencies are not necessarily a measure of the competition experienced during lexical planning (Damian, 2003; Mahon, Costa, Peterson, Vargas, & Caramazza, 2007, also Goldrick, p.c.). We thus postpone any further treatment of competition accounts to the Discussion. The second class of accounts holds that reduced production difficulty results in reduced pronunciations (Arnold & Watson, 2015; Bard et al., 2000; Bell et al., 2009; Kahn & Arnold, 2012; Watson et al., 2015). Such production ease accounts predict that (1) faster planning will result in less articulatory detail (Kahn & Arnold, 2015; see also Kirov & Wilson, 2013). A radical production ease account further predicts that (2) only production ease should systematically affect pronunciation. Figure 1 illustrates radical and moderate production ease accounts (Panel a and b) and contrasts them with the absence of production ease effects on articulation (Panel c). With these clarifications in mind we now turn to the literature on PND.

Author Manuscript

PND has received considerable attention in psycholinguistic research on both comprehension and production (for recent overviews, see, Chen & Mirman, 2012; Sadat, Martin, Costa, & Alario, 2014). Of interest here is that PND has been found to affect both the planning (Chen & Mirman, 2012; Heller & Goldrick, 2014; Sadat et al., 2014; Vitevitch, 2002; Vitevitch & Luce, 1999; Vitevitch & Sommers, 2003; Vitevitch & Stamer, 2006) and pronunciation of spoken words (Fox et al., 2015; Gahl et al., 2012; Munson, 2007; Munson & Solomon, 2004; see also Scarborough, 2010, 2012, 2013; Scarborough & Zellou, 2013; Wright, 2004; for a critique of some of these latter studies, see Gahl, 2015).1 For instance, one line of studies presented in Sadat et al. (2014) found that words with few phonological neighbors (low PND words) are planned more quickly than words with many phonological neighbors (high PND words; see also Vitevitch & Stamer, 2006). A separate line of studies found that low PND words are articulated with less detail than high PND words (Fox et al., 2015; Munson, 2007; Munson & Solomon, 2004; Scarborough, 2010, 2012, 2013; Scarborough & Zellou, 2013; Wright, 2004). This would seem to suggest a positive correlation between the amount of time required for lexical planning and the amount of detail provided during articulation, with faster planning resulting in less articulatory detail. Such a positive correlation has been taken as evidence for production ease accounts (Gahl et al., 2012; for similar arguments for the reduction of predictable or repeated instances of words, see, Arnold, 2008; Bard et al., 2000; Bell et al., 2009).

Author Manuscript

However, there are at least two problems with this argument. First, parallel effects of PND on planning and articulation are at best indirect evidence in favor of the planning-drivesproduction view. At worst, parallel effects can arise in the complete absence of mediation. Second, the empirical landscape is less clear than the above paragraph suggests. For example, some studies have found the converse relationship between PND and planning, that

1Studies differ in how they calculate PND. Some calculate PND as the number of phonological neighbors that differ in only one segment from the target. Others sum the frequency of all neighbors (frequency-weighted PND, cf. Luce & Pisoni, 1998). Studies further differ in how edit distance is calculated (e.g., which operations of substitution, insertion, and deletion are considered) and in whether words that are morphologically related to the target are excluded when counting neighbors. We group these studies together and simply refer to their findings as PND effects.

Lang Cogn Neurosci. Author manuscript; available in PMC 2017 January 01.

Buz and Jaeger

Page 4

Author Manuscript

low PND words are planned more slowly (Munson, 2007; Vitevitch, 2002; Vitevitch & Luce, 1999). Other work has found the converse relationship between PND and pronunciation, that low PND words are pronounced with more detail (Gahl et al., 2012). An arguably bigger problem, however, is that almost all existing studies have focused on the role of PND in either planning or articulation. This means that arguments for or against specific claims about the planning-articulation link have relied on comparisons across studies that differ along many dimensions (e.g., some studies employed picture description, others employed reading tasks, some involve distinct languages, yet others were based on data from conversational speech).

Author Manuscript

In fact, we are aware of only a single study that investigated effect(s) of PND on planning and articulation under the same conditions (Munson, 2007; for related work, see Heller & Goldrick, 2014, who investigate noun density rather than PND). In a word reading study, Munson had speakers read aloud words either immediately upon presentation or with some delay. Munson also manipulated the lexical frequency and (frequency-weighted) PND of target words. Munson argued that any effect on articulation that is mediated through planning should be reduced in the delayed speech condition. This reduction was indeed observed for the effects of frequency on articulation, but not for the effects of PND on articulation: PND effects on articulation did not differ between the immediate and the delayed condition. Munson took this to argue that frequency, but not PND, effects on articulation are mediated through planning.

Author Manuscript

Munson further presented a regression analysis meant to directly test whether the effect of PND on articulation is mediated through lexical planning. Munson found that PND explained variation in pronunciation, even while effects of planning latency were simultaneously controlled for. These results are compatible with a link between production planning and articulation (prediction (1) above), but reject the radical production ease account (prediction (2) above). In particular, these results suggest that PND effects on articulation are not fully mediated through lexical planning but may stem from some other source.

Author Manuscript

However, since its publication, several potential problems have been identified with Munson’s study. First, the regression analysis Munson conducted collapsed over data from both the immediate and the delayed condition. It is possible that planning latencies in the delayed condition do not provide a good measure of the actual time course of lexical planning (the delay was always 1000 ms and thus predictable, potentially allowing advance planning). Thus collapsing over the immediate and delayed condition under estimates the effect of planning latencies on articulation, biasing Munson’s test against production ease accounts. Second, as has recently been discussed (Gahl, p.c.; Munson, p.c.), the stimuli used in Munson (2007) confounded PND with other phonological properties known to affect articulation (for a discussion, see Gahl, 2015). Third and finally, Munson’s analysis leaves open whether the effects of PND on articulation are at least partially mediated through lexical planning.

Lang Cogn Neurosci. Author manuscript; available in PMC 2017 January 01.

Buz and Jaeger

Page 5

Author Manuscript Author Manuscript

Consider Figure 2, which illustrates possible links between PND, lexical planning, and articulation under production ease accounts. (Note that Figure 2 is a simplification; in particular, different aspects of planning and articulation—reflected in different behavioral measures—might exhibit different dependencies.) Figure 2a illustrates the prediction under radical production ease accounts, where the influence of PND on articulation is fully mediated through lexical planning (i.e. if PND reduces lexical planning time, it also reduces pronunciation detail). The findings of Munson (2007), if confirmed, would argue against this account. Even if these findings are confirmed, however, this leaves open whether the influence of PND on articulation is completely independent of planning (see Figure 2c) or partially mediated through planning (as argued under moderate production ease accounts, see Figure 2b). The former case would describe a moderate production ease account of PND effects on articulation. The latter possibility, too, is compatible with moderate production ease accounts, but would imply that production ease does not contribute to PND effects on articulation.

Author Manuscript

In summary, while the study reported in Munson (2007) is of central importance to our understanding of the link between planning and articulation, it leaves open important questions. This motivates the current work. We investigate the effect of PND on lexical planning and articulation, as well as the link between them, while addressing the confounds that have been identified since the publication of Munson’s study. We use minimal pair stimuli that allow us to test for differences in PND while controlling, as much as possible, for differences in phonological form. Unlike in Munson (2007), all our data comes from non-delayed productions. Additionally, we balance or control for additional variables that have been identified to affect planning or articulation since Munson (2007). We present triallevel analysis to directly assess the link between planning and articulation (see Sadat et al., 2014, for a similar analysis of PND effects on planning). The current study also extends Munson’s in two other aspects that facilitate comparison with other work on planning or articulation. First, Munson used a word reading task, whereas most work on the role of PND in lexical planning has relied on picture naming (see Sadat et al., 2014, and references therein). We thus employ a picture naming paradigm. Second, Munson measured vowel dispersion and vowel duration. This contrasts with the majority of studies on the planning-articulation link in other domains, which have focused on word durations (e.g., Arnold, 2008; Bard et al., 2000; Bell et al., 2009; Kahn & Arnold, 2015). We thus measure both vowel dispersion and word duration to facilitate comparison with both Munson (2007) and other work.

Author Manuscript

To anticipate the outcome of the current study, we find that effects of PND on articulation do not seem to be mediated through effects of PND on planning. This will lead us to discuss alternative explanations for the effect of PND on articulation, including explanations in terms of representational accounts (and, specifically, the production-perception loop of exemplar-based models, e.g., Pierrehumbert, 2002; Wedel, 2006) and accounts that allow articulation to be affected by communicative goals (e.g., Galati & Brennan, 2010; Lindblom, 1990; Schertz, 2013; Stent, Huffman, & Brennan, 2008).

Lang Cogn Neurosci. Author manuscript; available in PMC 2017 January 01.

Buz and Jaeger

Page 6

Author Manuscript

Experiment In a picture naming experiment, we investigate the effect of log-frequency-weighted PND on the lexical planning and articulation of the same words. Critical items consisted of minimal pairs (car-jar) which differed in log-frequency-weighted PND. This allows us to investigate PND effects when all but one segment of a word is held constant, thereby reducing the a priori expected differences in planning and articulatory measures due to differences in the phonological form. Methods Participants—36 University of Rochester undergraduates participated in the experiment. All were self-reported monolingual native English speakers. Participants were compensated $10.

Author Manuscript

Procedure—On each trial, participants were presented a picture and had to name it. We instructed participants to name the pictures as quickly as possible. Figure 3a shows a schematic of a trial. Participants initiated each trial with a mouse click. Starting 250 ms later, a fixation cross was displayed at the center of the screen for 500 ms, and a beep tone was played for the first 250 ms. After the 500 ms had passed, the picture appeared centered in the screen. All pictures were 420 by 420 pixels large and displayed on a screen with a resolution of 1680 by 1050 pixels, about 60 cm away from the participant. Participants ended the trial by clicking a mouse button. The experiment lasted no longer than 40 minutes.

Author Manuscript

Materials—Stimuli to the experiment consisted of 108 line drawings taken from the International Picture Naming Project (IPNP) database (Bates et al., 2003). For all pictures, IPNP norms identified the word we intended participants to produce as the dominant label (> 80% naming accuracy). Following Munson (2007) and Scarborough (2010, 2012), we binned targets into low vs. high log-frequency-weighted PND. The dominant labels for forty of these targets formed twenty monosyllabic minimal pairs (e.g., car-jar), such that one of the targets had greater log frequency-weighted PND (e.g., car with a log-frequency-weighted PND of 51.08) and the other had lower log-frequency-weighted PND (e.g., jar with log-frequency-weighted PND of 31.08).

Author Manuscript

Each minimal pair consisted of CVC, CCVC, or CVCC words. To control for possible compression effects, pairs shared a vowel and onset and coda complexity (i.e. targets in a pair either both had single segment codas or both had two-segment codas, following Scarborough, 2010). One pair was CCVC, three were CVCC, and the remainder (36) were CVC. Three pairs differed in the coda and the remainder (37) differed in the onset. Vowels were in the set (/ɑ, æ, ɔ, aƱ, ε, eI, I, oƱ, Ʊ/). These minimal pairs constitute the target items for our study. A complete list of items is provided in Appendix B (Table B1). Our minimal pair design holds constant syllable, coda, and onset complexity, all of which could influence word durations and vowel dispersion. However, differences in consonant contexts are known to affect vowel duration, specifically voicing and manner (House, 1961).

Lang Cogn Neurosci. Author manuscript; available in PMC 2017 January 01.

Buz and Jaeger

Page 7

Author Manuscript

Chi-squared tests of independence show that, across the density groups, our stimuli pairs did not significantly differ in manner (plosive, nasal, fricative, lateral), place (bilabial, labial, labio-dental, alveolar, velar), or voicing (p’s> 0.1). In addition we balanced (logtransformed) frequency, average biphone log probability, number of alternative picture labels for paired pictures, and proportion of usage of the dominant label (no difference assessed by paired t-tests). The mean (and standard deviation) of these measures by high vs. low logfrequency-weighted PND condition are provided in Table 1. We report log-frequencyweighted PND based on IPhOD2, a lexical database of 54,000 tokens of English from the SUBTLEXus corpus (Brysbaert & New, 2009). All results reported below replicated robustly when log-frequency-weighted PND was calculated based on CELEX (Baayen, Piepenbrock, & van Rijn, 1993), as in, for example, Scarborough (2010, 2012).

Author Manuscript

In addition to the 40 minimal pair pictures, there were 60 filler pictures. Fillers were pictures whose dominant labels were mono (9), bi-syllabic (42), tri-syllabic (7), or quadri-syllabic (2). Finally, eight pictures served as practice trials, following instructions and preceding the main session. Fillers and practice labels were chosen as to not be phonological neighbors with any of the critical or filler items.

Author Manuscript

Two lists were created by pseudo-randomly distributing the 20 minimal pairs across the 100 trials. Each participant saw both target words of a minimal pair item with at least one, but no more than four, fillers appearing between pairs (fillers did not occur between a pair). The order of the two targets within a pair was counter-balanced across the two lists. Each list was seen by 18 participants. Every 25 trials, participants were prompted to take a break (breaks never intervened between neighbor-target pairs). Trials were automatically recorded from mouse click to mouse click. The experimenter sat silently in the recording booth with the participant. Scoring—The first author transcribed all 100 target and filler pictures for each participant. One participant was removed because of low picture naming accuracy (72%). Naming accuracy of the remaining participants was high (88%). This left 35 participants for the analysis. We then checked whether participants’ productions corresponded to the intended picture labels. For two minimal pairs, one of the targets had very low intended label usage across participants (bark and rain,

The (in)dependence of articulation and lexical planning during isolated word production.

The number of phonological neighbors to a word (PND) can affect its lexical planning and pronunciation. Similar parallel effects on planning and artic...
2MB Sizes 0 Downloads 6 Views