Accepted Manuscript Title: Accelerometer-based automatic voice onset detection in speech mapping with navigated repetitive transcranial magnetic stimulation Author: Anne-Mari Vitikainen Elina M¨akel¨a Pantelis Lioumis Veikko Jousm¨aki Jyrki P. M¨akel¨a PII: DOI: Reference:

S0165-0270(15)00194-6 http://dx.doi.org/doi:10.1016/j.jneumeth.2015.05.015 NSM 7234

To appear in:

Journal of Neuroscience Methods

Received date: Revised date: Accepted date:

3-2-2015 19-5-2015 21-5-2015

Please cite this article as: Vitikainen A-M, M¨akel¨a E, Lioumis P, Jousm¨aki V, M¨akel¨a JP, Accelerometer-based automatic voice onset detection in speech mapping with navigated repetitive transcranial magnetic stimulation, Journal of Neuroscience Methods (2015), http://dx.doi.org/10.1016/j.jneumeth.2015.05.015 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

1 Title: Accelerometer-based automatic voice onset detection in speech mapping with navigated repetitive transcranial magnetic stimulation

ip t

Authors: Anne-Mari Vitikainen a,b Lic.Phil, Elina Mäkeläa B.Sc., Pantelis Lioumisa,c Ph.D., Veikko Jousmäkid,e Ph.D., Jyrki P. Mäkelä a M.D., Ph.D. a

M

d

Article type: Research article Number of text pages: 20 Number of figures: 2 Number of tables: 1 Number of supplementary figures: 2 Number of supplementary tables: 3

an

us

cr

BioMag Laboratory, HUS Medical Imaging Center, University of Helsinki and Helsinki University Hospital, P.O. Box 340, FI-00029 Helsinki b Department of Physics, University of Helsinki, P.O. Box 64, FI-00014 Helsinki c Neuroscience Center, University of Helsinki, P.O. Box 56, FI-00014 Helsinki d Department of Neuroscience and Biomedical Engineering, Aalto University School of Science, P.O. Box 15100, FI-00076 AALTO, Espoo, Finland e Aalto NeuroImaging, Aalto University School of Science, P.O. Box 15100, FI-00076 AALTO, Espoo, Finland

te

Address for correspondence: Anne-Mari Vitikainen BioMag Laboratory, HUS Medical Imaging Center, Helsinki University Hospital P.O. Box 340, FI-00029 Helsinki Tel: +358 5042 72020 Fax: +358 9471 74404 E-mail: [email protected]

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

All authors’ e-mails: Anne-Mari Vitikainen Elina Mäkelä Pantelis Lioumis Veikko Jousmäki Jyrki P. Mäkelä

[email protected] [email protected] [email protected] [email protected] [email protected]

Page 1 of 27

2 Abstract

2

Background: The use of navigated repetitive transcranial magnetic stimulation (rTMS) in

3

mapping of speech-related brain areas has recently shown to be useful in preoperative

4

workflow of epilepsy and tumor patients. However, substantial inter- and intraobserver

5

variability and non-optimal replicability of the rTMS results have been reported, and a

6

need for additional development of the methodology is recognized. In TMS motor cortex

7

mappings the evoked responses can be quantitatively monitored by electromyographic

8

recordings; however, no such easily available setup exists for speech mappings.

9

New Method: We present an accelerometer-based setup for detection of vocalization-

an

us

cr

ip t

1

related larynx vibrations combined with an automatic routine for voice onset detection for

11

rTMS speech mapping applying naming.

12

Comparison with Existing Method(s): The results produced by the automatic routine were

13

compared with the manually reviewed video-recordings.

14

Results: The new method was applied in the routine navigated rTMS speech mapping for

15

12 consecutive patients during preoperative workup for epilepsy or tumor surgery. The

16

automatic routine correctly detected 96% of the voice onsets, resulting in 96% sensitivity

17

and 71% specificity. Majority (63%) of the misdetections were related to visible throat

18

movements, extra voices before the response, or delayed naming of the previous stimuli.

19

The no-response errors were correctly detected in 88% of events.

20

Conclusion: The proposed setup for automatic detection of voice onsets provides

21

quantitative additional data for analysis of the rTMS-induced speech response

22

modifications. The objectively defined speech response latencies increase the

23

repeatability, reliability and stratification of the rTMS results.

Ac ce p

te

d

M

10

Page 2 of 27

3 1 2

Keywords: Presurgical planning, navigated TMS (nTMS), object naming, speech

3

mapping

ip t

4 1. Introduction:

6

Localization of speech-related brain areas by navigated repetitive transcranial magnetic

7

stimulation (rTMS) during an object naming task has been suggested to be useful in

8

planning of brain tumor and epilepsy surgery (1, 2). Use of individual’s magnetic

9

resonance imaging (MRI) based navigation with rTMS mapping enables the speech

an

us

cr

5

related cortical sites to be transferred to the neuronavigation system (3) and to be used in

11

surgical planning. Preoperative speech mapping by navigated rTMS may aid in objective

12

risk-benefit assessment of the planned surgery, enable more precisely targeted smaller

13

craniotomies, faster and safer intraoperative mapping, and safer surgeries for patients that

14

cannot undergo awake craniotomy (2, 4). The rTMS speech mapping results have been

15

compared to direct cortical stimulation (DCS) during awake craniotomy implying that

16

nTMS is remarkably sensitive but relatively non-specific in detecting the sites producing

17

speech disturbance in DCS (4, 5). In preoperative navigated TMS (nTMS) mapping of

18

motor cortex, shown to have a very good match with DCS (5, 9), the responses to nTMS

19

are monitored by motor evoked potentials from the activated muscles. No such

20

straightforward, easily recordable marker exists for detection of speech modifications

21

induced by rTMS.

Ac ce p

te

d

M

10

22

Page 3 of 27

4 The navigated rTMS method has been accepted by US Food and Drug Administration

2

(FDA) for presurgical speech mapping in 2012 (10), and, consequently, its use will

3

probably expand in the near future. Recently, rTMS speech mapping results in patients

4

with brain tumors and healthy subjects have suggested tumor-induced plasticity of speech

5

representation areas (11, 12), and demonstrated differences in cortical areas related to

6

object and action naming in healthy subjects (13). Thus, nTMS during an object naming

7

task may have an impact on surgery planning and provide information about the

8

organization of speech-related brain areas in general. Nevertheless, the intraobserver and

9

interobserver comparisons of the nTMS speech mapping results show only limited

an

us

cr

ip t

1

replicability, and the currently used protocols need further development (14, 15).

11

Additionally, the methodology is not completely standardized between the surgical

12

groups applying it.

d te

13

M

10

The no-response (speech arrest) errors are the most replicable results of nTMS speech

15

mapping (11). However, speech disturbances such as semantic and phonological

16

paraphasias, and performance errors during the pulse train are more difficult to separate

17

quantitatively from the recorded videos. Particularly, the value of hesitations, delayed but

18

not completely abolished responses induced by rTMS, is not clear, as their evaluation is

19

quite subjective; interpretation of these errors has been considered as a possible reason

20

for a high rate of false positive results of rTMS studies as compared with DCS (16).

Ac ce p

14

21 22

Microphone recording of vocalization to detect the voice onsets objectively is hampered

23

by ambient noise from TMS pulses and coil cooling (1). Electromyographic (EMG)

Page 4 of 27

5 signals from cricothyroid muscles have been used in combination with nTMS to monitor

2

effects of nTMS to the inferior frontal cortex on larynx muscles during object naming

3

tasks (17). However, the insertion of the EMG wire electrodes to the cricothyroid muscle

4

is invasive and requires skill (18).

ip t

1

5

Detection of larynx vibrations, coinciding with the fundamental frequency of the voice,

7

with an accelerometer enables non-invasive follow-up of speech vocalizations (19).

8

Accelerometers can accurately measure vocal activity (20-22). Compared with

9

microphones, accelerometers are not sensitive to ambient environmental sounds and are

us

an

10

cr

6

therefore well suitable for voice assessment (21).

M

11

In this study, we tested the feasibility of using an accelerometer to pick up the onsets of

13

the vocalizations in navigated rTMS speech mapping.

te

d

12

14

16

2. Materials and Methods

Ac ce p

15

2.1. Subjects

17

We made the accelerometer recordings as part of the rTMS speech mappings for twelve

18

consecutive patients (4 females/8 males, age range 12–39 years) going through tumor or

19

epilepsy surgery workup. Both hemispheres were stimulated with rTMS in eleven

20

patients; patient #4 did not tolerate the right-hemisphere stimulation and only his left

21

hemisphere was stimulated. For one patient, the data during the baseline was not recorded

22

due human error. For one patient the data was lost due technical difficulties, for patient

23

#7 the data of the left hemisphere stimulation was inaccessible, and for patient #6 a first

Page 5 of 27

6 1

part of data of left hemisphere stimulation was corrupted. The results of ten patients are

2

presented. The study was approved by the local Ethical Committee.

3 2.2. Experimental design

ip t

4

The experiment started with an initial baseline session without rTMS to select the images

6

that were correctly named and pronounced. In order to enable the offline comparison of

7

the responses with and without the rTMS, the baseline image set was run two times more.

8

During the second baseline session the rTMS coil was held near the patient’s head in the

9

navigation field of the rTMS system to ensure the trigger pulse output. The intensity of

10

the baseline stimulation was held at 0 or 1 % of the maximum stimulator output, i.e., no

11

rTMS stimulation was applied. Images that were not named, not named correctly, not

12

named clearly, not articulated correctly and named with delay or hesitation were removed

13

from the image set after the first baseline round. Only the data from the second baseline

14

round was used for the analysis and subsequent rTMS sessions. The images were

15

displayed in random order. All sessions were video-recorded for offline analysis.

us

an

M

d

te

Ac ce p

16

cr

5

17

The rTMS measurements were carried out at the BioMag Laboratory using eXimia NBS

18

4.3 (Nexstim Ltd., Helsinki, Finland) and a commercial speech mapping module

19

(NexSpeech, Nexstim Ltd., Helsinki, Finland). The navigation system estimates the

20

strength of the maximum electric field at the stimulation location and overlays the

21

estimated field strength on-line on the 3-D reconstruction of the individual’s brain (23).

22

Each stimulation site is tagged to the structural magnetic resonance (MR) images for

23

subsequent analysis.

Page 6 of 27

7 1 All rTMS stimulations were done with a figure-of-eight coil of a 70-mm outer diameter

3

and biphasic pulse shape. The resting motor threshold (MT) was determined from the

4

abductor pollicis brevis muscle controlled by the hemisphere affected by the epilepsy or

5

tumor. The method used by Rossini et al. (24) was used for the MT determination. The

6

rTMS intensity for the mapping was adjusted to produce roughly equally strong electric

7

field to all perisylvian cortical regions. If the stimulation caused intolerable discomfort to

8

the subject, its intensity was lowered in 5-10 % decrements until tolerable. Thus, the

9

stimulation intensity varied somewhat across subjects. The estimated induced electric

an

us

cr

ip t

2

field strength at the cortex was registered. The stimulations were done with 5-pulse rTMS

11

trains at 5 or 7 Hz (1, 25).

M

10

d

12

In nine patients, the images to be named were a subset of color images out of set of 84

14

images depicting everyday objects (1). In three patients, a selection of 92 images from a

15

standardized image set (26) was used. The selection from the standardized set was chosen

16

to represent frequently used items in Finnish every day life, whose names are common in

17

Finnish language, and that have only few synonyms. The subjects were asked to name the

18

objects in Finnish as quickly and precisely as possible. The images were displayed for

19

700 ms on a computer screen with 2.0-3.0 s interstimulus intervals (ISI). The experiment

20

started with a 2.5 s ISI. If needed, the ISI was adjusted according to the baseline

21

performance of the patient. The rTMS trains started 300 ms after the image onset. The

22

coil was hand-held and freely movable between the pulse trains. During the stimulation,

23

the coil was moved between the pulse trains following a grid-like pattern so that the

Ac ce p

te

13

Page 7 of 27

8 stimulated locations covered systematically a wide fronto-temporo-parietal cortical area.

2

The orientation of the coil was adjusted to induce current primarily perpendicular to the

3

fibers of the temporalis muscle to minimize muscle twitching, and secondarily

4

perpendicular to the sulcus at the stimulation location. The cortical sites where rTMS

5

produced naming errors were revisited to evaluate the repeatability of the effect.

ip t

1

2.3. Manual review of the mapping data

us

7

cr

6

The speech mappings were routinely reviewed offline from the video by a

9

neuropsychologist with expertise in effects of DCS on speech. The categories available in

an

8

the speech mapping module (no error, no-response error, performance error, semantic

11

paraphasia, muscle stimulation and other) were applied in the analysis. Additionally

12

information about performance errors’ subdivision (e.g. delays, phonological

13

paraphasias) was noted in the free comment field. The speech response latencies were not

14

available in the speech mapping module.

16

d

te

Ac ce p

15

M

10

2.4. Vibration recording

17

Subject’s vocal activity, i.e. fundamental frequency of the voice, was recorded during

18

object naming with a three-axis accelerometer (ADXL330 iMEMS® Accelerometer,

19

Analog Devices, Norwood, MA) attached to the skin on left side of the subject’s throat,

20

onto the larynx site producing palpable vibrations during vocalization (Fig. 1). The

21

analog accelerometer signals were connected to the EMG system of the stimulator with a

22

custom built interface. The recorded frequency band was 10–500 Hz and sampling rate 3

Page 8 of 27

9 1

kHz. Similar accelerometer has been used previously by Bourguignon and co-workers to

2

detect the fundamental frequency of the reader’s voice (19).

3

ip t

4 5

cr

6

8

10

Fig. 1. The accelerometer attached to the throat.

12

M

11

an

9

us

7

2.5. Automated routine for voice onset detection For the automated routine, the following files were collected: the tabular data of the

14

speech mapping (speech-file containing the speech event related data from the

15

commercial speech mapping module, including the name of the picture, speech exam

16

identifications,and rTMS train sequence identifiers, converted to excel file), and

17

accelerometer and trigger signals (edf-files). Each accelerometer data file corresponding

18

to each of the speech exams (sessions) were identified and verified.

te

Ac ce p

19

d

13

20

The accelerometer signals were high-pass filtered with Butterworth filter (4th order, cut-

21

off frequency of 80 Hz) to reduce low frequency interference and possible signal level

22

drifts while maintaining the characteristics of the voice (19, 22) ( Figure 2). The data was

23

first filtered in forward and then in reverse direction, preserving the waveform features

24

exactly at the same time point where they occur in the original signal (27). To enable a

Page 9 of 27

10 robust automatic onset calculation routine, the envelope (see Figure 2) of the filtered

2

signal was calculated using Hilbert transform. The signal envelope captures the varying

3

features of the signal generating the signal outline, and its analytic representation enables

4

fast calculations. This approach is commonly used in sound signal processing (28).

5

The recordings were split into several sessions (and thus several data files) controlled by

6

the commercial program module. Some of the sessions may contain short but extremely

7

intensive vibrations due to e.g. coughs while maintaining constant level of the vibrations

8

related to silence and speech responses. In order to enable uniform processing of the

9

baseline and subsequent rTMS sessions, the mean of the signal envelope during the

an

us

cr

ip t

1

whole session was calculated to represent the overall vibration level of the session. An

11

active speech threshold was formed by multiplying this general signal level with an

12

individually adjustable constant.

d te

13

M

10

The first rough estimate of the speech periods was formed by taking into account the

15

signal periods where the envelope of the signal amplitude exceeded the active speech

16

threshold (29). This was done sample by sample (sampling rate 3000 Hz). Erroneous

17

periods containing e.g. coughs, sighs, swallowing, moving of the jaw etc. were included

18

into analysis, in addition to the speech response. To extract only the true speech response

19

onsets, and not the shorter signals originating from non-desirable events, the envelope

20

signal was then modified by applying a moving average filter (implemented as

21

convolution, with a rectangular unit pulse of 40 ms in length; Figure 2). After this the first

22

rising edge of the resulting signal between successive rTMS train onsets was determined

23

as the speech response onset corresponding the given rTMS train onset.

Ac ce p

14

Page 10 of 27

11 1 The active speech threshold limit and the length of the moving average filter window and

3

shape of the convolution kernel (rectangular unit pulse) were checked manually to be

4

appropriate for our measurement setup with four randomly selected sessions (not

5

included in the results).

Ac ce p

te

d

M

an

us

cr

ip t

2

6

Page 11 of 27

12 Fig. 2. An example of the analysis steps of the automated routine and the corresponding audio

2

waveform of a sample of data from patient #1. A) The original accelerometer signal, B) filtered

3

signal, C) signal envelope and the active threshold (dashed line), D) resulting modified signal

4

after the moving average filter. The rising edge of the signal after each rTMS train onset is

5

determined as the speech response onset, E) the rTMS train trigger pulses, F) The corresponding

6

audio waveform. Note strong signals from the clicks induced by rTMS.

cr

ip t

1

7

The trigger pulses of the rTMS stimuli, collected together with the accelerometer data,

9

enabled the calculation of the onset latencies of vocalizations and the onsets of the stimuli

an

us

8

(Fig. 2). The responses for each image from the rTMS session were matched with the

11

corresponding images in baseline, and the voice onset time difference was calculated. To

12

improve the usability of the routine, the rTMS train sequence numbers are shown in the

13

overview figures (see Supplementary Figure A) and the image name in the response

14

comparison figures. Finally, a list of the responses which onset time difference exceeding

15

a chosen value (default 100 ms) compared to the baseline was printed. The results were

16

visually crosschecked from the recorded video and from the signals for erroneous onset

17

detections.

d

te

Ac ce p

18

M

10

Page 12 of 27

an

us

cr

ip t

13

1

Fig. 3. An example of the response comparison of one object naming baseline-rTMS session pair

3

from patient #1. The voice onset time is prolonged by 227 ms when rTMS is applied, and instead

4

of naming the image correctly as “pullo” (bottle), the patient named it as “kokis” (coke) (semantic

5

error). This is seen as a divergent signal shape. (For interpretation of the references to color in this

6

figure legend, the reader is referred to the web version of the article. The MATLAB figure is

7

provided as Supplementary Figure B.)

9

d

te

Ac ce p

8

M

2

2.6. Comparison of the data

10

The results produced by the automated routine were compared with the manually

11

reviewed results from video with the following five components: a) the number of

12

detected rTMS pulse trains compared with the number of the actually occurring rTMS

13

trains, b) the number of the correctly detected rTMS pulse trains, c) the number of

14

detected voice onsets, d) the number of the correctly detected voice onsets, and e) the

15

number of no-response errors. The voice onset latency could not be directly compared, as

Page 13 of 27

14 1

it could not be defined precisely from the video recordings. However, we defined the

2

delays of responses scored as “delays” by our neurophysiologist.

3 The correctness was evaluated against the response performance observed from the

5

videos for every response in which the correctness was in doubt. The reason for

6

misdetection of the rTMS train onset resulted from not detecting the trigger signal, or

7

detecting extra trigger signals. The reason for voice onset misdetections were classified to

8

four categories according to the underlying cause: I) throat movement related problems

9

(swallows, jaw movements, muscle stimulation, grimaces, etc.); such activity was

an

us

cr

ip t

4

detected before the actual response or movement was detected without any vocalization,

11

II) extra voice and associated movements taking place before the actual response (such as

12

'hmm', 'eeeh', coughs, sighs, etc.), III) delayed naming of the previous image, and IV)

13

other reasons. The reason for misdetection of the no-response errors was analyzed as well.

14

te

d

M

10

3. Results:

16

The accelerometer signals were easily recorded with the EMG system of the TMS device,

17

enabling on-line visualization of the signal during the stimulation and detailed off-line

18

analysis.

19

Ac ce p

15

20

The rTMS train sequence was detected correctly in 98 % of all patients with sensitivity of

21

99 % and specificity of 86 %. Detailed breakdown of the detection performance for each

22

patient is given in Supplementary Table A. A confirmed rTMS train sequence was

23

associated in 98 % of the shown images. The reason for not having an rTMS train

Page 14 of 27

15 sequence while the image was present was the movement of the stimulation coil out of

2

the navigation field of the rTMS system. The sequence of images delivered by the speech

3

module can only be interrupted manually and is not directly related to successful

4

detection of the stimulator coil by the navigation system. 78 % of the misdetections were

5

due to lack of the trigger pulses during the images: either the pulse sequence was not

6

delivered, or it was delivered only partially. The rest of the misdetections resulted from

7

extra trigger signals of unknown origin in between the confirmed train sequences.

us

cr

ip t

1

8 3.1. Voice onset detection

an

9

The latencies of the vocalizations during rTMS were increased by more than 100 ms in,

11

on average, 26±13 trials (range 7-55 trials), and by more than 500 ms in, on average 9±5

12

trials (range 2-19 trials) (For values with intermediate delays, see Table 2). Longer

13

latency delays were less common than short ones in all patients.

d

te

14

M

10

The voice onset detection performance was evaluated as portion of the correctly detected

16

voice onsets of all voice response onsets; details are given in Table 1. The sensitivity of

17

the automated routine to correctly detect the voice onsets was 96 %, and the specificity

18

71 %. Majority of the misdetections were related to visible throat movements before the

19

actual response (26 %), to extra voice before the response (24 %) or other, e.g. trigger

20

related problems (36 %). Delayed naming of the previous image was present in 13 % of

21

the misdetections. Detailed categorization of the misdetections for each patient separately

22

is provided as Supplementary Table B.

Ac ce p

15

Page 15 of 27

ip t 31

2

39

3

36

4

39

5

12

6

17

7

37

8

15

9

17

F F M M M F

MTa (% of MSOPb)

50 25 25 39

M M M

Condition

Baseline rTMS LHc Baseline rTMS LH Baseline

63

38

25 63 62

Stimulation intensity range (% MT)

rTMS LH Baseline

rTMS LH Baseline

rTMS LH Baseline

rTMS LH Baseline

rTMS RH Baseline rTMS LH Baseline rTMS LH

rTMS train (Hz)

M an

1

Sex (F/M)

100 to 90 100

ed

Age (y)

ce pt

Patient no.

us

Table 1 The patient characteristics, stimulation parameters and response onset detection performance.

Ac

1 2 3

cr

16

100

100 to 77 95 to 87 100

100 to 120 100 to 71 97 to 89

5, 7

5, 7

5, 7 5, 7 5 5 5 5 5, 7

Occurring (count) 42

Response onsets Correctly detected (count) 42

406 115

404 112

340 200

333 198

334 133

332 133

265 114

251 101

111 157

96 155

172 172

165 164

119 97

111 97

206 78

182 78

292

285

Correctly detected (%) 99.5 97.4 98.8 96.5 85.9 96.5 94.5 90.3 98.3

Page 16 of 27

a

ip t M

Baseline

36

100

5

103

100

176

173

us

rTMS LH

97.2 95,7

Grand average

motor threshold, b maximum stimulator output, c left hemisphere, d right hemisphere

Table 2 The response vocalizations with delays and increased latencies. Responses marked as delays

2 3 4

rTMS LH

2

rTMS RHd rTMS LH

0 0

rTMS RH rTMS LH

1 11

6 7 8

Average delay (ms)

>150ms (count)

>200ms (count)

>500ms (count)

18

16

4

-

16 15

13 12

10 11

4

281 407

15 55

12 42

9 33

4 4

6 0

434 -

41 24

32 9

26 5

8 -

7 (9)e

1175

34

33

32

15

3 (4)e 2

758 884

32 45

32 37

31 35

19 17

rTMS RH rTMS RH

5 0

649 -

37 8

34 5

32 5

10 4

rTMS LH

0

-

24

21

20

11

rTMS RH

0

-

7

4

4

2

rTMS RH rTMS LH

rTMS RH rTMS LH

665

>100ms (count)

Responses with increased latency

24

rTMS LH 5

Delayed (count)

c

ed

1

Condition

ce pt

Patient no.

Ac

1 2 3 4

17

M an

10

cr

17

Page 17 of 27

ip t All

-

25

rTMS RH rTMS LH

0 0

-

11 27

rTMS RH Average ± SD

1 2±3

1221 719 ± 329

23 26 ± 13

22

21

9

8 25

6 21

7

20 21 ± 12

19 19 ± 11

12 9±5

Range 0 - 11 281 - 1221 7 - 55 4 - 42 4 - 35 2 - 19 motor threshold, b maximum stimulator output, c left hemisphere, d right hemisphere, e the average is calculated from those response onsets where the onset was available

ce pt

ed

a

Ac

1 2

0

us

10

rTMS LH

M an

9

cr

18

Page 18 of 27

19 1

3.2. Detection of no-response errors The no-response errors were detected correctly in 88 % of all the no-response errors,

3

including the “no-response” events in the baseline sessions. Details for the detection of

4

no-response errors for each patient is given in Supplementary Table C. As the

5

accelerometer data also contains the first round of the images in the baseline session, and

6

thus also the responses to the images named incorrectly, not named, not named clearly,

7

not articulated correctly and named with delay or hesitation, the following results are

8

calculated separately for baseline and rTMS sessions. In baseline sessions the sensitivity

9

and specificity were 100 %. In rTMS sessions the overall sensitivity was 82 % and

an

us

cr

ip t

2

specificity 100 %. The reasons for misdetection followed the same categories as in voice

11

onset misdetections.

M

10

d

12 4. Discussion:

14

The accelerometer-based method presented here measures the voice onset latency to

15

specific image stimuli. The accelerometer recording produced high-quality signals and

16

enabled automatic voice onset time detection. The recordings were collected from 12

17

consecutive patients who required rTMS speech mapping; thus they reflect overall

18

feasibility of the presented setup in a real clinical situation. The automated routine was

19

compared to a manual review of the rTMS speech mapping videos, which is the present

20

method to analyze the errors in the object naming task. We found that the presented

21

method with the automated routine correctly identified 98 % of all presented rTMS trains

22

onsets and 96 % of the voice onsets. This suggests that the methodology could produce

23

an additional reliable means to stratify the effects of rTMS in an object naming task for

Ac ce p

te

13

Page 19 of 27

20 1

presurgical planning. Moreover, it could provide a preliminary indicator to detect the no-

2

response errors and thus speed up the analysis of the videoed responses.

3 Our setup offers fast additional information to the behavioral data from video analysis of

5

the naming performance (1). Short delays may pass unnoticed in video analysis of several

6

hundred images and responses in several sessions, but they can be detected and promptly

7

quantified by the accelerometer recording. Most importantly our presented setup provides

8

numeral values of naming latencies, therefore reducing subjectivity and increasing

9

repeatability and reliability of the analysis. We are not aware of reports studying just

an

us

cr

ip t

4

notifiable differences in delays of naming. Healthy subjects distinguish a reliable speech

11

asynchrony if acoustical signal leads lip opening by 80 ms or lags it by 140 ms (30);

12

probably minimum perceived differences in naming delay are in the same time range.

13

The delays identified by visual scoring ranged from 300 to 1200 ms with an average of

14

700 ms. This variability may relate, in part, to the regularity of the patient performance

15

generating a background baseline for response variability in visual analysis. Our setup,

16

however, enables selection of any delay for a more precise scrutiny. The final clinical

17

value of different delays can only be identified by comparison with the data obtained by

18

DCS during awake craniotomy.

d

te

Ac ce p

19

M

10

20

The automated routine may not recognize all stop consonants in the beginning of a word

21

as their signal amplitude is very small during the voice onset. The smaller thresholds

22

required for their detection is not feasible, as spikes caused by TMS and other random

23

disturbances would be classified as speech. However, this drawback is not important as

Page 20 of 27

21 the latencies in rTMS condition are compared to the baseline latencies of the same word,

2

and the beginning of the word is usually lost in both conditions. Only the loudness

3

variation between the baseline and rTMS condition may cause problems despite the use

4

of relative detection threshold: the automated routine may detect the onset only in the

5

louder vocalization and cause an error in comparison of the baseline and rTMS data.

6

Therefore, visual evaluation of the automatically detected voice onsets is still important.

7

Most observed erroneous latency detections were induced by coughing or sighing before

8

the actual response. These artifacts resulted in too short, not abnormally long latencies.

9

Instead, the true effects of rTMS caused a delayed vocalization. Therefore, the design of

cr

us

an

10

ip t

1

our algorithm minimizes false positive findings.

M

11

The automated analysis detected successfully latencies for the naming of presented

13

images. Although the patients rehearsed naming, some of them had particular difficulties

14

to name specific images fluently during rTMS. This may indicate that image-related

15

factors, instead of rTMS-related ones, are the underlying reason for such variability. Our

16

algorithm reliably identifies such images and enables straightforward comparison with

17

the DCS data to study this question.

te

Ac ce p

18

d

12

19

The analysis of the measured signals can be developed further. For example, a shape

20

recognition algorithm (see Figure 3) could recognize rTMS-induced differences of

21

pronunciation or word change on the vowel-associated vibration pattern in comparison

22

with the pattern recorded during the baseline.

23

Page 21 of 27

22 We lost data in 2 out of 12 patients due to errors in procedures related to data saving.

2

Closer integration of the accelerometer analysis into the speech mapping module

3

probably would avoid such errors. Similar recordings could, in principle be done also

4

with ordinary microphones. However, recordings with microphones during the rTMS

5

speech mapping paradigm can be problematic due to the loud rTMS clicks, and also due

6

to other environmental sounds, such as arising from the coil cooling system, present in

7

the stimulation environment (See Figure 2).

us

cr

ip t

1

8 Conclusion

an

9

In this study, we developed an accelerometer signal-based automated routine for voice

11

onset detection from larynx vibrations to be used with navigated rTMS speech mapping.

12

The automated routine was found feasible and it detects excellently the rTMS stimulation

13

train onsets, the corresponding vocalization onsets and the no-response errors. This

14

method produces numerical result tables indicating the latency of each response, thus

15

adding reliability, repeatability, and objectivity to the rTMS speech mapping/object

16

naming analysis.

d

te

Ac ce p

17

M

10

18

Acknowledgements

19

This study was financially supported by a development grant from the HUS Medical

20

Imaging Center. We thank Helge Kainulainen and Ronny Schreiber at the department of

21

Neuroscience and Biomedical Engineering, Aalto University School of Science, Finland,

22

for the technical support with the accelerometer.

23

Page 22 of 27

23 References

2

1. Lioumis P, Zhdanov A, Mäkelä N, Lehtinen H, Wilenius J, Neuvonen T, et al. A novel

3

approach for documenting naming errors induced by navigated transcranial magnetic

4

stimulation. J Neurosci Methods. 2012;204(2):349-354.

5

2. Sollmann N, Picht T, Mäkelä JP, Meyer B, Ringel F, Krieg SM. Navigated transcranial

6

magnetic stimulation for preoperative language mapping in a patient with a left

7

frontoopercular glioblastoma. J Neurosurg. 2013;118(1):175-179.

8

3. Mäkelä T, Vitikainen AM, Laakso A, Mäkelä JP. Integrating nTMS Data into a

9

Radiology Picture Archiving System. J Digit Imaging. 2015 Jan 24.

an

us

cr

ip t

1

4. Picht T, Krieg SM, Sollmann N, Rösler J, Niraula B, Neuvonen T, et al. A comparison

11

of language mapping by preoperative navigated transcranial magnetic stimulation and

12

direct cortical stimulation during awake surgery. Neurosurgery. 2013;72(5):808-819.

13

5. Tarapore PE, Findlay AM, Honma SM, Mizuiri D, Houde JF, Berger MS, et al.

14

Language mapping with navigated repetitive TMS: Proof of technique and validation.

15

Neuroimage. 2013;82:260-272.

16

6. Picht T, Schmidt S, Brandt S, Frey D, Hannula H, Neuvonen T, et al. Preoperative

17

functional mapping for rolandic brain tumor surgery: Comparison of navigated

18

transcranial magnetic stimulation to direct cortical stimulation. Neurosurgery.

19

2011;69(3):581-588.

20

7. Forster M, Hattingen E, Senft C, Gasser T, Seifert V, Szelényi A. Navigated

21

Transcranial Magnetic Stimulation and functional Magnetic Resonance Imaging -

22

advanced adjuncts in preoperative planning for central region tumors. Neurosurgery.

23

2011;68(5):1317-1325.

Ac ce p

te

d

M

10

Page 23 of 27

24 8. Vitikainen AM, Salli E, Lioumis P, Mäkelä JP, Metsähonkala L. Applicability of

2

nTMS in locating the motor cortical representation areas in patients with epilepsy. Acta

3

Neurochirurgica. 2013;155(3):507-518.

4

9. Eldaief MC, Press DZ, Pascual-Leone A. Transcranial magnetic stimulation in

5

neurology: A review of established and prospective applications. Neurol Clin Pract.

6

2013;3(6):519-526.

7

10. Krieg SM, Sollmann N, Hauck T, Ille S, Foerschler A, Meyer B, et al. Functional

8

language shift to the right hemisphere in patients with language-eloquent brain tumors.

9

PLoS One. 2013;8(9):e75403.

an

us

cr

ip t

1

11. Rösler J, Niraula B, Strack V, Zdunczyk A, Schilt S, Savolainen P, et al. Language

11

mapping in healthy volunteers and brain tumor patients with a novel navigated TMS

12

system: Evidence of tumor-induced plasticity. Clin Neurophysiol. 2014;125(3):526-536.

13

12. Hernandez-Pavon JC, Mäkelä N, Lehtinen H, Lioumis P, Mäkelä JP. Effects of

14

navigated TMS on object and action naming. Front Hum Neurosci. 2014;8:660.

15

13. Sollmann N, Hauck T, Hapfelmeier A, Meyer B, Ringel F, Krieg SM. Intra- and

16

interobserver variability of language mapping by navigated transcranial magnetic brain

17

stimulation. BMC Neurosci. 2013;14:150.

18

14. Krieg SM, Sollmann N, Hauck T, Ille S, Meyer B, Ringel F. Repeated mapping of

19

cortical language sites by preoperative navigated transcranial magnetic stimulation

20

compared to repeated intraoperative DCS mapping in awake craniotomy. BMC Neurosci.

21

2014;15:20.

22

15. Ille S, Sollmann N, Hauck T, Maurer S, Tanigawa N, Obermueller T, et al.

23

Impairment of preoperative language mapping by lesion location: a functional magnetic

Ac ce p

te

d

M

10

Page 24 of 27

25 resonance imaging, navigated transcranial magnetic stimulation, and direct cortical

2

stimulation study.. J Neurosurg. 2015 Apr 17:1-11. [Epub ahead of print].

3

16. Rogić M, Deletis V, Fernández-Conejero I. Inducing transient language disruptions

4

by mapping of Broca's area with modified patterned repetitive transcranial magnetic

5

stimulation protocol. J Neurosurg. 2014;120(5):1033-1041.

6

17. Deletis V, Fernández-Conejero I, Ulkatan S, Rogić M, Carbó EL, Hiltzik D.

7

Methodology for intra-operative recording of the corticobulbar motor evoked potentials

8

from cricothyroid muscles. Clin Neurophysiol. 2011;122(9):1883-1889.

9

18. Bourguignon M, De Tiège X, de Beeck MO, Ligot N, Paquier P, Van Bogaert P, et al.

an

us

cr

ip t

1

The pace of prosodic phrasing couples the listener's cortex to the reader's voice. Hum

11

Brain Mapp. 2013;34(2):314-326.

12

19. Hillman RE, Heaton JT, Masaki A, Zeitels SM, Cheyne HA. Ambulatory monitoring

13

of disordered voices. Ann Otol Rhinol Laryngol. 2006;115(11):795-801.

14

20. Lindstrom F, Ren K, Li H, Waye KP. Comparison of two methods of voice activity

15

detection in field studies. J Speech Lang Hear Res. 2009;52(6):1658-1663.

16

21. Orlikoff RF. Vocal stability and vocal tract configuration: An acoustic and

17

electroglottographic investigation. J Voice. 1995;9(2):173-181.

18

22. Ruohonen J, Karhu J. Navigated transcranial magnetic stimulation. Neurophysiol Clin.

19

2010;40(1):7-17.

20

23. Rossini PM, Barker AT, Berardelli A, Caramia MD, Caruso G, Cracco RQ, et al.

21

Non-invasive electrical and magnetic stimulation of the brain, spinal cord and roots:

22

Basic principles and procedures for routine clinical application. Report of an IFCN

23

committee. Electroencephalogr Clin Neurophysiol. 1994;91(2):79-92.

Ac ce p

te

d

M

10

Page 25 of 27

26 24. Epstein CM, Lah JJ, Meador K, Weissman JD, Gaitan LE, Dihenia B. Optimum

2

stimulus parameters for lateralized suppression of speech with magnetic brain stimulation.

3

Neurology. 1996;47(6):1590-1593.

4

25. Brodeur MB, Dionne-Dostie E, Montreuil T, Lepage M. The Bank of Standardized

5

Stimuli (BOSS), a new set of 480 normative photos of objects to be used as visual stimuli

6

in cognitive research. PLoS One. 2010;5(5):e10773.

7

26. Oppenheim A, V., Schafer R, W., Buck j, R. Discrete-Time Signal Processing. 2nd ed.

8

Upper Saddle River, New Jersey 07458: Prentice Hall; 1999.

9

27. Samjin C, Zhongwei J. Comparison of envelope extraction algorithms for cardiac

10

sound signal segmentation. Expert Systems with Applications. 2008;34:1056-1069.

11

28. Myers S, Hansen BB. The origin of vowel lenght neutralization in final position:

12

Evidence from finnish speakers. Natural Language and Linguistic Theory.

13

2007;25(1):157-193.

14

29. Summerfield Q. Lipreading and audio-visual speech perception. Philos Trans R Soc

15

Lond B Biol Sci. 1992;335(1273):71-78.

cr

us

an

M

d te

Ac ce p

16

ip t

1

Page 26 of 27

27

te

d

M

an

us

cr

ip t

Highlights  rTMS-induced modifications of naming are commonly analyzed from video reordings  Detection of vocalization-related larynx vibrations via accelerometer is introduced  Our setup offers additional information to the behavioral data  Automated routine correctly detected 96% of the voice onsets  The new method improves the repeatability and objectivity of rTMS language mappings

Ac ce p

1 2 3 4 5 6 7 8 9 10

Page 27 of 27

Accelerometer-based automatic voice onset detection in speech mapping with navigated repetitive transcranial magnetic stimulation.

The use of navigated repetitive transcranial magnetic stimulation (rTMS) in mapping of speech-related brain areas has recently shown to be useful in p...
391KB Sizes 0 Downloads 13 Views