Applied Ergonomics 45 (2014) 1196e1207

Contents lists available at ScienceDirect

Applied Ergonomics journal homepage: www.elsevier.com/locate/apergo

Towards successful user interaction with systems: Focusing on user-derived gestures for smart home systems Eunjung Choi, Sunghyuk Kwon 1, Donghun Lee 1, Hogin Lee, Min K. Chung* Dept. of Industrial and Management Engineering, Pohang University of Science & Technology (POSTECH), Republic of Korea

a r t i c l e i n f o

a b s t r a c t

Article history: Received 18 March 2013 Accepted 20 February 2014

Various studies that derived gesture commands from users have used the frequency ratio to select popular gestures among the users. However, the users select only one gesture from a limited number of gestures that they could imagine during an experiment, and thus, the selected gesture may not always be the best gesture. Therefore, two experiments including the same participants were conducted to identify whether the participants maintain their own gestures after observing other gestures. As a result, 66% of the top gestures were different between the two experiments. Thus, to verify the changed gestures between the two experiments, a third experiment including another set of participants was conducted, which showed that the selected gestures were similar to those from the second experiment. This finding implies that the method of using the frequency in the first step does not necessarily guarantee the popularity of the gestures. Ó 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.

Keywords: Hand gesture User-derived hand gestures Gesture-command association

1. Introduction Recently, many researchers have focused on developing more natural user interfaces that facilitate interaction between users and systems. More specifically, with advanced technologies, the trend has been geared towards developing more intuitive devices, and one of the efforts is a gesture-based interface that recognizes any physical movement without the help of a traditional device such as a mouse or a keyboard (Saffer, 2008). Previous studies on gesture-based interfaces have focused on either 2 dimensional (2D) gestures utilizing a touch-screen controlled by a finger or a stylus pen, or 3 dimensional (3D) motion-recognition systems accompanied by sensor-gloves or handheld devices. More recently, the trend of developing 3D gesture recognition technologies, such as Microsoft’s Kinect, has allowed different gesture-based interfaces to provide users with easier control of the devices without any extra equipment (Bhuiyan and Picking, 2011; Kim et al., 2011). Consequently, recent research

* Corresponding author. San 31, Hyoja-dong, Nam gu, Pohang, Gyeongbuk, Republic of Korea. Tel.: þ82 54 279 2192; fax: þ82 54 279 2920. E-mail addresses: [email protected] (E. Choi), [email protected] (S. Kwon), [email protected] (D. Lee), [email protected] (H. Lee), [email protected] (M.K. Chung). 1 Present address: Visual Display Division, Samsung Electronics Co. Ltd., #129, Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 443-742 S, Republic of Korea. http://dx.doi.org/10.1016/j.apergo.2014.02.010 0003-6870/Ó 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.

studies have begun to pay more attention to 3D free hand gestures (Henze et al., 2010; Kim et al., 2011; Mauney et al., 2010). Meanwhile, in the earlier studies, due to the limitation of the gesture recognition technologies, gesture interfaces could not recognize many postures or motions, and only a few gestures were suggested by designers or engineers based on their specialized knowledge (Buisine and Martin, 2007; O’Hagan et al., 2002; Sears and Arora, 2002). However, some of these gestures are difficult to discover and adopt because they are arbitrarily associated with commands (Yee, 2009). More recently, as the image processing technology for faster computation has become possible, and subsequently the number of gesture-activated functions has increased, intuitiveness has become an important consideration in gesture design (Blackler et al., 2010; Lee et al., 2010; Lepinski et al., 2010; Park, 2012). Therefore, the issue has expanded to determining the gestures for more intuitive interaction with the products. Lately, various researches have started to focus on matching gestures with commands by involving users in the initial stage of designing gestures to fully use the users’ experiences: user-centered approach (Akers, 2006; Epps et al., 2006; Grandhi et al., 2010; Henze et al., 2010; Kuhnel et al., 2011; Lee et al., 2010; Mauney et al., 2010; Mitchell and Heap, 2011; Nebelrath et al., 2011; Nielsen et al., 2003, 2004; Stern et al., 2008; Wobbrock et al., 2009). The aforementioned studies followed a similar procedure to suggest gestures for commands (Fig. 1). In such case, the frequency

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

1197

Fig. 1. The procedure of the previous studies to select a top gesture for each command.

ratio is commonly used to eliminate awkward gestures, and to select popular gestures among the users. More specifically, the users were asked to derive a gesture for each command. Once the gestures were collected from the users, similar gestures for each command were grouped together according to the physical shapes/ motions of the gestures. Then, a gesture with the highest frequency (Top gesture) was suggested as the final gesture for a command. In addition some of the gestures with high frequency for each command were selected and then were estimated in terms of subjective measures such as suitability, ease of memory, and fatigue. Finally, one of them was selected as the final gesture for the command. However, these steps can introduce several issues. Firstly, users have a limited set of gesture candidates in their mind, and they end up selecting one of them as the best gesture for a command during an experiment. Therefore, if they observe other gestures that they had not thought about during the experiment, they could change their selection. Secondly, if a skilled user in designing gestures is included in a participant group; a gesture that he/she derives could be the most suitable gesture for a command. However, following the aforementioned step is likely to neglect the meaningful gestures due to their low frequency. However, there are no studies on these issues. Thus, we made hypothesis to identify the issues as follows.  Users may change their selection after observing other gestures.  A gesture derived from only a few users might be a better gesture. To verify the hypothesis, two experiments were conducted. In the first experiment, we followed the procedure of the previous studies: deriving gestures for a command from users and selecting a gesture with the highest frequency (Top gesture) for each command. In the second experiment, we asked the same users to select the most suitable gesture for each command after observing all of the user-derived gestures acquired in the first experiment, and then, selected a top gesture for each command. Moreover, the top gestures between the two experiments were compared to identify whether the top gestures in the first experiment were maintained in the second experiment, and whether the gestures that a few users derived in the first experiment became popular gestures. Finally, in order to verify the changed gestures between the two experiments, a third experiment including another set of participants was conducted. 2. Methods Two experiments were conducted to identify aforementioned hypothesis (see Fig. 2). Two experiments were designed as within subject test. The first experiment focused on the acquisition of hand gestures mapping with structures within a house and appliances,

and also studied the participant’s reason for choosing a certain gesture. The experiment of deriving gestures for each command from the users was based on the user-centered approach similar to the previous studies (Akers, 2006; Epps et al., 2006; Grandhi et al., 2010; Henze et al., 2010; Kuhnel et al., 2011; Lee et al., 2010; Mauney et al., 2010; Nebelrath et al., 2011; Nielsen et al., 2003, 2004; Wobbrock et al., 2009). The main issue in the second experiment was to identify whether the participants would change their own gestures after observing the gestures derived by the others in the first experiment. More specifically, the top gestures between the two experiments were compared after finishing the both experiments. Thus, the same participants took part in both of the experiments. In the second experiment, the participants were asked to select the most suitable gesture for each command from the gesture list, which included most of the gestures that the users derived in the first experiment. 2.1. Finding gesture commands First, to select commands of various products, we collected the commands from the smart-home system. Then, to find proper gesture commands for smart-home appliances, previous researches of which targets for smart-home systems were investigated, and a brainstorming session with four researchers was carried out. Among the commands from the previous studies, we collected a motion-recognized remote controller (Ouchi et al., 2005; Pan et al., 2010; Wilson and Shafer, 2003), a glove-type interface (Dipietro and Sabatini, 2008; Ng et al., 2011), and a touch-sensitive interactive system (Saffer, 2008; Seifried et al., 2009), as well as the commands that were used in Kuhnel et al. (2011) and Neßelrath et al., 2011 By doing so, a total of 40 commands in smart-home appliances were collected. To select target commands, a brainstorming session with four experts was carried out. During the session, all of the commands collected from the previous stage were used as the basic information. In addition to the commands from the previous researches, the structures within the house that involved interactions on a dayto-day basis were also selected. As a result, a total of 38 commands for 11 products were selected: Air conditioner, TV, Audio player, Phone, Light, Desk Lamp, Curtain(s), Blind, Door(s), Window(s), and Faucet (see Table 1). 2.2. Experiment 1 2.2.1. Participants A total of thirty students in POSTECH voluntarily participated in the experiment, of which fifteen were men and fifteen were women. They were all right handed, and no participant had musculoskeletal disorders in his or her arms and hands. The mean age was 23.2 (sd: 2.89; rage of age: 19e30). None of the

1198

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

Fig. 2. Methods of this study.

participants had any previous experiences with a 3D hand gesturebased interface such as Microsoft Kinect. 2.2.2. Apparatus and environment of the experiment To derive gestures, the participants were given verbal commands. For instance, a command such as “raise temperature of air conditioner” was recorded and saved as an audio file. In the experiment the file was randomly played. The reason why commands were recorded was to provide the same tone of the instructions to all the participants. In addition, based on the theory of Kita (2000), we have assumed that the shape of visible objects in the experimental environments that were similar to the targets of this experiment Table 1 Target commands. Product

Command

Air conditioner

1. Turn on 2. Turn off 3. Raise the temperature 4. Lower the temperature 5. Turn on 6. Turn off 7. Increase the volume 8. Decrease the volume 9. Go to the next channel (channel up) 10. Go to the previous channel (channel down) 11. Turn on 12. Turn off 13. Increase the volume 14. Decrease the volume 15. Go to the next track (track up) 16. Go to the previous track (track down) 17. Answer 18. Hang up 19. Increase the volume 20. Decrease the volume 21. Turn on 22. Turn off 23. Turn on 24. Turn off 25. Turn up the light 26. Turn down the light 27. Open 28. Close 29. Open 30. Close 31. Open 32. Close 33. Open 34. Close 35. Turn on 36. Turn off 37. Raise the temperature 38. Lower the temperature

TV

Audio player

Phone

Light Desk Lamp

Curtain(s) Blind Door(s) Window(s) Faucet

could affect gestures that the participants derived. For example, while deriving a gesture for “open the door”, if they see a hinged door (visible objects), it is possible that they could mimic their action as they would do in daily life. Hence, we covered all of the objects in a typical room such as a window and a blind with white wallpaper, and after they entered the room, the door was also covered with white wallpaper. Thus, the images that the participants could see in the room were minimized during the experiment. The objects they could see in the room were only a desk, a chair, a camera, a laptop computer, some paper and a pencil. See Fig. 3 for the layout of the room. 2.2.3. Procedures Prior to the experiment, the participants were asked to imagine that they were now in a smart-home whose products could be controlled by their hand gestures. To make sure that the participants understood the thirty-eight target commands, they were given a detailed introduction by a moderator based on Table 1. To distinguish meaningful gestures from other gestures, the participants were told to put their hands on the desk before the proper gesture came to mind. Once they derived a gesture, they had to put their hands on the desk again. In addition, they were instructed to focus on deriving the most natural and proper gesture for the given commands listed in Table 1. Repeating the same gestures for different commands was allowed. To familiarize with deriving

Fig. 3. Environment of the first experiment.

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

gestures, the participants were required to practice two commands prior to the experiment. In the experiment, the participants derived gestures for the selected thirty-eight commands. The sequence of commands was randomly presented. To generate gestures, participants were given verbal commands as listed in Table 1. Since the participants were asked to consider each command independently, they sometimes repeated the same gesture for different commands. Moreover, there was no time limit for deriving a gesture. After the participants derived a gesture for each command, they were required to rate the suitability of the gesture (e.g., ‘The gesture I generated is a good match for the intended command’). The participants rated the suitability level on a 7 point scale (1: strongly disagree w7: strongly agree). Additionally, an in-depth interview was carried out to understand their reasons for selecting the certain gestures. The participants were allowed take a rest whenever they wanted. The entire experiment was recorded for post analysis. 2.2.4. Grouping user-derived gestures In order to group the user-derived gestures, a brainstorming session with four researchers was conducted. Two of researchers are PhD candidates in Human Factors, and the other two have PhD in Human Factors and are specialists in the design of user interfaces. During the brainstorming session, they observed together all of the video recordings of the gestures that the users had derived for each command. From the videos, they also assigned groups to the gestures. Three steps were conducted to categorize the user-derived gestures in the brainstorming session. In the first step, the userderived gestures for each command were grouped when the shape of the gestures was exactly the same. In the second step, the user-derived gestures were grouped based on the similarity of the shape. When all of the four researchers agreed that the gestures were similar unanimously, they were grouped together. In this case the user interview that asked for the reason why they defined the gesture (mental model) was helpful for the researchers in making decisions. Finally, the representative gesture for each gesture group was determined by a unanimous consensus of the four researchers based on the important elements of the gestures such as a direction and a specific path that the participants mentioned in the experiment. If there is no consensus among the researchers in deciding the representative gesture within a gesture group, all of the gestures in that group were shown to the participants again during the second experiment with the same group number (see Fig. 4). This is because presenting only one of the gestures could affect participants’ selection of the suitable gesture in the second experiment. Therefore, the researchers tried to maintain the form of gestures as the users derived while determining the representative gesture for each group; thus, the representative gestures for each group could

1199

include most of the gestures that the participants came up within the first experiment. As a result, a total of 1140 gestures were narrowed down to 294 gesture groups. 2.3. Experiment 2 2.3.1. Participants A total of twenty-eight students who participated in the first experiment took part in the second experiment as well, and fifteen of them were men and thirteen of them were women. The mean age was 23.5 (sd: 2.81). Two students did not participate in the experiment due to personal reasons. 2.3.2. Apparatus and environment of the experiment A representative gesture for each gesture group was re-enacted by a performer, and these re-enacted gestures were video-recorded to build a gesture list for each command. Each gesture list was presented to the participants through a video clip in a Microsoft Office PowerPoint 2007 presentation. Each slide showed each gesture group, which was categorized by four experts from the first experiment. Fig. 4 shows a part of a gesture list. As shown in Fig. 4, the gestures classified as the same group were presented with the same answer number to the participants. Once the participants finished watching the gesture list for each command through PPT slides, a video that includes all of the gesture groups and their answer numbers in the caption was shown repeatedly through the Windows Media Player. This is because the participants could forget the answer number of the gesture group that they wanted to select while watching many slides. To show the gesture list, a room with a projector (SANYO XGA PLCMX500) and a projector screen (MR FM100) was used. To write their answer numbers while watching the video, pencil and paper were given to them. 2.3.3. Procedures There was a one-month gap between the first experiment and the second experiment. Prior to the experiment, the participants were given a detailed introduction by a moderator since they were not still familiar with the device. In the experiment, the video of the gesture example list was presented to the participants, and they were asked to select the most natural and proper gesture for each command without considering the gesture that they derived in the first experiment. Similar to the first experiment, the participants selected the gesture for each command independently so that the same gesture for different commands could be selected. Once one gesture for each command was selected, its’ suitability was estimated as in the first experiment.

Fig. 4. An example of a part of a gesture list: A slide shows representative gestures of one gesture group. A command is presented in the top of the slide, and both answer number and description of the gestures are presented at the bottom of the slide.

1200

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

2.4. Results 2.4.1. Change rate for user-derived gestures To identify the first hypothesis, a change rate was computed. Fig. 5 shows the change rate where the participants changed their selection after observing gestures in the gesture list for each command. Although gestures that the participants derived in the first experiment were included in the gesture list, about 66% of the participants have changed their selection of the gestures. In particular about 70% of the participants have changed their gestures for “turn on/off” commands of an air conditioner, as well as the most of the commands of a TV, audio player, desk lamp, and curtain(s). On the other hand, about 50% of the participants maintained their gestures for “raise/lower the temperature” commands of an air conditioner, as well as the most of the commands for blind, door(s), window(s), and faucet. 2.4.2. Top gesture for each command between two experiments To identify hypothesis 1, and 2, the top gestures between two experiments were compared. Although the same participants conducted both the first and the second experiment, only 13 top gestures from the first experiment remained after the second experiment. More specifically, some of the gestures that only a few participants derived in the first experiment were selected as the top gestures in the second experiment, and some of the top gestures in the first experiment were not even selected in the second experiment (see Fig. 6). In the first experiment, the gestures of pressing a button were selected as the top gestures for “turn on/off an air conditioner”, “turn on/off a TV” and “turn on/off an audio player.” However, the top gestures in the second experiment were different for each device. In the case of “turn on the air conditioner”, the gesture of hand-fanning for indicating the heat of a room was selected in the second experiment, having the highest frequency of 39.3%, while it only appeared less than 10% during the first experiment. In the case of “turn off the air conditioner”, the gesture of rubbing one’s own body for indicating the chill of a room was selected, having the highest frequency of 32.1%, while it only appeared 6.7% in the first experiment. Moreover, the top gesture of pressing a button for “turn on the air conditioner” from the first experiment was only selected less than 7% of the times in the second experiment. In the case of “turn on a TV” and “turn off an audio player”, the gesture imitating a screen turning on and the gesture of covering ears were selected as the top gestures in the second experiment, and they only appeared 6.7% and 10% of the times in the first experiment respectively. On the other hand, in the first experiment the top gesture of pressing a button for “turn off a TV” or for “turn on an audio player” was selected as the top gesture in the second

experiment as well. However, the gesture imitating a screen disappearing for “turn off a TV” and the gesture of a finger snap for “turn on an audio player” were also selected as the top gestures in the second experiment, and only appeared 6.7% and 23.3% of the times in the first experiment respectively. For the command “turn on/off a light”, the gestures of finger snapping and hand clapping were selected as the top gestures in the second experiment. In the above results of the two gestures also appeared with high frequency in the first experiment. However, in the first experiment the top gesture that imitated pressing a button was not even selected for either of the commands in the second experiment. Top gestures for “increase or decrease the volume” of devices such as a TV, audio player and phone were similar. The gestures of palm stretching or clutching were selected as the top gestures for “increase or decrease the volume” of a TV, audio player and phone, and they appeared less than 14% of the time in the first experiment. On the other hand, the gestures for “change TV channels” or “audio player tracks” were closely related to the motions with horizontal directions. For “go to the next channel of a TV” or “go to the next track of an audio player” the gesture of moving a hand from right to left was selected, and for “go to the previous channel of a TV” the gesture of moving a hand from left to right was selected. However, for “go to the previous track of an audio player”, both gestures indicating right and left were selected equally as the top gestures. For the commands related to curtain(s), blind, door(s), window(s), and faucet, most of the participants imagined themselves physically using the objects as they were deriving the gesture. The gestures of each command were similar to the participants’ action in everyday life. In other words, most participants derived the gestures in relation to the physical shape of the objects. Most of the gestures were similar in both experiments. 2.4.3. Suitability scores by the comparison of the 1st and 2nd experiments The average suitability for the first experiment was 4.99 (sd: 0.41), and for the second experiment 5.09 (sd: 0.43). As a result of the ANOVA test, the second experiment showed a slightly higher suitability compared to the first experiment (F ¼ 4.72, p ¼ 0.036). The average suitability scores for the top gestures were 5.10 (sd: 0.57) in the first experiment, and 5.27 (sd: 0.53) in the second experiment. As a result of the ANOVA test, the second experiment showed a marginally higher suitability (F ¼ 3.16, p ¼ 0.084). Detailed results of the two experiments are presented in Appendix 1. Especially, the suitability scores of the top gestures for “turn on/ off an air conditioner”, “turn off a TV or an audio player”, “answer a phone” and “raise the temperature of a faucet” in the second experiment were over 0.8 point higher than those of the first experiment (see Fig. 7).

Fig. 5. Change rate for user-derived gestures by the comparison of the two experiments (%).

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

Fig. 6. Examples of top gestures of the first experiment vs. the second experiment, with % of frequency indicated in brackets.

1201

1202

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

Fig. 6. (continued).

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

1203

Fig. 6. (continued).

3. Verification of the changed gestures between the two experiments To verify the changed gestures between the two experiments, we conducted the third experiment. In the third experiment, five products such as curtain(s), blind, door(s), window(s), and faucet were excluded from the target products. Most of the gestures for them were not changed between the two experiments and also most of the gestures for the products mimicked the participants’ action in everyday life. All of the procedures were exactly same with the second experiment, while the group of participants was changed.

3.1. Participants A total of twenty nine students in POSTECH who had not been involved in any of the previous experiments took part in the third experiment, and sixteen of them were men and thirteen of them were women. The mean age was 22.5 (sd: 3.6). Four of them had previous experiences with a Microsoft Kinect. 3.2. Results In terms of “turn on/off” commands, a gesture of pressing a button was rarely selected by the participants except for the TV.

Fig. 7. Comparisons of suitability scores for each command in the both experiments.

1204

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

Similarly with the top gestures in the second experiment, the top gestures for “turn on/off” commands in the third experiment were different for each device. Most of the top gestures were similar in the second experiment except for the TV. For “turn on/off the TV”, the gestures of imitating a screen turning on/off, which were frequently selected in the second experiment, were rarely selected in the third experiment. The gesture of pressing a button on a remote controller was selected as the top gesture for both “turn on the TV (34.5%)” and “turn off the TV (37.9%) in the third experiment.” In the case of “turn on/off” commands for the air conditioner and the audio player, and “answer/hang up” commands for the phone, the top gestures in the third experiment were the same with the second experiment. 31.0% of the participants preferred the gesture of hand-fanning for “turn on the air conditioner”, and 24.1% of them preferred the gesture of rubbing one’s own body for “turn off the air conditioner”. In the case of the audio player, 37.9% of the participants preferred the finger snapping gesture for “turn on the audio player”, and 17% of them preferred the gesture of covering one’s ear for “turn off the audio player”. In the case of “answer/hang up the phone”, participants preferred the gesture mimicking their usage behavior of the phone in a daily life (37.9% for “answer the phone” and 58.6% for “hang up the phone”). In the case of a “turn on” command for the light and a “turn off” command for the desk lamp, the top gestures were the same with the second experiment. 44.8% of the participants preferred the finger snapping gesture for “turn on the light”. Also, 44.5% of them preferred the gesture of making a fist for “turn off the desk lamp”. Although the top gestures for “turn off the light” and “turn on the desk lamp” were different with the second experiment, the finger snapping for “turn off the light” and the gesture of making a palm for “turn on the desk lamp” were selected as the second most preferred gesture respectively in the second experiment. However, they were rarely derived in the first experiment. In addition, the top gestures for “go to next/previous” for the TV and the audio player were similar with the second experiment. The gesture of moving a hand left for “go to next channel/track” were frequently selected over 48.3% for the both products and the gesture of moving hand right for “go to previous channel/track” were frequently selected by over 51.7% for both of the products. In the case of “increase/decrease” commands, the gestures of moving a hand upwards/downwards were selected as the top gesture regardless of the products. Although the gesture of palm stretching or clutching, top gestures in the second experiment, were selected as the second most preferred gesture for TV and audio player (below 20.7%), and the gestures of turning a wrist clockwise/counterclockwise, top gestures in the second experiment, were selected as the second most preferred gesture for the desk lamp (17.2%), the gestures of moving a hand upwards/downwards (over 44.8%) were overwhelmingly high. However, in the case of the Phone, the gesture of palm stretching or clutching were also highly preferred (over 31.0%). 4. Discussion 4.1. Top gesture for each command through comparison among three experiments Although a gesture that the participants defined in the first experiment was included in the gesture list, most of the participants changed their gestures in the second experiment. As a result, most of the top gestures were not the same between the two experiments. According to Mather and Johnson (2000), people have the tendency to retroactively ascribe positive attributes to an option that

they have selected: Choice-supportive bias. Using this theory, participants would have had a tendency to maintain their gestures in the second experiment because they have selected a certain gesture for a command and derived it in the first experiment. However, most of the participants changed their selection in the second experiment, and even the suitability score in the second experiment was higher than the first experiment. This result implies that the users selected a gesture from a limited number of gesture candidates that they could imagine during an experiment, and the selected gestures may not always be the best gestures for them. Therefore, if they observe other gestures that they had not thought about during the experiment, they could change their selection. In addition, some of the gestures that only a few participants derived in the first experiment were selected as the top gestures during the second experiment and also during the third experiment. For “on/off” commands among the others, some of the gestures derived by only a few participants in the first experiment were frequently selected as the top gestures during the second and the third experiments. It implies that the gestures derived from a few users may have been better gestures. In particular, the top gestures in the second and the third experiments were more like sound forms that could represent each device properly. These gestures were different from the top gestures selected in the first experiment that imitated controlling the physical buttons on the actual device or the buttons on the remote controller of the device, with no regards to the device itself. In addition, the suitability scores of the last two experiments were higher than those of the top gestures in the first experiment. Meanwhile, in the case of the TV, although the gestures of imitating a screen turning on/off are more like sound forms, the participants in the third experiment rarely selected the gestures. We thought that the participants in the second experiment selected the gestures because they were attracted by the uniqueness of the gestures in comparison to their own. However, the participants in the third experiment may have considered both the simple and the sound forms of the gestures. Compared with other gestures in the gesture list, the gestures of imitating a screen are more complicated and require the use of two hands. In the case of controlling an air conditioner or an audio player, the gestures that depicted the users’ environment or their needs were selected as the top gestures. Pavlovic et al. (1997) referred to these gesture types as a communicative gesture. The gesture of handfanning for “turn on an air conditioner” and the gesture of rubbing one’s own body for “turning off an air conditioner” were very similar to the actions a person would take to express his/her condition of being hot or cold. Moreover, the gesture of covering one’s ears for “turn off an audio player” was very similar to the actions a person naturally takes to imply the loudness of the environment. Likewise, the users seem to prefer a gesture that naturally reflects the action they would take in their daily lives when the meaning of the function is closely related to their physical condition. In the case of the commands “next/previous” commands for the TV and the audio player, despite the fact that the gesture of moving a hand left for “go to next” command and the gesture of moving a hand right for “go to previous” were frequently selected in the second and the third experiments, the other direction for each command was also frequently selected (over 34.5%) in both experiments. In other words, both of the directions of right and left were mixed for the same command; the users did not seem to have one common concept of directions for the commands. As a result of analyzing the interviews, some of the users perceived “next” and “previous” as the concept of time and they tended to match them with right and left respectively. This outcome agreed with the claim of Tversky et al. (1991) that two consecutive planes in a spatial dimension are often mapped to a temporal dimension. On the other

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

hand, the users imagined themselves swiping through touch screens to move to the next screen and thus moved their hands from right to left to convey “next” and vice versa. This issue is also discussed in the previous studies for a touch screen (Johnson, 1995; Kwon et al., 2011). These results show that not only awkward gestures but also meaningful gestures could be neglected by following the method of the previous studies: screening out several gestures using frequency of gestures in the initial stage (see Fig. 1). To resolve this issue, before selecting the final gesture for a command, sufficient numbers of gesture candidates should be provided to users. More specifically, gestures for the commands should be derived by the users together with experts who specialize in the design-related community such as computer science, ergonomics and industrial design. Then, the derived gestures should be estimated by a completely new group of users in terms of subjective measures such as suitability, ease of memory and fatigue to select the most suitable gesture for each command. These considerations will be taken into account in our future research. 4.2. The implication for the top gestures with lower change rate Although most of the top gestures from the first experiment were not maintained in the second and third experiments, the top gestures for some of the commands have similar patterns among the three experiments. According to Shirai and Furui (1994), the gestures often originate from the gesturer’s mental models, which are relatively stable and distinct structures in the mind. In this sense, the similar gesture patterns for similar commands could show the common mental models of the users for the commands. In the results of this study, the users have shown to possess the concept of change in quantity for “increase/decrease” commands, and they have tended to match the upward motion with the increase in quantity, and the downward motion with the decrease in quantity. In the case of vertical directions (up/down), some of the studies have suggested similar gesture patterns for the “increase/decrease” commands (Henze et al., 2010; Kuhnel el al., 2011). This outcome also agrees with the claim of Hurtienne et al. (2008) that states how vertical motion relates to the concept of change in amount. Top gestures for the structures within the house such as curtain(s), door(s), windows(s), blind, and faucet depicted the actions that the users would normally make on a daily basis with the objects. Pavlovic et al. (1997) referred to these gesture types as manipulative gestures. Likewise, for the objects that involved physical motions, the users tended to derive the gestures related to the physical features of the objects. This outcome seemed to agree with the Law of Affordance (Norman, 1999), which claims that the

1205

physical characteristics of the objects may well affect the users’ decision on deriving gestures. 5. Conclusions Many studies that focus on deriving gestures with the users have commonly utilized the frequency ratio to eliminate awkward gestures or to select popular gestures for commands. However, as the results of the three experiments in this study, we identified several limitations of the process suggested in the previous studies. More specifically, various users tended to change their selection after observing other gestures derived by the other users, and some of the top gestures in the first experiment were not even selected in the second experiment and were rarely selected in the third experiments. This implies that the frequency could not guarantee the selected gestures because users in the first step have only a limited set of gestures in their mind. In addition to this, some of the gestures which a few users derived in the first experiment were selected as top gestures in the second and the third experiments. This implies that some of the more significant gestures could be overlooked by following the previous method. However, this study has not covered the process of designing a final gesture set for each command or considered the technical feasibility of the gestures, because the primary purpose of this study was to investigate whether utilizing the frequency ratio would be the optimal solution for eliminating unnecessary gestures as the initial stage of deriving gestures. Therefore, in order to determine a final gesture set for each command, another user study should be conducted by involving a different set of participants with various age groups and experts in various fields which will be one of our future studies. The results of this study suggest a good starting point for the further research on designing gestures for commands. Acknowledgements “This work was supported by the National Research Foundation of Korea (NRF) (gs1) grant funded by the Korea government (MEST) (gs2) (No. 2011-0015523).” “This treatise was supported by the project of Global Ph.D. Fellowship which National Research Foundation of Korea conducts from 2011 (No. 2012-056822).”

Appendix

Appendix A Suitability scores for each command in the three experiments Command number

Command description

Experiment

Suitability

Suitability for top gesture

Mean

SD

Mean

SD

1

Turn on an air conditioner

2

Turn off an air conditioner

3

Raise the temperature of an air conditioner

4

Lower the temperature of an air conditioner

1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd

4.72 5.19 5.34 4.85 5.02 5.21 4.77 5.04 5.59 5.00 5.14

1.49 1.02 1.08 1.30 1.17 1.08 1.28 0.87 0.98 1.36 0.83

4.67 5.47 5.44 4.54 5.72 5.57 4.67 5.12 5.75 4.93 5.17

1.80 0.95 1.01 1.82 0.91 1.13 1.09 0.88 0.97 1.24 0.88

(continued on next page)

1206

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

Appendix A (continued ) Command number

Command description

5

Turn on a TV

6

Turn off a TV

7

Increase the volume of a TV

8

Decrease the volume of a TV

9

Go to the next channel (channel up) of a TV

10

Go to the previous channel (channel down) of a TV

11

Turn on an audio player

12

Turn off an audio player

13

Increase the volume of an audio player

14

Decrease the volume of an audio player

15

Go to the next track (track up) of an audio player

16

Go to the previous track (track down) of an audio player

17

Answer a phone

18

Hang up a phone

19

Increase the volume of a phone

20

Decrease the volume of a phone

21

Turn on a light

22

Turn off a light

23

Turn on a desk lamp

24

Turn off a desk lamp

25

Turn up the light of a desk lamp

26

Turn down the light of a desk lamp

27

Open a curtain(s)

28

Close a curtain(s)

29

Open a blind

Experiment

Suitability

Suitability for top gesture

Mean

SD

Mean

SD

3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 1st 2nd 1st 2nd

5.45 4.44 4.99 5.69 4.46 4.91 5.55 5.26 5.25 5.69 5.16 5.24 5.59 5.05 4.96 5.55 4.78 4.98 5.55 4.46 4.22 5.31 4.59 4.54 5.34 5.29 5.15 5.55 5.35 5.19 5.31 5.39 5.00 5.79 4.93 5.06 5.69 4.91 5.58 5.69 4.85 5.29 5.66 5.09 5.18 5.48 5.12 5.13 5.62 4.79 4.58 5.93 4.80 4.69 5.72 4.64 4.73 5.69 4.24 4.79 5.69 4.50 4.64 5.45 4.62 4.43 5.41 5.95 6.34 5.63 6.35 5.31 5.58

1.09 1.25 1.19 0.93 1.47 1.22 0.91 1.04 0.98 0.97 1.08 1.01 0.91 1.29 0.92 0.87 1.32 0.95 0.78 1.38 1.21 1.14 1.22 1.20 0.90 0.95 1.08 1.15 0.96 1.25 1.26 1.03 1.15 0.68 1.34 1.13 0.81 1.44 1.21 1.00 1.30 1.19 1.17 0.97 0.79 1.02 0.97 0.89 0.94 1.34 1.39 0.88 1.28 1.44 1.00 1.41 1.34 0.81 1.38 1.32 0.81 1.13 0.96 1.12 1.10 1.06 1.05 0.77 0.74 1.14 0.71 1.06 0.98

5.62 4.79 5.06 5.90 4.08 5.08 5.73 5.62 5.79 5.85 5.47 5.22 5.81 5.00 4.95 6.07 4.88 5.18 5.93 4.65 4.49 5.00 4.58 5.48 5.40 5.71 5.96 5.43 5.68 5.61 5.39 4.96 5.26 6.07 4.72 5.26 6.06 5.68 5.98 6.00 4.59 5.79 5.76 5.27 5.26 5.57 5.29 5.21 5.69 5.42 4.12 6.23 5.24 3.91 5.75 5.25 4.75 5.90 4.57 4.82 5.85 4.72 4.69 5.27 4.20 4.67 5.38 6.00 6.30 5.55 6.32 5.34 5.74

1.07 1.58 0.94 1.10 1.80 1.11 1.01 0.68 0.63 0.90 0.74 1.24 0.83 0.75 0.88 0.62 1.22 0.71 0.70 1.41 1.62 1.18 1.33 0.53 0.89 0.69 0.66 1.40 0.75 1.17 1.20 1.15 0.78 0.59 1.41 0.74 0.57 1.23 0.83 1.10 0.91 0.82 1.20 0.99 0.60 1.09 1.01 0.78 0.95 0.80 1.05 0.73 2.09 1.24 0.71 1.65 1.56 0.88 1.37 1.82 0.80 0.81 0.70 0.96 1.44 0.90 1.09 0.59 0.76 1.05 0.73 1.15 0.90

E. Choi et al. / Applied Ergonomics 45 (2014) 1196e1207

1207

Appendix A (continued ) Command number

Command description

Experiment

Suitability Mean

SD

Mean

SD

30

Close a blind

31

Open a door(s)

32

Close a door(s)

33

Open a window(s)

34

Close a window(s)

35

Turn on a faucet

36

Turn off a faucet

37

Raise temperature of a faucet

38

Lower temperature of a faucet

1st 2nd 1st 2nd 1st 2nd 1st 2nd 1st 2nd 1st 2nd 1st 2nd 1st 2nd 1st 2nd

5.23 5.44 5.55 5.29 5.44 5.35 5.52 5.34 5.79 5.33 5.15 5.08 5.00 5.22 4.40 4.64 4.55 4.60

1.25 1.06 0.90 1.27 1.20 1.24 1.25 1.08 0.91 1.08 1.21 1.27 1.27 1.20 1.53 1.24 1.33 1.35

5.66 5.63 5.76 5.46 5.99 5.41 5.66 5.42 5.67 5.38 5.78 5.43 5.28 5.46 3.65 4.79 4.30 4.80

0.86 0.97 0.87 1.50 1.02 1.44 1.46 0.91 1.19 0.88 1.15 0.99 1.10 0.97 1.57 1.62 1.38 1.62

References Akers, D., 2006. Wizard of oz for participatory design: inventing a gestural interface for 3d selection of neural pathway estimates. In: Proc. of the ACM Conf. on CHI’06, pp. 454e459. Buisine, S., Martin, J.C., 2007. The effects of speech gesture cooperation in animated agents’ behavior in multimedia presentations. Interact. Comput. 19, 484e493. Bhuiyan, M., Picking, R., 2011. A gesture controlled user interface for inclusive design and evaluative study of its usability. J. Softw. Eng. Appl. 4, 513e521. Blackler, A., Popovic, V., Mahar, D., 2010. Investigating users’ intuitive interaction with complex artifacts. Appl. Ergon. 41, 72e92. Dipietro, L., Sabatini, A.M., 2008. A survey of glove-based systems and their applications. IEEE Trans. Syst. Man. Cybern. Part C Applic. Rev. 38 (4), 461e482. Epps, J., Lichman, S., Wu, M., 2006. A study of hand shape use in tabletop gesture interaction. In: Proc. of the ACM Confer. on CHI’06, pp. 748e753. Grandhi, S.A., Joue, G., Mittelberg, I., Jarke, M., 2010. Designing touchless gesturebased interfaces for human computer interaction: Insights from coverbal gestures. In: Proc. of SIGHCI. Henze, N., Locken, A., Boll, S., Hesselmann, T., Pielot, M., 2010. Free-hand gestures for music playback: deriving gestures with a user-centered process. In: Proc. of MUM Int Conf Mob Ubiquitous Multimed, vol. 16, pp. 1e10. Hurtienne, J., Weber, K., Blessing, L., 2008. Prior experience and intuitive use: image schemas in user centered design. In: Langdon, P., Clarkson, J., Robinson, P. (Eds.), Designing Inclusive Futures Springer, London, pp. 107e116. Johnson, J.A., 1995. A comparison of user interfaces for panning on a touch controlled display. In: Proc. of the ACM Confer. SIGCHI, pp. 218e225. Kim, H.J., Jeoung, K.H., Kim, S.K., Han, T.D., 2011. Ambient wall: smart wall display interface which can be controlled by simple gesture for smart home. Hong Kong, China. In: Proc. of SIGGRAPH Asia (SA’ 11), pp. 1e2. Kita, S., 2000. How representational gestures help speaking. In: McNeill, D. (Ed.), Language, Culture and Cognition 2. Language and Gesture. Cambridge University Press. Kuhnel, C., Westermann, T., Hemmert, F., Kratz, S., Muller, A., Moller, S., 2011. I’m home: defining and evaluating a gesture set for smart-home control. Int. J. Hum. Comput. Stud., 693e704. Kwon, S., Choi, E., Chung, M.K., 2011. Effect of control-to-display gain and movement direction of information spaces on the usability of navigation on small touch-screen interfaces using tap-n-drag. Int. J. Ind. Ergon. 41 (3), 322e330. Lee, S., Kim, S., Jin, B., Choi, E., Kim, B., Jia, X., Kim, D., Lee, K., 2010. How users manipulate deformable displays as input devices. Atlanta, Georgia, USA. In: Proc. of the ACM Confer. on CHI’10, pp. 1647e1656. Lepinski, G.J., Grossman, T., Fitzmaurice, G., 2010. The design and evaluation of multitouch marking menus. Atlanta, Georgia, USA. In: Proc. of the ACM Confer. on CHI’10, pp. 2233e2242. Mather, M., Johnson, M.K., 2000. Choice-supportive source monitoring: do our decisions seem better to us as we age? Psychol. Aging. 15 (4), 596e606. Mauney, D., Howarth, J., Wirtanen, A., Capra, M., 2010. Cultural similarities and differences in user-defined gestures for touch screen user interfaces. Atlanta, Georgia, USA. In: Proc. of the ACM Confer. on CHI’10, pp. 4015e4020.

Suitability for top gesture

Mitchell, T., Heap, I., 2011. SoundGrasp: a gestural interface for the performance of live music. In: Proc. of the Int. New Interfaces for Musical Expression. Neßelrath, R., Lu, C., Schulz, C.H., Frey, J., Alexandersson, J., 2011. A gesture based system for context-sensitive interaction with smart homes. Berlin. In: Wichert, R., Eberhardt, B. (Eds.), Advanced Technologies and Societal Change, pp. 209e219. Ng, W., Ng, C., Noordin, N.K., Ali, B.M., 2011. Gesture based automating household appliances. Interact. Techniques and Environments Lecture Notes in Computer Science Hum. Comput. Interact 6762, 285e293. Nielsen, M., Moeslund, T., Storring, T., Granum, E., 2003. A Procedure for Developing Intuitive and Ergonomic Gesture Interfaces for Man-Machine Interaction. Technical Report CVMT 03e01Aalborg University, Aalborg. Nielsen, M., Storring, T., Moeslund, T., Granum, E., 2004. A procedure for developing intuitive and ergonomic gesture interfaces for HCI. In: Gesture-Based Comm. in Hum. Comput. Interact, pp. 105e106. Norman, D., 1999. Affordances and design. In: Proc. of the ACM Confer (Open discussion). Ouchi, K., Esaka, N., Tamura, Y., Hiraharam, M., Doi, M., 2005. Magic wand: an intuitive gesture remote control for home appliances. In: Proc. of the Int. Active Media Technology, p. 274. O’Hagan, R.G., Zelinsky, A., Rougeaux, S., 2002. Visual gesture interfaces for virtual environments. Interact. Comput. 14, 231e250. Pan, G., Wu, J., Zhang, D., Wu, Z., Yang, Y., Li, S., 2010. GeeAir: a universal multimodal remote control device for home appliances. Pers. Ubi. Comput. 14 (8), 723e735. Park, W., 2012. A Multi-Touch Gesture Vocabulary Design Methodology for Mobile Devices. Ph.D. thesisDivision of Mechanical and Industrial Engineering POSTECH, Pohang, Korea. Pavlovic, V., Sharma, R., Huang, T.S., 1997. Visual interpretation of hand gestures for human-computer interaction: a review. IEEE TPAMI 19 (7), 677e695. Saffer, D., 2008. Designing Gestural Interfaces: Touch Screens and Interactive DevicesO’Reilly Media, Sebastopol. Sears, A., Arora, R., 2002. Data entry for mobile devices: an empirical comparison of novice performance with Jot and Graffiti. Interact. Comput. 14, 413e433. Seifried, T., Haller, M., Scott, S.D., Perteneder, F., Rendl, C., Sakamoto, D., Inami, M., 2009. Cristal: a collaborative home media and device controller based on a multi-touch display. In: Proc. of the Int. Interactive Tabletops and Surfaces, pp. 37e44. Shirai, K., Furui, S., 1994. Special issue on spoken dialogue. In: Proc. of Speech Comm, vol. 15, pp. 3e4. Stern, H.I., Wachs, J.P., Edan, Y., 2008. Optimal consensus intuitive hand gesture vocabulary design. In: Proc. of IEEE Int. Conf. Semantic Comput, pp. 96e103. Tversky, B., Kugelmass, S., Winter, A., 1991. Cross-cultural and developmental trends in graphic productions. Cognit. Psychol. 23 (4), 515e557. Wilson, A., Shafer, S., 2003. Xwand: UI for intelligent spaces. In: Proc. of the ACM Confer. on SIGCHI, vol. 1(5), pp. 545e552. Wobbrock, J.O., Morris, M.R., Wilson, A.D., 2009. User-defined gestures for surface computing. In: Proc. of the ACM Confer. on CHI’09, pp. 1083e1092. Yee, W., 2009. Potential limitations of multi-touch gesture vocabulary: differentiation, adoption, fatigue. In: Proc. of the 13th Int. Conf. on Hum. Comput. Interact, pp. 291e300.

Towards successful user interaction with systems: focusing on user-derived gestures for smart home systems.

Various studies that derived gesture commands from users have used the frequency ratio to select popular gestures among the users. However, the users ...
3MB Sizes 0 Downloads 3 Views