Deetjen U, Powell JA. J Am Med Inform Assoc 2016;0:1–6. doi:10.1093/jamia/ocv190, Research and Applications Journal of the American Medical Informatics Association Advance Access published February 15, 2016

Informational and emotional elements in online support groups: a Bayesian approach to large-scale content analysis

RECEIVED 30 April 2015 REVISED 4 November 2015 ACCEPTED 7 November 2015

Ulrike Deetjen1, John A Powell2

ABSTRACT ....................................................................................................................................................

Keywords: online support groups, informational support, emotional support, health, e-health, Internet

BACKGROUND AND SIGNIFICANCE Online support groups are one of the major ways in which the Internet has fundamentally changed how people experience health and health care. They provide a platform for health discussions formerly restricted by time and place, enable individuals to connect with others in similar situations, and facilitate open, anonymous communication.1 Online support groups may positively impact how patients cope with health conditions,2 provide support for stigmatized conditions,3 and may empower patients to assume an active role in their health,4,5 even though robust evidence of actual health benefits is scant.6 Previous research has identified that both active and passive participation in online support groups fulfill a variety of functions in individuals’ lives, including information provision, tangible assistance, network support to emotional support, esteem support, recognition, sharing experiences, learning to tell one’s own story, and visualizing the disease.5,7,8 These forms of support broadly fall into 2 categories, informational and emotional support, both of which can have positive effects on well-being.9 Previous studies have consistently found that informational and emotional support make up the largest share of posts and occur with about similar frequencies in online support groups,10–13 although there are differences in the absolute numbers depending on whether subcategories (such as esteem support or instrumental support) were subsumed among emotional and informational posts, as is done in this research. This categorization is not restricted to online settings; research over decades has identified that patients often have a combination of informational and emotional needs in relation to their health. These

may vary based on demographics, individual preferences, and the nature and stage of illness.14–17 The assistance of online support groups can cater to these needs, as close communities may develop online and may provide a space for exchanging information and emotional aid.18 In particular, the ability to speak to larger groups may help with both aims, while reduced cues in written, computer-mediated, and asynchronous communication may foster emotional support provision.19 Weak ties also extend the reach of individuals for exchanging information even if no personal relationship exists.20 For health issues, both of these functions may come together, as shared health concerns help to develop a feeling of cohesiveness.21 For specific conditions, research found that the primary function of a group for irritable bowel syndrome (IBS) was providing information on interpreting symptoms, managing the illness, and interacting with health professionals.7 Satisfying both emotional and informational needs were key functions within an eating disorder group,11 a Huntington’s disease online support group,22 and a breast cancer forum.23,24 Comparative research on breast cancer versus prostate cancer concluded that women seek more emotional support, while men look for information,25 keeping in mind the different nature of these conditions, which lead to distinct needs and experiences regardless of gender. However, in support of the gender differences, earlier research comparing mailing lists on ovarian cancer with prostate cancer also found the same gender-specific communication patterns,26 while single-condition research on prostate cancer evidenced a focus on information seeking for evaluating treatment options and outcomes in addition to emotional support needs.16,27

Correspondence to Ulrike Deetjen, MSc, Oxford Internet Institute, University of Oxford, Oxford, UK; [email protected]. For numbered affiliations see end of article. C The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. V

For Permissions, please email: [email protected].

1

Downloaded from http://jamia.oxfordjournals.org/ by guest on February 19, 2016

....................................................................................................................................................

RESEARCH AND APPLICATIONS

Objective This research examines the extent to which informational and emotional elements are employed in online support forums for 14 purposively sampled chronic medical conditions and the factors that influence whether posts are of a more informational or emotional nature. Methods Large-scale qualitative data were obtained from Dailystrength.org. Based on a hand-coded training dataset, all posts were classified into informational or emotional using a Bayesian classification algorithm to generalize the findings. Posts that could not be classified with a probability of at least 75% were excluded. Results The overall tendency toward emotional posts differs by condition: mental health (depression, schizophrenia) and Alzheimer’s disease consist of more emotional posts, while informational posts relate more to nonterminal physical conditions (irritable bowel syndrome, diabetes, asthma). There is no gender difference across conditions, although prostate cancer forums are oriented toward informational support, whereas breast cancer forums rather feature emotional support. Across diseases, the best predictors for emotional content are lower age and a higher number of overall posts by the support group member. Discussion The results are in line with previous empirical research and unify empirical findings from single/2-condition research. Limitations include the analytical restriction to predefined categories (informational, emotional) through the chosen machine-learning approach. Conclusion Our findings provide an empirical foundation for building theory on informational versus emotional support across conditions, give insights for practitioners to better understand the role of online support groups for different patients, and show the usefulness of machine-learning approaches to analyze large-scale qualitative health data from online settings.

Deetjen U, Powell JA. J Am Med Inform Assoc 2016;0:1–6. doi:10.1093/jamia/ocv190, Research and Applications

METHODS The research presented in this paper uses RSS feed scraping to collect data and employs a machine-learning approach based on a Naı¨ve Bayesian classifier to categorize the posts into more informational and emotional messages. Both data collection and analysis are made largely scalable, hence enabling insights based on large-scale online data. Our methodology operates within the pragmatic research paradigm and represents a quantitatively informed approach to qualitative data analysis as one potential path on the continuum of possibilities in content analysis.29 While the 2 main categories (emotional, informational) are predefined based on the existing literature,10–13 the features of informational and emotional posts are picked up as part of the classifier’s learning process, similar to qualitative research. “Counting” these features stands in contrast with detailed qualitative analysis, as it reduces the richness of the data.29 Our approach acknowledges the variety of patient experiences that previous research has found in online support groups,11,25,27 but seeks to take advantage of quantitative approaches to explore the larger context of how individuals interact in online support groups. Dailystrength.org is a worldwide support group platform with forums that cut across all major conditions. Individuals have to sign up to be able to post to the forum, while passive participation (browsing and reading only) is possible without any login. Similar to many other forums, it offers both emotional support and information sharing, though with an added feature for emotional support by offering the opportunity to give online “hugs” to others30 as well as the exchange of messages. This research, however, will look only at the content of the messages exchanged in the discussion forums and analyze whether these are informationally or emotionally oriented.

2

In the first step of capturing the data, large-scale qualitative data was scraped from the website Dailystrength.org by RSS feed parsing using a Python script to access all responses to the threads for the 14 selected conditions. To ensure comparability across conditions and time, up to 500 threads between July 16, 2010 and July 16, 2014 were scraped for each condition, resulting in a total of 40 612 posts. The dataset was cleaned by converting all words to lowercase, removing all numbers and punctuation, and adding missing data. The names of the 14 conditions (and their abbreviations) were also removed from all posts, as they would create noise for the classifier that is specific to the dataset rather than providing informative features across conditions.31 The next step consisted of classifying the data into informational and emotional posts using a Naı¨ve Bayesian classifier. This technique is a machine-learning approach in which a classifier is “trained” in what informational and emotional posts are and can then be applied to classify an infinitely large number of posts (40 612 in this research). For this, a subset of 2000 posts (50% emotional, 50% informational) was purposively selected from the overall dataset to obtain good instances of informational and emotional posts. It included posts from all 14 conditions, roughly reflecting the frequency of posts per condition in the overall dataset. These 2000 posts were hand-coded by one of the authors. To verify coding quality, a second author coded a 20% sample of 400 posts, randomly selected from the 2000, giving a Cohen’s kappa for inter-rater reliability of 0.965. Table 1 provides a few examples of informational and emotional posts. Out of these 2000 posts, 1500 were randomly selected for the training dataset for the classifier to learn to distinguish the posts. Only unigrams were used, thereby ignoring the position of single words and compound structures. The 15 most informative features for emotional posts in order of decreasing weight were “prayers,” “sorry,” “hugs,” “glad,” “thoughts,” “deal,” “welcome,” “thank,” “god,” “loved,” “strength,” “alone,” “support,” “wonderful,” and “sending.” In contrast, informational posts mainly contained the words “effects,” “started,” “weight,” “blood,” “eating,” “drink,” “dose,” “night,” “recently,” “taking,” “side,” “using,” “twice,” and “meal.” No features were excluded from the classification, except for the condition names removed at the stage of data cleaning. The remaining 500 posts were used as a test dataset to evaluate the classifier. Recall for informational and emotional posts was 92% and 96%, respectively, whereas precision was 95% and 92%. This indicates that informational posts were mistaken for emotional posts more frequently, but a lower number of emotional posts was

Table 1: Examples of informational and emotional posts Informational posts

Emotional posts

“I’ve had the most success with [product]. You still have to make sure you’re drinking enough water or juice. Also vegetables. It makes a big difference. Let me know if it helps.”

“I’m sorry that you are so uncomfortable right now. We all have episodes where it just seems like Fibromyalgia rules no matter what.”

“I know the Prednisone can cause high numbers and you have to watch, have you mentioned this to your doctor? Do keep an eye on this and let them know if you go too high.”

“Glad to hear from you and know you are okay. Will be thinking of you on Thursday and pray that things will go alright. Keep us posted. Hugs.”

Source: Dailystrength.org

Downloaded from http://jamia.oxfordjournals.org/ by guest on February 19, 2016

RESEARCH AND APPLICATIONS

Beyond this predominantly qualitative research focusing on 1 or 2 conditions to compare informational and emotional support, and studies on the patterns of activity in general,28 research has not yet looked at their coexistence across conditions. A reason for this may be the limits of manual data collection and processing. These restrictions mean that samples in previous research have been relatively small, usually only about 1000 posts,7,11,22,25 making datasets too small to meaningfully compare conditions. Research on online support groups has recently started to take advantage of computational approaches to large-scale data analysis,23,24 but has not yet used this to compare informational and emotional posts across health conditions. Insights into the patterns of support and the explanatory factors would be valuable for health practitioners who seek to promote patient-centered care and to harness patient engagement. It would also advance the theoretical understanding of how the characteristics of online support groups can cater to various patient needs. This research intends to promote this understanding by analyzing the extent to which informational and emotional messages are exchanged online across different conditions and what other factors determine emotional and informational exchange across conditions. To do so, this research draws on online support group posts across 14 common conditions: breast cancer, prostate cancer, lung cancer, depression, schizophrenia, Alzheimer’s disease, multiple sclerosis, cystic fibrosis, fibromyalgia, heart failure, diabetes type 2, IBS, asthma, and chronic obstructive pulmonary disease (COPD). These conditions were purposively sampled prior to any analysis taking place to cover a range of common chronic conditions with differing demographic coverage and a range of support needs. Beyond comparing these conditions, this research also regards age, gender, duration of membership, and number of posts (as a measure of size and activity), all of which may influence the dynamics of interaction in support groups.11,25,28

Deetjen U, Powell JA. J Am Med Inform Assoc 2016;0:1–6. doi:10.1093/jamia/ocv190, Research and Applications

Of the 38 337 posts in our sample, 58% were classed as “emotional” (having mainly emotional support content) and 42% were classed as “informational” (with mainly informational support content). Across all posts for which gender is available, there is no difference between male and female contributions (b ¼ 0.001; P ¼ .91; Nfemale ¼ 22 059 and Nmale ¼ 5382 out of 27 441). Emotional and informational posts vary with age, with older individuals tending to write more informational posts (b ¼ 0.082; P ¼ .00; N ¼ 18 603), but with some exceptions in the age group 65–75 based on visual inspection of the cross-sectional data. Across conditions, emotional and informational posts vary considerably (see Table 2). While a multitude of factors may influence this difference (e.g., the size of the forum, offline contact of a group, other characteristics specific to the group), some trends exist that may be related to the characteristics of the disease itself. Groups for brain-related conditions receive a high number of emotional posts. This includes mental health conditions such as depression (87% emotional posts) and schizophrenia (75% emotional posts), and also neurodegenerative diseases such as Alzheimer’s disease (83% emotional posts), where the majority of online communication may be by caregivers. Informational posts tended to relate to nonterminal physical conditions such as IBS (71% informational posts), diabetes type 2 (65% informational posts), and asthma (62% informational posts). While there are no gender differences across all posts, discussions on

N

Emotional posts (%)

Informa -tional posts (%)

Average age

Gender (% female)

Depression

3647

87

13

39.3

64

Alzheimer’s disease

3799

83

17

49.8

81

Schizophrenia

3480

75

25

35.9

70

Cystic fibrosis

275

65

35

36.3

76

Lung cancer

923

63

37

50.8

81

Breast cancer

2738

59

41

48.0

96

Fibromyalgia

4508

58

42

45.7

92

COPD

4757

56

44

60.9

59

Heart failure

2683

48

52

48.3

64

Multiple sclerosis

3258

44

56

46.7

83

255

43

57

57.8

37

Prostate cancer Asthma

2433

38

62

40.6

78

Diabetes type 2

3161

35

65

50.2

76

IBS Total

2420

29

71

44.4

79

38 337

58

42

45.9

77

Downloaded from http://jamia.oxfordjournals.org/ by guest on February 19, 2016

RESULTS

Table 2: Frequency of emotional and informational posts and demographic characteristics of support group members across conditions

RESEARCH AND APPLICATIONS

incorrectly classified as informational. The overall accuracy of the classifier was 94%. This was confirmed by 4-fold cross validation, repeating the random allocation of posts to the test and training dataset by partitioning the 2000 posts into 4 parts and iteratively using each as a test dataset (accuracy values obtained: 93.6%, 95.2%, 91.6%, 96.0%). Of course, the boundaries between both post types may indeed be fluid, as emotional support and informational support often go hand in hand.23,24 Therefore, the classification of a post only means that it was mainly informational or emotional, but not necessarily exclusively either. Then again, some posts may contain substantial elements of both or neither of the two. There were 2275 posts that would not fall into the emotional or informational category for one reason or another with a sufficient degree of certainty (probability of at least 75%) and were excluded from further analysis (leaving N ¼ 38 337 out of 40 612). For analyzing the data, the first part entailed cross-sectional analysis to find differences in the proportions of emotional and informational posts per condition. The second step consisted of fitting a hierarchical logistic model to predict informational or emotional posts, with variation at the level of conditions, individuals, and threads being adjusted for by including random slopes with fixed intercepts across all 3 levels of the multilevel model (using the statistical package R). In this structure, the model then examined the effect of 6 independent variables: the number of posts in the specific forum (at the level of conditions), the number of other posts (at the level of threads), the number of posts written (at the level of the individual), length of membership of Dailystrength.org, gender, and age (at the lowest level of the individual post, where the classification into informational and emotional posts was also measured). Ethics approval was granted by the University of Oxford research ethics committee. This research employed only information publicly accessible on Dailystrength.org without login. As a consequence, the number of available cases for both gender (leaving N ¼ 27 441 out of 38 337) and age (leaving N ¼ 18 603 out of 38 337) was reduced, as these measures were included only if individuals disclosed this information publicly on their profile or if it could clearly be identified from the user name.

Table 3: Results from multilevel logistic regression Outcome variable: Informational (1) or emotional (0) posts, controlling for random effects at the level of conditions, threads, and individuals Independent variables

Beta

Duration of membership

0.030

Gender

0.051

Age

0.197***

Number of posts per individual (at level of individuals)

0.357***

Number of posts in a thread (at level of threads)

0.047

Number of posts in forum (at level of conditions)

0.124

N

18 603 2

McFadden’s R

0.57

Largest condition index

3

Note: P-values ***< .001 breast cancer, where mostly women communicate, are more emotionally oriented (59% emotional posts), while prostate cancer groups, where mostly men communicate, rather exchange information (43% emotional posts). For the second part of the analysis, the multilevel hierarchical logistic model predicts informational and emotional posts across all conditions (see Table 3), adjusting for the fact that not all posts were independent of each other. For example, posts were arranged in threads (which may be about topics that invite more informational and

3

Deetjen U, Powell JA. J Am Med Inform Assoc 2016;0:1–6. doi:10.1093/jamia/ocv190, Research and Applications

emotional responses) and were written by individuals in relation to different conditions, which featured different levels of informational and emotional posts (see Table 2). Table 3 shows that both age and the number of individual posts are significant predictors (P < .001) of a post being informationally or emotionally oriented. However, all other measures at different levels across the model do not significantly help to predict the nature of a post.

DISCUSSION

4

Downloaded from http://jamia.oxfordjournals.org/ by guest on February 19, 2016

RESEARCH AND APPLICATIONS

The results show that both informational and emotional posts exist, with a slight overall orientation toward emotional support. The general existence of both is based on an a priori assumption derived from previous research10–13 and embedded in this research through the initial manual classification of posts in the training dataset. The proportion of 58% of emotional posts in the overall dataset has emerged from the data itself, and confirms that both informational and emotional support are important elements of online support groups. One contribution of this research is to allow comparison of informational and emotional support across different health issues. The results confirm findings of previous research that focused on single conditions. Online support groups for mental health conditions (depression and schizophrenia in our study) have a high number of emotional posts.32 Similarly, contexts where mostly caregivers communicate, as in the case of an online hospice community for Alzheimer’s disease, have been found to have a high number of posts seeking and providing emotional support.33 In line with previous research,7,14 we also found that computer-mediated support for individuals suffering from IBS and diabetes largely consisted of informational support on interpreting symptoms and managing the condition. This research found no major gender differences across all posts. However, echoing previous findings,25,26 female discussions on breast/ovarian cancer are more emotional, while prostate cancer groups feature more informational support. A potential explanation may lie in the nature of single-gender vs mixed-gender groups (although some users of the forum may be family or friends of a different gender), as found in a meta-study on gender differences in online support groups.34 This meta-study concluded that females show more elements of emotional disclosure, congratulate each other for achievements, and provide each other with encouragement and support, whereas men were more likely to provide information about treatments, diagnosis, and symptoms. Interestingly, in mixed-gender discussions, these differences became less evident, perhaps explaining why gender differences are not present for other conditions (without strong gender incidence) in this research. It is worth highlighting that the prostate cancer group is considerably smaller and less active than the breast cancer group, with both fewer threads and fewer replies per thread. However, cystic fibrosis, which has a similar number of posts as prostate cancer, is more emotionally oriented than both prostate and breast cancer (65% emotional posts). In addition, in our sample the average age for breast cancer is 48 years, whereas that of the prostate cancer group is 58 years (see Table 2), so our finding that older people tend to write more informational posts also partially explains the difference. The finding that higher age is associated with writing posts with more informational content may have several explanations. First, some of the conditions with a higher share of emotional posts (such as depression, schizophrenia, or cystic fibrosis; see Table 2) have a relatively high proportion of younger individuals in their forum populations, and the demographic prevalence of the condition may be

enmeshed with the disease-related tendency to write informational or emotional posts. Second, older people are less likely to use the Internet for social purposes, especially in the sense of accessing more weak ties in social networks or discussion groups.35 This may explain why they use the Internet as a source of informational rather than emotional support. Third, older people suffering from chronic conditions may become “experts” in their condition, and may want to share their knowledge and experience more widely.36 Finally, it is also possible that the diagnosis of a severe and/or chronic illness in a younger person has more of an emotional impact,14 given that it is relatively more unexpected and has long-term implications. The result that neither the number of posts in the thread nor the number of posts in a given forum significantly influenced the nature of posts shows simply that the amount of discussion in a certain context does not necessarily determine its content: Both informational and emotional support may be exchanged in discussions with several contributors. However, the association between the number of posts per individual and an increased number of emotionally oriented posts may indicate that those who contribute a lot tend to write more emotional posts—or that as people become more involved and have conversations, instead of simply providing information as a one-off interaction, they tend to provide more emotional support for each other. In addition, those who have written more emotional posts (rather than asking for factual information) may have received more emotional posts in exchange, thereby incentivizing them to remain part of the specific support group forum23,24 and perhaps write a larger number of posts. While we do not find any significant effects for group membership,24 our measure relates only to the time that an individual has been part of Dailystrength.org in general, not necessarily of the specific forum of the condition. Our multilevel model explains about 57% of the variance in whether a post is informational or emotional (pseudo R2 for logistic regression, computed without random effects37). This is relatively high in social research, but also reflects the findings of previous research that factors not captured in this model may influence the nature of a post. A multitude of context-specific features of conversations may influence the dynamics of conversation and thereby the classification of posts. For example, it would be useful to consider the individuals’ offline context and demographic variables such as education or health literacy, which have been found to be associated with health and health experiences38,39 and may also influence how individuals make use of online support groups. A limitation of our research is that we do not take network structures into account, which would provide more insights into how the frequency of individuals’ interaction or their degree of closeness influences their communication.18,40 In addition, the nature of an individual’s post may influence its response, as positive or negative emotional and informational self-disclosure may be related to the type of support received.23,24 More generally, analyzing content in online support groups can only capture active participation, but individuals may also benefit from passive lurking on these forums.8 More research in offline settings may be needed to advance insights into the question of whether informationally and emotionally oriented posts in fact meet the informational and emotional needs of the online support group members. Although machine-learning approaches are not unprecedented in online support group analysis,23,24 our research shows how machinelearning enables asking new research questions of relevance to health research: comparing informational and emotional posts across several health conditions necessitates a larger dataset to ensure a sufficient number of posts per condition, which can hardly be achieved through

Deetjen U, Powell JA. J Am Med Inform Assoc 2016;0:1–6. doi:10.1093/jamia/ocv190, Research and Applications

Finally, from a methodological perspective, this research has shown the usefulness of machine-learning approaches to analyze large-scale qualitative health data in the online age. While manual qualitative data analysis will be more precise and reliable when looking at fine-grained differences in communication, machine-learning approaches can help to extend findings from qualitative research to larger amounts of data and thereby understand the underlying largescale patterns needed for making findings more generalizable and comparable across medical conditions, for example. We believe the research community can usefully harness machine-learning approaches to further explore the nature of interactions in online communities and how these vary by health condition. Future work could examine the longitudinal characteristics of social support in online settings, and could explore whether big data approaches can investigate the relationship between online support and subsequent health and social outcomes.

U.D. had the original idea, designed the study, undertook the data gathering and analysis, and drafted the manuscript. J.A.P. contributed to the design of the study, contributed to the interpretation of the findings, and revised the final manuscript.

FUNDING This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors. U.D. is funded by the Clarendon Fund, the Economic and Social Research Council, and Balliol College, Oxford.

COMPETING INTERESTS The authors have no competing interests to declare.

CONCLUSION

ACKNOWLEDGEMENTS

This research showed that across online support forums for 14 health conditions, both informationally and emotionally oriented posts exist to a similar degree, with a slight majority of emotional posts. The nature of posts varies considerably across conditions, with mental health conditions and contexts where mostly caregivers communicate featuring more emotional messages, while nonterminal physical conditions have more informational messages. Across all conditions, there is no difference in the nature of posts by gender, although forums for conditions that affect only 1 gender tend to reveal differences, where forums for conditions that predominantly affect females tend to have more emotional posts in comparison to their male-oriented forum counterparts. More generally, the nature of posts is primarily determined by age and the number of total posts from an individual, whereas neither length of membership, nor gender, nor the overall number of posts in the thread or the condition-specific forum showed significant effects. The contribution of this research is 3-fold. On a theoretical level, this research helped to advance the understanding of how online support groups may reflect different patient needs across conditions. While previous research has demonstrated that patients have different needs and that informational and emotional elements coexist in online support groups based on researching single conditions, this research demonstrates how they differ across conditions and identifies factors that determine the relative balance of informational or emotional messages. On a practical level, these results provide insights for practitioners and health organizations to better understand the role of online support groups for different patients, which may help in advising patients on the value of online support.

The authors would like to thank Dr Rebecca Eynon and Dr Jonathan Bright for their comments and suggestions on earlier versions of this paper.

REFERENCES 1. Wright KB, Bell SB. Health-related support groups on the Internet: linking empirical findings to social support and computer-mediated communication theory. J Health Psychol. 2003;8(1):39–54. 2. Malik SH, Coulson NS. Coping with infertility online: an examination of selfhelp mechanisms in an online infertility support group. Patient Educ Couns. 2010;81(2):315–318. 3. Powell J, McCarthy N, Eysenbach G. Cross-sectional survey of users of Internet depression communities. BMC Psychiatry. 2003;3(1):19. 4. Barak A, Boniel-Nissim M, Suler J. Fostering empowerment in online support groups. Comp Hum Behav. 2008;24(5):1867–1883. 5. van Uden-Kraan CF, Drossaert CH, Taal E, Shaw BR, Seydel ER, van de Laar MA. Empowering processes and outcomes of participation in online support groups for patients with breast cancer, arthritis, or fibromyalgia. Qual Health Res. 2008;18(3):405–417. 6. Eysenbach G, Powell J, Englesakis M, Rizo C, Stern A. Health related virtual communities and electronic support groups: systematic review of the effects of online peer to peer interactions. BMJ. 2004;328(7449):1166. 7. Coulson NS. Receiving social support online: an analysis of a computermediated support group for individuals living with Irritable Bowel Syndrome. CyberPsychol Behav. 2005;8(6):580–584. 8. Ziebland S, Wyke S. Health and illness in a connected world: how might sharing experiences on the Internet affect people’s health? Milbank Q. 2012;90(2):219–249. 9. Tanis M. Online social support groups. In Joinson A, ed. Oxford Handbook of Internet Psychology. Oxford: Oxford University Press; 2007:139–154.

5

Downloaded from http://jamia.oxfordjournals.org/ by guest on February 19, 2016

CONTRIBUTORS

RESEARCH AND APPLICATIONS

hand-coding messages. The advantage of using the chosen machinelearning approach is that only examples of informational and emotional posts need to be selected by the researcher for the Bayesian classifier to “learn” what the features of each category are, based on which the complete set of posts can then be categorized. Determining the distinctive features of categories may not be possible in most cases, and while data cleaning, adding missing data and hand-coding the training and test dataset is still a substantial amount of effort, it is substantially lower than classifying entire large-scale datasets, and is transferable across contexts.23,41 A limitation of the Naı¨ve Bayesian classifier based on unigrams is its “bag of words” approach, which ignores the structural position and conjunctions of specific features. However, this may be acceptable based on the nature of the classification, as opposed to sentiment analysis, for example, where there are major differences between “good” and “not good.”42 The features extracted by the Naı¨ve Bayesian classifier show how emotional posts contain expressions of affection, support, and empathy, as well as references to religious elements, whereas informational posts relate to specific activities, measures of time and frequency, or other health-related measures such as weight and blood. The lower recall but higher precision for informational posts may be explained by the circumstance that individuals may add expressions of support and affection even when mainly providing information in their post. Finally, while Dailystrength.org is not an unusual forum in terms of its design features and characteristics, this research to some extent focuses on the case of a forum in the English-speaking world dominated by contributions from the United States. At the same time, a large multicondition forum like Dailystrength.org allows several conditions to be compared while holding constant all other platform characteristics that may influence interaction online.

Deetjen U, Powell JA. J Am Med Inform Assoc 2016;0:1–6. doi:10.1093/jamia/ocv190, Research and Applications

25. Seale C, Ziebland S, Charteris-Black J. Gender, cancer experience and Internet use: a comparative keyword analysis of interviews and online cancer support groups. Soc Sci Med. 2006;62(10):2577–2590. 26. Sullivan CF. Gendered cybersupport: a thematic analysis of two online cancer support groups. J Health Psychol. 2003;8(1):83–104. 27. Sillence E, Mo PK. Communicating health decisions: an analysis of messages posted to online prostate cancer forums. Health Expect. 2014;17(2):244–253. 28. Davison KP, Pennebaker JW, Dickerson SS. Who talks? The social psychology of illness support groups. Am Psychol. 2000;55(2):205. 29. Morgan DL. Qualitative content analysis: a guide to paths not taken. Qual Health Res. 1993;3(1):112–121. 30. Swan M. Emerging patient-driven health care models: an examination of health social networks, consumer personalized medicine and quantified self-tracking. Int J Environ Res Public Health. 2009;6(2):492–525. 31. Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. JAMIA. 2011;18(5):544–551. 32. Griffiths KM, Calear AL, Baneld M, Tam A. Systematic review on Internet support groups (ISGs) and depression (2): what is known about depression ISGs? J Med Internet Res. 2009;11(3):e41. 33. Buis LR. Emotional and informational support messages in an online hospice support community. Comp Inform Nursing. 2008;26(6):358–367. 34. Mo PKH, Malik SH, Coulson NS. Gender differences in computer-mediated communication: a systematic literature review of online health-related support groups. Patient Educ Couns. 2009;75(1):16–24. 35. Dutton WH, Blank G, Groselj D. Cultures of the Internet: The Internet in Britain. Oxford Internet Survey 2013. Oxford: Internet Institute, University of Oxford; 2013. 36. Ziebland S. The importance of being expert: the quest for cancer information on the Internet. Soc Sci Med. 2004;59(9):1783–1793. 37. Hox J. Multilevel Analysis: Techniques and Applications. New York, NY and Hove: Routledge; 2010. 38. DeWalt DA, Berkman ND, Sheridan S, Lohr KN, Pignone MP. Literacy and health outcomes. J Gen Int Med. 2004;19(12):1228–1239. 39. Berkman ND, Sheridan SL, Donahue KE, Halpern DJ, Crotty K. Low health literacy and health outcomes: an updated systematic review. Ann Int Med. 2011;155(2):97–107. 40. Maloney-Krichmar D, Preece J. A multilevel analysis of sociability, usability, and community dynamics in an online health community. ACM Transactions on Computer-Human Interaction (TOCHI). 2005;12(2):201–232. 41. Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys (CSUR). 2002;34(1):1–47. 42. Go A, Bhayani R, Huang L. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford; 2009:1–12.

AUTHOR AFFILIATIONS .................................................................................................................................................... 1 2

Oxford Internet Institute, University of Oxford, Oxford, UK

Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK

6

Downloaded from http://jamia.oxfordjournals.org/ by guest on February 19, 2016

RESEARCH AND APPLICATIONS

10. Ridings CM, Gefen D. Virtual community attraction: why people hang out online. J Comput Mediat Commun. 2004;10(1). 11. Eichhorn KC. Soliciting and providing social support over the Internet: an investigation of online eating disorder support groups. J Comput Mediat Commun. 2008;14(1):67–78. 12. Braithwaite DO, Waldron VR, Finn J. Communication of social support in computer-mediated groups for people with disabilities. Health Commun. 1999;11(2):123–151. 13. Finn J. An exploration of helping processes in an online self-help group focusing on issues of disability. Health Social Work. 1999;24(3):220–231. 14. Beeney LJ, Bakry AA, Dunn SM. Patient psychological and information needs when the diagnosis is diabetes. Patient Educ Couns. 1996;29(1):109– 116. 15. Hallstro¨m I, Elander G. A comparison of patient needs as ranked by patients and nurses. Scand J Caring Sci. 2001;15(3):228–234. 16. Steginga SK, Occhipinti S, Dunn J, Gardiner RA, Heathcote P, Yaxley J. The supportive care needs of men with prostate cancer (2000). PsychoOncology. 2001;10(1):66–75. 17. Leydon GM, Boulton M, Moynihan C, et al. Cancer patients’ information needs and information seeking behaviour: in depth interview study. BMJ. 2000;320(7239):909–913. 18. Wellman B, Gulia M. Virtual communities as communities: net surfers don’t ride alone. In: Kollock P, Smith M, ed. Communities in Cyberspace. New York, NY: Routledge; 1999;167–194. 19. Walther JB. Computer-mediated communication impersonal, interpersonal, and hyperpersonal interaction. Commun Res. 1996;23(1):3–43. 20. Constant D, Sproull L, Kiesler S. The kindness of strangers: the usefulness of electronic weak ties for technical advice. Organ Sci. 1996; 7(2):119–135. 21. Frost JH, Massagli MP. Social uses of personal health information within PatientsLikeMe, an online patient community: what can happen when patients have access to one another’s data. J Med Internet Res. 2008;10(3). 22. Coulson NS, Buchanan H, Aubeeluck A. Social support in cyberspace: a content analysis of communication within a Huntington’s disease online support group. Patient Educ Couns. 2007;68(2):173–178. 23. Wang YC, Kraut RE, Levine JM. Eliciting and receiving online support: using computer-aided content analysis to examine the dynamics of online social support. J Med Internet Res. 2015;17(4):e99. 24. Wang YC, Kraut R, Levine JM. To stay or leave?: the relationship of emotional and informational support to commitment in online health support groups. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. ACM; 2012;833-842.

Informational and emotional elements in online support groups: a Bayesian approach to large-scale content analysis.

This research examines the extent to which informational and emotional elements are employed in online support forums for 14 purposively sampled chron...
228KB Sizes 0 Downloads 8 Views