This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Infodemiology, is properly cited. The complete bibliographic information, a link to the original publication on https://infodemiology.jmir.org/, as well as this copyright and license information must be included.
Unlike past pandemics, COVID-19 is different to the extent that there is an unprecedented surge in both peer-reviewed and preprint research publications, and important scientific conversations about it are rampant on online social networks, even among laypeople. Clearly, this new phenomenon of scientific discourse is not well understood in that we do not know the diffusion patterns of peer-reviewed publications vis-à-vis preprints and what makes them viral.
This paper aimed to examine how the emotionality of messages about preprint and peer-reviewed publications shapes their diffusion through online social networks in order to inform health science communicators’ and policy makers’ decisions on how to promote reliable sharing of crucial pandemic science on social media.
We collected a large sample of Twitter discussions of early (January to May 2020) COVID-19 medical research outputs, which were tracked by Altmetric, in both preprint servers and peer-reviewed journals, and conducted statistical analyses to examine emotional valence, specific emotions, and the role of scientists as content creators in influencing the retweet rate.
Our large-scale analyses (n=243,567) revealed that scientific publication tweets with positive emotions were transmitted faster than those with negative emotions, especially for messages about preprints. Our results also showed that scientists’ participation in social media as content creators could accentuate the positive emotion effects on the sharing of peer-reviewed publications.
Clear communication of critical science is crucial in the nascent stage of a pandemic. By revealing the emotional dynamics in the social media sharing of COVID-19 scientific outputs, our study offers scientists and policy makers an avenue to shape the discussion and diffusion of emerging scientific publications through manipulation of the emotionality of tweets. Scientists could use emotional language to promote the diffusion of more reliable peer-reviewed articles, while avoiding using too much positive emotional language in social media messages about preprints if they think that it is too early to widely communicate the preprint (not peer reviewed) data to the public.
The COVID-19 pandemic has led to an unparalleled surge in global research publications on a single topic in documented history [
Communication of science to the public has traditionally relied on professionals (eg, journalists, scientists, and public health authorities) to meticulously translate scientific findings for public consumption [
Text-based emotions refer to the presence of fine-grained emotions, such as happy, sad, and angry, in human languages [
Given the plurality of the emotional dynamics in social media sharing, we aimed to first establish which of the 3 theoretical explanations mentioned above is most likely true in the context of social media sharing of COVID-19 scientific research. Although self-enhancement motivation has been established in the context of the interpersonal sharing of professionally mediated science communication [
Peer-reviewed journal publications and preprints differ in their scientific uncertainty in that there is a possibility that the results may be invalidated by subsequent studies [
To address the above gaps in our knowledge, we collected all Twitter discussions of nearly 10,000 early (January to May 2020) COVID-19 English research articles in the life science and biomedical fields in both peer-reviewed journals and preprint servers from Altmetric. Altmetric provides quantification of the attention received online for an individual research article. It is increasingly being used as a research metric for science evaluation [
What aspect of emotion (ie, positive valence, negative valence, or arousal) best explains the emotional dynamics in the social media sharing of COVID-19 scientific outputs?
Do the emotional dynamics of sharing have similar or divergent patterns between messages of preprint and peer-reviewed journal publications?
What are the emotional dynamics associated with the role of scientists as social media message creators in the sharing of COVID-19 science?
To answer our research questions, we collected data from several sources. First, we obtained COVID-19–related medical English peer-reviewed journal publications, published prior to mid-May 2020, from the MEDLINE database (accessed through PubMed), where we retrieved each publication’s unique digital object identifier (DOI). We then used the PubMed application programming interface (API) to further retrieve each publication’s detailed metadata (ie, journal, title, category, authors, abstract, etc). Second, we extracted the DOIs of preprint medical publications in the same period from bioRxiv and medRxiv. We further used the bioRxiv API to extract all detailed metadata of each preprint. At the time of data collection, there were 6552 articles available on MEDLINE and 3725 articles from bioRxiv and medRxiv together. Third, social media mentions of all articles from the MEDLINE database and preprint servers were collected from Altmetric, a London-based commercial company that tracks, analyzes, and collects the online activity around scholarly outputs from a selection of online sources, such as blogs, Twitter, Facebook, Google+, mainstream news outlets, and media. We used a research fetch API to query the Altmetric database using DOIs. Fourth, because of Twitter’s terms of use, Altmetric could only share the status ID of tweets through their API. We further retrieved the details of each tweet through a Twitter developer account using the REST API.
The Altmetric collection of tweets contains original tweets, retweets, quoted tweets, and replies. We used original tweets and their retweets, which yielded a raw sample of 268,003 original tweets created before June 1, 2020. We further removed tweets from nonhuman accounts (eg, organizational accounts or bots) through (1) manually checking and matching all official Twitter accounts of each publisher, journal, and preprint server, and (2) manually checking accounts with excessively high tweet volume (>200 tweets) in our data. This resulted in a final sample of 243,567 original tweets and 729,319 retweets. See
By focusing on the early months (January to May 2020) of the COVID-19 pandemic, we generated a large corpus of original tweets (n=243,567) for analysis. Accordingly, our data covered 8612 articles from 1161 peer-reviewed journals in the MEDLINE database and 2 preprint servers (ie, bioRxiv and medRxiv) in the life science and biomedical fields (see
Our raw tweet data contained many non-English tweets as Altmetric collected those tweets based on the presence of valid URLs to the DOI-referenced articles instead of text keywords. To process these data, we wrote and used a simple detect-then-translate program, using a Google Translate API, to translate all non-English tweet texts, user screen names, and user biographies (self-described text descriptions) to English. The translated tweet texts were then used to generate variables in this research. Specifically, to quantify the emotion in each tweet, we first used the previously validated Linguistic Inquiry and Word Count (LIWC) dictionaries [
As mentioned earlier, the discrete perspective is also a critical theoretical approach to investigate emotions [
Lastly, content sharing was measured by the number of retweets. Because our data covered a relatively long timespan (ie, 5 months), we counted the number of retweets within a fixed period (eg, the first 168 hours [a week]) after the time of the tweet to make the retweet count of different tweets comparable.
Answering our third research question required us to identify scientists in related fields (ie, medical doctors or academic researchers in the life science and biomedical fields) among tweet message creators. Unfortunately, there was no reliable existing method for us to identify the relevant scientists. To ensure cost-effectiveness and maintain a focused research scope, we developed (and pilot tested) a 2-step classification approach that relied on keyword identification and heuristic rules. This rule-based algorithm extracted formal job titles (eg, clinician, doctor, physician, and surgeon) and related medical terms (eg, cardiology and gastroenterology) from the user screen name along with their text biography and then differentiated scientists from nonscientists. Our manual verification coding validated a 95.5% F1 score for the classification performance. We acknowledge that this method is imperfect as it can lead to underidentification of scientists. We estimated 30%-50% underidentification through manual validation of our classification results on random samples (
We further included a wide range of previously established control variables that capture the characteristics of the users, referenced articles, and COVID-19 pandemic situation.
Example tweets of each specific emotion.
Emotion | Tweet examplesa |
Joy |
“Some more good news - In this cohort of patients hospitalized for severe Covid-19 who were treated with compassionate-use [DRUG], clinical improvement was observed in [NUMBER] of [NUMBER] patients. #coronavirus #COVID-19” “Good news. Large, retrospective [JOURNAL] study of n=[NUMBER]. [DRUG] did not increase risk of severe #COVID19.” “Some clinical important found about 2019-nCoV from [JOURNAL]. I picked up some important info and translate it Here. |
Anger |
“Are you serious? The stranger this gets the more it screams bioweapon. #COVID19 coronavirus male infertility” “The more vitamin D the less mortality from Coronavirus! The skin produces vitamin D with the sun. So why should we be locked up inside?” “I don't expect politicians to know understand the detail of science. But you can't insult science when you don't like it and then suddenly insist on something that science can't give on demand.” |
Fear |
“Horrific read about allocation of scarce medical resources with #COVID19 by [AUTHORS] in @[JOURNAL] - This is very sad and distressing.” “Severe COVID-19 complications: [SYMPTOM] may be observed in the acute phase in severe cases. Long-term [SYMPTOM] has been observed.” “Horrifying. Social distancing in [LOCATION] is almost next to impossible.” |
Sadness |
“Reading this here left me with depression without enough meme.” “Sadly, this new covid fact will be totally ignored and causing so many lives.” “First time I see a political editorial at the [JOURNAL]. And it is about the disaster that is happening in [COUNTRY]. So sad.” |
Neutral |
“Clinical Characteristics and Results of [TEST & SUBJECT] With COVID19.” “The present study provides ten key recommendations for the management of COVID-19 infections in [DISEASE GROUP]: #COVID19” “Here is the link of the last study on [DRUG]!” |
aThe URL has been removed.
Descriptions of all variables.
Variable | Description |
RT7D | # of retweets in the first 168 hours |
preprint | =1 if the tweet source is a preprint article |
peer | =1 if the tweet source is a peer-reviewed article |
letter | =1 if the tweet source is a journal opinion/letter piece |
scientist | =1 if the user is classified as a doctor or researcher in the life science and biomedical fields |
liwc_positive | # of positive emotion dictionary words identified by LIWCa 2015 |
liwc_negative | # of negative emotion dictionary words identified by LIWC 2015 |
emotion: joy | =1 if the tweet text is predicted to have a salient emotion of joy |
emotion: anger | =1 if the tweet text is predicted to have a salient emotion of anger |
emotion: fear | =1 if the tweet text is predicted to have a salient emotion of fear |
emotion: sadness | =1 if the tweet text is predicted to have a salient emotion of sadness |
emotion: neutral | =1 if the tweet text is predicted to have no specific emotion |
log_follower | (log) number of followers the user had |
verified | =1 if the user is a verified user |
length | # of words in the tweet text |
hashtags | # of hashtags used in the tweet |
mention | =1 if the tweet contains any mention of other users |
title_length | # of words in the reference article in preprints or journal |
title_liwc_pos | # of positive emotion words in the title identified by LIWC 2015 |
title_liwc_neg | # of negative emotion words in the title identified by LIWC 2015 |
log_cov_tweet | (log) rolling 7-day total number of global coronavirus tweets |
log_cov_case | (log) rolling 7-day total number of global new confirmed COVID cases |
log_cov_fatality | (log) rolling 7-day total number of global new confirmed COVID fatalities |
aLIWC: Linguistic Inquiry and Word Count.
Summary statistics of all variables.
Variable | Combined sample (N=243,567), mean (SD) | Preprint (N=47,570), mean (SD) | Peer-reviewed article (N=97,769), mean (SD) | Journal letter (N=98,228), mean (SD) |
RT7Da | 4.928 (85.873) | 6.351 (75.654) | 5.022 (87.606) | 4.145 (88.729) |
preprint | 0.195 (0.396) | N/Ab | N/A | N/A |
peer | 0.401 (0.490) | N/A | N/A | N/A |
letter | 0.403 (0.491) | N/A | N/A | N/A |
scientist | 0.183 (0.387) | 0.156 (0.363) | 0.179 (0.383) | 0.201 (0.401) |
liwc_positive | 0.324 (0.634) | 0.316 (0.619) | 0.300 (0.614) | 0.352 (0.661) |
liwc_negative | 0.206 (0.506) | 0.208 (0.498) | 0.191 (0.480) | 0.221 (0.535) |
emotion: joy | 0.245 (0.430) | 0.280 (0.449) | 0.248 (0.432) | 0.225 (0.417) |
emotion: anger | 0.050 (0.219) | 0.045 (0.207) | 0.034 (0.181) | 0.070 (0.255) |
emotion: fear | 0.410 (0.492) | 0.400 (0.490) | 0.416 (0.493) | 0.409 (0.492) |
emotion: sadness | 0.026 (0.159) | 0.021 (0.143) | 0.021 (0.145) | 0.033 (0.179) |
emotion: neutral | 0.269 (0.443) | 0.254 (0.435) | 0.281 (0.449) | 0.264 (0.441) |
log_follower | 6.367 (2.174) | 6.345 (2.266) | 6.329 (2.205) | 6.415 (2.096) |
verified | 0.039 (0.194) | 0.038 (0.191) | 0.039 (0.194) | 0.039 (0.194) |
length | 19.661 (12.893) | 21.477 (13.021) | 19.969 (12.807) | 18.475 (12.796) |
hashtags | 0.648 (1.387) | 0.647 (1.378) | 0.667 (1.428) | 0.630 (1.350) |
mention | 0.200 (0.400) | 0.176 (0.381) | 0.201 (0.401) | 0.211 (0.408) |
title_length | 11.060 (4.733) | 13.051 (5.063) | 12.511 (4.303) | 8.652 (3.859) |
title_liwc_pos | 0.101 (0.322) | 0.074 (0.280) | 0.090 (0.301) | 0.125 (0.359) |
title_liwc_neg | 0.087 (0.289) | 0.087 (0.290) | 0.103 (0.311) | 0.070 (0.262) |
log_cov_tweet | 15.829 (0.166) | 15.817 (0.138) | 15.831 (0.176) | 15.834 (0.168) |
log_cov_case | 12.507 (1.283) | 12.530 (1.300) | 12.504 (1.346) | 12.498 (1.209) |
log_cov_fatality | 9.690 (1.530) | 9.740 (1.562) | 9.681 (1.611) | 9.674 (1.428) |
aRT7D: number of retweets in the first 168 hours.
bN/A: not applicable.
To answer each of our research questions, we examined (1) the impacts of positive versus negative emotional language; (2) the impacts of specific emotions, such as joy, anger, fear, and sadness; and (3) the role of scientists as social media message creators in sharing about COVID-19 medical scientific papers through statistical analysis. We referred to the collective findings from answering these questions as the emotional dynamics in sharing COVID-19 science on social media. Because the distribution of the retweet count was highly skewed (see
Consistent with prior studies [
Lastly, to buttress any findings from the statistical analysis on the effect of positive and negative emotion words in tweet text, we further conducted explorative analyses using a word cloud plot. We created 4 text corpuses along the emotion dimension (ie, positive vs negative) and tweet source dimension (ie, preprint vs peer reviewed). For example, if a
This paper uses only secondary public data from an authorized Twitter commercial data vendor in compliance with Twitter privacy policy. Apart from the public Twitter handle, our data do not contain any individual identifier.
We started with positive and negative emotional language. In the combined sample of all original tweets, our regression analysis (see
The above results suggested that the “positivity bias” was only prevalent and visible in tweets that mentioned COVID-19 preprints. To further check the findings’ robustness, we also analyzed the within-article effects following a past study on the interpersonal sharing of science to the public [
Our results implied that there were divergent patterns among these 3 subgroups. More specifically, the “positivity bias” was only present in tweets mentioning preprints, which predicted that one additional positive emotional word was associated with a 17.7% increase in the retweet rate (IRR 1.177, 95% CI 1.089-1.272;
Although the results of the statistical analyses implied the existence of a “positivity bias,” they cannot explain why it exists. Hence, we sought to further provide some explorative insights. Using word cloud plots (
Negative binomial estimation results using the Linguistic Inquiry and Word Count emotional dictionary word counts in subgroups.
Variable | Model 1 (preprint; N=47,570)a,b | Model 2 (peer reviewed; N=97,769)a,b | Model 3 (journal letter; N=98,228)a,b | Model 4 (preprint; N=47,570)a,c | Model 5 (peer reviewed; N=97,769)a,c | Model 6 (journal letter; N=98,228)a,c | |||||||
|
|
|
|
|
|
|
|||||||
|
IRRd | 1.177e | 1.048 | 1.043 | 1.084f | 1.048g | 1.029 | ||||||
|
SEh | 0.047 | 0.030 | 0.049 | 0.036 | 0.027 | 0.025 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.980 | 1.033 | 1.041 | 1.031 | 1.030 | 1.032 | ||||||
|
SE | 0.052 | 0.047 | 0.056 | 0.040 | 0.043 | 0.032 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.785e | 1.879e | 1.891e | 1.930e | 1.915e | 1.933e | ||||||
|
SE | 0.059 | 0.027 | 0.026 | 0.025 | 0.022 | 0.024 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 2.040e | 1.865e | 1.465e | 2.003e | 2.032e | 1.822e | ||||||
|
SE | 0.283 | 0.223 | 0.202 | 0.196 | 0.210 | 0.275 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.049e | 1.050e | 1.051e | 1.055e | 1.053e | 1.049e | ||||||
|
SE | 0.004 | 0.002 | 0.004 | 0.002 | 0.002 | 0.002 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.042f | 1.037e | 1.012 | 1.064e | 1.032e | 1.018 | ||||||
|
SE | 0.017 | 0.013 | 0.016 | 0.015 | 0.011 | 0.018 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.944e | 1.604e | 1.703e | 1.601e | 1.469e | 1.632e | ||||||
|
SE | 0.149 | 0.083 | 0.090 | 0.137 | 0.067 | 0.083 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.992 | 0.979e | 1.018f | N/Ai | N/A | N/A | ||||||
|
SE | 0.007 | 0.006 | 0.009 | N/A | N/A | N/A | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.051 | 1.056 | 1.001 | N/A | N/A | N/A | ||||||
|
SE | 0.124 | 0.112 | 0.077 | N/A | N/A | N/A | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.914 | 1.083 | 1.007 | N/A | N/A | N/A | ||||||
|
SE | 0.082 | 0.092 | 0.077 | N/A | N/A | N/A | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.861 | 0.904 | 1.201 | 0.636f | 0.842 | 0.874 | ||||||
|
SE | 0.143 | 0.218 | 0.299 | 0.142 | 0.178 | 0.213 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.864 | 0.778g | 0.846 | 0.717g | 0.762f | 0.728f | ||||||
|
SE | 0.155 | 0.100 | 0.146 | 0.144 | 0.095 | 0.091 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.144 | 1.254f | 1.149 | 0.977 | 1.127 | 1.075 | ||||||
|
SE | 0.178 | 0.136 | 0.175 | 0.168 | 0.119 | 0.131 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 4.596e | 4.369e | 4.172e | 3.593e | 3.711e | 3.367e | ||||||
|
SE | 0.151 | 0.120 | 0.143 | 0.165 | 0.119 | 0.113 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.188 | 0.089 | 0.000f | N/A | N/A | N/A | ||||||
|
SE | 0.503 | 0.337 | 0.002 | N/A | N/A | N/A |
aDependent variable: retweets in the first 168 hours.
bNo fixed effect.
cFixed effect.
dIRR: incidence rate ratio.
e
f
g
hRobust standard error clustered by article.
iN/A: not applicable.
Prediction of the retweet count for (A) preprints, (B) peer-reviewed articles, and (C) journal letters. Positive emotion Linguistic Inquiry and Word Count dictionary words in tweets about preprints predict the highest retweet count. Bands indicate the 95% CIs.
Word cloud plot of all positive/negative emotional words in tweets about preprints and peer-reviewed articles (word size indicates the term frequency-inversed document frequency weight). (A) positive–peer-reviewed articles; (B) positive–preprints; (C) negative–peer-reviewed articles; (D) negative–preprints.
Next, we examined the impact of a specific emotion on retweet count. In this analysis, we used a machine learning approach that was developed for tweet text analysis [
The regression analysis on the combined sample (see
With further analysis, we again observed that the “positivity bias” was most prevalent in tweets mentioning preprints. In the combined sample (see
Prediction of retweet count according to emotion. Joy in tweets about preprints predicts the highest retweet count. Error bars indicate 95% CIs.
Negative binomial estimation results using a specific emotion in subgroups.
Variable | Model 1 (preprint; N=47,570)a,b | Model 2 (peer reviewed; N=97,769)a,b | Model 3 (journal letter; N=98,228)a,b | Model 4 (preprint; N=47,570)a,c | Model 5 (peer reviewed; N=97,769)a,c | Model 6 (journal letter; N=98,228)a,c | |
|
|
|
|
|
|
|
|
|
IRRd | 1.503e | 1.186e | 1.100 | 1.317e | 1.117f | 1.110g |
|
SEh | 0.098 | 0.060 | 0.086 | 0.094 | 0.056 | 0.061 |
|
|
|
|
|
|
|
|
|
IRR | 0.777 | 0.843f | 0.835 | 0.809f | 0.883 | 0.968 |
|
SE | 0.140 | 0.065 | 0.095 | 0.067 | 0.067 | 0.103 |
|
|
|
|
|
|
|
|
|
IRR | 0.998 | 0.985 | 1.023 | 1.152 | 0.971 | 0.992 |
|
SE | 0.079 | 0.056 | 0.106 | 0.109 | 0.055 | 0.064 |
|
|
|
|
|
|
|
|
|
IRR | 0.590e | 1.440 | 0.810f | 0.619e | 1.128 | 0.905 |
|
SE | 0.104 | 0.393 | 0.078 | 0.091 | 0.196 | 0.084 |
|
|
|
|
|
|
|
|
|
IRR | 1.786e | 1.874e | 1.893e | 1.929e | 1.914e | 1.935e |
|
SE | 0.059 | 0.024 | 0.026 | 0.025 | 0.022 | 0.024 |
|
|
|
|
|
|
|
|
|
IRR | 2.149e | 1.892e | 1.485e | 2.029e | 2.030e | 1.821e |
|
SE | 0.315 | 0.222 | 0.212 | 0.203 | 0.208 | 0.273 |
|
|
|
|
|
|
|
|
|
IRR | 1.053e | 1.052e | 1.052e | 1.056e | 1.055e | 1.050e |
|
SE | 0.003 | 0.002 | 0.004 | 0.003 | 0.002 | 0.002 |
|
|
|
|
|
|
|
|
|
IRR | 1.037f | 1.038e | 1.010 | 1.063e | 1.032e | 1.017 |
|
SE | 0.017 | 0.013 | 0.017 | 0.014 | 0.011 | 0.018 |
|
|
|
|
|
|
|
|
|
IRR | 1.849e | 1.593e | 1.689e | 1.596e | 1.466e | 1.618e |
|
SE | 0.136 | 0.084 | 0.086 | 0.121 | 0.067 | 0.082 |
|
|
|
|
|
|
|
|
|
IRR | 0.994 | 0.979e | 1.015g | N/Ai | N/A | N/A |
|
SE | 0.006 | 0.006 | 0.009 | N/A | N/A | N/A |
|
|
|
|
|
|
|
|
|
IRR | 1.090 | 1.064 | 1.029 | N/A | N/A | N/A |
|
SE | 0.141 | 0.111 | 0.075 | N/A | N/A | N/A |
|
|
|
|
|
|
|
|
|
IRR | 0.934 | 1.096 | 1.047 | N/A | N/A | N/A |
|
SE | 0.077 | 0.099 | 0.072 | N/A | N/A | N/A |
|
|
|
|
|
|
|
|
|
IRR | 0.867 | 0.864 | 1.216 | 0.639f | 0.845 | 0.879 |
|
SE | 0.142 | 0.176 | 0.309 | 0.143 | 0.177 | 0.215 |
|
|
|
|
|
|
|
|
|
IRR | 0.851 | 0.791g | 0.853 | 0.723 | 0.759f | 0.728f |
|
SE | 0.152 | 0.100 | 0.142 | 0.145 | 0.094 | 0.091 |
|
|
|
|
|
|
|
|
|
IRR | 1.155 | 1.238f | 1.144 | 0.970 | 1.132 | 1.075 |
|
SE | 0.179 | 0.133 | 0.169 | 0.166 | 0.119 | 0.131 |
|
|
|
|
|
|
|
|
|
IRR | 4.536e | 4.356e | 4.167e | 3.574e | 3.708e | 3.366e |
|
SE | 0.158 | 0.116 | 0.142 | 0.164 | 0.118 | 0.113 |
|
|
|
|
|
|
|
|
|
IRR | 0.157 | 0.160 | 0.000f | N/A | N/A | N/A |
|
SE | 0.417 | 0.519 | 0.001 | N/A | N/A | N/A |
aDependent variable: retweets in the first 168 hours.
bNo fixed effect.
cFixed effect.
dIRR: incidence rate ratio.
e
f
g
hRobust standard error clustered by article.
iN/A: not applicable.
We compared the difference in the retweet rate between tweets from scientists and nonscientists. The distributional differences of specific emotions between scientists and nonscientists in each subgroup are reported in
These results highlighted that scientists’ participation could alter the emotional dynamics in the social media sharing of messages of preprints, as their expressed positive emotions (ie, joy) and high-arousal negative emotions (ie, anger and fear) could enhance sharing. In comparison, the indifferences in the emotional dynamics between scientists’ tweets and nonscientists’ tweets about preprints may suggest that it is the emotion elicited by the messages about preprints, rather than who expressed it, that influences content sharing.
Negative binomial estimation results using interactions between the scientist indicator and the specific emotion indicators in subgroups.
Variable | Model 1 (preprint; N=47,570)a,b | Model 2 (peer reviewed; N=97,769)a,b | Model 3 (journal letter; N=98,228)a,b | Model 4 (preprint; N=47,570)a,c | Model 5 (peer reviewed; N=97,769)a,c | Model 6 (journal letter; N=98,228)a,c | |||||||
|
|
|
|
|
|
|
|||||||
|
IRRd | 1.618e | 1.434e | 1.513e | 1.667e | 1.393e | 1.601e | ||||||
|
SEf | 0.145 | 0.095 | 0.176 | 0.143 | 0.085 | 0.134 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.540e | 1.110g | 1.060 | 1.344e | 1.044 | 1.102 | ||||||
|
SE | 0.123 | 0.068 | 0.112 | 0.113 | 0.060 | 0.074 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.762 | 0.751e | 0.811 | 0.819h | 0.772e | 0.975 | ||||||
|
SE | 0.154 | 0.063 | 0.116 | 0.079 | 0.060 | 0.129 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.020 | 0.893g | 1.008 | 1.194 | 0.890g | 0.984 | ||||||
|
SE | 0.097 | 0.059 | 0.140 | 0.130 | 0.057 | 0.080 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.618h | 1.516 | 0.783h | 0.623e | 1.057 | 0.883 | ||||||
|
SE | 0.126 | 0.490 | 0.085 | 0.107 | 0.224 | 0.092 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.887 | 1.235h | 1.048 | 0.914 | 1.236h | 0.971 | ||||||
|
SE | 0.119 | 0.114 | 0.139 | 0.116 | 0.106 | 0.099 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.194 | 1.767e | 1.258 | 1.106 | 1.913e | 1.034 | ||||||
|
SE | 0.333 | 0.304 | 0.222 | 0.231 | 0.369 | 0.150 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.916 | 1.339e | 1.019 | 0.875 | 1.330e | 1.000 | ||||||
|
SE | 0.105 | 0.119 | 0.151 | 0.103 | 0.101 | 0.096 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.869 | 0.789 | 1.168 | 1.127 | 1.372 | 1.137 | ||||||
|
SE | 0.317 | 0.281 | 0.213 | 0.376 | 0.374 | 0.187 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.772e | 1.858e | 1.878e | 1.914e | 1.895e | 1.922e | ||||||
|
SE | 0.058 | 0.024 | 0.026 | 0.026 | 0.022 | 0.024 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 2.091e | 1.733e | 1.482h | 1.986e | 1.929e | 1.800e | ||||||
|
SE | 0.315 | 0.210 | 0.228 | 0.207 | 0.207 | 0.284 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.052e | 1.052e | 1.053e | 1.055e | 1.055e | 1.050e | ||||||
|
SE | 0.004 | 0.002 | 0.004 | 0.003 | 0.002 | 0.002 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.038h | 1.035e | 1.011 | 1.066e | 1.030e | 1.020 | ||||||
|
SE | 0.017 | 0.012 | 0.017 | 0.015 | 0.011 | 0.018 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.847e | 1.560e | 1.643e | 1.602e | 1.447e | 1.599e | ||||||
|
SE | 0.140 | 0.082 | 0.086 | 0.125 | 0.066 | 0.081 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.993 | 0.978e | 1.011 | N/Ai | N/A | N/A | ||||||
|
SE | 0.006 | 0.006 | 0.008 | N/A | N/A | N/A | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.073 | 1.026 | 1.026 | N/A | N/A | N/A | ||||||
|
SE | 0.142 | 0.097 | 0.074 | N/A | N/A | N/A | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.922 | 1.103 | 1.034 | N/A | N/A | N/A | ||||||
|
SE | 0.075 | 0.097 | 0.071 | N/A | N/A | N/A | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.863 | 0.878 | 1.207 | 0.636h | 0.895 | 0.898 | ||||||
|
SE | 0.140 | 0.176 | 0.309 | 0.145 | 0.191 | 0.225 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.848 | 0.826 | 0.863 | 0.719g | 0.759h | 0.740h | ||||||
|
SE | 0.153 | 0.106 | 0.147 | 0.144 | 0.097 | 0.091 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 1.151 | 1.187 | 1.131 | 0.977 | 1.140 | 1.074 | ||||||
|
SE | 0.180 | 0.127 | 0.171 | 0.167 | 0.123 | 0.129 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 4.489e | 4.250e | 4.098e | 3.539e | 3.624e | 3.308e | ||||||
|
SE | 0.161 | 0.116 | 0.149 | 0.166 | 0.116 | 0.116 | ||||||
|
|
|
|
|
|
|
|||||||
|
IRR | 0.178 | 0.110 | 0.000h | N/A | N/A | N/A | ||||||
|
SE | 0.466 | 0.353 | 0.002 | N/A | N/A | N/A |
aDependent variable: retweets in the first 168 hours.
bNo fixed effect.
cFixed effect.
dIRR: incidence rate ratio.
e
fRobust standard error clustered by article.
g
h
iN/A: not applicable.
The retweet rate difference between tweets from scientists and nonscientists for (A) preprints, (B) peer-reviewed articles, and (C) journal letters. Error bars indicate 95% CIs.
The COVID-19 crisis may have already created a lasting change to the scientific communication process [
First, we observed a positivity bias. A positive-valence emotion, rather than a negative-valence emotion or high-arousal emotion, contributed to the sharing. Even though the pandemic has given COVID-19 research a heightened “news-like” status, the dissemination of this research on social media did not exhibit a pattern mimicking social media news. Instead, it implied that social media users’ sharing of COVID-19 science may be motivated by altruistic reasons or self-enhancement, which was consistent with previous studies on the sharing of science to the public in interpersonal communication settings [
Second, the “positivity bias” was most salient in messages of preprints than messages of articles in peer-reviewed journals. What drives this observed difference in emotional dynamics, especially between tweets about preprints and peer-reviewed research? One possibility could be the nature of preprints, as preprints involve nonvetted findings. The peer-review process helps scrutinize and mitigate the scientific uncertainty of a scientific manuscript, and the process often leads to tone-downed findings and conclusions [
Given the self-enhancement explanation behind the “positivity bias,” it is also possible that tweets about preprints possess higher self-enhancement potential. Findings in preprints may be perceived by social media users to have higher self-enhancement value because they may be perceived as more novel and impactful [
Finally, we showed that scientists’ participation in the social media sharing of COVID-19 science exhibited differential emotional dynamics in tweets about different scientific sources. Specifically, scientists played a moderating role in the sharing of social media messages about peer-reviewed research, as their expressive positive emotions (ie, joy) and high-arousal negative emotions (ie, anger and fear) further enhanced sharing. However, the same pattern was not observed in messages about preprints. Given that peer-reviewed journal research contains arguably much more reliable findings than preprints, the presence of enhancing and neutral effects of scientists’ emotions in tweets about peer-reviewed research and preprints, respectively, could imply a moderated emotional communication process by scientists on social media, selectively promoting more reliable findings. Therefore, our study highlights the instrumental role of scientists in moderating science communication to the public on social media, echoing recent calls for promoting more effective science communication from both the scientific community [
Our focus on studying the messages that explicitly referenced COVID-19 research (ie, with a valid URL reference), however, limited us from examining other messages that may have contained scientific research information but did not provide a valid reference. Lack of a valid reference or source ambiguity is a key factor leading to rumor mongering [
Notwithstanding these limitations, our study provides useful implications that add to the ongoing debate regarding the virtue and danger in the use of preprints in science communication to the public [
Information on data processing and cleaning.
Information on the method for classifying scientist Twitter users.
Information on robustness tests.
Negative binomial estimation results using Linguistic Inquiry and Word Count emotional word counts in the combined sample.
Top 15 positive or negative emotion word stems (Linguistic Inquiry and Word Count 2015) in each text corpus.
Negative binomial estimation results using specific emotion indicators in the combined sample.
Kolmogorov-Smirnov test statistics on the distribution of specific emotions between tweets from scientists and tweets from nonscientists in each subgroup.
application programming interface
digital object identifier
incidence rate ratio
Linguistic Inquiry and Word Count
We sincerely thank Altmetric for providing us academic access to their database for collecting the data.
None declared.