This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Infodemiology, is properly cited. The complete bibliographic information, a link to the original publication on https://infodemiology.jmir.org/, as well as this copyright and license information must be included.
The COVID-19 pandemic has been accompanied by an
In this work, we aimed to develop a practical, structured approach to identify narratives in public online conversations on social media platforms where concerns or confusion exist or where narratives are gaining traction, thus providing actionable data to help the WHO prioritize its response efforts to address the COVID-19 infodemic.
We developed a taxonomy to filter global public conversations in English and French related to COVID-19 on social media into 5 categories with 35 subcategories. The taxonomy and its implementation were validated for retrieval precision and recall, and they were reviewed and adapted as language about the pandemic in online conversations changed over time. The aggregated data for each subcategory were analyzed on a weekly basis by volume, velocity, and presence of questions to detect signals of information voids with potential for confusion or where mis- or disinformation may thrive. A human analyst reviewed and identified potential information voids and sources of confusion, and quantitative data were used to provide insights on emerging narratives, influencers, and public reactions to COVID-19–related topics.
A COVID-19 public health social listening taxonomy was developed, validated, and applied to filter relevant content for more focused analysis. A weekly analysis of public online conversations since March 23, 2020, enabled quantification of shifting interests in public health–related topics concerning the pandemic, and the analysis demonstrated recurring voids of verified health information. This approach therefore focuses on the detection of infodemic signals to generate actionable insights to rapidly inform decision-making for a more targeted and adaptive response, including risk communication.
This approach has been successfully applied to identify and analyze infodemic signals, particularly information voids, to inform the COVID-19 pandemic response. More broadly, the results have demonstrated the importance of ongoing monitoring and analysis of public online conversations, as information voids frequently recur and narratives shift over time. The approach is being piloted in individual countries and WHO regions to generate localized insights and actions; meanwhile, a pilot of an artificial intelligence–based social listening platform is using this taxonomy to aggregate and compare online conversations across 20 countries. Beyond the COVID-19 pandemic, the taxonomy and methodology may be adapted for fast deployment in future public health events, and they could form the basis of a routine social listening program for health preparedness and response planning.
Since the beginning of the COVID-19 pandemic, digital communication and social networking have supported the rapid growth of real-time information sharing about the virus that causes COVID-19 (SARS-CoV-2) and the disease in the public domain and across borders. The breadth of conversation, diversity of sources, and polarity of opinions have sometimes resulted in excessive information, including false or misleading information, in digital and physical environments during an acute public health event; this can lead to confusion and risk-taking behaviors that can harm health, trust in health authorities, and the public health response [
To address this challenge, the World Health Organization (WHO) Information Network for Epidemics (EPI-WIN), in collaboration with digital research partners, developed a methodology for weekly analysis of digital social media data to identify, categorize, and understand the key concerns expressed in online conversations [
Previous research has explored the use of data produced and consumed on the web to inform public health officials, agencies, and policy—a concept known as
The design of interventions for infodemic response must account for an ecosystem where information flow online can cause public health harm offline. Metrics and frameworks related to digital information flows and online behavior are most useful to practitioners when they can be coupled with other online and offline sources of public health data that inform public health decision-making. The WHO has therefore expanded the concept of infodemiology into a multidisciplinary scientific field that amalgamates cross-disciplinary and mixed methods approaches designed to inform the health emergency response [
Health emergencies give rise to information overload, which has been shown to influence people’s risk perceptions and protective actions during health emergencies [
Health authorities therefore not only face the challenge of providing relevant, high-quality health information but also must provide it at the right time, in the right format, and with collaborative engagement of communities [
The rise of social media platforms has generated a readily available source of real-time data related to what people express and share in online communities. The 2009 H1N1 influenza pandemic was the first pandemic to occur in the era of social media and was one of the earliest outbreaks informed by analysis of online conversations and information-seeking behaviors. The previous pandemic offers a case study that evidences how online social listening has been used to follow rapidly evolving public sentiment, track actual disease activity, and monitor the emergence of misinformation [
The onset of the COVID-19 pandemic has exacerbated concerns about misinformation. Throughout the pandemic, there has been a demand for information; at first, this demand was for information about the origin of the virus, and now it is focused mainly on the response to the virus, particularly vaccination and wider public health and social measures. Similar to information voids [
Despite the influx of studies as to how information is being spread and shared in the era of COVID-19 and how information is influencing people’s health practices, major gaps in knowledge remain as to how best to monitor, understand, and respond to it [
Detecting viral misinformation narratives and information voids in real-time data is crucial to a rapid, comprehensive response by authorities for effective delivery of health information to populations during a health emergency, although this does not ensure that people will necessarily act in accordance with that health information. Previous research has evaluated the correction of misinformation and the role of individuals versus organizations in using real-time data [
Interventions need to address the different aspects of the information ecosystem that influence the spread and health impact of an infodemic. For platforms, content moderation policies, modification of content promotion algorithms, and designing for friction can discourage sharing of misinformation and unverified information [
Research is ongoing to assist policy makers in understanding public concerns and sentiment around the pandemic as well as in tracking information outbreaks and the emergence of misinformation. However, there is little to no empirical evidence on how this research can be used to develop practical tools for an outbreak response by public health authorities. More collaborations between researchers and public health practitioners are needed to fill this gap. As a contribution to the infodemic response toolbox, the taxonomy and methodology in this study offer a practical, structured approach for identifying information voids and narratives of concern that warrant attention and action. This approach has already provided actionable data to help the WHO focus its efforts for the COVID-19 pandemic response.
A social listening taxonomy for COVID-19 conversations was developed specifically for this analysis. It was designed to filter digital content referring to COVID-19 (and synonyms) for items of relevance in a public health context and to classify that content into categories. The taxonomy consisted of 35 keyword-based searches (one set of searches for each of two languages, namely English and French) which were grouped into 5 overarching topic categories representing thematic areas in which people were engaging, writing, or searching for information.
The 5 top-level categories and corresponding 35 subcategories of this social listening taxonomy for COVID-19 conversations were defined based on established epidemic management and public health practices during an outbreak of infectious disease [
Structure of the social listening taxonomy for COVID-19 conversations.
Each of these 5 categories were segmented into subcategory levels that are familiar to the epidemiologist’s investigation and management of the outbreak, resulting in a total of 35 taxonomy subcategory levels (
Each of the 35 taxonomy subcategories encompassed a list of topics that captured different aspects of that segment of the online conversation on COVID-19. Keywords for the 35 subcategory searches were generated based on expert knowledge from the WHO EPI-WIN team and translated into Boolean search strings to identify topic-related language for review of relevant social media posts and news content. The keywords of the taxonomy are available on request at the contact address listed in the
In addition to the taxonomy subcategory levels, the keyword-based Boolean search string was created to also identify posts containing a question; this enabled analysis of categories for which people were seeking information and, therefore, potential information voids. The question search string was designed to be paired with each of the 35 taxonomy Boolean search strings to identify posts referring to the topic and containing question words, verb-subject inversions, and auxiliaries.
Finally, the sum of the total volume of the social media conversation (on all topics) was estimated by monitoring the number of posts mentioning at least one of the most commonly used words in English (eg,
The analysis was based on the weekly aggregation of publicly available social media data in English and French using Meltwater Explore. Institutional Review Board review was not sought, as the analysis used large bodies of text written by humans on the internet and on some social media platforms. The analysis and resulting reports focused on the identification of conversation narratives and thematic questions instead of on individual statements and users.
The Meltwater social listening platform was configured to collect verbatim mentions of keywords associated with the 35 predefined taxonomy category Boolean searches from 9 open data sources and fora (Twitter, blog entries, Facebook, Reddit posts and comments, other unspecified message boards or fora, comments under news articles and blog entries, Instagram posts, product reviews, and YouTube video titles and comments). A total of 87.02% of the resulting analysis data set was sourced from Twitter. Blogs (5.34%) and, specifically, the Reddit platform (4.34%) were the next most prominent sources in the data set. These were followed by message boards (2.14%), comments under news articles (0.89%), online review websites (0.13%), Instagram (0.12%), and Facebook (0.03%).
For each of the 35 taxonomy subcategories, the global daily total volume of posts, and the volume of posts posing a question, were recorded on a weekly basis. Tracking changes in volume from week to week also enabled determination of the velocity for a given subcategory.
The methodology used to test and validate the retrieval and classification in this study used both retrieval precision and retrieval recall, which are related to how much retrieved data is relevant and how much relevant data is retrieved, respectively [
To test whether the taxonomy categories captured the intended information (retrieval precision), a random sample of content captured by each of the 35 Boolean searches was human-coded for relevance (10,500 posts in total) by a single reviewer, with a second reviewer validating the coding. The post was coded as either relevant to the search subcategory (1) or not relevant (0).
The aim of the coding was to determine the proportion of relevant (R; also, “true positive” [TP]) results as a percentage of the retrieved sample. The coders judged whether a post was relevant according to the intended definition of the specific subcategory search for which the post was returned. For example, if a post had been returned for the “The Illness – Confirmed Symptoms” search, the coder would check if the post referred to a confirmed symptom of COVID-19 (TP) or whether the matched keywords were mentioned in a different context (irrelevant [I]; also, false positive [FP]). For instance, if a post had been returned by the Boolean search for COVID-19 vaccines, did the post refer to COVID-19 vaccinations? If yes, the post was coded as a TP. If the post in question mentioned COVID-19, but the part of the post mentioning vaccines was about the influenza vaccine, the post would be coded as an FP.
The initial retrieval precision testing showed an average result of 82% for the 35 taxonomy subcategory searches. The retrieval precision rate was calculated as precision = [TP ÷ (TP + FP)] × 100%.
A total of 7 searches returned content below the target minimum retrieval precision rate of 70%, with a range of 42% to 100% (
To spot-check the coding for reliability, we deployed a second reviewer to analyze 10% of the posts (30 per taxonomy category search, 1500 in total). We calculated the Cohen kappa to determine intercoder reliability, which was found to be high (κ=0.81, observed agreement [po]=0.95, expected agreement [pe]=0.76).
A further test was performed to assess retrieval recall: whether content of relevance to the research aims failed to be retrieved by the taxonomy searches. To test this, a random sample of 1000 items of content, mentioning COVID-19 (and synonyms) but excluding the taxonomy category keywords (the “not retrieved” sample in
Results of retrieval precision testing and retesting with a sample size of 300 posts analyzed per subcategory.
Subcategory | Posts retrieved by the taxonomy category search human-coded as true positives and retrieval precision rate, n (%) |
The Cause – The Cause | 217 (72.3) |
The Cause – Further Spread – Stigma | 260 (86.7) |
The Cause – Further Spread – Immunity | 245 (81.7) |
The Illness – Confirmed Symptoms | 189a (63)/239b (79.7) |
The Illness – Other Discussed Symptoms | 141a (47)/218b (72.7) |
The Illness – Asymptomatic | 300 (100) |
The Illness – Presymptomatic | 300 (100) |
The Illness – Means of Transmission | 295 (98.3) |
The Illness – Protection From Transmission | 299 (99.7) |
The Illness – Underlying Conditions | 238 (79.3) |
The Illness – Demographics – Sex | 215 (71.7) |
The Illness – Demographics – Age | 215 (71.7) |
The Illness – Vulnerable People | 287 (95.7) |
The Illness – Vulnerable Communities | 269 (89.7) |
Treatment – Vaccines | 300 (100) |
Treatment – Current Treatment | 144a (48)/224b (74.7%) |
Treatment – Research & Development | 290 (96.7) |
Treatment – Nonproven Treatment (Nutrition) | 245 (81.7) |
Treatment – Myths | 126a (42)/221b (73.7) |
Interventions – Measures in Public Settings | 243 (81) |
Interventions – Testing | 280 (93.3) |
Interventions – Supportive Care – Equipment | 204a (68)/257b (85.7%) |
Interventions – Supportive Care – Health Care | 289 (96.3) |
Interventions – Personal Measures | 298 (99.3) |
Interventions – Reduction of Movement | 256 (85.3) |
Interventions – Protection | 276 (92) |
Interventions – Technology | 278 (92.7) |
Interventions – Travel | 250 (83.3) |
Interventions – Faith | 201a (67)/269b (89.7) |
Interventions – Unions and Industry | 223 (74.3) |
Interventions – The Environment | 183a (61)/290b (96.7) |
Interventions – Inequalities | 280 (93.3) |
Interventions – Civil Unrest | 280 (93.3) |
Information – Misinformation | 273 (91) |
Information – Statistics | 244 (81.3) |
aIndicates a taxonomy subcategory search that performed below minimum requirements and was subsequently updated and retested to yield better performance.
bNumber and percentage of posts in the sample coded as true positives in the retesting of the taxonomy subcategory search following the update.
Results of human coding of retrieved and unretrieved samples for calculation of retrieval recall and F-scores.
Sample | Coded relevant | Coded irrelevant | Total coded sample, n | ||
|
Samples, n | Description | Samples, n | Description |
|
Retrieveda | 875 | True positive | 125 | False positive | 1000a |
Not retrieved | 304 | False negative | 696 | True negative | 1000 |
Total | 1179 | N/Ab | 821 | N/A | 2000 |
aThe “retrieved” sample size was downweighted to equal the “not retrieved” sample size.
bN/A: not applicable.
The results of the coding of the “not retrieved” sample indicated the proportion of TN results as a proportion of the sample; 70% of content was judged not to be relevant to the research aims, and therefore it was deemed correct that this content was not retrieved by our taxonomy. From the data, we also calculated the retrieval recall rate as recall = [TP ÷ (TP + FN)] × 100%.
The overall retrieval recall rate was 74%. This coding process enabled identification of areas where the existing Boolean string could be expanded to include more relevant keywords to retrieve more relevant content, or where the taxonomy could be expanded to include new and emerging issues. From the content that was not retrieved but was judged to be of potential relevance to the research aims (false negatives, FNs), 3 topics were identified that will be added to the taxonomy in a pending update: mutations/variants of the COVID-19 virus; “long covid” (long-term symptoms of COVID-19); and the impact of the pandemic on mental health and well-being.
To validate the coding of the sample of “not retrieved” content for reliability, we deployed a second reviewer to analyze 10% of the posts (100 posts). We calculated the Cohen kappa to determine intercoder reliability, which was found to be high (κ=0.86, po=0.93, pe=0.50).
From the results of the coding of the retrieved and unretrieved samples, we calculated an F1 score and an F0.5 score with the following formulas:
The F1 score (harmonic mean of precision and recall) for the searches was 0.80, and the F0.5 score was 0.84. F1 and F0.5 scores range from 0 to 1, with 1 representing perfect performance. A higher F1 or F0.5 score is considered reasonable, with a score closer to 1 indicating stronger performance of a retrieval and classification approach. The inclusion of the F0.5 measure reflects the greater importance of retrieval precision in this study: given the vast number of potentially relevant pieces of content, it is more important to the aims of this project to correctly classify the retrieved posts than to collect every possibly relevant post. Therefore, we consider it a positive result to achieve a higher F0.5 score than F1 score. This is because in this study, it is more important that the results are not impacted by a high number of false positives and that the true positives are classified into the correct subcategory. The retrieval recall testing is also helpful because it enables identification of new or changing pandemic issues, such as new terminology being used that can be added to the taxonomy category search language over time.
Potential information voids were identified based on 3 parameters within the weekly data set: the volume (ie, how many social media items referred to topic X?), the velocity (ie, the rate of increase of the number of social media items that have engaged with topic X over the course of the past week), and the presence of questions about the topic. The volume was the sum of the online items that mentioned COVID-19 together with a keyword related to each tracked topic. Velocity was determined as the percentage increase of the volume of content items aggregated under each topic from week to week, where velocity = [(current week’s total number of mentions – previous week’s total number of mentions) ÷ (previous week’s total number of mentions)] × 100%.
Starting in late March 2020, weekly global analysis reports were produced that supplied the EPI-WIN team with early warnings of points of concern expressed in public comments by online users [
Each week, social media conversations were segmented based on levels of velocity and quantitatively examined for public engagement (eg, likes, shares, poll votes, reactions), hashtags, and most-used keywords and phrases. From this weekly quantitative data, up to 10 topics with high velocity and/or a large proportion of social media posts expressing a question, and/or with high levels of engagement, were identified as potential priority information voids or sources of confusion or concern.
The identified issues on social media were then further evaluated using engagement data and Google search trends to determine whether a significant number of online users had also been looking for information on these topics to help determine whether the information void was more widespread.
Each week, we used the quantitative analysis to identify up to 10 topics reflecting potential information voids and areas of concern. These topics were then examined in more detail via qualitative analysis to understand the context and identify where action may need to be taken in line with a sequential explanatory design approach [
This analysis prioritized the flagging of widespread confusion or frequently asked questions, the rapid amplification of misinformation, or ad hoc aspects of the conversation that were particularly relevant to public health, such as vaccine questioning ahead of and during a vaccination campaign.
The quantitative data were compiled in a web-based dashboard accessible to the emergency responders in the EPI-WIN team, and insights were discussed with EPI-WIN emergency responders on a weekly basis. The dashboard was updated weekly to allow investigation of short- and long-term trends in volumes, changes in velocity, and the volume of questions for each topic.
Weekly written reports outlined quantitative and qualitative findings about the 5 to 10 topics of concern, included visualizations from the dashboard, and summarized recommendations for action when needed [
Quantitative analysis of the volume changes indicated that the narratives and questions in the online conversations shifted as the pandemic evolved over the course of 2020 and into 2021 (
Most discussed topics by month and results of the pivoted data set by month, sorted by volume of social media mentions.
Year and month | Most discussed topic | Volume (millions of social media mentions) | |
|
|||
|
March | Interventions – Testing | 37 |
|
April | Interventions – Testing | 18 |
|
May | Interventions – Measures in Public Settings | 12 |
|
June | Interventions – Testing | 11 |
|
July | Interventions – Testing | 14 |
|
August | Interventions – Testing | 9 |
|
September | Interventions – Testing | 8 |
|
October | Interventions – Testing | 17 |
|
November | Interventions – Testing | 8 |
|
December | Treatment – Vaccines | 15 |
|
|||
|
January | Treatment – Vaccines | 15 |
|
February | Treatment – Vaccines | 12 |
|
March | Treatment – Vaccines | 15 |
|
April | Treatment – Vaccines | 15 |
At the same time, topics re-emerged periodically in terms of popularity. All 35 categories of topics that were tracked resumed a higher velocity throughout the reporting period for an average of 18 weeks combined (
Frequency of weekly velocity growth (number of weeks in which a topic experienced positive velocity) and average weekly increase rate (or decrease, when a negative value is returned) by topic.
Topic | Number of weeks in which a topic experienced positive velocity (increase of social media mentions since previous week) | Average weekly increase in number of social media mentions (%) |
Treatment – Myths | 26 | 31 |
The Illness – Demographics – Age | 24 | 8 |
Interventions – Reduction of Movement | 23 | 14 |
The Cause – The Cause | 23 | 7 |
Vaccines | 22 | 7 |
The Cause – Further Spread: Stigma | 22 | 10 |
Interventions – Faith | 21 | 3 |
The Illness – Other Discussed Symptoms | 21 | 52 |
Treatment – Current Treatment | 20 | 3 |
Interventions – Travel | 20 | 3 |
Interventions – The Environment | 20 | 7 |
The Illness – Confirmed Symptoms | 20 | 1 |
The Illness – Asymptomatic transmission | 19 | 6 |
The Illness – Means of Transmission | 19 | 10 |
Interventions – Measures in Public Settings | 18 | 4 |
The Cause – Further Spread: Immunity | 18 | 9 |
Treatment – Nonproven Treatment (Nutrition) | 18 | 4 |
Information – Statistics and Data | 18 | 4 |
Interventions – Technology | 18 | 5 |
Information – Misinformation | 17 | 5 |
The Illness – Vulnerable Communities | 17 | 4 |
Interventions – Testing | 17 | -1 |
Interventions – Supportive Care – Health Caren | 17 | 3 |
Interventions – Protection | 17 | -3 |
The Illness – Presymptomaticn | 16 | 40 |
The Illness – Underlying Conditionsn | 16 | 10 |
Interventions – Supportive Care – Equipment | 16 | 0 |
The Illness – Protection From Transmission | 16 | 0 |
Interventions – Personal Measures | 15 | -3 |
The Illness – Demographics – Sex | 15 | 8 |
Treatment – Research & Development | 15 | 8 |
Interventions – Unions and Industry | 15 | -1 |
Interventions – Inequalitiesn | 14 | -1 |
The Illness – Vulnerable Peoplen | 14 | 1 |
Interventions – Civil Unrest | 14 | 32 |
Analysis of the peaks in discussion of 2 of the leading recurring topics, “risk related to age demographics” and “the cause,” provided insight into how narratives around these topics were fueled by real-life events. The conversation on “risk related to age demographics” increased in velocity 24 times throughout the period studied. A total of 3 million public social media posts engaged with the topic: 64% of these posts were focused on children, whereas 30% focused on older people. The volume of conversation on children and COVID-19 risk increased above the yearly average for 133 days. Speculation about the severity of COVID-19 infection in children was raised consistently throughout the evaluation period, and it represented fertile ground for confusion and potential misinformation. Major triggers included news reports of child deaths (560,000 public posts discussed children and mortality), reports of symptoms observed in children in particular (300,000 public posts discussed children and symptoms), the debate over school reopenings, particularly with regard to transmission (818,000 public posts) and, most recently, COVID-19 immunization (656,000 public posts). In relation to this topic, doubts resurfaced repeatedly about the threat of COVID-19 to children; however, there was a diversity of narrative foci for these doubts, linked to changing events during the pandemic.
By contrast, public discussion on the possible origins of the pandemic (“the cause”) had a persistent narrative throughout the evaluation period. “The cause” of the epidemic was a focus of 3.26 million public social media posts throughout the period monitored. The size of the conversation was most prominent at the beginning of the pandemic and diminished as of June 2020, but with periodically recurring peaks in the number of posts. Conspiracy theories suggesting the artificial origin of the virus as a bioweapon were persistent in online discussion, and prominent influencers operating in the conspiracy theory space were often linked to resurgent peaks in public online discussion. The phrase “biological weapon” was mentioned 326,000 times in the public social media space (cf. 141 million mentions of COVID-19 vaccines in the same period). The rate of mentions decreased by 65% from Q2 to Q3 2020 (as it decreased to 34,000 mentions globally), but it surged to 110,000 in Q4 as the theory regained prominence in the public discourse, in part driven by the release of a preprint paper claiming that the virus was an “unrestricted bioweapon” [
The insights obtained in this study have afforded public health experts the opportunity for a more rapid and targeted assessment of a subsample of narratives across the English and French languages using public digital sources. These insights can be combined with others to better understand whether and how people are understanding public health and social measures and putting them into practice to protect themselves and their communities.
The application of this taxonomy to successive weekly online social listening analysis has resulted in a better understanding of the evolution and dynamics of high-velocity conversations about COVID-19 worldwide in English and French during the pandemic. The taxonomy also provides a quantifiable approach to support more adaptive and targeted planning and prioritization of health response activities. For example, monitoring and characterizing re-emerging topics can guide re-evaluation and updating of risk communication and community engagement initiatives to improve understandability and resonance, or highlight where adjustments in technical guidance, public health policy, and social measures may be needed. In addition, the fact that narratives discussed online often overlap across different categories reveals the breadth of this taxonomy, and this overlap enables emerging narratives and potential information voids to be picked up through velocity alerts raised in different elements of the taxonomy.
The testing process described in this article forms the basis of the taxonomy review and maintenance process. Updates to the taxonomy are also informed by observations from the weekly analysis and reporting of the data, and public health expert knowledge via WHO, the wider news agenda, and epidemic management context of the pandemic. The taxonomy has been updated twice since its creation in March 2020, with a third update forthcoming in 2021. The aim of the taxonomy updates is to ensure that important new and emerging topics are captured as the pandemic evolves (as in the examples of variants/mutations and “long covid” above) and that the taxonomy includes the latest language and terms being used by the public [
There is added value in using a common social listening taxonomy for integration of insights from a variety of data sources and research methods in online and offline communities. This can provide a more systematic way to integrate analysis of different data sources and facilitate complementarity of digital social listening data with other data such as knowledge, attitudes, and practices research to help uncover drivers of online discussion, and to support social listening in vulnerable or more marginalized communities, including those with limited access to online platforms.
A challenge of this analysis approach is the need for human analysts to continuously monitor and evolve the taxonomy in line with the developing narratives and emerging topics as well as the changing language used in discussions of the COVID-19 pandemic. Ideally, taxonomies would be tested, reviewed, and updated frequently, particularly when a new stage of the pandemic begins (eg, when the vaccine rollout started), as such events in the pandemic timeline can generate new topics of discussion and new terminology (eg, “Covid passports”). However, the benefit of more frequent updates is balanced by the need for comparability of data across time as well as by the fact that this analytical method needs to be rapidly reproducible, including in more resource-constrained environments, to have real, practical use week-on-week during the pandemic to inform the immediate needs of the health authority response, including risk communication and community engagement in any country context.
To help identify actionable insights, the weekly analysis was focused less on exact counts of mentions and more on relative changes, narratives, and topic signals to evaluate and contextualize infodemic signals. When rapidly identifying up to 10 information voids in large weekly data sets, absolute precision was less important than the early detection of an actionable signal to help trigger a timely response. For example, if there was a sudden rise in online narratives expressing concern over a treatment, coupled with other information available from the emergency response, the exact number of mentions was less important than signal detection, analysis, and recommendations for possible action. Despite this, more research is needed to refine and streamline the process for rapidly updating and publishing such taxonomies, especially in protracted epidemics, where shifts in concerns and conversations are bound to occur.
A key takeaway from the analysis that can be applied during the current pandemic is the frequent recurrence of topics of concern and its implications for communication. Public health authorities, governments, and nongovernmental organizations must be prepared to communicate repeatedly on the same issue, adapting frames, approaches, and content as public perceptions of issues and topics shift. Our analysis shows that areas of concern wax and wane, with confusion disappearing and re-emerging as new information comes to light or new events occur. Monitoring the changing narratives on a weekly basis and over time using a taxonomy, such as the one used in this study, can enable health authorities to assess longer-term trends and to be more nimble in adapting approaches to respond effectively to topics of concern and to counter misinformation. Further research can help to adapt these digital social listening approaches to provide metrics for evaluation of infodemic management interventions.
The taxonomy has been adapted, translated, and applied in a number of country-level studies in Mali, the Philippines, and Malaysia [
A pilot project by WHO EPI-WIN and research partners, Early Artificial intelligence–supported Response with Social Listening (EARS) [
There is an opportunity to apply the taxonomy and methodology described in this paper to detect information voids during future, as yet unknown, pandemics and other public health crises. The 5 top-level categories and some of the 35 sub-categories are relevant to social listening in any outbreak but would need to be adapted to the type of pathogen. If, for example, the HIV/AIDS epidemic had started in the digital, connected world of 2020 rather than in the 1980s, the online social listening taxonomy structure would have needed some adjustment to filter and segment public discourse related to the epidemic and identify information voids. For example, a “Demographics – Men Who Have Sex With Men” topic could be added under the category “the illness” to better hear questions and concerns from this particular demographic group. This approach could also include adjustments to subcategories under “the intervention” to remove irrelevant subcategories of “Reduction of Movement” and “Unions and Industry.” After such a taxonomy review and adjustment, the keywords used to capture content related to each category and subcategory would also need to be systematically reviewed to ensure they were appropriate to the narratives in relation to specific illness in question. For example, terms relating to injected drug use, sex between men, sex between a man and woman, and mother-to-child transmission could be added under “The Illness – Modes of Transmission”. Having a taxonomy structure and methodology already in place as a starting point would enable faster deployment of digital social listening activities in a future outbreak.
Interpretation of the analysis must account for the limitations of the data sources included in the content aggregator. During health emergencies, health authorities require surge support in social listening, response, and evaluation functions. Analysis services from a central analytics unit or from commercial or academic institutions need to be set up quickly to use a systematic approach to detect and understand people’s changing concerns, questions, and possible areas of confusion shared publicly online. The overhead in management of data from open sources can be high, and in settings where the social listening analytics capacity is not yet in place for routine analysis, content aggregators can be used to rapidly set up an analysis workflow. The media content aggregation platform used for this study offers firehose access to Twitter, ensuring a complete set of data for analysis, subject to privacy limitations. Other sources in the platform are either sampled from or limited to public posts only [
This research is global and is limited to two major languages (English and French). As a result, only major online narrative themes and information voids were identified, and the resulting interpretations may not be representative of trends and patterns that could be observed in digital communities for other languages. Moreover, in a global weekly analysis, smaller or more localized conversations may go undetected. One of the aims of this work is to apply and advance the methods to develop taxonomies that can be rapidly applied to any linguistic context for different geographies and public health events.
It has also been observed that the global English-language data set is prone to overrepresent the voice of social media in geographic regions or communities that are more digitally active than others. A key challenge in this study was the digital amplification of discourse pertaining to US politics, the elections, and the digital prominence of US civil society thereof [
Another limitation of this research is the start date of the project, March 23, 2020, which is several weeks after COVID-19 was declared a public health emergency of international concern; however, data prior to this date (back to January 2020) have been retrieved and stored for future analysis, ensuring that it is possible to analyze a longer timeline.
Adaptation and application of the taxonomy structure in future outbreaks must also take into account validation of information retrieval and recall. The test scores referenced in the taxonomy testing and validation section should be taken as estimates of the accuracy of the retrieval process by the taxonomy category searches, and function most effectively as a tool for identifying areas for improvement. A key limitation of the test results is that human coders can make errors [
This research focuses on the identification of potential information voids and sources of confusion in online social conversations to provide actionable insights for risk communication and community engagement and other health response activities. While it can provide insight into the opinions expressed online, integration with other analyses, including from listening to offline communities is needed. Applying this methodology globally has provided the added and needed insight, inspiring new ways of thinking and use of information in support of risk communication during health emergencies. Much of the value of the taxonomy we developed is in the capacity to rapidly deploy and provide ongoing insights about information voids during an outbreak, which then allows a health authority to take evidence-informed action and course-correct risk communication during an epidemic. The application of the taxonomy and methodology for social listening at regional, country, and subnational levels in the COVID-19 pandemic—which is already being tested—offers possibilities for more actionable insights that must increasingly support a localized response. Moreover, this method offers an approach for monitoring of concerns, questions, and information voids in future outbreaks, enabling a faster response by the health authorities in affected countries during the next acute health event.
The listing of keywords and search terms per taxonomy subcategory is available upon request by contacting enquiry@mediameasurement.com.
Early Artificial intelligence–supported Response with Social Listening
World Health Organization Information Network for Epidemics
false negative
false positive
expected agreement
observed agreement
first quarter
second quarter
third quarter
fourth quarter
true negative
true positive
World Health Organization
The authors would like to thank the EPI-WIN team members for their contributions to the ideation on this work, and Bernardo Mariano, WHO Director of Digital Health and Innovation, for supporting and advocating for this digital analytics innovation work. S Bezbaruah, S Briand, CC, JL, AM, TN, TDP, and FT are staff of the WHO. These authors are alone responsible for the views expressed in this paper, and these views do not represent those of their organization.
S Ball, PV, AW, and TZ are employed by a media monitoring company that provides a wide range of services to clients in media monitoring and listening, including the WHO. The work described in this paper was part of the contractual service to the WHO. The other authors have no conflicts to declare.