Since the COVID-19 outbreak began in the U.S., there has been an explosion of online symptom checkers and self-triage tools. With many commercial organizations, payers, and healthcare systems making such tools available, how does one filter the signal from the noise in terms of the evidence behind these symptom checkers? The intent of giving patients tools to check their own symptoms and to self-triage remotely is noble from a public health standpoint, offering one mechanism to flatten the curve and help keep the well at home while directing the potentially ill to appropriate healthcare settings. However, the evidence for their safety and accuracy is still largely lacking, which is equally concerning as a public health issue.
What does it mean for a COVID-19 symptom checker and self-triage tool to be evidence based? There are at least two criteria. One is that the symptom checker uses the latest evidence — clinical guidelines if you will — as to what symptoms have what degrees of association with SARs-CoV-2 infection. For example, fever and respiratory symptoms have been recognized as hallmark symptoms of COVID-19, although they are certainly not exclusive to that illness. But until more recently, gastrointestinal symptoms for example, had not been widely considered part of the symptomatology.(1) A variety of other symptoms, such as alteration in the sense of taste or smell have also recently become more widely accepted, and it has been postulated that fever and respiratory symptoms alone likely lead to an under-diagnosis of COVID-19.
Thus, evidence point #1 for symptom checkers and triage tools is asking those who build such tools how current and comprehensive the symptom assessment is, how often the model is updated according to new knowledge about symptomatology, and whose guidelines are used for symptom and risk factor inclusion. One COVID-19 symptom checker which uses latest guidelines from the Centers for Disease Control, is the CDC’s own Coronavirus self-checker (2) which includes variables such as age, gender, geography, travel, contact history, and a host of typical and atypical symptoms.
Evidence point #2 is how such symptom checkers, incorporate or “weight” the symptoms to arrive at triage decisions. To our knowledge, there remains insufficient evidence to guide those who build symptom checkers as to how much more or less gastrointestinal symptoms, for example, should be weighted in a triage decision regarding possible infection. Fortunately, for triage decisions (i.e. should a patient seek care?), certain symptoms such as shortness of breath may more clearly associate with a need for care, while others such as cough or alteration in taste or smell alone, may not.
Evidence point #3 is how accurate is the triage output of symptom checkers. How often does a symptom checker create false positives (i.e. send a patient to seek care or get testing when the care turns out to be unnecessary, or the test turns out to be normal), and false negatives (i.e. tell a patient that there is no need to present for care or be tested, when in fact, the patient should have sought care, or did have infection)? For example, a 2015 study in the British Medical Journal (3), the first wide scale study of accuracy for general purpose symptom checkers, found that symptom checkers had deficits in both diagnosis and triage. Measuring this accuracy, characterized by sensitivity, specificity, positive predictive value, and negative predictive value, is only possible when the triage tool result for a patient is coupled with a clinical system where care visits, clinical decisions, and/or diagnoses are documented and/or where testing is done in order to compare the symptom checker’s output as a screening tool to clinical outcomes. Unless a commercial symptom checker has a way to couple the output for a given patient with a source of truth such as a care encounter or a laboratory test, it is not possible for such tools to report accuracy with evidence. Furthermore, until or unless a perfect laboratory test is available (even lab tests for coronavirus have known false negative rates (4), particularly among patients with mild symptoms), we may not be able to know the true accuracy of symptom checkers.
Recently, however, a COVID-19 symptom checker and triage tool, the first known to be fully integrated with the patient’s electronic health record (EHR), was described in the Journal of the American Medical Informatics Association (5) The tool’s branching logic questions are driven from the EHR, and the answers are stored there as part of the medical record. Triage recommendations from the tool (e.g. recommendations for self-care at home, non-urgent care, urgent care, or emergent care) could be compared in the EHR, for patients who followed triage advice to seek care, to the clinical decisions made at the point of care. For example, of 16 patients who presented to the emergency department within 48 hours of using the tool, 14 had been correctly triaged to the emergent category by the tool (sensitivity of 87.5% (95% CI 61.7–98.5%), and specificity of 76.2% (95% CI 72.9–79.5%). Evidence of the converse, however, is difficult to establish. Namely, if the tool advises self-care at home, the lack of subsequent visit documented in the EHR does not necessarily mean that triage advice was correct. Nevertheless, as perhaps one of the first such triage tools with this sort of evidence, it has since become part of the base content of the EHR vendor, available to other organizations also interested in using it.
A final word of caution about symptom checkers and triage tools. These systems are, by definition, screening tools with confirmation of a potential diagnosis meant to be confirmed clinically or by a gold standard test. Like all screening tests, their charge is to have very high sensitivity, meaning that their false negative rate should be extremely low. So, if a patient is told that he or she does not need to seek care, there should be very few errors in that triage decision. Otherwise, such tools risk creating false reassurance, which could be dangerous from an individual health perspective as well as from a public health perspective. But there is usually a trade off for having very low false negative rates. These tools often achieve low false negative rates at the expense of generating some false positives. If the net effect is that they help reduce surges of patients who don’t need care, then they may be accomplishing part of their mission. But if their rates of false positives are high enough that they direct patients to care who don’t need it, then they may actually contribute to the problem, stretching already burdened healthcare systems further, and adding more fuel to the healthcare crisis.
In summary, these tools have a fine balancing act to walk, and with limited evidence behind many of them, it’s unclear whether they reduce or add to public health challenges. Two symptom checkers and triage tools highlighted here bring evidence to bear. There may be others. We encourage those who plan to deploy these tools to consider the three points of evidence we describe. Stay safe, and be well.
NODE.Health Foundation is a 501(c)(3) non-profit organization dedicated to education, validation and dissemination of evidence based digital medicine. As the largest professional association in digital medicine, NODE.Health empowers societies, executives and NODES from health systems, payers, life sciences, venture capital, startups and the public sector involved in healthcare digital transformation. NODE.Health does not endorse any specific products or services.
NODE.Health is pleased to cross post this article giving examples of evidence-based symptom checker and triage tools for public benefit. NODE.Health encourages its readers to be diligent with selecting such tools and understand the evidence. As more evidence comes out on the use of such tools for COVID-19, NODE.Health will keep its readers informed. Interested in learning more about the Network of Digital Evidence (NODE.Health)? Click here