< Tests

False positives, false negatives, sensitivity, specificity of COVID tests: what are we talking about?

Text updated on 2020-12-04

When testing for the coronavirus responsible for COVID-19 or another contagious disease, there are four types of people:

true positive (VP): infected individuals with a positive test,
true negatives (VN): uninfected individuals who test negative,
false positives (FP): individuals testing positive when they are not infected,
False negatives (FN): individuals who test negative while infected.

For screening tests, the aim is to detect as many people as possible who could be infected with the SARS-CoV-2 coronavirus in order to isolate them and prevent them from infecting others. For example, one can imagine tests carried out at the entrance of a stadium for a football match. We want to ensure that infected people are identified. What we want to maximize, in this case, is the sensitivity, i.e., the proportion of true positives among all infected individuals (=VP/(VP+FN) ). This is the probability that a test is positive for an infected person: the higher the probability, the more sensitive the test is. In the case of COVID-19, with sensitivity not the only important parameter to consider in managing the epidemic, the test must also be rapid and feasible on a large number of people, as explained below.

For diagnostic tests, the aim is to establish a diagnosis, i.e., to find out whether a person is infected with the SARS-CoV-2 coronavirus. It is then important to make sure that the disease is detected. We want to make sure that people who are not infected are correctly identified. What you want to maximize in this case is the specificity, i.e., the proportion of true negatives among all uninfected individuals (=VN/(VN+FP) ). This is the probability that a test will be negative for a person who is not infected: the higher the probability, the more specific the test is.

To evaluate a test, both its specificity and sensitivity must be considered. Taken separately, these two parameters are meaningless. For example, if the sensitivity is 100% and the specificity is 50%, this means that all infected people will be detected as positive, however, many people who are not infected will be mistakenly identified as positive (false positives). The ideal is to choose a test that optimizes both sensitivity and specificity.

The concepts of sensitivity and specificity are used for dichotomous tests (yes/no, positive/negative) whereas many laboratory measurements give a continuous value. The threshold of a test (the value at which it is decided that a test becomes positive) influences its sensitivity and specificity. The value of this threshold depends strongly on the intended use of the test. The threshold can be changed to make a test more sensitive but, in this case, it may become less specific. Increasing the sensitivity of a test is usually done at the expense of its specificity, and vice versa.

In the case of COVID-19 tests, different reagents and protocols are used. There is no standard between laboratories and between countries regarding the threshold value at which a person is declared positive.

Depending on the intended use of the test, better sensitivity or specificity will be sought. In France, the Haute Autorité de la Santé considers that diagnostic tests for COVID-19 must have a minimum specificity of 99% and a minimum sensitivity of 80%.

In the case of the COVID-19, three types of screening/monitoring can be distinguished:

(1) use for epidemic surveillance and prevention. In this case, screening tests are carried out in order to be able to trigger an alert procedure and a reinforcement of safety procedures when a certain threshold is exceeded. There is not necessarily an attempt to trace back contaminated individuals.
(2) mass screening. For example, regular testing in at-risk communities (typically nursing homes, university campuses, or food production centers).
(3) spot screening prior to a gathering. For example before a family reunion or a cocktail party at the White House.

Cases (1) and (2) concern large-scale screening over a short period of time. In order to reach thousands or even millions of people, the best approach in view of current data seems to be test pooling and the use of saliva tests. See the questions What approaches could accelerate large-scale screening? and Pooling tests ("pooling", "pools"): why and for what purpose? In case (3), the main concern is to ensure that there are no infected persons participating in the gathering, even if this means excluding persons who are false-positive. In this case, tests with high sensitivity will be preferred.

Sources

Article explaining the concepts of specificity and sensitivity with diagrams.

Loong, T. W. (2003). Understanding sensitivity and specificity with the right side of the brain. Bmj, 327(7417), 716-719.

Review article explaining screening tests and screening programs.

Guessous, I., Cornuz, J., Gaspoz, J. M., & Paccaud, F. (2010). Screening: principles and methods: Screening. Swiss Medical Journal, 6(256), 1390-1394.

At the end of September 2020, the French High Authority for Health issued an opinion in favour of the use of antigenic tests on nasopharyngeal swabs only in people who present symptoms of COVID-19 (fever, dry cough, loss of smell or taste, etc.) if the test performance is: minimum sensitivity greater than 80% and minimum specificity greater than 99%.

Opinion of the High Authority for Health of October 9, 2020.

New York Times article explaining that different threshold values are considered in different states in the United States and that this influences the estimation of the number of COVID-19 positive people.

Mandavilli, A. (2020). Your Coronavirus Test Is Positive. Maybe It Shouldn't Be. New York Times. Updated 17 Sept 2020.

Opinion of the French Society of Microbiology (SFM) in France dated on the end of September 2020 proposing an algorithm and a cut-off value for COVID-19 in an attempt to homogenize test results.

https://www.sfm-microbiologie.org/wp-content/uploads/2020/10/Avis-SFM-valeur-Ct-excre%CC%81tion-virale-_-Version-Finale-07102020-V3.pdf

The Food and Drug Administration (FDA) in the United States distinguishes three types of tests for COVID-19: "surveillance", "screening", and "diagnostic testing".

FDA Website. COVID-19 Test Uses: FAQs on Testing for SARS-CoV-2. Last accessed Dec 1, 2020.

Modelling of different COVID-19 screening strategies depending on the frequency of testing (from daily to biweekly), the speed to receive results (and, therefore, the length of time between the test and the isolation of a positive person), and the sensitivity of the tests (minimum viral load for detection). The researchers conclude that regular virological testing and rapid results are more important for controlling the epidemic than highly sensitive tests. However, testing must be done very regularly (the model envisions testing every day, every 3 days, or every week) which is costly and requires good logistics.

Larremore, D. B., Wilder, B., Lester, E., Shehata, S., Burke, J. M., Hay, J. A., ... & Parker, R. (2020). Test sensitivity is secondary to frequency and turnaround time for COVID-19 surveillance. MedRxiv.

In China, in mid-October 2020, more than 10 million people were tested for COVID-19 in 5 days thanks to 4,090 test sites in Qingdao and surrounding suburbs. Each resident was contacted to perform the test. Registration information included ID card number, business or residential address, and telephone number. Nasopharyngeal swabs were obtained. In order to reduce processing time and save resources, a pooling approach was used, with each pool containing samples from 3 to 10 people (3 for contact cases of infected persons, 5 for inpatients or caregivers, and 10 for community members). If a pooled sample was positive, then an individual test was performed on each person in the pool.

Xing, Y., Wong, G. W., Ni, W., Hu, X., & Xing, Q. (2020). Rapid Response to an Outbreak in Qingdao, China. New England Journal of Medicine, e129.

Regular mass screening of students at Duke University in the United States: 10,265 students tested a total of 68,913 times, and 84 tested positive, half of whom were asymptomatic. The tests were performed using a pool of nasopharyngeal swabs.

Denny, T. N., Andrews, L., Bonsignori, M., Cavanaugh, K., Datto, M. B., Deckard, A., ... & Haase, S. B. (2020). Implementation of a pooled surveillance testing program for asymptomatic SARS-CoV-2 infections on a college campus-Duke University, Durham, North Carolina, August 2-October 11, 2020.