Thursday, February 4, 2021

Testing for Covid 19, and Baye's Theorem: An Entry-Level Primer for Those Interested in Medical Testing Accuracy

I put together a little spreadsheet when Covid was first entering our awareness, in early 2020. Link to spreadsheet model (it may open in a Google app; if  so, there will be an "open" button that allows it to be opened in Excel). 

There was an oft-repeated claim that the tests, though slow to come out, were highly accurate--I think they were often said to be 99% accurate, which is very high. However, having been exposed to Bayesian revisions earlier in my career, I thought to see what this really means. Thus the spreadsheet. I pulled it up again recently, and though the early Covid testing issues are behind us, I still thought it might be interesting to some people. So I updated it as a learning tool.

There are three inputs--the sensitivity of the test (its probability of identifying the disease among those known to have it, the value often given for "accuracy"); the specificity of the test (its probability of identifying the absence of disease among those know to not in fact have it), and the prevalence of the disease in the population.

What we're interested in is the likelihood of true and false positive test results, and of true and false negative results.

Now it won't be a surprise to anyone that the sensitivity and specificity of the test are very important to the probability of false signals. It is very intuitive that a lower "accuracy" will give less robust results! (For simplicity here, we'll treat sensitivity and specificity as having the same value, but that often is not the case in the real world of medical testing; the attached model allows the values to differ.

But the prevalence of a disease in the population presents an interesting dynamic. Let's say that sensitivity and specificity are both the same, a highly accurate 99%. If we're testing randomly selected persons drawn from the general population, then the population prevalence becomes very important. For Covid it might be only 1% or 2% at any given time; maybe less. That was perhaps the situation in the early days of testing. But if the prevalence is only 1%, then 50% of the positive tests will be false positives, and if 2% then one-third will be false. This means that the tests give very little certainty value to the results; they aren't truly actionable.

Let me mention that if sensitivity and specificity are lower, but still high at 95%, and the prevalence is 2%, then nearly three-quarters of the positive test results will be false (72.06%). One must truly be dubious of the information value of testing if accuracy is not very high indeed, when prevalence is low.

It gets better if prevalence is higher. While many people test today in order to travel or for other general reasons, and can be considered drawn randomly from the larger population, and have a low prevalence just as above, there is also a large portion that are testing because they have some symptoms that cause them to be concerned. Perhaps the prevalence in this portion of the testing population is higher--let's just be brave and assert baldly that it is 25%. That reduces the false positives to 13.64% for a 95% accurate test, and to 2.94% if the accuracy values are 99%. Certainly a test with less than 3% false positives is very informative.

Anyway, you can use the spreadsheet to try out different input assumptions, and if desired to learn the intuition why false positives are so much higher than we might naively think.