DISCUSSION
-------------------------------------------------------------------------
December 30, 2009
A professor of medicine wrote:
"Well-known, well taught, still
neglected. In many applications the true positive rate is only 1 %, meaning that usually false positives will far outnumber
true positives. (......) The info content of criteria is best when the pre-criteria positivity estimate is close to
50 %. lf we survey a population, which only has 1 % RA
in it, like the full population, then we shall have a lot of false positives and only a few true positives. Google 'false positives' and look for Wikipedia refs!".
Comment by B. Soric:
"False positive paradox" (from
Wikipedia)
http://en.wikipedia.org/wiki/False_positive_paradox
"The false positive paradox is
a situation where the incidence of a condition is lower than the false
positive rate of a test, and therefore, when the test indicates that a condition exists,
it is probable that the result is a false positive".
For example, if a subject has
the disease, a medical test may be, say, 99% likely to correctly indicate that she does, and if a subject does not
have the disease, it may be 99% likely to correctly indicate that she doesn't. Suppose that a disease occurs in
1 out of 10,000 people. The numbers from the Wikipedia example are here shown
in a contingency table:
p=0.99
; p'=0.99 ; 1-p'=0.01
True positive True
negative Sum _
Test positive: a=
99 b= 9999 10,098 = S
Test negative: c= 1
d=989,901 989,902
Sum: R=100 N= 999,900 1,000,000 = M
1. If we suppose that all the values are known, we calculate:
Positive predictive value = a/(a+b) = 1-P = 99/10,098
= 0.0098
Proportion of false positives = b(a+b) = P = 0.9902
2. But even if we know only p, p', M, and S,
we can still calculate:
R = [S-(1-p')M] /(p+p'-1) = 100 ;
N = M-R = 999,900
P = (1-p')N/S = 0.9902
So, the "false positive paradox"
may be well-known, but so far I have not found anywhere that the latter possibility of calculation (under 2.) has been published
by anybody. It is very simple, but it seems to be very important for researchers!
-------------------------------------------------------------------------
December 31, 2009
Another professor wrote this:
The proportion (P)
of false positives cannot be exactly calculated, even for very large samples, because sensitivity and specificity are
known only approximately.
Comment by B. Soric:
That
may be true, but unless we consider S/M or calculate P, we don't know at all whether P is too large or not.
Suppose, for example, that
we apply ACR criteria to 800 patients, whose diagnoses may be even totally unknown to us, and 417 of them are classified
as RA. So, we know: M= 800 ; S= 417 ; S/M= 0.521 ; p=
0.935 ; p'= 0.893 while
R, N and P are still unknown. Without
calculating P or taking into consideration S/M we have no idea of the percentage of wrong RA
diagnoses (false positives); but we can calculate as follows:
R= [S-(1-p')M] /(p+p'-1) = 400
; N= M-R = 400 ; and:
P= (1-p')N/S = 0.103 - though
this value of P is just an approximate
or expected value. (The same value of P
we find in Table 2. for S/M = 0.521 ; see
on page "Home" of this web-site!). So, we may expect or hope that a large part of those 417
patients indeed have rheumatoid arthritis.
On the contrary, if,
say, only 120 of the 800 patients are classified as RA, we find R=42 ; N=758 ; P=0.676 In the latter case we certainly cannot assume that most of those 120 patients
indeed have RA.
-------------------------------------------------------------------------