DISCUSSION

DISCUSSION

-------------------------------------------------------------------------

December 30, 2009

A professor of medicine wrote:

"Well-known, well taught, still neglected. In many applications the true positive rate is only 1 %, meaning that usually false positives will far outnumber true positives. (......) The info content of criteria is best when the pre-criteria positivity estimate is close to

50 %. lf we survey a population, which only has 1 % RA in it, like the full population, then we shall have a lot of false positives and only a few true positives. Google 'false positives' and look for Wikipedia refs!".

Comment by B. Soric:

"False positive paradox" (from Wikipedia)

http://en.wikipedia.org/wiki/False_positive_paradox

"The false positive paradox is a situation where the incidence of a condition is lower than the false positive rate of a test, and therefore, when the test indicates that a condition exists, it is probable that the result is a false positive".

For example, if a subject has the disease, a medical test may be, say, 99% likely to correctly indicate that she does, and if a subject does not have the disease, it may be 99% likely to correctly indicate that she doesn't. Suppose that a disease occurs in 1 out of 10,000 people. The numbers from the Wikipedia example are here shown in a contingency table:

p=0.99 ; p'=0.99 ; 1-p'=0.01

True positive True negative Sum _

Test positive: a= 99 b= 9999 10,098 = S

Test negative: c= 1 d=989,901 989,902

Sum: R=100 N= 999,900 1,000,000 = M

1. If we suppose that all the values are known, we calculate:

Positive predictive value = a/(a+b) = 1-P = 99/10,098 = 0.0098

Proportion of false positives = b(a+b) = P = 0.9902

2. But even if we know only p, p', M, and S, we can still calculate:

R = [S-(1-p')M] /(p+p'-1) = 100 ; N = M-R = 999,900

P = (1-p')N/S = 0.9902

So, the "false positive paradox" may be well-known, but so far I have not found anywhere that the latter possibility of calculation (under 2.) has been published by anybody. It is very simple, but it seems to be very important for researchers!

-------------------------------------------------------------------------

December 31, 2009

Another professor wrote this:

      The proportion (P) of false positives cannot be exactly calculated, even for very large samples, because sensitivity and specificity are known only approximately.

Comment by B. Soric:

      That may be true, but unless we consider S/M or calculate P, we don't know at all whether P is too large or not.

      Suppose, for example, that we apply ACR criteria to 800 patients, whose diagnoses may be even totally unknown to us, and 417 of them are classified as RA.  So, we know: M= 800 ;   S= 417 ; S/M= 0.521 ;   p= 0.935 ;   p'= 0.893 while R, N and P are still unknown. Without calculating P or taking into consideration S/M  we have no idea of the percentage of wrong RA diagnoses (false positives); but we can calculate as follows:

      R= [S-(1-p')M] /(p+p'-1) = 400 ;   N= M-R = 400 ;   and:

      P= (1-p')N/S = 0.103   - though this value of P is just an approximate or expected value. (The same value of P we find in Table 2. for S/M = 0.521 ; see on page "Home" of this web-site!). So, we may expect or hope that a large part of those 417 patients indeed have rheumatoid arthritis.

      On the contrary, if, say, only 120 of the 800 patients are classified as RA, we find R=42 ; N=758 ; P=0.676 In the latter case we certainly cannot assume that most of those 120 patients indeed have RA.

-------------------------------------------------------------------------