How
to Make Good Use of Diagnostic or Classification Criteria
for Rheumatoid Arthritis or Other Diseases
(By Branko Soric)
December, 2009
..
The gist of the problem
It is usual to use a contingency table in order to show the
numbers of patients who have or do not have a certain disease (for example rheumatoid arthritis - briefly: RA) which is diagnosed
either truly or falsely:
True RA True
non-RA Sum _
Diagnosed as "RA":
a b a+b (=S)
Diagnosed as "non-RA": c
d c+d _
Sum: a+c (=R) b+d (=N) a+b+c+d (=M)
S and M are known, while R and N are unknown. We can find the values
of R and N, as well as the proportions of true and wrong "RA"-diagnoses (a/S and
b/S) by simply solving a system of two equations with two unknowns (see in the appendix!).
Theoretically, in very large sets, and if sensitivity and specificity of a test are known exactly,
the value b/S = P (as well as a/S = 1-P) is absolutely dependent on S/M.
Bayes' postulate
It seems that (mis)understanding the use of diagnostic or classification criteria
for rheumatoid arthritis (RA) - similarly as in the case of statistical inference
- was for some time based on the wrong Bayes' postulate (not to be confused
with the correct Bayes' theorem!)(1), which says that unknown probabilities (or proportions) may be supposed to
be equal to each other.
For
example, if we suppose that in a large set of experiments there are as many true null hypotheses as true alternatives, we
may easily obtain a small proportion of false discoveries (among statistically significant results). Likewise, if the sensitivity
and specificity of the classification criteria for RA are near 0.9, and if we suppose that there are as many RA patients
as non-RA patients in a large group of rheumatic patients, then we may expect about ten percent of misclassifications.
(It should be kept
in mind, though, that percentages, obtained by applying the classification criteria for RA, can only be approximate or expected
values. Much larger groups of patients would have to be studied in order to obtain more exact results).
However,
there are 1.3 million cases of RA among U.S. adults, and 46.4 million cases of all forms of arthritis.(2)
If the classification criteria were applied to all these patients, we might easily
detect about one million true RA cases (from the RA group) and some 4.6 million false-RA cases (from the other group). As
a result, some 82 percent of those classified as RA would be misclassified ! (Namely: 4.6/5.6 = 0.82). Only the remaining 18
percent would indeed have RA.
I have asked some of the leading
experts if the classification criteria should be applied to the general population, or to
a population of patients coming to general-practitioners' offices, or only to rheumatic patients, or only
to those who come to rheumatologists' offices, etc. I have received
this answer: "The criteria were designed for patients who were candidates for the
diagnosis of RA, and not for the general population". But, who are the candidates? Cannot almost
any patients with arthritis or arthralgia (or some other symptom(s)) be considered
as candidates?
Diagnosing or classifying?
The
ARA diagnostic criteria for rheumatoid arthritis were proposed in 1956 and revised in 1958.
In 1987 the American College of Rheumatology (ACR) published a new revised set of classification criteria, where the
term “diagnostic” criteria was replaced by “classification”.(3)
According to some distinguished
rheumatologists, "the criteria are used for classification in studies, not for individual diagnosis, which is an individual
clinical decision for each patient". -
"The ACR criteria were put together so that investigators doing clinical trials would agree as to whether or not a
patient had 'rheumatoid arthritis,' as defined by the criteria". - "The sole purpose of the ACR criteria was to standardize
on a patient set that could be entered into clinical trials". - Etc.
The 1987 criteria for the classification of rheumatoid arthritis (RA) were formulated
from a computerized analysis of 262 patients with RA and 262 control
subjects with rheumatic diseases other than RA (i.e. non-RA), where the diagnoses had not been originally based on classification criteria, but the criteria were later
applied to these two groups. Let M denote
the known total number of patients (M = 524). Let
R be the unknown number of true RA cases, and let N be the unknown number of true non-RA cases. M = R+N = 524. Table 1. shows a part of the results i.e. numbers
of correctly and incorrectly classified cases (which numbers are taken from the paper by Arnett et al.(3)):
_ _
Table 1.
(Classification tree method)
Classified:
_ Overall
Correctly Incorrectly _
RA patients:
262 =R 245 =pR 17 =(1-p)R
Non-RA (control subjects):
262 =N 234 =p'N
28 =(1-p')N_
Sensitivity
= 245 / 262 = 0.935 =p
[ M = R+N = 524 ]
Specificity
= 234 / 262 = 0.893 =p' [ S= pR+(1-p')N
= 273 ]
Classified as RA: S = 273 = 245 (true
RA) + 28 (true non-RA)
Classified as non-RA: 251
= 17 (true RA) +234 (true non-RA) _
_ Total:
M = 524 = 262 (true RA) +262 (true non-RA) _
(See also Table 4. in the appendix!)
So, there were as many true RA cases as true non-RA cases (262+262=524). Of
all the 524 patients, 273 were classified as RA, while 251 were classified as non-RA. The proportions of misclassifications
were not large: 28/273 = 0.103 among those classified as RA, and 17/251 = 0.068
in the other group (classified as non-RA).
However, if the numbers of true
RA and true non-RA cases are very different from each other, there can be many misclassifications. The unknown proportions of false classifications can be calculated from the observed proportion
of cases classified as RA (or non-RA) among
all the patients. (If we know S/M, we can calculate the unknown values and proportions).
Finding
the proportion of wrongly classified cases
If p = 0.935 =sensitivity, and if p' = 0.893 = specificity, then the known number (S) of cases classified as RA is
S = pR+(1-p')N = 273 where N and R are not known, but we can calculate
them as follows:
A
simple derivation (in the appendix!) yields: R
=[S- 0.107×M[ / 0.828
= 262 and hence we can also calculate: N = M-R = 524-262 = 262.
The
proportion (P) of misclassified cases among the S
cases that are classified as RA is: P = (1-p')N/(pR+p'N) = (1-p')N/S ; the proportion (P*) of misclassified cases among the M-S cases that are classified as non-RA
is: P* = (1-p)R/(M-S).
Some
combinations of the values R and N are given in Table 2. For a given known
ratio S/M we can find in the table
the corresponding proportion of mistakes (P).
_
_
Table 2.
M = R +
N S/M
(M-S)/M P P*_
1000 = 1000
+ 0
0.935 = p 0.065=1-p 0
1.000
1000 = 900 + 100
0.852 0.148 0.013 0.395
1000 = 800 + 200
0.769 0.231 0.028 0.225
1000 = 700 + 300
0.687 0.313 0.047 0.145
1000 = 600 + 400 0.604 0.396
0.071 0.098
1000 = 500 + 500 0.521 0.479 0.103 0.068
1000 = 400 + 600
0.438 0.562 0.147
0.046
1000 = 300 + 700
0.355 0.645 0.211 0.030
1000 = 200 + 800
0.273
0.727
0.314
0.018
1000 = 100 + 900
0.190 0.810 0.507 0.008
1000 = 0 + 1000 0.107=1-p' 0.893 = p' 1.000 0 _
_ (p = sensitivity = 0.935
; p' = specificity = 0.893) _
Proportion of true RA diagnoses
The
diagnostic criteria from 1958 were sometimes used with the intention of diagnosing RA in individual cases, before the introduction
of the new classification criteria of 1987. Perhaps some of the latter criteria
might sometimes help to estimate the percentage of true RA diagnoses(?). For
example, let us consider a possibility of applying only three of the seven criteria given in Table 3.
_ _
Table 3.
Sensitivity
(p) Specificity
(p')
_
1. Morning stiffness
91.2 % p1=0.912
40.4 % p'1=0.404
1-p'1=0.596
2. Arthritis of 3 or
more joints
90.7 % p2=0.907
84.0 % p'2=0.840 1-p'2=0.160
3. Arthritis of hand joints 79.3 % p3=0.793
84.0 % p'3=0.840
1-p'3=0.160
4. Symmetric arthritis
(any region)
94.3 % p4=0.943
74.3 % p'4=0.743
1-p'4=0.257
5. Rheumatoid nodules 43.4 % p5=0.434
97.7 % p'5=0.977 1-p'5=0.023
6. Serum rheumatoid
factor 80.4
% p6=0.804
87-0 % p'6=0.870
1-p'6=0.130
7. Radiographic changes
(ARA)
77.2 % p7=0.772 93.7
% p'7=0.937 1-p'7=0,063
(Note: According to Arnett et al.(3),
a patient is said to have rheumatoid
_ arthritis if he/she has satisfied at least 4 of the above
7 criteria). _
[ From the data given in Table 3. we find: p2×p5×p7 = = 0.907×0.434×0.772 = 0.304 and: (1-p'2)( 1-p'5)(
1-p'7) = 0.16×0.023×0.0063 = 0.00023 ].
Let us suppose that the criteria
No. 2., 5. and 7. are applied to a group of 36,000 patients (=M), and suppose
that the unknown number of RA patients in this group is only 1000 (=R). (So,
there are N = 35,000 non-RA patients in this group). On the assumption (which is not sure!) that the symptoms
No. 2., 5. and 7. appear in single patients independently of each other (i.e. in random combinations), only about 304 RA patients
will be discovered (namely: p2×p5×p7 = 0.304 ; 1000×0.304=304), but we can expect more than 97 percent of
the RA diagnoses to be true. Namely, the expected number of wrong RA diagnoses is 35,000×0.00023= 8 ; 8/(304+8)=0.026 or 2.6 percent.
Other diseases
There are other diagnostic criteria that can be used for diagnosing (or classifying) various other diseases.(7)
Perhaps, in order to determine the achieved proportion of true diagnoses of such
a disease, it might be useful to reason in a similar way as above and use the observed proportion of diagnoses made on a large
number of patients (S/M).
Diagnoses and discoveries
The
possibility to estimate the proportion of false diagnoses by means of the classification criteria for RA is
remindful of statistical verification of the results of scientific experiments by means of calculating the "false discovery
rate" (FDR)(4) or by some other similar methods.(5)
Namely, if we diagnose RA in a single case, or if we obtain a statistically-significant result in a single experiment,
we may surmise that our diagnosis - or our "statistical discovery", respectively - may be true. However, in order to determine the probability of a mistake, we need a large set of patients, or a large
set of experiments, respectively.
If we
know the ratio S/M, we can calculate the proportion P, as described above. Similarly,
if we know the ratio (r/n) of statistically significant results (r) in a large set of experiments (n), we can calculate
(for example) the expected maximal proportion of false discoveries (Qmax), from the following formula:
Qmax
= [(n/r)-1] / [(1/0.05)-1], where 0,05 is the "5-percent level of statistical
significance".(5)(6) For example, if
r/n = 0.69 (i.e.: n/r = 1.449), we find: Qmax = 0.024 which shows that the maximally-expected proportion of
false discoveries is less than 0.05 i.e. less than 5 percent.
________________________________________________________________________
Appendix
Known values: M, S, S/M, (M-S)/M = 1-(S/M), p = 0.935 p' =
0.893
wherefrom we calculate:
R , N,
P, P*
Two equations with two unknowns (R and N): S = pR+(1-p')N ; M =R+N
S = pR + (1-p')(M-R) =
pR + (1-p')M-(1-p')R = (p+p'-1)R+(1-p')M
S-(1-p')M = (p+p'-1)R ; R = [S-(1-p')M] /(p+p'-1)
1-p' = 0.107 ; p+p'-1 = 0.828 ; R = [S - 0.107×M] / 0.828 ;
R/M = [(S/M) - 0.107]
/ 0.828 ;
N=M-R
S/M = 0.828×R/M
+0.107 ; P = (1-p')N/(pR+p'N) = (1-p')N/S
P= (1-p')N/S ; P* = (1-p)R/(M-S)
-----------------------------------------------------------------------------------------------------
Table
4. _ True RA True non-RA Total _
Classified
as RA: a = 245
b
= 28 a+b
= 273 =S
Classified as non-RA: c = 17
d = 234
c+d = 251
_
Total: a+c= 262 =R b+d = 262 =N a+b+c+d= 524 =M_
Sensitivity = p = a/(a+c) ;
specificity = p' =
d/(b+d)
Positive predictive value = a/(a+b) = 1-P ; P = (1-p')N/S = b/(a+b)
P is the proportion of non-RA cases
among the S cases that are classified
as RA.
_______________________________________________________________________
See: DISCUSSION (click here!)
References:
1. Lancelot
Hogben: Mathematics in the Making, Rathbone Books Limited, London - Mladinska knjiga, Ljubljana, 1977.
p. 269
2. Charles G. Helmick, David T. Felson, Reva C. Lawrence, et al.: Estimates of the Prevalence of Arthritis and Other Rheumatic
Conditions in the United States, Arthritis & Rheumatism, Part 1, Vol. 58, No. 1,
January 2008, pp. 15-25
http://www.rheumatology.org/press/prevalence-one.asp
3. Frank C.
Arnett, Steven M. Edworthy, Daniel A. Bloch, et al.:
The American Rheumatism Association 1987 Revised Criteria for
the Classification of Rheumatoid Arthritis, Arthritis and Rheumatism, Vol. 31. No. 3 March 1988
http://www.rheumatology.org/publications/classification/ra/1987_revised_criteria_classification_ra.asp?aud=mem
4. Benjamini,
Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J.
R. Stat. Soc. Ser. B, 57, 289-300.
5. Sorić, B. (1989). Statistical 'Discoveries' and
Effect-size Estimation, Journal of the American Statistical Association, Vol. 84, No. 406 (Theory and Methods), 608-610
6. Soric, B (2001).
Statisticko zakljucivanje (Statistical Inference)
https://soric-b.tripod.com/statistickozakljucivanje/
https://soric-b.tripod.com/statisticalinference/
7. Some Internet addresses of papers
dealing with diagnostic criteria for various diseases:
http://www.mult-sclerosis.org/DiagnosticCriteria.html
Diagnostic Criteria for Multiple Sclerosis
http://www.ncbi.nlm.nih.gov/pubmed/16717206
Revised diagnostic criteria for neuromyelitis optica
http://neuro.psychiatryonline.org/cgi/content/full/15/2/200
An Empirical Study of Different
Diagnostic Criteria for Delirium Among Elderly Medical Inpatients
http://www.pwsausa.org/syndrome/Diagnos.htm
Diagnostic Criteria
for Prader-Willi Syndrome
http://ncptsd.kattare.com/ncmain/ncdocs/fact_shts/fs_dsm_iv_tr.html
DSM-IV-TR criteria
for PTSD - "In 2000, the American
Psychiatric Association revised the PTSD diagnostic criteria in the fourth edition of its Diagnostic and Statistical Manual
of Mental Disorders (DSM-IV-TR)".
http://www.nfinc.org/nf1.shtml
Diagnostic Criteria for NF-1 (Neurofibromatosis type 1)
http://jnnp.bmj.com/content/80/12/1364.abstract
Validity of diagnostic criteria for chronic inflammatory demyelinating polyneuropathy: a multicentre European study
http://www.amjmed.com/article/S0002-9343(09)00334-9/abstract
Diagnostic Criteria for Atrophic Rhinosinusitis
http://bloodjournal.hematologylibrary.org/cgi/content/full/112/2/231
The revised World
Health Organization diagnostic criteria for polycythemia vera, essential thrombocytosis, and primary myelofibrosis: an alternative
proposal
http://hht.org/medical-scientific/diagnostic-criteria-for-hht/
Diagnostic Criteria for HHT (Hereditary
Hemorrhagic Telangiectasia)
http://www.medscape.com/viewarticle/412642_4
Diagnostic
Criteria For Diabetes Mellitus
http://www.ingentaconnect.com/content/bsc/bjd/2008/00000158/00000004/art00014
Diagnostic
criteria for atopic dermatitis: a systematic review
http://www.medicalcriteria.com/site/index.php?option=com_content&view=article&id=121%3Aneurocluster&catid=64%3Aneurology&Itemid=80&lang=en
ICHD-II Diagnostic Criteria for Cluster Headache
http://www.medicalnewstoday.com/articles/76581.php
New
Diagnostic Criteria For Alzheimer's Disease
______________________________________________________________________-
December, 2009
Branko Sorić
(doctor of medicine, retired)
Vlaška 84
10000 Zagreb, Croatia
Fax: +385 1 4623 436 E-mail: branko.soric@zg.t-com.hr
See: DISCUSSION (click here!)
|