Because I am not
a mathematician or statistician, I am not able to understand much of what has been written on methods for estimating proportions
of false discoveries or confidence intervals in many papers published in the last twenty years. I thought that the main problem was the possibility that the actual proportions could be much smaller
than the estimated proportions (in very large sets of p-values). However, recently I saw the paper by Meinshausen and Bühlmann (see below: Ref. 4.), who say:
(In the Abstract):
"The true amount of false discoveries, however, is very often much larger
than indicated by FDR. We propose the new Discoveries-at-Risk approach (DaR) (......) DaR approach offers both tighter control and more power than FDR when controlling at
low error rates."
(In the Introduction):
"A large proportion of these 'discoveries' might be due to falsely rejected null hypotheses, see e.g. Soric (1989)." (See below: Ref. 5.). The
authors mention Type 1 error rates in multiple testing situations, and say: "The
two most prominent examples are the family-wise error rate, see e.g. Holm (1979) or
Westfall and Young (1993), and the false discovery rate, introduced by Benjamini and Hochberg (1995)." (See below:
Ref. 1.) "(......) It has to
be noted, though, that FDR measures only the expected proportion of
falsely rejected hypotheses. FDR contains no information about the
variance or distribution of this quantity. (......) The variance can be quite high, in particular for dependent test statistics.
A serious, yet mostly ignored shortcoming of FDR is the high risk that the actual
proportion of falsely rejected hypotheses is much larger than suggested by FDR."
Let Q be the actual
proportion of false discoveries in S significant results. So, according to Meinshausen and Bühlmann, there is a high risk
that FDR < Q. But, unless I made
mistakes when I tried to use the Benjamini-Hochberg method (see on the page Qmg), it seems that FDR can easily be larger than Qmax (i.e. Qmax <
FDR), while in a large set of p-values Q < Qmax. If we want to take into account random variations, we can calculate a "practically maximal possible" value
of Qmax (see on the page: Possible Qmax). Hence it seems that: Q < Qmax
< FDR. In other words, couldn't
FDR and Qmax more often overestimate than underestimate the
true proportion of false discoveries, especially in large sets m and S? (But,
I may be wrong).
If a large number (m)
of p-values is obtained from continuous distributions, in many independent experiments, and if the distribution of these p-values
is shown in a histogram, we can therefrom approximately estimate the number of true null hypotheses (m0) and the
proportion of false discoveries in S significant results: Q = F/S.
(I denote this estimate of Q as Qmg , which comes from "Q maximal graphical").
In large sets m and S, I suppose, Qmg might be rather near to the actual Q.
I have been told that there are many cases when the separation between the alternative and the null distributions is
not easy, and more sophisticated statistical methods can then do a much better job than the graphical analysis of the histogram.
Perhaps I am wrong, but, even if the sets are not very large, wouldn't it be useful to calculate the "maximally
possible" and the "minimally possible" values of Qmg (similarly as
in the above-mentioned case of Qmax. See: REMARK 2.
under the TABLE - Click here!)?
Efron,
in his paper of 2008 (see below: Ref. 3.), paid attention to the Benjamini and Hochberg’s False Discovery Rate method
and also to the Benjamini–Yekutieli's False Coverage Rate (FCR). (See below; Ref. 1. and
2.). Efron says that the paper by Benjamini and Yekutieli (2005) is insightful and ingenious (mentioning, though, a case in which "the intervals
are frighteningly wide").
Suppose that m experiments are made (where m is several thousand or more) and m p-values are obtained. We enumerate S significant results with
p-values that are smaller than alpha (p<a). We draw a histogram showing the distribution
of m p-values, and we find the estimate
Qmg of the expected proportion of false discoveries in S. We also take into account possible random variations and calculate the practically "maximal possible" value
of this proportion (Qmg*) and the "minimal possible" value (Qmg**). Then we use
another method (the best one) which gives the best estimate (X) of the same proportion of false discoveries in the same set S, and
we calculate the "maximal possible" value X*
and the "minimal possible" value X**. I would like to know, what would be the actual differences: |Qmg-X|, |Qmg*-X*| and |Qmg**-X**| in a certain real set S.
Has such a comparison been made, and where have the results
been published?
However, if any method makes it possible to estimate Q precisely (or rather precisely) in a large-enough
set, then we can also rather precisely calculate the proportion of false confidence intervals (E) by inserting
that estimate of Q into these formulae:
f = (S-QS)
/ [m -(QS/a0)]
E = [QS
+ (S-QS)×a/f ] / S
where a is used instead of alpha; a0 is
the significance level achieved in S declared discoveries; a is the value used in defining 100×(1-a)-percent confidence intervals;
m and S
are also known. (See the page: E & Emin). Could this method of calculating the proportion of false confidence intervals be interesting or useful in any way? Has
that been published, and where?
I
beg you to help me correct my mistakes!
________________________
REFERENCES:
1. Benjamini,
Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J.
R. Stat. Soc. Ser. B, 57, 289-300.
2.
Benjamini, Y., Yekutieli, D. (2005). False Discovery Rate-Adjusted Multiple Confidence Intervals for Selected Parameters, Journal of the American Statistical Association, Vol. 100, 469
3. Efron, B.
(2008). Microarrays, empirical bayes, and the two-groups model. Statist. Sci., 23, 1{47.
With comments and a rejoinder by the author.
4. Meinshausen N., Bühlmann
P. (2003). Discoveries
at Risk.
http://en.scientificcommons.org/43322211 Seminar für
Statistik, ETH Zürich, Switzerland, May 9, 2003
5. Sorić, B. (1989). Statistical 'Discoveries'
and Effect-size Estimation, Journal of the American Statistical Association, Vol. 84, No. 406 (Theory and Methods), 608-610