Q 2009

Because I am not a mathematician or statistician, I am not able to understand much of what has been written on methods for estimating proportions of false discoveries or confidence intervals in many papers published in the last twenty years. I thought that the main problem was the possibility that the actual proportions could be much smaller than the estimated proportions (in very large sets of p-values). However, recently I saw the paper by Meinshausen and Bühlmann (see below: Ref. 4.), who say:

(In the Abstract):

"The true amount of false discoveries, however, is very often much larger than indicated by FDR. We propose the new Discoveries-at-Risk approach (DaR) (......) DaR approach offers both tighter control and more power than FDR when controlling at low error rates."

(In the Introduction):

"A large proportion of these 'discoveries' might be due to falsely rejected null hypotheses, see e.g. Soric (1989)." (See below: Ref. 5.). The authors mention Type 1 error rates in multiple testing situations, and say: "The two most prominent examples are the family-wise error rate, see e.g. Holm (1979) or Westfall and Young (1993), and the false discovery rate, introduced by Benjamini and Hochberg (1995)." (See below: Ref. 1.) "(......) It has to be noted, though, that FDR measures only the expected proportion of falsely rejected hypotheses. FDR contains no information about the variance or distribution of this quantity. (......) The variance can be quite high, in particular for dependent test statistics. A serious, yet mostly ignored shortcoming of FDR is the high risk that the actual proportion of falsely rejected hypotheses is much larger than suggested by FDR."

Let Q be the actual proportion of false discoveries in S significant results. So, according to Meinshausen and Bühlmann, there is a high risk that FDR < Q. But, unless I made mistakes when I tried to use the Benjamini-Hochberg method (see on the page Qmg), it seems that FDR can easily be larger than Qmax (i.e. Qmax < FDR), while in a large set of p-values Q < Qmax. If we want to take into account random variations, we can calculate a "practically maximal possible" value of Qmax (see on the page: Possible Qmax). Hence it seems that: Q < Qmax < FDR. In other words, couldn't FDR and Qmax more often overestimate than underestimate the true proportion of false discoveries, especially in large sets m and S? (But, I may be wrong).

If a large number (m) of p-values is obtained from continuous distributions, in many independent experiments, and if the distribution of these p-values is shown in a histogram, we can therefrom approximately estimate the number of true null hypotheses (m₀) and the proportion of false discoveries in S significant results: Q = F/S. (I denote this estimate of Q as Qmg , which comes from "Q maximal graphical"). In large sets m and S, I suppose, Qmg might be rather near to the actual Q. I have been told that there are many cases when the separation between the alternative and the null distributions is not easy, and more sophisticated statistical methods can then do a much better job than the graphical analysis of the histogram. Perhaps I am wrong, but, even if the sets are not very large, wouldn't it be useful to calculate the "maximally possible" and the "minimally possible" values of Qmg (similarly as in the above-mentioned case of Qmax. See: REMARK 2. under the TABLE - Click here!)?

Efron, in his paper of 2008 (see below: Ref. 3.), paid attention to the Benjamini and Hochberg’s False Discovery Rate method and also to the Benjamini–Yekutieli's False Coverage Rate (FCR). (See below; Ref. 1. and 2.). Efron says that the paper by Benjamini and Yekutieli (2005) is insightful and ingenious (mentioning, though, a case in which "the intervals are frighteningly wide").

Suppose that m experiments are made (where m is several thousand or more) and m p-values are obtained. We enumerate S significant results with p-values that are smaller than alpha (p<a). We draw a histogram showing the distribution of m p-values, and we find the estimate Qmg of the expected proportion of false discoveries in S. We also take into account possible random variations and calculate the practically "maximal possible" value of this proportion (Qmg*) and the "minimal possible" value (Qmg**). Then we use another method (the best one) which gives the best estimate (X) of the same proportion of false discoveries in the same set S, and we calculate the "maximal possible" value X* and the "minimal possible" value X**. I would like to know, what would be the actual differences: |Qmg-X|, |Qmg*-X*| and |Qmg**-X**| in a certain real set S. Has such a comparison been made, and where have the results been published?

However, if any method makes it possible to estimate Q precisely (or rather precisely) in a large-enough set, then we can also rather precisely calculate the proportion of false confidence intervals (E) by inserting that estimate of Q into these formulae:

f = (S-QS) / [m -(QS/a₀)]

E = [QS + (S-QS)×a/f ] / S

where a is used instead of alpha; a₀ is the significance level achieved in S declared discoveries; a is the value used in defining 100×(1-a)-percent confidence intervals;

m and S are also known. (See the page: E & Emin). Could this method of calculating the proportion of false confidence intervals be interesting or useful in any way? Has that been published, and where?

I beg you to help me correct my mistakes!

________________________

REFERENCES:

1. Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B, 57, 289-300.

2. Benjamini, Y., Yekutieli, D. (2005). False Discovery Rate-Adjusted Multiple Confidence Intervals for Selected Parameters, Journal of the American Statistical Association, Vol. 100, 469

3. Efron, B. (2008). Microarrays, empirical bayes, and the two-groups model. Statist. Sci., 23, 1{47. With comments and a rejoinder by the author.

4. Meinshausen N., Bühlmann P. (2003). Discoveries at Risk. http://en.scientificcommons.org/43322211 Seminar für Statistik, ETH Zürich, Switzerland, May 9, 2003

5. Sorić, B. (1989). Statistical 'Discoveries' and Effect-size Estimation, Journal of the American Statistical Association, Vol. 84, No. 406 (Theory and Methods), 608-610