Q 2009

E & Emin

Home
Possible Qmax
TABLE
Qmg
Qmg & Qmax
E & Emin

Estimating the expected proportion of

false confidence intervals

            ...

m0m.jpg

     (Symbols)    Significant   Not  significant   Total

     -------------------------------------------------------------

     Null  true           F               m0 - F               m0

     Alter. true         T               m1 - T               m1

     -------------------------------------------------------------

     Total                  S               m  - S                m  

     

Symbols     ( a  is used instead of alpha)

m = large number of experiments (p-values)

m0 = number of true null hypotheses in  m

m1 = number of true alternative hypotheses in  m

                m = m0+m1 ;    m1 =  m1' +m1'' +m1'''....(etc.)

S = F+T = number of significant results (discoveries) at the level  a0  in  m

F = number of false discoveries in  S

T = number of true discoveries in  S

a = F/m0 = significance level, i.e. proportion of false discoveries  in m0

a =  probability that a 100(1-a)-percent confidence interval (in m0  or m1) is false;  

       i.e. proportion of false 100(1-a)-percent confidence intervals in  m  such intervals

f = T/m1 = proportion of true discoveries in m1   

                F = a0m0 ;       T = f m1 

Q = F/S = actual (or nearly-exactly estimated) proportion of false discoveries in S

Qmg  (= "Q-maximal-graphical")  =  estimate of Q obtained from a histogram  
Qmax = estimate of Q obtained from known values  m,  S,  a  (as in my paper

              published in JASA, 1989 - see below:  Ref. 1.)

E = proportion of false  100(1-a)-percent  confidence intervals in S intervals
Emax  =  calculated largest expected value of E

Emin  =  calculated smallest expected value of E

                

      In a large known number (m) of experiments, a0  and  a  are known, and the number (S) of significant results (in which p<a) is also known (because it can be enumerated).   In the case that  Q  can be nearly-exactly estimated, we can also nearly-exactly calculate the proportion (E) of false confidence intervals (in S):   

                                                                                                  (See derivations below !)

E = [QS + (S-QS)a/f ] / S    ............. (1)

      In (1)  a  and  S  are known,  Q  is also known if it can be estimated with a satisfactory precision, and  f  can be calculated from the following formula:

f = (S-QS) / [m -(QS/a0)]   .............. (2)

where  S,  Q,  m  and  a0  are known. 

      Derivation of (2): 

Q = F/S ;   QS = F = m0a0 ;   m0 = QS/a0

m1f = T  = S-F  = S - QS 

f = T/m1 = (S-QS)/m1   where  m1= m-m0 = m - (QS/a0)

f = (S-QS) / [m - (QS/a0) ;   this is formula (2). 

             

      Derivation of (1): 

     F  false discoveries (significant at the level  a0<a)  give  F  false 100(1-a)-percent confidence intervals.  There are T  true discoveries  with  T  100(1-a)-percent confidence intervals comprising  m1a  false confidence intervals.   [Namely: m1a = (m1' + m1'' + m1'''+.....)a]

      T=m1f ;   Ta =m1fa ;    m1a = Ta/f

      E = [F+ Ta/f] / S   = [QS+ (S-QS)a/f] / S ;   this is formula (1). 

       We insert  f  from (2) into (1) and calculate E.

             

      I beg everybody to let me know if anybody has published the above simple derivations, as well as the formulae (1) and (2) in any form, and where that has been published.  (There are so many published papers which I have never seen, and I don't know where to look or whom to ask).

      Also, I beg you to tell me what mistakes I have made!

            

 

      If we use  Qmg  as an estimate of  Q  in (1) ad (2), the obtained values could, perhaps, be rather near to the actual values of  E  and  f.  

      If we insert the value  Qmax  into (1) and (2), we obtain  f = 1  and the corresponding low value of  E.

               

      In a special case, where a0 = a , we have: 

E = (F+ Ta/f) / S =  (m0a0 + Ta/f) / S =  (m0a + Ta/f) / S =

= (m0a  + m1fa/f) / S  = (m0a + m1a) / S  = a(m0+m1) / S = ma / S

 E = am / S  ............. (4)

      The latter formula (4) has been published in my paper (JASA; 1989) in this form:

E = an / r  > Qmax ;  where  n  and  r  stand instead of  m  and  S,  respectively  (see below: Ref. 1.).  In (4):  m,  S  and  are known, and, if  m  is very large, we can simply and almost-exactly calculate the actual value (E) of the proportion of false confidence intervals in S.

_______________________________________________________

REFERENCE:

      1. Sorić, B. (1989). Statistical "discoveries" and effect-size estimation. J. Amer. Statist. Assoc., 84, 608-610. http://www.jstor.org/pss/2289950 

--------------------------------------------------------------------------

               

I beg to be notified about any

mistakes that may exist above!

branko.soric@zg.t-com.hr

           

Go to:  Home TABLE

----------------------------------------------

 

June - September, 2009

Branko Soric

 

 

 

 

 

 

 

................................................................................................................................................................

................................................................................................................................................................