STP 526 POWER OF A HYPOTHESIS TEST Splus code for power/sample size analysis, written by Daniel F. Heitjan. To get Splus code for power (original version): http://www.cceb.upenn.edu/pages/heitjan/ It's in the ``powerf.q'' archive. Examples of power calculations are in the file powerex.q. You need functions pfnc, pchisqnc, glhpw, glhss. Note: the function fun.glhpw does not work if the X matrix is less than full rank. You can fix that, thanks to your knowledge of g-inverses! Locate the line hpop<-t(theta)%*%solve(cc%*%solve(t(x)%*%x)%*%t(cc))%*%theta and replace it by hpop<-t(theta)%*%solve(cc%*%ginv(t(x)%*%x)%*%t(cc))%*%theta (Note that you need to load the MASS package to be able to use ginv.) ************************************************************************** Null hypothesis H0: c %*% beta %*% u - theta0 = 0, in the model y = x %*% beta + e. glhpw: Power Function for Tests of the General Linear Hypothesis Use a noncentral F approximation to compute the power of a test of the general linear hypothesis. Call: fun.glhpw(alpha, x, bt1, sigma, cc, u, th0, test=1, tol=1e-08) Required arguments: alpha = type I error probability of the test. x = N-by-q design matrix. bt1 = q-by-p alternative hypothesis value of the regression coefficient. sigma = p-by-p variance matrix of a single observation. cc = a-by-q between-rows contrast matrix. u = p-by-b between-columns contrast matrix. th0 = a-by-b null value of cc%*%beta%*%u. Optional arguments: test = test for which power is desired. test=1 (the default) is the Wilks lambda test; test=2 is the Hotelling-Lawley trace test; test=3 is the Pillai-Bartlett trace test. See Muller and Peterson (1984) for details. tol = tolerance for computing the rank of x (using qr()). The default is 1.e-8. Return: the approximate power of a test of the general linear hypothesis cc%*%beta%*%u=th0 under the alternative beta=bt1. The model is y(Nxp)=x(Nxq)%*%beta(qxp)+e(Nxp), where x is the design, beta is the matrix of multivariate regression parameters, and e is the error matrix, whose rows are assumed to be independent draws from a multivariate normal with mean 0 and p-by-p variance matrix sigma. Reference: Muller, K.E. and Peterson, B.L. (1984). Practical methods for computing power in testing the multivariate general linear hypothesis. Computational Statistics and Data Analysis 2, 143--158. Note: If the computed approximate degrees of freedom are negative, a power of 0 is returned; this may mean that the proposed design has no degrees of freedom for error. *******************SAMPLE SIZE CALCULATION**************************** glhss Sample Size for Tests of the General Linear Hypothesis Compute the minimum sample size required to have a specified power for rejecting the null hypothesis in a test of the general linear hypothesis. Call: fun.glhss(alpha, power, x, bt1, sigma, cc, u, th0, test=1, ninit=1) Required arguments: alpha = type I error probability of the test. power = desired power of the test. x = N0-by-q design matrix. This matrix should represent the design for a sample of size 1. For example, if there are to be two groups, x should contain two rows, with the first row representing the design for group 1 and the second row the design for group 2. The sample size returned will then be the number per group. bt1 = q-by-p alternative hypothesis value of the regression coefficient. sigma = p-by-p variance matrix of a single observation. cc = a-by-q between-rows contrast matrix. u = p-by-b between-columns contrast matrix. th0 = a-by-b null value of cc%*%beta%*%u. Optional arguments: test = test to use. test=1 (the default) is the Wilks lambda test; test=2 is the Hotelling-Lawley trace test; test=3 is the Pillai-Bartlett trace test. See Muller and Peterson (1984) for details. ninit = number of replications of x to use as the starting sample size. The sample size algorithm is simple -- it starts at ninit and keeps adding or subtracting 1 (that is, 1 copy of x) until the desired power is achieved. The default is ninit=1. Return: list containing two elements -- n, the smallest number of copies of x that guarantee the desired power for the test described, and power, the power of the design with that sample size. REFERENCE Muller, K.E. and Peterson, B.L. (1984). Practical methods for computing power in testing the multivariate general linear hypothesis. Computational Statistics and Data Analysis 2, 143--158. NOTE The power function glhpw is an approximation, so there may be slight differences when compared with more exact methods. This function increases (or decreases) n one unit at a time, so well chosen starting values can save time. Some experimentation will often be helpful. *********************************Example 1******************************* ## Two-sample t test power. # H0: mu1 = mu2. # What is power if mu1-mu2 = 10, sigma=10, and n=8 in each treatment. x <- matrix(c(1,0,0,1),nrow=2,ncol=2) # x is design matrix for one rep. > x [,1] [,2] [1,] 1 0 [2,] 0 1 xmat <- matrix(x,nrow=16,ncol=2,byrow=T) # construct x matrix with desired sampsize. > xmat [,1] [,2] [1,] 1 0 [2,] 0 1 [3,] 1 0 [4,] 0 1 [5,] 1 0 [6,] 0 1 [7,] 1 0 [8,] 0 1 [9,] 1 0 [10,] 0 1 [11,] 1 0 [12,] 0 1 [13,] 1 0 [14,] 0 1 [15,] 1 0 [16,] 0 1 > # bt1 is value of parameters under alternative hypothesis. bt1 <- matrix(c(5,-5),nrow=2,ncol=1) # could also use value (10,0) for bt1 > bt1 [,1] [1,] 5 [2,] -5 cc <- matrix(c(1,-1),nrow=1,ncol=2) > cc [,1] [,2] [1,] 1 -1 sigma2 <- matrix(10^2,nrow=1,ncol=1) u <- matrix(1,nrow=1,ncol=1) theta0 <- matrix(0,nrow=1,ncol=1) > theta0 [,1] [1,] 0 # Find power!! fun.glhpw(.05,xmat,bt1,sigma2,cc,u,theta0) [1] 0.4612388 # Power against HA if 8 observations in each tmt. fun.glhss(0.05, .95, x, bt1, sigma2, cc, u, theta0,test=1,ninit=1) $n: [1] 27 # Says take 27 observations in each treatment. $power: [1] 0.9500773 # Can also do this with Russell Lenth's software at http://www.stat.uiowa.edu/~rlenth/Power/ Select one-way ANOVA, then set levels[treatment] = 2 n [Within] = 8 SD [Within] = 10 Method = t Alpha = .05 Detectable contrast = 10 Then we get Power = .4612 # Another alternative: PROC POWER in SAS (or use PROC GLMPOWER) # See power.sas for SAS code *********************************Example 2******************************* ## Try to recompute the ANOVA example on p. 717 of Kutner et al. (Kenyon foods) # H0: mu1 = mu2 = mu3 = mu4 # Find power when mu1=12.5, mu2=13, mu3=18, mu4=21 x <- matrix(c(1,1,rep(0,8), 0,0,1,1,1,0,0,0,0,0, 0,0,0,0,0,1,1,1,0,0, 0,0,0,0,0,0,0,0,1,1),nrow=10,ncol=4) # x is design matrix > x [,1] [,2] [,3] [,4] [1,] 1 0 0 0 [2,] 1 0 0 0 [3,] 0 1 0 0 [4,] 0 1 0 0 [5,] 0 1 0 0 [6,] 0 0 1 0 [7,] 0 0 1 0 [8,] 0 0 1 0 [9,] 0 0 0 1 [10,] 0 0 0 1 > qf(.95,3,6) # Quantile of F distribution [1] 4.757063 # Test: reject when F stat > 4.757063 # Now find what the alpha's are under this alternative hypothesis. # Need to find deviations from overall mean---consider unbalanced design # when doing so! meanvec <- matrix(c(12.5,13,18,21),ncol=1) > meanvec [,1] [1,] 12.5 [2,] 13.0 [3,] 18.0 [4,] 21.0 muhat <- mean(x %*% meanvec) > muhat [1] 16 bt0 <- meanvec - muhat > bt0 [,1] [1,] -3.5 [2,] -3.0 [3,] 2.0 [4,] 5.0 cmat <- t(matrix(c(1,-1,0,0,0,1,-1,0,0,0,1,-1),4,3)) > cmat [,1] [,2] [,3] [,4] [1,] 1 -1 0 0 [2,] 0 1 -1 0 [3,] 0 0 1 -1 theta0 <- matrix(c(0,0,0),nrow=3,ncol=1) > theta0 [,1] [1,] 0 [2,] 0 [3,] 0 # Suppose sigma = 2.5. sigma2 <- matrix(2.5^2,nrow=1,ncol=1) u <- matrix(1,nrow=1,ncol=1) # Find power!! The command is: fun.glhpw(.05,x,bt0,sigma2,cmat,u,theta0) [1] 0.7310757 # Note that NWK give power as about 0.72, reading from graphs. # Now, how many reps of this design are needed to obtain power 0.70 # against the specified alternative hypothesis? > fun.glhss(0.05, .95, x, bt0, sigma2, cmat, u, theta0,test=1,ninit=1) $n: [1] 2 $power: [1] 0.9970765 # Note: if using one rep of this design already gives power > requested power, # fun.glhss will dump! But it's free software. fun.glhss(0.05, .7, x, bt0, sigma2, cmat, u, theta0,test=1,ninit=1) Error in solve.default(t(x) %*% x) : Lapack routine dgesv: system is exactly singular