Tests of additional conditional moment restrictions

The primary focus of this article is the provision of tests for the validity of a set of conditional moment constraints additional to those defining the maintained hypothesis that are relevant for independent cross-sectional data contexts. The point of departure and principal contribution of the paper is the explicit and full incorporation of the conditional moment information defining the maintained hypothesis in the design of the test statistics. Thus, the approach mirrors that of the classical parametric likelihood setting by defining restricted tests in contradistinction to unrestricted tests that partially or completely fail to incorporate the maintained information in their formulation. The framework is quite general allowing the parameters defining the additional and maintained conditional moment restrictions to differ and permitting the conditioning variates to differ likewise. GMM and generalised empirical likelihood test statistics are suggested. The asymptotic properties of the statistics are described under both null hypothesis and a suitable sequence of local alternatives. An extensive set of simulation experiments explores the practical efficacy of the various test statistics in terms of empirical size and size-adjusted power confirming the superiority of restricted over unrestricted tests. A number of restricted tests possess both sufficiently satisfactory empirical size and power characteristics to allow their recommendation for econometric practice.


Introduction
The primary focus of this article is the provision of tests relevant for independent cross-sectional data for the validity of a set of conditional moment constraints in addition to those de ning the maintained hypothesis when a nite dimensional parameter vector is the object of inferential interest. Examples include moment conditional homoskedasticity and instrument validity. 1 The main point of departure and principal contribution of the paper is the explicit incorporation of the maintained conditional moment information in the formulation of the test statistics. Thus, our approach mirrors that of the classical parametric likelihood setting by de ning restricted tests for these additional conditional moments in contradistinction to unrestricted tests that partially or completely fail to incorporate the maintained moment condition information in their design with the advantage that the former dominate the latter tests in terms of asymptotic local power, cf. Aitchison (1962). Newey (1985), pp.242-244, andEichenbaum et al. (1988), Appendix C, pp.74-76, formulate GMM tests of additional unconditional moment constraints fully utilising maintained moment information gaining a similar local asymptotic power advantage over tests that fail to do so. The framework adopted in this paper is quite general allowing the parameters de ning the additional and maintained conditional moment restrictions to di er and permitting the conditioning variates to di er likewise. The paper also contributes a number of new theoretical results required to address the null and local alternative asymptotic distributions of the test statistics.
The approach taken in the paper exploits an equivalence between conditional moment constraints and a countably in nite number of unconditional restrictions noted elsewhere; see Chamberlain (1987).
Test statistics are consequently de ned in terms of an appropriate set of additional in nite unconditional moment conditions. These tests adapt and generalise those of Donald et al. (2003) which approximates conditional moments by an appropriate nite set of unconditional moments. Tests for a nite number of unconditional moment restrictions, cf. inter alia Newey (1985), Eichenbaum et al. (1988) and Ruud (2000) for GMM, Hansen (1982), and Smith (1997Smith ( , 2011 for generalized empirical likelihood (GEL), see also Kitamura and Stutzer (1997), Imbens et al. (1998) and Newey and Smith (2004), are wellknown to be inconsistent against all alternatives implied by conditional moment conditions; see, e.g., Bierens (1990). GMM and GEL test statistics de ned in Donald et al. (2003) circumvent this di culty by allowing the number of unconditional moments to grow with sample size at an appropriate rate. 2 Likewise here both maintained and null hypothesis conditional moment constraints are approximated by corresponding sets of unconditional moment restrictions with the former a subset of the latter, both of whose dimensions grow with sample size at appropriate rates. Restricted GMM-and GEL-based 1 Instrument validity tests are the concern of the application in section 6 to a parametric speci cation of an Engel curve relationship discussed elsewhere in, e.g., Muellbauer (1976), Banks et al. (1997) and, more recently, Blundell and Horowitz (2007). See fn. 15 below.
2 Consistent tests of goodness of t in regression models have received substantial attention in the literature. See, e.g., Eubank and Spiegelman (1990) for the nonlinear regression context. See also inter alia De Jong and Bierens (1994), Hong and White (1995) and Jayasuriya (1996).
[1] test statistics for additional conditional moment restrictions, after location and scale standardization, are asymptotically equivalent and converge in distribution to a standard normal variate under the null hypothesis. Intuitively this result re ects the implicit in nite number of unconditional moments under test since standardised chi-square distributed statistics are asymptotically standard normally distributed when the statistic degrees of freedom diverges to in nity. A similar result is obtained for unrestricted statistics that partially or completely neglect the maintained conditional moment information although the limit standard normal variate di ers. 3 Interestingly, unlike nite dimensional test statistics, e cient parameter estimation is no longer required for test implementation. Under a suitable sequence of local alternatives, restricted and unrestricted test statistics are asymptotically non-central standard normally distributed. The non-centrality parameter of the restricted statistics exceeds those of unrestricted statistics thereby demonstrating the de ciency of these latter tests mirroring the results for restricted tests in the classical parametric likelihood, Aitchison (1962), and unconditional moment condition, Newey (1985) and Eichenbaum et al. (1988), settings. The asymptotic local power results also indicate that one-sided tests of the additional conditional moment restrictions are apposite.
The paper is organized as follows. Section 2 provides some initial de nitions, details the test problem and describes moment conditional homoskedasticity and instrument validity examples that are used throughout the paper. GMM and GEL restricted test statistics are then speci ed in section 3; an initial discussion presents the equivalence between conditional moment restrictions and an appropriately de ned in nite set of unconditional moment constraints together with the assumptions that underpin the analysis in the paper. Section 4 provides the limiting distributions of these and unrestricted statistics under the null hypothesis of the additional conditional moment validity; the large sample independence of the restricted test statistics and GMM and GEL test statistics for the maintained hypothesis is shown which thus permits the overall test size of a sequential test of the maintained and then additional conditional moment restrictions to be controlled. Section 5 considers the local asymptotic behaviour of the restricted and unrestricted test statistics demonstrating the one-sided nature of the tests and the relative de ciency of the latter tests. Section 6 presents a set of simulation results on the size and power of the test statistics based on an application to a parametric speci cation of an Engel curve relationship. Section 7 concludes. Proofs of the results in the text and certain subsidiary lemmata are given in Appendix A and the Supplement to the paper.
The paper uses the generic subscript notation \m" and \a" to denote quantities associated with the maintained hypothesis and additional moment constraints. Conditional moment indicator vectors are denoted by u( ; ) of generic dimension J, with parameter vector of dimension p and associated parameter space B; instrument vectors are denoted as s with dimension d. The abbreviations a.s., for some a0 2 B a . Here the moment indicator vector u a (z; a ) denotes a J a -vector of known functions of z and the unknown p a -vector of parameters a with B a the corresponding parameter space and s a an observable d a -vector of instruments. Together the parameter vectors m0 and a0 constitute the objects of inferential interest. Note that a may or may not be coincident with the maintained hypothesis parameter vector m . Likewise, the notation s a for the instrument vector de ning the additional conditional moment constraints (2.2) explicitly permits circumstances in which the maintained instruments s m may or may not be strictly included in the additional instruments s a or vice-versa. 4

Test Problem
The maintained hypothesis is given by the conditional moment constraint E[u m (z; m0 )js m ] = 0 (2.1) and is assumed to hold throughout. Nonparametric components are excluded from the moment indicator vector de nitions. The theoretical analysis of the paper could in principle be extended to deal with such models; see, e.g., Pouzo (2009, 2012). [3] for some S a with non-zero probability content.

Examples
Example 2.1 (Conditional Homoskedasticity). This example concerns the conditional homoskedasticity of the maintained conditional moment indicator vector u m (z; m ); hence the maintained hypothesis and additional instrument vectors are identical, i.e., s m = s a . The additional conditional moment indicator is de ned by . Therefore the null hypothesis may be expressed as  Hansen et al. (1996), uses the inverse of the sample moment matrix P n i=1 s mi s 0 mi (y i m x i ) 2 =n as metric whereas, under conditional homoskedasticity, the LIML metric, i.e., the inverse of 2 n ( m ) Example 2.2 (Instrument Validity). In this example both maintained and additional conditional moment indicators coincide, i.e., u m (z; m ) = u a (z; a ) with m = a and, thus, J a = J m . The issue here is the validity of the additional instrument vector s a . The null hypothesis is therefore de ned by

Approximating Conditional Moment Restrictions
Conditional moment constraints of the form (2.1) and (2.2) are equivalent to a countable number of unconditional moment restrictions under certain regularity conditions; see Chamberlain (1987). The following assumption, Assumption 1, p.58, of Donald et al. (2003), henceforth DIN, provides precise conditions. The discussion is initially framed for a generic vector of instruments s and moment indicator vector u(z; ).
Assumption 3.1. For all K, E[q K (s) 0 q K (s)] is nite and for any a(s) with E[a(s) 2 ] < 1 there are Possible approximating functions which satisfy Assumption 3.1 are splines, power series and Fourier series. See inter alia DIN, Newey (1997) and Powell (1981) for further discussion.
DIN de nes the unconditional moment indicator vector as u(z; ) q K (s). By considering the moment conditions E[u(z; 0 ) q K (s)] = 0, if K approaches in nity at an appropriate rate, dependent on the sample size n and the estimation method, EL, IV, GMM or GEL, DIN demonstrates that under certain conditions these estimators are consistent and achieve the semi-parametric e ciency lower bound. To do so, however, requires the imposition of a normalization condition on the approximating functions, DIN Assumption 2, p.59, which now follows. Let S denote the support of the random vector s.
Assumption 3.2. For each K there is a constant scalar (K) and matrix B K such thatq K (s) = has smallest eigenvalue bounded away from zero uniformly in K and p K (K).
Hence to formulate a test statistic appropriate for the null hypothesis (2.3) requires that its con-   6 To illustrate the construction of q K m (sm) and q M K a (sa) for polynomial approximating functions suppose sm and sa have dam elements in common. Let the approximating functions vector q K m (sm) for the maintained conditional moment restrictions (2.1) be a polynomial of order km 1 which yields K = k dm m . Thus km could be chosen as [K 1=dm ] + 1 for given K. Similarly let the components of the vector of approximating functions q M K a (sa) for the additional conditional moment restrictions (2.2) corresponding to the dam elements in common between sm and sa be formed from a polynomial of order ka 1. Also suppose a polynomial of order ka excluding the constant term is used for those components corresponding to the da dam unique elements in sa. Then the dimension of the vector of approximating functions q M K a (sa) is k dam a ((ka + 1) da dam 1). Therefore the order of the dimension of q M K a (sa) is k da a . Examples: (a) ME: dam = 0; thus M K = (ka + 1) da 1, e.g., da = 1, M K = ka. (b) JE: dam = dm; thus M K = k dm a ((ka + 1) da dm 1), e.g., dm = 1, da = 2, M K = k 2 a . For the general case this suggests choosing ka = [(M K) 1=da ] + 1. [7]

Basic Assumptions and Notation
Let denote the distinct elements of m and a with 0 and the composite parameter space B de ned similarly with p the number of parameters comprising . The vector s collects the distinct elements of the maintained and additional instrument vectors s m and s a . Also let u(z; ) and q K (s) denote the non-redundant elements of u m (z; m ) and u a (z; a ) and q K m (s m ) and q M K a (s a ) respectively. It will be helpful to de ne a number of f.r.r. selection matrices S u m , S u a and S q m , S q a ; viz., S q a are both f.r.r. selection matrices. Importantly for the theoretical analysis underpinning the results in the paper, the unconditional forms of moment indicator vectors corresponding to the maintained and null hypotheses, cf. (3.1) and (3.3), may be expressed as S m (u(z; ) q K (s)) and . The unconditional form of the moment indicator vector corresponding to the null hypothesis with that for the maintained hypothesis expressed as S m (u(z; ) q K (s)) = u m (z; m ) q K m (s m ), K ! 1.
with that for the maintained hypothesis given by Standard conditions are imposed to derive the limiting distributions of the test statistics discussed below; viz. 7 The row and column dimensions of the selection matrices S q m and S q a depend on K but to avoid a burdensome notation this dependence is not made explicit. [8] Unlike DIN Assumption 6(b), p.67, it is unnecessary to impose E[sup 2B ku (z; )k ] < 1 for some > 2 for GEL; see Guggenberger and Smith (2005). 8

Test Statistics
. GMM statistics appropriate for tests of maintained and null hypotheses expressed unconditionally in (3.1) and (3.3) take the standard forms 8 Supplement Lemma S.1 may be substituted for DIN Lemma A.10, p.82, rendering = 2 su cient for the succeeding DIN lemmas and theorems concerned with GEL.
9 Nonsmooth moment indicators could be accommodated by appropriately modifying the theoretical analysis. See, e.g., Pouzo (2009, 2012) and Parente and Smith (2011). [9] and T g GM M = nĝ(^ ) 0^ 1ĝ (^ ); ( 3.5) where^ m denotes the subvector of^ corresponding to m , where p p m is the number of additional parameters in a de ning the additional conditional moment conditions (2.2) as compared with the maintained hypothesis (2.1) parameters m .
Remark 3.4. For xed and nite K, under suitable conditions, GMM, Newey (1985) and Eichenbaum et al. (1988), and GEL, Smith (2011), test statistics for the validity of additional moment restrictions, e.g., T gm GM M in J r (3.6) mimic those introduced to render chi-square random variates with large degrees of freedom approximately standard normally distributed.
A number of alternative test statistics to GMM-based procedures for a nite number of additional moment restrictions using GEL, Newey and Smith (2004) and Smith (1997Smith ( , 2011, may be adapted for the framework considered here. As in DIN and Newey and Smith (2004) where and m = S m are the corresponding (J m + J a M )K-and J m K-vectors of Lagrange multipliers associated with the unconditional moment constraints (3.1) and (3.3). Let j (v) = @ j (v)=@v j and j = j (0), (j = 0; 1; 2; :::) where, without loss of generality, the normalisation 1 = 2 = 1 is imposed. 10 10 EL is GEL with (v) = log(1 v), Imbens (1997), Qin and Lawless (1994) and Smith (2000). ET is also GEL with [10] Let^ gm n ( m ) = f m : 0 m g mi ( m ) 2 V, i = 1; :::; ng and^ g n ( ) = f : 0 g i ( ) 2 V, i = 1; :::; ng. Given , the respective Lagrange multiplier estimators for m and are de ned bŷ The corresponding respective Lagrange multiplier estimators for m and are then de ned as^ m = m (^ m ) and^ =^ (^ ), cf. Assumption 3.3(c), Similarly to the restricted GMM statistic J r (3.6), a restricted form of GEL likelihood ratio (LR) statistic for testing the null hypothesis (2.3) against the maintained hypothesis (2.4) may be based on the di erence of GEL criterion function (3.7) statistics; viz.
Restricted Lagrange multiplier, score and Wald-type statistics are de ned respectively as 11 An additional assumption on the GEL function ( ) is required for statistics based on GEL as in DIN Assumption 6, p.67.

Asymptotic Null Distribution
The following theorem provides a statement of the limiting distribution of the restricted GMM statistic J r (3.6) under the null hypothesis H 0 (2.3).
The next result details the limiting properties of the restricted GEL-based statistics for the null hypothesis (2.3) and their relationship to that of the GMM statistic J r (3.6).
Then LR r , LM r , S r and W r converge in distribution to a standard normal random variate. Moreover all of these statistics are asymptotically equivalent to J r .
[12] maintained hypothesis (2.1), cf. LR r (3.8), LM r (3.9) and S r (3.10); i.e., with the score form based on T g GM M (3.5) By a similar analysis to that used to establish Theorems 4.1 and 4.2 the statistics LR u , LM u and S u converge in distribution to a standard normal random variate and are mutually asymptotically equivalent but not to the restricted statistics above. 13 Remark 4.2. Other forms of unrestricted statistics may also be de ned that incorporate the maintained information (2.1) to a lesser extent than restricted statistics, e.g., a GMM statistic solely based on the additional conditional moment restrictions (2.2); viz.
where T ga GM M = nĝ a (^ a ) 0^ 1 aĝa (^ a ) with^ a the subvector of^ corresponding to a ,ĝ a ( a ) = P n i=1 g ai ( a )=n and^ a = P n i=1 g ai (^ a )g ai (^ a ) 0 =n. GEL forms LR a , LM a and S a follow similarly; cf. (4.1), (4.2) and (4.3) respectively. The proofs of Theorems 4.1 and 4.2 may be adapted to demonstrate that these statistics each converge in distribution to a standard normal random variate and are mutually asymptotically equivalent but not to the restricted statistics or the unrestricted GEL class de ned above.
This section concludes with an asymptotic independence result between the restricted GMM statistic J r for testing (2.3) and the corresponding statistic for testing the maintained hypothesis (2.1); viz. A similar result holds for the associated restricted GEL statistics LR r , LM r , S r and W r and their counterparts for testing (2.1) if the additional assumption (K) 2 K 3 =n ! 0 is imposed.
Remark 4.3. The practical import of Theorem 4.3 is that the overall asymptotic size of the test sequence for (2.1) and (2.2) may be controlled, e.g., (a) test (2.1) using J m ; (b) given (2.1), test (2.2) using J r , with overall asymptotic test size 1 (1 m )(1 a ), where m and a are the respective asymptotic sizes of the individual tests in (a) and (b).
Remark 4.4. The asymptotic independence of J r and J m mirrors that of classical and unconditional moment GMM and GEL tests for a sequence of parametric restrictions; see Newey (1985) and Smith (2011). Indeed the unrestricted statistic J u is the sum of suitably rescaled restricted J r and maintained hypothesis J m statistics; cf. the decomposition of standard unrestricted classical or GMM and GEL statistics for parametric restrictions.

Asymptotic Local Power
This section considers the asymptotic distribution of the statistics of the previous sections under a suitable sequence of local alternatives. Critically, this discussion demonstrates the de ciency in terms of asymptotic local power of unrestricted tests which fail to fully incorporate the maintained conditional information (2.1) and thereby the superiority of restricted tests.
The set-up is similar to that in Eubank and Spielgeman (1990) and Hong and White (1995), see also Tripathi and Kitamura (2003), utilising local alternatives to the null hypothesis (2.3) of the form The asymptotic local alternative distributions of the statistics described above are obtained under the following assumption. If additionally Assumption 3.5 is satis ed and (K) 2 K 3 =n ! 0, then LR r , LM r , S r and W r are asymptotically equivalent to J r .
Remark 5.2. Since r 0 tests of the null hypothesis H 0 (2.3) based on these statistics should be one-sided. Although not discussed here, a similar analysis to that underpinning DIN Lemma 6.5, p.71, demonstrates the consistency of tests based on the statistics J r , LR r , LM r , S r and W r .
The following corollary to Theorem 5.1 details the limiting distribution of the standard forms of unrestricted statistics LR u (4.1), LM u (4.2) and S u (4.3) under the same local alternative sequence (5.1).
Corollary 5.1. Let Assumptions 3.1-3.4 and 5.1 hold and (K) 2 K 2 =n ! 0. Then S u converges in distribution to a N ( u = p 2; 1) random variate, where If additionally Assumption 3.5 is satis ed and (K) 2 K 3 =n ! 0, then LR u , LM u are asymptotically equivalent to S u .
Remark 5.3. Since r > u Corollary 5.1 demonstrates that for xed M restricted tests dominate the standard unrestricted tests in terms of asymptotic local power. Other unrestricted tests that partially or completely fail to incorporate the maintained conditional moment information (2.1) in their formulation are likewise relatively de cient. For example, using a similar analysis to that for Theorem 5.1, the GMM statistic J a (4.4) and associated GEL statistics LR a , LM a and S a may be shown to converge in distribution under the local alternatives sequence (5.1) to a N ( a = p 2; 1) random variable, where a = E[ (s) 0 S u0 a (S u a (s)S u0 a ) 1 S u a (s)]. Hence r a 0. Therefore, tests based on these and other unrestricted statistics are asymptotically less powerful relative to restricted tests.
Remark 5.4. Corollary 5.1 also shows that the di erence in local asymptotic power between restricted and unrestricted tests declines with increasing M since the noncentrality parameter u would di er little from r with consequential similar discriminatory power for both standard unrestricted and restricted tests for local departures from the null hypothesis H 0 (2.3). M , for given n and K, will depend on the correlation between these extra elements and the conditional expectation E[u(z; 0 )js]. If this correlation is zero or weak then, although not strictly speaking applicable here, an asymptotic local power analysis for the unconditional moment context would indicate that power should be expected to be diminished since test chi-square degrees of freedom will increase with M but the noncentrality parameter will remain relatively unaltered. Cf. Newey (1985) section 3, pp.238-244, in particular, the discussion following Proposition 6, p.242. If this correlation is strong there will be a trade-o between increases in both degrees of freedom and noncentrality parameter with power potentially enhanced. Simulation evidence reported next in section 6 suggests that for a given sample size n and xed value of K the correspondence between empirical and nominal test size deteriorates with increasing M ; a similar deterioration is also observed for size-corrected empirical power but it should be emphasised against speci c sets of alternatives.

Simulation Evidence
This section reports the results from a simulation study to assess the performance of some of the tests for ME and JE forms of instrument validity in the linear regression model, see Example 2.2, based on the GMM and GEL statistics developed in previous sections. To provide a realistic setting, the investigation is based on an application to a dataset where the issue of instrument validity is of some interest and importance.
Overall these experiments revealed that nominal size is approximated relatively more closely by the empirical size of (a) the non-standardised tests, see results for restricted tests are reported as they dominate the unrestricted forms in terms of empirical power re ecting their theoretical superiority; see Corollary 5.1. 14 All experiments concern a parametric speci cation for the Engel curve relationship between the expenditure share of leisure services y and the logarithm of total expenditures x and employ the same data as those in Blundell and Horowitz (2007). These data correspond to a subsample of the householdlevel observations from the British Family Expenditure Survey and consist of a sample of 1518 married couples with one or two children and an employed head of household. Since many parametric Engel curve speci cations are often linear or quadratic in x, see, e.g., Muellbauer (1976) and Banks et al. (1997), the experimental basis here is the linear regression model The maintained instrument s m is the annual income from wages and salaries of the head of household.

Experimental Design
The parameter vector is estimated using the full data set by e cient two step (

Estimators
E cient estimation methods examined include 2SGMM (gmm) with weight matrix computed as above, continuous updating (cue), empirical likelihood (el) and exponential tilting (et). The subscripts ma, me and je indicate estimation incorporating maintained, ME and JE restrictions respectively.
gmm, cue and et are computed using the Broyden{Fletcher{Goldfarb{Shanno (BFGS) algorithm of MATLAB. EL is more problematic because in some samples for particular BFGS EL estimates^ EL the convex hull condition P n i=1^ EL i g(z i ;^ EL ) < 10 4 may not be satis ed where the EL implied probabilitieŝ EL i = 1=n(1 +^ 0 EL g(z i ;^ EL )), (i = 1; :::; n), and the EL Lagrange multiplier^ EL = ^ 1 ĝ(^ EL ) witĥ = P n i=1^ EL i g(z i ;^ EL )g(z i ;^ EL ) 0 andĝ( ) = P n i=1 g(z i ; )=n; see Newey and Smith (2004) Theorem 2.3, p.224. Hence el is computed using the matElike MATLAB package with the optional Zipsolver package; see Zedlewski (2008). 18 In the case of non-convergence, el is computed employing BFGS applied to the EL dual problem with the Lagrange multiplier obtained using MATLAB code based on Owen (2001) eq. (12.3), p.235. 19 EL estimates obtained via this procedure are only considered to be valid solutions if the convex hull condition is satis ed, otherwise no solution in the convex hull is reported. Note, however, that in the test size and power results reported in sections 6.3 and 6.4 the EL estimates satis ed the convex hull condition in all replications. 20

Test Statistics
Restricted tests for ME E[ujx] = 0 and JE E[ujs m ; x] = 0 adopt the following notation. The superscripts m and j refer respectively to the ME or JE hypothesis under test with the subscripts cue, el, et referring to which GEL criterion is used to construct the test and, as above, denoting the e cient estimator(s) employed. E.g., the non-standardised restricted GEL LR-type statistic for JE based on EL criteria and estimators is denoted as LR j el = 2n(P g el (^ elj ;^ elj ) P gm el (^ elma ;^ elma )), cf. (3.8). LR-type CUE statistics evaluated at null and the maintained hypothesis EL and ET estimators are also computed using the subscript cue(gel) to denote the use of the CUE criterion and GEL estimators, e.g., for JE, LR j cue(gel) = 2n(P g cue (^ gelj ;^ gelj ) P gm cue (^ gelma ;^ gelma )). The non-standardised robusti ed score S and Wald W statistics, see fn. 11, evaluated at the corresponding e cient ma estimator are also examined.
Restricted ME and JE non-standardised test statistics are calibrated against chi-square distributions with M K and [(M K) 1=2 ] 2 degrees of freedom respectively. 21 18 matElike, rather than solving the dual EL problem, solves the primal EL problem directly and is chosen as the default algorithm because it is faster on average than BFGS. Both BFGS and matElike solutions are identical if each converges to a solution in the convex hull.
19 el computation requires some care since the EL criterion involves the logarithm function which is unde ned for negative arguments. This di culty is avoided by replacing logarithms with a function that is logarithmic for arguments larger than a small positive constant and quadratic below that threshold. The code is available at http://www-stat.stanford.edu/~owen/empirical/ 20 In a preliminary study the convex hull condition was found to be violated for values of K and M larger than those considered here. The adjusted EL estimator of Chen et al. (2008) o ers an alternative to EL in such circumstances. 21 A number of asymptotically equivalent test statistics for the maintained hypothesis (2.1) were also investigated. The Durbin (1954)-Wu (1973)-Hausman (1978 test based on an auxiliary regression as described in Davidson and Mackinnon [19] GEL LM, score and Wald ME and JE test statistics require estimators of the variance matrix = E[g(z; 0 g(z; 0 ) 0 ] and Jacobian G = E[@g(z; 0 )=@ 0 ]. The estimators considered for and G arê = n 1 P n i=1 g(z i ;^ gel )g(z i ;^ gel ) 0 andĜ = n 1 P n i=1 @g(z i ;^ gel )=@ 0 where^ gel is the null hypothesis GEL estimator. Additional results are also presented for ME and JE LM tests based on the consistent ; (i = 1; :::; n): LM statistics based on~ k are denoted g LM.

Choice of the Number of Instruments
Implementation of the above tests requires a choice of K to employ under the maintained hypothesis.
Because the Donald et al. (2009)

Empirical Size
The results on empirical size reported here correspond to a nominal asymptotic level of 0:05; those results for nominal levels 0:01 and 0:10 are qualitatively similar and are therefore omitted.
6.3.1 ME  Similar general conclusions to those for the ME tests above broadly follow. Interestingly, given M , sample size n and thus K, rejection frequencies are higher than those obtained for the ME hypothesis.

Summary
The empirical size of non-standardised tests more closely approximates nominal size than that of standardised tests. The use of e cient rather than root-n consistent estimators is recommended for test

Conclusions
The primary focus of this article has been concerned with the provision of tests for additional con- The simulation experiments undertaken to explore the e cacy of the various tests proposed in the paper indicate a number of restricted tests possess both su ciently satisfactory empirical size and power characteristics to allow their recommendation for econometric practice.
The methods proposed in this paper are also relevant for short panel data models with independent cross sections and strictly exogenous instruments. The development of results pertinent for conditional moment constraints involving di erent instruments in di erent time periods is the subject of current research; cf. Holtz-Eakin et al. (1988), Arellano and Bond (1991) and Chamberlain (1992).
Finally to prove ng 0 n P ngn J a M K p 2J a M K d ! N (0; 1) it is rst established that ng 0 n (P n P n )g n p 2J a M K = o p (1) where P n = 1 n S 0 m ( mn ) 1 S m with n = E[g i;ng Therefore, noting n = n E[ g i;n g 0 i;n ], from eq.(5.1) Consequently, since Therefore ng 0 n (P n P n )g n p 2J a M K = o p (1) Note that 1=C min ( n ) max ( n ) C because j (A) (B)j kA Bk, j min ( n ) min ( n )j = o (1) and j max ( n ) max ( n )j = o (1). Similarly 1=C min ( mn ) max ( mn ) C: Supplement Lemma S.2 is now invoked to prove Again using c r E[(g 0 i;n ( n ) 1g i;n ) 2 ] 3E[(g 0 i;n ( n ) 1 g i;n ) 2 ] + 12E[(g 0 i;n ( n ) 1 g i;n ) 2 ] + 3E[( g 0 i;n ( n ) 1 g i;n ) 2 ]: Now, for n large enough, E[(g 0 i;n ( n ) 1 g i;n ) 2 ] CE[kg i;n k 4 ]. Since n;0 2 N for n large enough, by Lastly, E[( g 0 i;n ( n ) 1 g i;n ) 2 ] C(K=n 2 )E[k i k 4 kq i k 4 ] C (K) 2 K 2 =n 2 : Hence, E[(g 0 i;n ( n ) 1g i;n ) 2 ] = o p (K p n) as required. Likewise, E[(g 0 i;n S 0 m ( mn ) 1 S mgi;n ) 2 ] = o p (K p n).
Thirdly, P n n P n = P n . Therefore, The conclusion of the theorem for J r then follows.
The proof structure for the restricted GEL statistics LR r , LM r , S r and W r is similar to that for Theorem 4.2 demonstrating their mutual asymptotic equivalence to the GMM statistic J r under the local alternatives (5.1). The proofs for LM r , S r and W r are omitted for brevity. [28]