It is likely that all complex behaviors and diseases result from interactions between genetic vulnerabilities and environmental factors. Accurately identifying such gene–environment interactions is of critical importance for genetic research on health and behavior. In a previous article we proposed a set of models for testing alternative relationships between a phenotype (P) and a putative moderator (M) in twin studies. These include the traditional bivariate Cholesky model, an extension of that model that allows for interactions between M and the underling influences on P, and a model in which M has a non-linear main effect on P. Here we use simulations to evaluate the type I error rates, power, and performance of the Bayesian Information Criterion under a variety of data generating mechanisms and samples sizes (n = 2,000 and n = 500 twin pairs). In testing the extension of the Cholesky model, false positive rates consistently fell short of the nominal Type I error rates (\( \alpha = 10,.05,.01 \)). With adequate sample size (n = 2,000 pairs), the correct model had the lowest BIC value in nearly all simulated datasets. With lower sample sizes, models specifying non-linear main effects were more difficult to distinguish from models containing interaction effects. In addition, we provide an illustration of our approach by examining possible interactions between birthweight and the genetic and environmental influences on child and adolescent anxiety using previously collected data. We found a significant interaction between birthweight and the genetic and environmental influences on anxiety. However, the interaction was accounted for by non-linear main effects of birthweight on anxiety, verifying that interaction effects need to be tested against alternative models.

Stata and Mplus scripts for data generation and model fitting are available from the first author at.http://www.waisman.wisc.edu/twinresearch/researchers/vanhullecv.shtml
This study was funded by the NIH Grant R21 MH086099 from the National Institute for Mental Health. Infrastructure support was provided by the Waisman Center via a core grant from the National Institute of Child Health and Human Development (P30 HD03352).
Appendix 1
Estimated sample size analysis
We followed the method recommended by Saunders et al. (2003). Assume \( \widehat{E}(\chi^{2} ) = Q \)for N simulations, where \( \widehat{E}(\chi^{2} ) \) is the mean LRT between HA and H0 across simulations. Then, let \( \frac{Q - P}{N} = \hat{k} \) where p is the degrees of freedom for the LRT. From Q and \( \hat{k} \), we can compute the χ2 non-centrality parameter for a given sample size \( \bar{N} \) and thereby generate power. Reversing this process, given p and a target power \( (1 - \beta ) \), we can, via the non-centrality parameter \( \lambda_{p,1 - \beta } \), compute the sample size \( \bar{N} \) needed to achieve power \( (1 - \beta ) \). If we set \( \lambda_{p,1 - \beta } = \hat{k}\bar{N} \) then
The non-centrality parameter was calculated using Stata’s npnchi2 function (Stata Corp. 2009 Stata Statistical Sofware: Release 11. College Station, TX: StataCorp, LP) \( {{\uplambda}}_{{{\text{p,1}} - \beta }} {\text{ = npnchi2(p,invchi2(p,(1}} - \alpha ) ) , ( 1- (\beta / 1 0 0 ) / {\text{Q)}} \).
Appendix 2
Appendix 3
Van Hulle, C.A., Lahey, B.B. & Rathouz, P.J. Operating Characteristics of Alternative Statistical Methods for Detecting Gene-by-Measured Environment Interaction in the Presence of Gene–Environment Correlation in Twin and Sibling Studies. Behav Genet 43, 71–84 (2013). https://doi.org/10.1007/s10519-012-9568-4
DOI: https://doi.org/10.1007/s10519-012-9568-4