Model selection and structure specification in ultra-high dimensional generalised semi-varying coefficient models

Research output: Contribution to journalArticlepeer-review


In this paper, we study the model selection and structure specification for the generalised semi-varying coefficient models (GSVCMs), where the number of potential covariates is allowed to be larger than the sample size.We first propose a penalised likelihood method with the LASSO penalty function to obtain the preliminary estimates of the functional coefficients. Then, using the quadratic approximation for the local log-likelihood function and the adaptive group LASSO penalty (or the local linear approximation of the group SCAD penalty) with the help of the preliminary estimation of the functional coefficients, we introduce a novel penalised weighted least squares procedure to select the significant covariates and identify the constant coefficients among the coefficients of the selected covariates, which could thus specify the semiparametric modelling structure. The developed model selection and structure specification approach not only inherits many nice statistical properties from the local maximum likelihood estimation and nonconcave penalised likelihood method, but also computationally attractive thanks to the computational algorithm that is proposed to implement our method. Under some mild conditions, we establish the asymptotic properties for the proposed model selection and estimation procedure such as the sparsity and oracle property.We also conduct simulation studies to examine the finite sample performance of the proposed method, and finally apply the method to analyse a real data set, which leads to some interesting findings.
Original languageEnglish
Pages (from-to)2676-2705
Number of pages30
JournalAnnals of Statistics
Issue number6
Early online date7 Oct 2015
Publication statusPublished - 6 Dec 2015

Bibliographical note

© 2015, Institute of Mathematical Statistics. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.


  • Local maximum likelihood
  • Oracle estimation
  • SCAD
  • Sparsity
  • Ultra-high dimension

Cite this