Contenuto in:
Capitolo

Random effects regression trees for the analysis of INVALSI data

  • Giulia Vannucci
  • Anna Gottard
  • Leonardo Grilli
  • Carla Rampichini

Mixed or multilevel models exploit random effects to deal with hierarchical data, where statistical units are clustered in groups and cannot be assumed as independent. Sometimes, the assumption of linear dependence of a response on a set of explanatory variables is not plausible, and model specification becomes a challenging task. Regression trees can be helpful to capture non-linear effects of the predictors. This method was extended to clustered data by modelling the fixed effects with a decision tree while accounting for the random effects with a linear mixed model in a separate step (Hajjem & Larocque, 2011; Sela & Simonoff, 2012). Random effect regression trees are shown to be less sensitive to parametric assumptions and provide improved predictive power compared to linear models with random effects and regression trees without random effects. We propose a new random effect model, called Tree embedded linear mixed model, where the regression function is piecewise-linear, consisting in the sum of a tree component and a linear component. This model can deal with both non-linear and interaction effects and cluster mean dependencies. The proposal is the mixed effect version of the semi-linear regression trees (Vannucci, 2019; Vannucci & Gottard, 2019). Model fitting is obtained by an iterative two-stage estimation procedure, where both the fixed and the random effects are jointly estimated. The proposed model allows a decomposition of the effect of a given predictor within and between clusters. We will show via a simulation study and an application to INVALSI data that these extensions improve the predictive performance of the model in the presence of quasi-linear relationships, avoiding overfitting, and facilitating interpretability.

  • Keywords:
  • Regression trees,
  • Multilevel models,
  • Random effects,
  • Hierarchical data,
+ Mostra di più

Giulia Vannucci

University of Florence, Italy - ORCID: 0000-0003-3569-6274

Anna Gottard

University of Florence, Italy - ORCID: 0000-0002-8246-4962

Leonardo Grilli

University of Florence, Italy - ORCID: 0000-0002-3886-7705

Carla Rampichini

University of Florence, Italy - ORCID: 0000-0002-8519-083X

  1. Arpino, B., Bacci, S., Grilli, L., Guetto, R., Rampichini, C. (2019) Issues in prior achievement adjustment for value added analysis: an application to Invalsi tests in Italian schools. pp.17-20. In ASA Conference 2019. Statistics for Health and Well-Being. Book of Short Papers
  2. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J. (1984). Classification and Regression Trees. Wadsworth, Belmont, (CA).
  3. Hajjem, A., Bellavance, F., Larocque, D. (2011). Mixed Effects Regression Trees for Clustered Data. Statistics and Probability Letters, 81, pp. 451–459. DOI: 10.1016/j.spl.2010.12.003
  4. Hajjem, A., Bellavance, F., Larocque, D. (2014). Mixed-effects Random Forest for Clustered Data. Journal of Statistical Computation and Simulation, 84, pp. 1313–1328. DOI: 10.1080/00949655.2012.741599
  5. Miller, P.J., McArtor, D.B., Lubke, G.H (2017). METBOOST: Exploratory Regression Analysis with Hierarchically Clustered Data. arXiv:1702.03994v1 [stat.ML]
  6. Sela, R.J., Simonoff, J.S. (2012). RE-EM Trees: A Data Mining Approach for Longitudinal and Clustered Data. Machine Learning, 86, pp. 169–207. DOI: 10.1007/s10994-011-5258-3
  7. Snijders, T.A.B., Bosker, R.J. (2012). Multilevel Analysis: An Introduction to Basic and Ad- vanced Multilevel Modeling (2nd ed.). SAGE Publications Ltd.
  8. Vannucci, G. (2019). Interpretable Semilinear Regression Trees. PhD Thesis, FLOrence RE- search repository.
  9. Wermuth, N., Cox, D.R. (1998). On Association Models defined over Independence Graphs. Bernoulli Society for Mathematical Statistics and Probability, 4, pp. 477–495. DOI: 10.2307/3318662
PDF
  • Anno di pubblicazione: 2021
  • Pagine: 29-34

XML
  • Anno di pubblicazione: 2021

Informazioni sul capitolo

Titolo del capitolo

Random effects regression trees for the analysis of INVALSI data

Autori

Giulia Vannucci, Anna Gottard, Leonardo Grilli, Carla Rampichini

Lingua

English

DOI

10.36253/978-88-5518-304-8.07

Opera sottoposta a peer review

Anno di pubblicazione

2021

Copyright

© 2021 Author(s)

Licenza d'uso

CC BY 4.0

Licenza dei metadati

CC0 1.0

Informazioni bibliografiche

Titolo del libro

ASA 2021 Statistics and Information Systems for Policy Evaluation

Sottotitolo del libro

Book of short papers of the opening conference

Curatori

Bruno Bertaccini, Luigi Fabbris, Alessandra Petrucci

Opera sottoposta a peer review

Anno di pubblicazione

2021

Copyright

© 2021 Author(s)

Licenza d'uso

CC BY 4.0

Licenza dei metadati

CC0 1.0

Editore

Firenze University Press

DOI

10.36253/978-88-5518-304-8

eISBN (pdf)

978-88-5518-304-8

eISBN (xml)

978-88-5518-305-5

Collana

Proceedings e report

ISSN della collana

2704-601X

e-ISSN della collana

2704-5846

393

Download dei libri

411

Visualizzazioni

Salva la citazione

1.418

Libri in accesso aperto

in catalogo

2.772

Capitoli di Libri

4.737.558

Download dei libri

5.169

Autori

da 1112 Istituzioni e centri di ricerca

di 66 Nazioni

71

scientific boards

da 383 Istituzioni e centri di ricerca

di 44 Nazioni

1.312

I referee

da 405 Istituzioni e centri di ricerca

di 39 Nazioni