Publications

Research Interests

  • Item response theory, measurement, psychometrics
  • Causal inference, program evaluation, policy
  • Statistical methodology, computing, simulation
  • Research synthesis, meta-analysis, meta-science

Google Scholar Metrics

Last updated: 2026-06-03 14:06 UTC.

Metric All Since 2021
Citations 883 858
h-index 15 15
i10-index 19 19

View full profile

Publications

Notational conventions:

  • : Student collaborators with whom I engaged extensively during the project.
  • : Paper on which I acted as a senior or corresponding author.
  • *: Joint authorship.
  • ...: Lengthy authorship list (names omitted).
  • [...]: Links to the article, replication materials, and media coverage, where applicable.

Peer-Reviewed Journal Publications

Lead/Sole Author

  • Gilbert, J. B., Young, W. S., Himmelsbach, Z., Ulitzsch, E., and Domingue, B. W. (2026). Conditional dependencies between response time and item discrimination: An item-level meta-analysis. Educational and Psychological Measurement. doi, code
  • Gilbert, J. B. (2026). Explanatory item response models for continuous data: A tutorial in R. Behavior Research Methods, 58, 124. doi, code
  • Gilbert, J. B., Domingue, B. W., and Kim, J. S. (2026). Estimating causal effects on psychological networks using item response theory. Psychological Methods, 31(1), 110-135. doi, code
  • Gilbert, J. B., Soland, J. G., and Domingue, B. W. (2026). The sensitivity of value-added estimates to test scoring decisions. Educational Measurement: Issues and Practice, 45(1). doi, code, policy brief
  • Gilbert, J. B., and Miratrix, L. W. (2026). Multilevel metamodels: Enhancing inference, interpretability, and generalizability in Monte Carlo simulation studies. Multivariate Behavioral Research, 61(2), 287-310. doi, code
  • Gilbert, J. B., Himmelsbach, Z., Miratrix, L. W., Ho, A. D., and Domingue, B. W. (2025). Item-level heterogeneity in value added models: Implications for reliability, cross-study comparability, and effect sizes. Journal of Educational and Behavioral Statistics. doi, code
  • Gilbert, J. B., Himmelsbach, Z., Soland, J., Joshi, M., and Domingue, B. W. (2025). Estimating heterogeneous treatment effects with item-level outcome data: Insights from item response theory. Journal of Policy Analysis and Management, 44(4), 1417-1449. doi, code
  • Gilbert, J. B. (2025). Estimating treatment effects with the explanatory item response model. Journal of Research on Educational Effectiveness, 18(1), 166-184. doi, code
  • Gilbert, J. B. (2025). How measurement affects causal inference: Attenuation bias is (usually) more important than outcome scoring weights. Methodology, 21(2), 91-122. doi, code
  • Gilbert, J. B., Zhang, L., Ulitzsch, E., and Domingue, B. W. (2025). Polytomous explanatory item response models for item discrimination: Assessing negative-framing effects in social-emotional learning surveys. Behavior Research Methods, 57, 109. doi, code
  • Gilbert, J. B., Miratrix, L. W., Joshi, M., and Domingue, B. W. (2025). Disentangling person-dependent and item-dependent causal effects: Applications of item response theory to the estimation of treatment effect heterogeneity. Journal of Educational and Behavioral Statistics, 50(1), 72-101. doi, code
  • Gilbert, J. B., Kim, J. S., and Miratrix, L. W. (2024). Leveraging item parameter drift to assess transfer effects in vocabulary learning. Applied Measurement in Education, 37(3), 240-257. doi, code
  • Gilbert, J. B., Hieronymus, F., Eriksson, E., and Domingue, B. W. (2024). Item-level heterogeneous treatment effects of Selective Serotonin Reuptake Inhibitors (SSRIs) on depression: Implications for inference, generalizability, and identification. Epidemiologic Methods, 13(S2), 20240006, 1-17. doi, code
  • Gilbert, J. B. (2024). Modeling item-level heterogeneous treatment effects: A tutorial with the glmer function from the lme4 package in R. Behavior Research Methods, 56(5), 5055-5067. doi, code
  • Gilbert, J. B., Kim, J. S., and Miratrix, L. W. (2023). Modeling item-level heterogeneous treatment effects with the explanatory item response model: Leveraging large-scale online assessments to pinpoint the impact of educational interventions. Journal of Educational and Behavioral Statistics, 48(6), 889-913. doi, code

Co-authored (Education/Quantitative Methods)

  • Nalbandyan, R., Gilbert, J. B., Franco, V. R., and Domingue, B. W. (2026). Signposts on the path from nominal to ordinal scales: Moving from a discrete to a continuous view. Educational and Psychological Measurement. doi, code
  • Domingue, B. W., Braginsky, M., Caffrey-Maffei, L. A., Gilbert, J. B. … and Frank, M. C. (2025). An introduction to the Item Response Warehouse (IRW): A resource for enhancing data usage in psychometrics. Behavior Research Methods, 57, 276. doi
  • Relyea, J. E., Gilbert, J. B., … and Kim, J. S. (2025). Asset-based implementation of structured adaptations in an online third-grade content literacy intervention. Reading Research Quarterly, 60(4). doi, code, EWP Policy and Practice
  • Kim, J. S., Gilbert, J. B., … and Tvedt, J. N. (2024). Time to transfer: Long-term effects of a sustained and spiraled content literacy intervention in the elementary grades. Developmental Psychology, 60(7), 1279-1297. doi, code, Harvard Ed. Magazine, NBC News, Hechinger Report, MetaMetrics, American Educator, Minding the Gap, Harvard Magazine
  • Relyea, J. E., Rich, P., Kim, J. S., and Gilbert, J. B. (2023). The COVID-19 impact on reading achievement growth of Grade 3-5 students in a US urban school district: variation across student characteristics and instructional modalities. Reading and Writing, 36(2), 317-346. doi, Science Daily
  • Kim, J. S., Burkhauser, M. A., Relyea, J. E., Gilbert, J. B., … and McIntyre, J. (2023). A longitudinal randomized trial of a sustained content literacy intervention from first to second grade: Transfer effects on students’ reading comprehension. Journal of Educational Psychology, 115(1), 73-96. doi, code, supplement, Forbes, Education Week
  • Kim, J., Gilbert, J., Yu, Q., and Gale, C. (2021). Measures matter: A meta-analysis of the effects of educational apps on preschool to grade 3 children’s literacy and math skills. AERA Open, 7, 23328584211004183. doi, code, New York Times
  • Scripp, L., and Gilbert, J. (2016). Music plus music integration: A model for music education policy reform that reflects the evolution and success of arts integration practices in 21st century American public schools. Arts Education Policy Review, 117(4), 186-202. doi

Co-authored (Health Science/Biostatistics)

  • Berkowitz, S. T., Gilbert, J. B., … and Finn, A. P. (Accepted). Characterization of intravitreal anti-VEGF injection transitions across bevacizumab, biosimilar, and branded medications: IRIS (Intelligent Research in Sight) Registry study. Journal of VitreoRetinal Diseases.
  • Ghauri, S., Ross, C., Gilbert, J. B. … and Krzystolik, M. G. (2026). Timing and determinants of post-injection endophthalmitis after first-time anti-VEGF administration: A retrospective national study in the American Academy of Ophthalmology IRIS Registry. Ophthalmology Retina. doi
  • Hoyek, S., Gilbert, J. B. … and Patel, N. A. (2026). Intraocular pressure changes following vitrectomy with and without phacoemulsification: An IRIS Registry analysis. Canadian Journal of Ophthalmology, 61(2), 300-308. doi
  • Nestorova, T., Nestorov, I., Gilbert, J. B., and Howell, I. (2025). Does vibrato define genre or vice versa? A novel parametric approach to vibrato analysis. Journal of Voice. doi
  • Tainsh, L., Douglas, V. P., Gilbert, J. B. … and Lorch, A. (2025). Patient and practice level visual acuity prior to cataract surgery: An IRIS Registry (Intelligent Research in Sight) analysis. Clinical Ophthalmology, 19, 4975-4987. doi
  • Vu, D. M., Gilbert, J. B., … and Miller, J. W. (2025). Factors associated with gonioscopy coding before glaucoma procedures in the IRIS Registry. American Journal of Ophthalmology, 279, 253-263. doi
  • Zidan, A. A., Gilbert, J. B., … and Yin, J. (2025). Neurotrophic keratopathy in pediatric population: An IRIS Registry report. Ophthalmology, 132(9), 1063-1066. doi
  • Ross, C., Ghauri, S., Gilbert, J. B. … and Krzystolik, M. G. (2025). Intravitreal antibiotics versus early vitrectomy plus intravitreal antibiotics for post-injection endophthalmitis: An IRIS registry (Intelligent Research in Sight) analysis. Ophthalmology Retina, 9(3), 224-231. doi
  • Zidan, A. A., Gilbert, J. B. … and Yin, J. (2025). Nerve growth factor treatment for neurotrophic keratopathy in the IRIS registry. Ophthalmology, 132(3), 368-370. doi
  • Ha, S. K.*, Gilbert, J. B.*, Le, E., Ross, C., and Lorch, A. (2025). Impact of teleretinal screening program on diabetic retinopathy screening compliance rates in community health centers: a quasi-experimental study. BMC Health Services Research, 25, 318. doi, code

Conference Papers

  • Student, S. R., Gilbert, J. B., Eze, J., Young, W. S., and Domingue, B. W. (Accepted). Item-level heterogeneous treatment effects in instrumental variables regression. Proceedings of the International Meeting of the Psychometric Society. doi
  • Isley, C., Gilbert, J. B. … and Goel, S. (2026). Assessing the quality of AI-generated exams: A large-scale field study. Proceedings of the AAAI Conference on Artificial Intelligence, 40(45), 38626-38634. doi
  • Kakarla, S.†, Yanney, L.†, Gilbert, J. B.‡, and Domingue, B. W.‡ (2025). Exploring item-level heterogeneous treatment effects in educational interventions through machine learning techniques and item response models. 2024 IEEE MIT Undergraduate Research Technology Conference (URTC), Cambridge, MA, USA. doi

Book Chapters

  • Scripp, L., and Gilbert, J. (2019). Human development through music. In L. Scripp and B. Kaufman (Eds.), Music learning as youth development (pp. 8-39). Routledge. link

Working Papers

Manuscripts in Revision

  • Gilbert, J. B. and Kim, J. S. (2026). Mapping the mechanisms of interdisciplinary learning transfer from reading to math achievement: Evidence from a large-scale randomized controlled trial. Under revision in Developmental Psychology. doi, The Bell Ringer
  • Ashourizadeh, A.*, Gilbert, J. B.* … and Armstrong, G. W. (2026). Impact of cataract surgery on the risk of conversion from dry to neovascular age-related macular degeneration in the IRIS Registry. Under revision in Ophthalmology. doi
  • Burkhauser, M. A., Kim, J. S., Mosher, D., Scherer, E., Gilbert, J. B., Tvedt, J., and Grob, L. (2026). Empower the few to reach the many: An implementation strategy to support adoption, spread, and reform ownership of a first-grade content literacy program. Under revision in Scientific Studies of Reading.
  • Gilbert, J. B. and Soland, J. (2024). Mechanisms of effect size differences between researcher developed and independently developed outcomes: An item-level meta-analysis. Under revision in Multivariate Behavioral Research. doi
  • Armstrong-Carter, E., Gilbert, J. B., Silfverskiold, T., Clark, L. E., and Templin, T. (2024). Adolescents provide approximately $15 billion worth of informal family caregiving services to the United States economy. Under revision in Children and Youth Services Review. doi
  • Halpin, P. F., and Gilbert, J. B. (2024). Testing whether reported treatment effects are unduly influenced by item-level heterogeneity. Under revision in Journal of Research on Educational Effectiveness. doi

Manuscripts Under Review

  • Ashourizadeh, A.*, Gilbert, J. B.* … and Armstrong, G. W. (2025). Visual outcomes of cataract surgery in patients with age-related macular degeneration. Under review in Ophthalmology.
  • Gilbert, J. B., Kim, E. J., Himmelsbach, Z., Ulitzsch, E., and Zhang, L. (2026). Idiographic item response theory: Modeling person-specific differential item functioning in intensive longitudinal data. Under review in Multivariate Behavioral Research. doi
  • Whitcomb, N., Gilbert, J. B., Ross, C., Kearney, W., Lorch, A., and Lee, H. J. (2026). Comparative surgical outcomes and predictors of surgical intervention in recurrent corneal erosion: An IRIS Registry study. Under review in American Journal of Ophthalmology.
  • Soland, J., and Gilbert, J. B. (2026). Does socially desirable responding increase after an intervention? Implications for estimating treatment effects. Under review in Journal of Experimental Education. doi
  • Gilbert, J. B.*, Soland, J. G.*, and Young, W. S. (2026). Is the replication crisis a measurement crisis? Evidence from over 100 randomized trial outcomes. Under review in Nature. doi
  • Cho, S-J., and Gilbert, J. B. (2026). Explanatory item response models with random item parameters and their applications. (Book Chapter)
  • Veltri, G. A.*, and Gilbert, J. B.* (2026). Results from randomized controlled trials are highly sensitive to data preprocessing decisions: A multiverse analysis of 97 outcomes. Under review in Advances in Methods and Practice in Psychological Science. doi
  • Dahrouj, M., Awh, C., Bleicher, I. D., Gilbert, J. B. … and Singh, R. P. (2026). Preoperative vision as a predictor of surgical outcomes for idiopathic epiretinal membranes: an IRIS Registry study. Under review in Ophthalmology.
  • Gilbert, J. B. and Himmelsbach, Z. (2026). Why fadeout is (probably) worse than we think: Adjusting for correlated sampling error in meta-analyses of behavioral interventions. Under review in Psychological Methods. doi
  • Gilson, A. Vonsachang, H., Gilbert, J. B. … and Lindsay, J. L. (2025). Rates of laser peripheral iridotomy vs. lens extraction in the treatment of angle closure glaucoma: an IRIS registry (Intelligent Research in Sight) analysis. Under review in Ophthalmology.
  • Zhang, L., Liu, Y., Molenaar, D., Gilbert, J. B., Kanopka, K., and Domingue, B. W. (2025). Realistic simulation of item difficulties. Under review in Behavior Research Methods. doi

Manuscripts in Preparation

  • Himmelsbach, Z., and Gilbert, J. B. (2026). When within-site randomization targets the wrong policy estimand: Implementer concentration in multisite trials.
  • Hardy, M., Gilbert, J. B., and Domingue, B. (2026). Efficient detection of bad benchmark items with novel scalability coefficients. doi
  • Student, S. R., Eze, J., Gilbert, J. B., Young, W. S., and Domingue, B. W. (2025). Expanding psychology’s causal toolkit: Latent outcomes and binary treatments in instrumental variables regression.
  • Himmelsbach, Z., and Gilbert, J. B. (2025). The case for Bayesian estimation of the D-study.