Research Interests
- Item response theory, measurement, psychometrics
- Causal inference, program evaluation, policy
- Statistical methodology, computing, simulation
- Research synthesis, meta-analysis, meta-science
Google Scholar Metrics
Last updated: 2026-06-03 14:06 UTC.
| Citations |
883 |
858 |
| h-index |
15 |
15 |
| i10-index |
19 |
19 |
View full profile
Publications
Notational conventions:
†: Student collaborators with whom I engaged extensively during the project.
‡: Paper on which I acted as a senior or corresponding author.
*: Joint authorship.
...: Lengthy authorship list (names omitted).
[...]: Links to the article, replication materials, and media coverage, where applicable.
Peer-Reviewed Journal Publications
Lead/Sole Author
- Gilbert, J. B., Young, W. S., Himmelsbach, Z., Ulitzsch, E., and Domingue, B. W. (2026). Conditional dependencies between response time and item discrimination: An item-level meta-analysis. Educational and Psychological Measurement. doi, code
- Gilbert, J. B. (2026). Explanatory item response models for continuous data: A tutorial in R. Behavior Research Methods, 58, 124. doi, code
- Gilbert, J. B., Domingue, B. W., and Kim, J. S. (2026). Estimating causal effects on psychological networks using item response theory. Psychological Methods, 31(1), 110-135. doi, code
- Gilbert, J. B., Soland, J. G., and Domingue, B. W. (2026). The sensitivity of value-added estimates to test scoring decisions. Educational Measurement: Issues and Practice, 45(1). doi, code, policy brief
- Gilbert, J. B., and Miratrix, L. W. (2026). Multilevel metamodels: Enhancing inference, interpretability, and generalizability in Monte Carlo simulation studies. Multivariate Behavioral Research, 61(2), 287-310. doi, code
- Gilbert, J. B., Himmelsbach, Z., Miratrix, L. W., Ho, A. D., and Domingue, B. W. (2025). Item-level heterogeneity in value added models: Implications for reliability, cross-study comparability, and effect sizes. Journal of Educational and Behavioral Statistics. doi, code
- Gilbert, J. B., Himmelsbach, Z., Soland, J., Joshi, M., and Domingue, B. W. (2025). Estimating heterogeneous treatment effects with item-level outcome data: Insights from item response theory. Journal of Policy Analysis and Management, 44(4), 1417-1449. doi, code
- Gilbert, J. B. (2025). Estimating treatment effects with the explanatory item response model. Journal of Research on Educational Effectiveness, 18(1), 166-184. doi, code
- Gilbert, J. B. (2025). How measurement affects causal inference: Attenuation bias is (usually) more important than outcome scoring weights. Methodology, 21(2), 91-122. doi, code
- Gilbert, J. B., Zhang, L., Ulitzsch, E., and Domingue, B. W. (2025). Polytomous explanatory item response models for item discrimination: Assessing negative-framing effects in social-emotional learning surveys. Behavior Research Methods, 57, 109. doi, code
- Gilbert, J. B., Miratrix, L. W., Joshi, M., and Domingue, B. W. (2025). Disentangling person-dependent and item-dependent causal effects: Applications of item response theory to the estimation of treatment effect heterogeneity. Journal of Educational and Behavioral Statistics, 50(1), 72-101. doi, code
- Gilbert, J. B., Kim, J. S., and Miratrix, L. W. (2024). Leveraging item parameter drift to assess transfer effects in vocabulary learning. Applied Measurement in Education, 37(3), 240-257. doi, code
- Gilbert, J. B., Hieronymus, F., Eriksson, E., and Domingue, B. W. (2024). Item-level heterogeneous treatment effects of Selective Serotonin Reuptake Inhibitors (SSRIs) on depression: Implications for inference, generalizability, and identification. Epidemiologic Methods, 13(S2), 20240006, 1-17. doi, code
- Gilbert, J. B. (2024). Modeling item-level heterogeneous treatment effects: A tutorial with the
glmer function from the lme4 package in R. Behavior Research Methods, 56(5), 5055-5067. doi, code
- Gilbert, J. B., Kim, J. S., and Miratrix, L. W. (2023). Modeling item-level heterogeneous treatment effects with the explanatory item response model: Leveraging large-scale online assessments to pinpoint the impact of educational interventions. Journal of Educational and Behavioral Statistics, 48(6), 889-913. doi, code
Co-authored (Education/Quantitative Methods)
- Nalbandyan, R., Gilbert, J. B., Franco, V. R., and Domingue, B. W. (2026). Signposts on the path from nominal to ordinal scales: Moving from a discrete to a continuous view. Educational and Psychological Measurement. doi, code
- Domingue, B. W., Braginsky, M., Caffrey-Maffei, L. A., Gilbert, J. B. … and Frank, M. C. (2025). An introduction to the Item Response Warehouse (IRW): A resource for enhancing data usage in psychometrics. Behavior Research Methods, 57, 276. doi
- Relyea, J. E., Gilbert, J. B., … and Kim, J. S. (2025). Asset-based implementation of structured adaptations in an online third-grade content literacy intervention. Reading Research Quarterly, 60(4). doi, code, EWP Policy and Practice
- Kim, J. S., Gilbert, J. B., … and Tvedt, J. N. (2024). Time to transfer: Long-term effects of a sustained and spiraled content literacy intervention in the elementary grades. Developmental Psychology, 60(7), 1279-1297. doi, code, Harvard Ed. Magazine, NBC News, Hechinger Report, MetaMetrics, American Educator, Minding the Gap, Harvard Magazine
- Relyea, J. E., Rich, P., Kim, J. S., and Gilbert, J. B. (2023). The COVID-19 impact on reading achievement growth of Grade 3-5 students in a US urban school district: variation across student characteristics and instructional modalities. Reading and Writing, 36(2), 317-346. doi, Science Daily
- Kim, J. S., Burkhauser, M. A., Relyea, J. E., Gilbert, J. B., … and McIntyre, J. (2023). A longitudinal randomized trial of a sustained content literacy intervention from first to second grade: Transfer effects on students’ reading comprehension. Journal of Educational Psychology, 115(1), 73-96. doi, code, supplement, Forbes, Education Week
- Kim, J., Gilbert, J., Yu, Q., and Gale, C. (2021). Measures matter: A meta-analysis of the effects of educational apps on preschool to grade 3 children’s literacy and math skills. AERA Open, 7, 23328584211004183. doi, code, New York Times
- Scripp, L., and Gilbert, J. (2016). Music plus music integration: A model for music education policy reform that reflects the evolution and success of arts integration practices in 21st century American public schools. Arts Education Policy Review, 117(4), 186-202. doi
Co-authored (Health Science/Biostatistics)
- Berkowitz, S. T., Gilbert, J. B., … and Finn, A. P. (Accepted). Characterization of intravitreal anti-VEGF injection transitions across bevacizumab, biosimilar, and branded medications: IRIS (Intelligent Research in Sight) Registry study. Journal of VitreoRetinal Diseases.
- Ghauri, S., Ross, C., Gilbert, J. B. … and Krzystolik, M. G. (2026). Timing and determinants of post-injection endophthalmitis after first-time anti-VEGF administration: A retrospective national study in the American Academy of Ophthalmology IRIS Registry. Ophthalmology Retina. doi
- Hoyek, S., Gilbert, J. B. … and Patel, N. A. (2026). Intraocular pressure changes following vitrectomy with and without phacoemulsification: An IRIS Registry analysis. Canadian Journal of Ophthalmology, 61(2), 300-308. doi
- Nestorova, T., Nestorov, I., Gilbert, J. B., and Howell, I. (2025). Does vibrato define genre or vice versa? A novel parametric approach to vibrato analysis. Journal of Voice. doi
- Tainsh, L., Douglas, V. P., Gilbert, J. B. … and Lorch, A. (2025). Patient and practice level visual acuity prior to cataract surgery: An IRIS Registry (Intelligent Research in Sight) analysis. Clinical Ophthalmology, 19, 4975-4987. doi
- Vu, D. M., Gilbert, J. B., … and Miller, J. W. (2025). Factors associated with gonioscopy coding before glaucoma procedures in the IRIS Registry. American Journal of Ophthalmology, 279, 253-263. doi
- Zidan, A. A., Gilbert, J. B., … and Yin, J. (2025). Neurotrophic keratopathy in pediatric population: An IRIS Registry report. Ophthalmology, 132(9), 1063-1066. doi
- Ross, C., Ghauri, S., Gilbert, J. B. … and Krzystolik, M. G. (2025). Intravitreal antibiotics versus early vitrectomy plus intravitreal antibiotics for post-injection endophthalmitis: An IRIS registry (Intelligent Research in Sight) analysis. Ophthalmology Retina, 9(3), 224-231. doi
- Zidan, A. A., Gilbert, J. B. … and Yin, J. (2025). Nerve growth factor treatment for neurotrophic keratopathy in the IRIS registry. Ophthalmology, 132(3), 368-370. doi
- Ha, S. K.*, Gilbert, J. B.*, Le, E., Ross, C., and Lorch, A. (2025). Impact of teleretinal screening program on diabetic retinopathy screening compliance rates in community health centers: a quasi-experimental study. BMC Health Services Research, 25, 318. doi, code
Conference Papers
- Student, S. R., Gilbert, J. B., Eze, J., Young, W. S., and Domingue, B. W. (Accepted). Item-level heterogeneous treatment effects in instrumental variables regression. Proceedings of the International Meeting of the Psychometric Society. doi
- Isley, C., Gilbert, J. B. … and Goel, S. (2026). Assessing the quality of AI-generated exams: A large-scale field study. Proceedings of the AAAI Conference on Artificial Intelligence, 40(45), 38626-38634. doi
- Kakarla, S.†, Yanney, L.†, Gilbert, J. B.‡, and Domingue, B. W.‡ (2025). Exploring item-level heterogeneous treatment effects in educational interventions through machine learning techniques and item response models. 2024 IEEE MIT Undergraduate Research Technology Conference (URTC), Cambridge, MA, USA. doi
Book Chapters
- Scripp, L., and Gilbert, J. (2019). Human development through music. In L. Scripp and B. Kaufman (Eds.), Music learning as youth development (pp. 8-39). Routledge. link
Working Papers
Manuscripts in Revision
- Gilbert, J. B. and Kim, J. S. (2026). Mapping the mechanisms of interdisciplinary learning transfer from reading to math achievement: Evidence from a large-scale randomized controlled trial. Under revision in Developmental Psychology. doi, The Bell Ringer
- Ashourizadeh, A.*, Gilbert, J. B.* … and Armstrong, G. W. (2026). Impact of cataract surgery on the risk of conversion from dry to neovascular age-related macular degeneration in the IRIS Registry. Under revision in Ophthalmology. doi
- Burkhauser, M. A., Kim, J. S., Mosher, D., Scherer, E., Gilbert, J. B., Tvedt, J., and Grob, L. (2026). Empower the few to reach the many: An implementation strategy to support adoption, spread, and reform ownership of a first-grade content literacy program. Under revision in Scientific Studies of Reading.
- Gilbert, J. B. and Soland, J. (2024). Mechanisms of effect size differences between researcher developed and independently developed outcomes: An item-level meta-analysis. Under revision in Multivariate Behavioral Research. doi
- Armstrong-Carter, E., Gilbert, J. B., Silfverskiold, T., Clark, L. E., and Templin, T. (2024). Adolescents provide approximately $15 billion worth of informal family caregiving services to the United States economy. Under revision in Children and Youth Services Review. doi
- Halpin, P. F., and Gilbert, J. B. (2024). Testing whether reported treatment effects are unduly influenced by item-level heterogeneity. Under revision in Journal of Research on Educational Effectiveness. doi
Manuscripts Under Review
- Ashourizadeh, A.*, Gilbert, J. B.* … and Armstrong, G. W. (2025). Visual outcomes of cataract surgery in patients with age-related macular degeneration. Under review in Ophthalmology.
- Gilbert, J. B., Kim, E. J., Himmelsbach, Z., Ulitzsch, E., and Zhang, L. (2026). Idiographic item response theory: Modeling person-specific differential item functioning in intensive longitudinal data. Under review in Multivariate Behavioral Research. doi
- Whitcomb, N., Gilbert, J. B., Ross, C., Kearney, W., Lorch, A., and Lee, H. J. (2026). Comparative surgical outcomes and predictors of surgical intervention in recurrent corneal erosion: An IRIS Registry study. Under review in American Journal of Ophthalmology.
- Soland, J., and Gilbert, J. B. (2026). Does socially desirable responding increase after an intervention? Implications for estimating treatment effects. Under review in Journal of Experimental Education. doi
- Gilbert, J. B.*, Soland, J. G.*, and Young, W. S. (2026). Is the replication crisis a measurement crisis? Evidence from over 100 randomized trial outcomes. Under review in Nature. doi
- Cho, S-J., and Gilbert, J. B. (2026). Explanatory item response models with random item parameters and their applications. (Book Chapter)
- Veltri, G. A.*, and Gilbert, J. B.* (2026). Results from randomized controlled trials are highly sensitive to data preprocessing decisions: A multiverse analysis of 97 outcomes. Under review in Advances in Methods and Practice in Psychological Science. doi
- Dahrouj, M., Awh, C., Bleicher, I. D., Gilbert, J. B. … and Singh, R. P. (2026). Preoperative vision as a predictor of surgical outcomes for idiopathic epiretinal membranes: an IRIS Registry study. Under review in Ophthalmology.
- Gilbert, J. B. and Himmelsbach, Z. (2026). Why fadeout is (probably) worse than we think: Adjusting for correlated sampling error in meta-analyses of behavioral interventions. Under review in Psychological Methods. doi
- Gilson, A. Vonsachang, H., Gilbert, J. B. … and Lindsay, J. L. (2025). Rates of laser peripheral iridotomy vs. lens extraction in the treatment of angle closure glaucoma: an IRIS registry (Intelligent Research in Sight) analysis. Under review in Ophthalmology.
- Zhang, L., Liu, Y., Molenaar, D., Gilbert, J. B., Kanopka, K., and Domingue, B. W. (2025). Realistic simulation of item difficulties. Under review in Behavior Research Methods. doi
Manuscripts in Preparation
- Himmelsbach, Z., and Gilbert, J. B. (2026). When within-site randomization targets the wrong policy estimand: Implementer concentration in multisite trials.
- Hardy, M., Gilbert, J. B., and Domingue, B. (2026). Efficient detection of bad benchmark items with novel scalability coefficients. doi
- Student, S. R., Eze, J., Gilbert, J. B., Young, W. S., and Domingue, B. W. (2025). Expanding psychology’s causal toolkit: Latent outcomes and binary treatments in instrumental variables regression.
- Himmelsbach, Z., and Gilbert, J. B. (2025). The case for Bayesian estimation of the D-study.
Popular Press
- Gilbert, J., and Kim, J. (2023). Do Educational Apps Actually Help Kids Learn? Education Week. link
- Scripp, L., and Gilbert, J. (2016). All that Matters is How Good it Sounds. VAN Magazine. link