This study introduces the expanded Plant Growth and Phenotyping Version 2 dataset (PGP v2), which substantially increases both scale and diversity from the initial release, comprising approximately 52,000 multispectral images across four crop species. The dataset was collected in the Texas A&M AgriLife Precision Phenotyping Greenhouse and includes temporal sequences with standardized lighting, environmental controls, and imaging protocols. Each image is accompanied by comprehensive metadata including timestamp, plant genotype, growth stage, and environmental conditions. The dataset is divided into training and validation subsets to support reproducible machine learning model development and evaluation.
F. Orvati Nia, J. Peeples, S. C. Murray, A. McFarland, T. Vann, S. Salehi, R. Hardin, D. D. Baltensperger, A. Ibrahim, J. A. Thomasson, H. Fadamiro, N. K. Subramanian, R. Roston, J. Ishimwe, D. Basak, N. Oladepo, and U. Vysyaraju.
"A Data-Driven Image Extraction and Analysis Pipeline for Plant Phenotyping in Controlled Environments."
bioRxiv, 2026.
BibTex:
@article{orvati2026data,
title={A Data-Driven Image Extraction and Analysis Pipeline for Plant Phenotyping in Controlled Environments},
author={Orvati Nia, Fahimeh and Peeples, Joshua and Murray, Seth C and McFarland, Andrew and Vann, Troy and Salehi, Shima and Hardin, Robert and Baltensperger, David D and Ibrahim, Amir and Thomasson, J. Alex and Fadamiro, Henry and Subramanian, Nithya K and Roston, Rebecca and Ishimwe, Joslin and Basak, Diptadeep and Oladepo, Nazar and Vysyaraju, Uday},
journal={bioRxiv},
pages={2026--02},
year={2026},
publisher={Cold Spring Harbor Laboratory}
}