Texas A&M Logo

A Data-Driven Image Extraction and Analysis Pipeline for Plant Phenotyping in Controlled Environments

1Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, USA
2Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, USA
3Texas A&M AgriLife Research, Texas A&M University, College Station, TX, USA
4Department of Biological and Agricultural Engineering, Texas A&M University, College Station, TX, USA
5Department of Agricultural and Biological Engineering, Mississippi State University, Mississippi State, MS, USA
6Department of Entomology, Texas A&M University, College Station, TX, USA
*Correspondence email: fahimehorvatinia@tamu.edu, jpeeples@tamu.edu
Abstract

Temporal imaging of plants in controlled environments helps scientists better understand growth and biological processes. However, analyzing large volumes of images has been limited by a lack of automated tools. Multispectral imagery captures additional information about plant pigments, structure, and stress beyond standard color images. We developed an automated analysis pipeline that identifies individual plants, tracks their growth over time, and measures traits such as height, area, shape, texture, and vegetation indices. Using artificial intelligence, the system efficiently processes thousands of images to provide consistent and repeatable measurements. By integrating engineering and plant biology, this work supports data-driven decisions for crop improvement and agricultural research.

Project Organization and Workflow Integration
Figure 1: Integration of teams and responsibilities in the Texas A&M AgriLife Phenotyping Greenhouse project.
Comprehensive Phenotyping Pipeline
Figure 2: Comprehensive pipeline for multispectral plant phenotyping and feature analysis.
Plant Growth and Phenotyping Version 2 Dataset

This study introduces the expanded Plant Growth and Phenotyping Version 2 dataset (PGP v2), which substantially increases both scale and diversity from the initial release, comprising approximately 14,000 images of corn, 27,000 of cotton, 1,376 of rice, and 10,608 of sorghum, along with 1,840 manually annotated sorghum images for keypoint detection.

Dataset Samples
Corn (~14,000 images)
Corn sample 1
Corn sample 2
Corn sample 3
Corn sample 4
Cotton (~27,000 images)
Cotton sample 1
Cotton sample 2
Cotton sample 3
Cotton sample 4
Rice (~1,376 images)
Rice sample 1
Rice sample 2
Rice sample 3
Rice sample 4
Sorghum (~10,608 images)
Sorghum sample 1
Sorghum sample 2
Sorghum sample 3
Sorghum sample 4
Citation
Plain Text:
F. Orvati Nia, J. Peeples, S. C. Murray, A. McFarland, T. Vann, S. Salehi, R. Hardin, D. D. Baltensperger, A. Ibrahim, J. A. Thomasson, H. Fadamiro, N. K. Subramanian, N. Oladepo, and U. Vysyaraju. 
"A Data-Driven Image Extraction and Analysis Pipeline for Plant Phenotyping in Controlled Environments." 
bioRxiv, 2026.

BibTex:
@article{orvati2026data,
  title={A Data-Driven Image Extraction and Analysis Pipeline for Plant Phenotyping in Controlled Environments},
  author={Orvati Nia, Fahimeh and Peeples, Joshua and Murray, Seth C and McFarland, Andrew and Vann, Troy and Salehi, Shima and Hardin, Robert and Baltensperger, David D and Ibrahim, Amir and Thomasson, J Alex and others},
  journal={bioRxiv},
  pages={2026--02},
  year={2026},
  publisher={Cold Spring Harbor Laboratory}
}