Feature Review
Genomic Prediction of Yield and Protein Traits in Soybean Using Machine Learning Models 


Legume Genomics and Genetics, 2025, Vol. 16, No. 2
Received: 20 Feb., 2025 Accepted: 06 Apr., 2025 Published: 27 Apr., 2025
As a globally significant food and plant protein crop, the yield and protein content of soybeans are the core target traits in breeding. However, due to the influence of the interaction between the genetic background and environment of complex quantitative traits, the efficiency of traditional phenotypic selection and genetic improvement is limited. To enhance breeding efficiency and prediction accuracy, this study explored the applicability and effectiveness of multiple machine learning algorithms in the genomic prediction of soybean yield and protein traits. Based on the genotype (SNP) and phenotypic data of multiple soybean breeding populations in this study, machine learning models such as RR-BLUP, Support vector Machine (SVM), Random Forest (RF), Gradient enhancer (GBM), and Deep neural Network (DNN) were respectively constructed. Combined with feature selection methods such as principal Component Analysis (PCA), LASSO and Boruta, the prediction accuracy and stability of the model are systematically evaluated. The results show that nonlinear models (such as RF and GBM) have better generalization ability for complex traits under multiple environmental conditions. The multi-trait joint prediction strategy further enhanced the model's performance in composite indicators such as protein yield. This study demonstrates the potential of machine learning techniques in the genomic prediction of complex quantitative traits, providing an efficient means for auxiliary selection in soybean breeding and laying the foundation for the construction of intelligent and high-throughput breeding decision-making systems.
. FPDF(win)
. FPDF(mac)
. HTML
. Online fPDF
Associated material
. Readers' comments
Other articles by authors
. Xingde Wang

. Tianxia Guo

Related articles
. Soybeans

. Genomic prediction

. Machine learning

. Yield traits

. Protein content

Tools
. Post a comment