Scientia Agricultura Sinica ›› 2025, Vol. 58 ›› Issue (15): 2960-2979.doi: 10.3864/j.issn.0578-1752.2025.15.003

• CROP GENETICS & BREEDING·GERMPLASM RESOURCES·MOLECULAR GENETICS • Previous Articles     Next Articles

Genomic Selection Method Based on G2PSE Stacking Ensemble

ZHUANG RunJie1,2(), LIU HuiMing1,2, WANG ShiYu1,2, LÜ WanPing1,2, WEN YongXian1,2,*()   

  1. 1 College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002
    2 Institute of Statistics and Applications, Fujian Agriculture and Forestry University, Fuzhou 350002
  • Received:2025-02-07 Accepted:2025-05-16 Online:2025-08-01 Published:2025-07-30
  • Contact: WEN YongXian

Abstract:

【Objective】 Genomic selection (GS) is a core technology for predicting individual phenotypes or genetic values from genome-wide marker information, which has important theoretical value and practical significance in agricultural breeding and genetic research. However, high-dimensional feature redundancy and nonlinear relationship modeling are key challenges in genomic selection. A genotype to phenotype stacking ensemble (G2PSE) is proposed, aiming to improve the prediction accuracy and generalization ability, and provide an efficient solution for high-dimensional genomic data analysis. 【Method】 The G2PSE stacking ensemble model framework was constructed, incorporating ten-fold cross-validation, ensemble learning, feature selection (LAR algorithm), and feature enhancement strategies. The model employed random forests (RF), support vector regression (SVR), and gradient boosting regression (GBR) as base learners, with ordinary least squares regression (OLSR) as the meta-learner. Additionally, the impact of meta-learners such as random forest, support vector regression, and neural networks on model performance was evaluated. The G2PSE model consisted of three core submodels: (1) All-feature stacking ensemble (AFSE), which fully utilized all SNP features; (2) LAR-feature stacking ensemble (LFSE), which reduced redundant information through feature selection to improve generalization; (3) LAR-feature enhanced stacking ensemble (LFESE), which combined feature selection with enhancement strategies to optimize prediction capability in high-dimensional data environments. The performance of three feature enhancement variants (AFESE, HFESEⅠ, HFESEⅡ) was explored. Finally, the model was evaluated experimentally on multi-trait datasets of three species, namely wheat, soybean, and tilapia, and further evaluated on an independent test set using the Pepper203 dataset to validate the robustness of the model. 【Result】 The G2PSE model significantly outperformed traditional methods and single machine learning models in two metrics, Pearson correlation coefficient (PCC) and mean absolute error (MAE). Among the three core submodels, LFESE performed the best by combining the feature selection and enhancement strategies, LFSE reduced redundant information and enhanced the generalization ability by feature selection, and AFSE had a significant advantage in comprehensively capturing genotypic global information. In addition, the three feature enhancement variant models further validated the importance of feature quality compared to feature quantity in improving prediction performance. The experiments also showed that the linear regression model performed best in meta-learner selection, while the LFESE and LFSE submodels demonstrated a more balanced performance in terms of computational efficiency. And a reasonable feature selection threshold was crucial for model performance, where the optimal threshold for low-dimensional datasets was 10%-20%, while the optimal threshold for high-dimensional datasets was 1%. Finally, the evaluation on an independent test set proved that the LFESE submodel had the best generalization ability. 【Conclusion】 The G2PSE model significantly improves genomic selection prediction performance through ensemble learning, feature selection, and enhancement strategies.

Key words: genomic selection, stacking ensemble, feature selection, feature enhancement, agricultural breeding

Table 1

Relevant information of experimental datasets"

数据集
Datasets
性状
Traits
个体数
Individuals
标记数
SNPs
遗传率
h2
数据来源
Data source
Wheat599 E1-GY 599 1279 0.832 https://github.com/AIBreeding/DNNGP?tab=readme-ov-file
E2-GY 599 1279 0.729
E3-GY 599 1279 0.689
E4-GY 599 1279 0.711
Wheat2000 TKW 2000 33709 0.833 https://github.com/cma2015/DeepGS
TW 2000 33709 0.754
GL 2000 33709 0.881
GW 2000 33709 0.848
GH 2000 33709 0.839
GP 2000 33709 0.625
SDS 2000 33709 0.681
PHT 2000 33709 0.434
Soy5014 HT 5014 4234 0.449 https://doi.org/10.1534/g3.116.032268
R8 5014 4234 0.558
YLD 5014 4234 0.485
Tilapia1125 HW 1125 32306 0.304 https://figshare.com/s/9b265a22b7e138c5a839
Pepper203 PHT 203 14922 0.610 https://bmcgenomdata.biomedcentral.com/articles/10.1186/s12863-023-01179-6
FT 203 14922 0.730

Fig. 1

Overall structure and submodel design of the G2PSE model A: Training and prediction workflow of the G2PSE base model (AFSE), where ten-fold cross-validation is used to train three base learners (RF, SVR, and GBR). The predictions are used as input features for the meta-learner (OLSR) to produce the final prediction; B: Frameworks of the three core submodels of G2PSE, namely AFSE, LFSE, and LFESE; C: Frameworks of the three G2PSE model variants based on different feature enhancement strategies: AFESE, HFESEⅠ, and HFESEⅡ; Xori: Original SNP features, XLAR: Key feature subset selected via the LAR method, X′: Features input into the meta learner (new features), Y: Phenotypic values, D: Original dataset, D′: Meta dataset, Base learners: Base learners, Meta learner: Meta learner, Predict′: Prediction outputs from the base learners, Final prediction: Final prediction output"

Fig. 2

Prediction performance of various models with full features"

Fig. 3

Prediction performance of different models after feature selection"

Fig. 4

Comparison of prediction performance among six submodels of the G2PSE model"

Table 2

Multicollinearity check of G2PSE model on Soy5014"

性状Traits 模型Model 条件数Condition number
HT AFSE 14.823
LFSE 16.598
LFESE 490.901
AFESE 5.436×1016
HFESEⅠ 298.676
HFESEⅡ 5.281×1016
R8 AFSE 7.709
LFSE 8.837
LFESE 495.610
AFESE 5.506×1016
HFESEⅠ 372.634
HFESEⅡ 5.132×1016
YLD AFSE 8.353
LFSE 8.817
LFESE 286.771
AFESE 5.291×1016
HFESEⅠ 286.676
HFESEⅡ 5.489×1016

Fig. 5

Comparison of computational efficiency between G2PSE model and other prediction models"

Fig. 6

Impact of different meta-learners on the prediction performance of the AFSE model AFSE-1 to AFSE-5 correspond to the meta-learners OLSR, RF, SVR, GBR, and FNN, respectively"

Table 3

Impact of different LAR screening thresholds on the prediction performance of the G2PSE model on Wheat599"

环境
Environment
筛选SNP数目
Number of selected SNPs
LFSE模型
LFSE model
LFESE模型
LFESE model
HFESEⅠ模型
HFESEⅠ model
HFESEⅡ模型
HFESEⅡ model
E1-GY 13 0.511(0.691) 0.489 (0.701) 0.581 (0.641) 0.297 (1.132)
64 0.592 (0.634) 0.612 (0.616) 0.563 (0.655) 0.305 (1.112)
128 0.604 (0.631) 0.636 (0.606) 0.543 (0.683) 0.321 (1.095)
256 0.604 (0.625) 0.542 (0.670) 0.454 (1.251) 0.330 (1.074)
512 0.597 (0.629) 0.407 (1.631) 0.333(1.728) 0.354 (1.030)
767 0.596 (0.630) 0.129 (3.452) 0.079 (3.518) 0.353 (1.033)
E2-GY 13 0.429 (0.701) 0.467 (0.691) 0.518 (0.658) 0.339 (1.095)
64 0.592 (0.624) 0.621 (0.622) 0.614 (0.625) 0.360 (1.033)
128 0.607 (0.612) 0.698 (0.588) 0.681 (0.592) 0.373 (1.003)
256 0.598 (0.622) 0.662 (0.634) 0.630 (0.662) 0.388 (0.968)
512 0.564 (0.638) 0.521 (1.014) 0.409 (1.243) 0.431 (0.909)
767 0.536 (0.648) 0.287 (2.026) 0.204 (2.484) 0.489 (0.838)
E3-GY 13 0.512 (0.678) 0.493 (0.681) 0.455 (0.692) 0.276 (1.070)
64 0.512 (0.677) 0.498 (0.679) 0.501 (0.689) 0.274 (1.072)
128 0.555 (0.655) 0.594 (0.628) 0.551 (0.672) 0.287 (1.059)
256 0.511 (0.678) 0.496 (0.682) 0.523 (0.743) 0.280 (1.061)
512 0.510 (0.681) 0.499 (0.679) 0.285 (1.394) 0.282 (1.061)
767 0.510 (0.680) 0.496 (0.680) 0.132 (3.048) 0.279 (1.065)
E4-GY 13 0.457 (0.711) 0.477 (0.686) 0.514 (0.668) 0.291 (1.079)
64 0.604 (0.630) 0.602 (0.633) 0.592 (0.644) 0.311 (1.047)
128 0.639 (0.607) 0.653 (0.598) 0.648 (0.614) 0.319 (1.037)
256 0.628 (0.607) 0.708 (0.594) 0.668 (0.637) 0.313 (1.044)
512 0.586 (0.628) 0.434 (1.123) 0.403 (1.157) 0.323 (1.029)
767 0.560 (0.641) 0.243 (2.231) 0.230 (2.387) 0.340 (1.011)

Table 4

Impact of different LAR screening thresholds on the prediction performance of the G2PSE model on Wheat2000"

性状
Traits
筛选SNP数目
Number of selected SNPs
LFSE模型
LFSE model
LFESE模型
LFESE model
HFESEⅠ 模型
HFESEⅠ model
HFESEⅡ 模型
HFESEⅡ model
TKW 337 0.713 (0.562) 0.802 (0.467) 0.710 (0.572) 0.669 (0.589)
1685 0.710 (0.560) 0.750 (0.643) 0.522 (0.974) 0.667 (0.590)
3371 0.704 (0.567) 0.694 (0.648) 0.404 (1.099) 0.666 (0.591)
6742 0.704 (0.568) 0.689 (0.662) 0.373 (1.163) 0.666 (0.591)
13483 0.702 (0.569) 0.691 (0.660) 0.368 (1.171) 0.666 (0.591)
20225 0.700 (0.571) 0.724 (0.612) 0.365 (1.176) 0.666(0.591)
TW 337 0.674 (0.564) 0.778 (0.476) 0.709 (0.573) 0.602 (0.613)
1685 0.665 (0.572) 0.655 (0.822) 0.528 (0.966) 0.600 (0.614)
3371 0.661 (0.572) 0.635 (0.737) 0.374 (1.170) 0.599 (0.615)
6742 0.657 (0.577) 0.627 (0.731) 0.373 (1.162) 0.599 (0.615)
13483 0.658 (0.576) 0.627 (0.733) 0.368 (1.170) 0.599 (0.615)
20225 0.659 (0.576) 0.628 (0.731) 0.366 (1.175) 0.600 (0.615)
GL 337 0.772 (0.478) 0.839 (0.410) 0.754 (0.510) 0.741 (0.502)
1685 0.765 (0.486) 0.738 (0.648) 0.522 (0.947) 0.740 (0.504)
3371 0.763 (0.489) 0.699 (0.654) 0.478 (1.007) 0.739 (0.504)
6742 0.762 (0.490) 0.703 (0.643) 0.445 (1.031) 0.739 (0.504)
13483 0.764 (0.488) 0.704 (0.642) 0.442 (1.036) 0.739 (0.504)
20225 0.761 (0.490) 0.702 (0.644) 0.440 (1.039) 0.739 (0.504)
GW 337 0.750 (0.514) 0.812 (0.451) 0.712 (0.557) 0.732 (0.526)
1685 0.745 (0.518) 0.743 (0.654) 0.497 (1.010) 0.731 (0.526)
3371 0.740 (0.524) 0.746 (0.582) 0.420 (1.086) 0.731 (0.527)
6742 0.741 (0.523) 0.716 (0.634) 0.402 (1.136) 0.731 (0.527)
13483 0.741 (0.523) 0.714 (0.636) 0.401 (1.137) 0.731 (0.527)
20225 0.742 (0.523) 0.716 (0.634) 0.400 (1.137) 0.731 (0.527)
GH 337 0.687 (0.567) 0.787 (0.483) 0.677 (0.586) 0.682 (0.567)
1685 0.686 (0.566) 0.707 (0.705) 0.463 (1.043) 0.681 (0.568)
3371 0.691 (0.561) 0.695 (0.654) 0.370 (1.170) 0.680 (0.569)
6742 0.684 (0.566) 0.698 (0.645) 0.395 (1.125) 0.680 (0.569)
13483 0.686 (0.565) 0.698 (0.645) 0.392 (1.129) 0.680 (0.569)
20225 0.689 (0.563) 0.699 (0.644) 0.391 (1.132) 0.680 (0.569)
GP 337 0.626 (0.604) 0.746 (0.512) 0.627 (0.622) 0.515 (0.667)
1685 0.609 (0.618) 0.662 (0.800) 0.414 (1.175) 0.513 (0.668)
3371 0.604 (0.619) 0.616 (0.759) 0.358 (1.167) 0.512 (0.668)
6742 0.603 (0.619) 0.617 (0.757) 0.365 (1.165) 0.512 (0.668)
13483 0.603 (0.619) 0.618 (0.756) 0.357 (1.177) 0.512 (0.668)
20225 0.602 (0.621) 0.615 (0.760) 0.352 (1.185) 0.512 (0.668)
SDS 337 0.663 (0.599) 0.796 (0.477) 0.694 (0.575) 0.525 (0.694)
1685 0.656 (0.607) 0.763 (0.606) 0.580 (0.862) 0.523 (0.695)
3371 0.644 (0.621) 0.714 (0.626) 0.468 (0.979) 0.520 (0.697)
6742 0.644 (0.623) 0.713 (0.629) 0.488 (0.948) 0.518 (0.698)
13483 0.643 (0.622) 0.712 (0.631) 0.487 (0.949) 0.521 (0.696)
20225 0.644 (0.621) 0.707 (0.636) 0.440 (1.025) 0.520 (0.697)
PHT 337 0.501 (0.664) 0.697 (0.566) 0.553 (0.673) 0.279 (0.755)
1685 0.477 (0.672) 0.626 (0.844) 0.422 (1.131) 0.278 (0.755)
3371 0.449 (0.682) 0.520 (0.879) 0.334 (1.167) 0.275 (0.756)
6742 0.452 (0.682) 0.512 (0.894) 0.294 (1.230) 0.276 (0.756)
13483 0.449 (0.681) 0.515 (0.890) 0.293 (1.229) 0.275 (0.756)
20225 0.451 (0.683) 0.517 (0.883) 0.311 (1.186) 0.275 (0.756)

Table 5

Comparison of the generalization ability of G2PSE submodels on the PHT trait of the Pepper203 dataset"

模型
Model
验证集Validation set 测试集Test set 差值绝对值Absolute difference
PCC MAE PCC MAE PCC MAE
AFSE 0.639 0.566 0.520 0.580 0.119 0.014
LFSE 0.735 0.502 0.645 0.563 0.090 0.061
LFESE 0.693 0.544 0.734 0.517 0.041 0.027
AFESE 0.659 0.553 -0.023 0.690 0.682 0.137
HFESEⅠ 0.669 0.553 0.415 0.745 0.254 0.192
HFESEⅡ 0.659 0.554 0.527 0.625 0.132 0.071

Table 6

Comparison of the generalization ability of G2PSE submodels on the FT trait of the Pepper203 dataset"

模型
Model
验证集Validation set 测试集Test set 差值绝对值Absolute difference
PCC MAE PCC MAE PCC MAE
AFSE 0.785 0.468 0.705 0.508 0.080 0.040
LFSE 0.806 0.444 0.756 0.501 0.050 0.057
LFESE 0.753 0.502 0.718 0.561 0.035 0.059
AFESE 0.769 0.467 0.434 0.701 0.335 0.234
HFESEⅠ 0.736 0.564 0.657 0.624 0.079 0.060
HFESEⅡ 0.768 0.469 0.712 0.521 0.056 0.052
[1]
李棉燕, 王立贤, 赵福平. 机器学习在动物基因组选择中的研究进展. 中国农业科学, 2023, 56(18): 3682-3692. doi: 10.3864/j.issn.0578-1752.2023.18.015.
LI M Y, WANG L X, ZHAO F P. Research progress on machine learning for genomic selection in animals. Scientia Agricultura Sinica, 2023, 56(18): 3682-3692. doi: 10.3864/j.issn.0578-1752.2023.18.015. (in Chinese)
[2]
VANRADEN P M. Efficient methods to compute genomic predictions. Journal of Dairy Science, 2008, 91(11): 4414-4423.

doi: 10.3168/jds.2007-0980 pmid: 18946147
[3]
WHITTAKER J C, CURNOW R N, HALEY C S, THOMPSON R. Using marker-maps in marker-assisted selection. Genetical Research, 1995, 66(3): 255-265.
[4]
JAVID S, BIHAMTA M R, OMIDI M, ABBASI A R, ALIPOUR H, INGVARSSON P K. Genome-Wide Association Study (GWAS) and genome prediction of seedling salt tolerance in bread wheat (Triticum aestivum L.). BMC Plant Biology, 2022, 22(1): 581.
[5]
MEHER P K, RUSTGI S, KUMAR A. Performance of Bayesian and BLUP alphabets for genomic prediction: Analysis, comparison and results. Heredity, 2022, 128(6): 519-530.

doi: 10.1038/s41437-022-00539-9 pmid: 35508540
[6]
HAILE T A, WALKOWIAK S, N’DIAYE A, CLARKE J M, HUCL P J, CUTHBERT R D, KNOX R E, POZNIAK C J. Genomic prediction of agronomic traits in wheat using different models and cross- validation designs. Theoretical and Applied Genetics, 2021, 134(1): 381-398.
[7]
KALER A S, PURCELL L C, BEISSINGER T, GILLMAN J D. Genomic prediction models for traits differing in heritability for soybean, rice, and maize. BMC Plant Biology, 2022, 22(1): 87.
[8]
XU Y, MA K X, ZHAO Y, WANG X, ZHOU K, YU G N, LI C, LI P C, YANG Z F, XU C W, XU S Z. Genomic selection: A breakthrough technology in rice breeding. The Crop Journal, 2021, 9(3): 669-677.
[9]
GUNUNDU R, SHIMELIS H, MASHILO J. Genomic selection and enablers for agronomic traits in maize: A review. Plant Breeding, 2023, 142(5): 573-593.
[10]
CESARANI A, MASUDA Y, TSURUTA S, NICOLAZZI E L, VANRADEN P M, LOURENCO D, MISZTAL I. Genomic predictions for yield traits in US Holsteins with unknown parent groups. Journal of Dairy Science, 2021, 104(5): 5843-5853.

doi: 10.3168/jds.2020-19789 pmid: 33663836
[11]
ONOGI A, WATANABE T, OGINO A, KUROGI K, TOGASHI K. Genomic prediction with non-additive effects in beef cattle: Stability of variance component and genetic effect estimates against population size. BMC Genomics, 2021, 22(1): 512.
[12]
ABDOLLAHI-ARPANAHI R, LOURENCO D, LEGARRA A, MISZTAL I. Dissecting genetic trends to understand breeding practices in livestock: A maternal pig line example. Genetics, Selection, Evolution, 2021, 53(1): 89.
[13]
YIN C, ZHOU P, WANG Y W, YIN Z J, LIU Y. Using genomic selection to improve the accuracy of genomic prediction for multi-populations in pigs. Animal, 2024, 18(2): 101062.
[14]
MOTA L F M, ARIKAWA L M, SANTOS S W B, FERNANDES JÚNIOR G A, ALVES A A C, ROSA G J M, MERCADANTE M E Z, CYRILLO J N S G, CARVALHEIRO R, ALBUQUERQUE L G. Benchmarking machine learning and parametric methods for genomic prediction of feed efficiency-related traits in Nellore cattle. Scientific Reports, 2024, 14: 6404.

doi: 10.1038/s41598-024-57234-4 pmid: 38493207
[15]
BANI S H, VAEZ T R, MANAFIAZAR G, MASOUDI A A, EHSANI A, SHAHINFAR S. Comparing machine learning algorithms and linear model for detecting significant SNPs for genomic evaluation of growth traits in F2 chickens. Journal of Agricultural Science and Technology, 2024, 26(6): 1261-1274.
[16]
GRINBERG N F, ORHOBOR O I, KING R D. An evaluation of machine-learning for predicting phenotype: Studies in yeast, rice, and wheat. Machine Learning, 2020, 109(2): 251-277.

doi: 10.1007/s10994-019-05848-5 pmid: 32174648
[17]
XIANG T, LI T, LI J L, LI X, WANG J. Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs. FASEB Journal, 2023, 37(6): e22961.
[18]
OMEKA W K M, LIYANAGE D S, LEE S, UDAYANTHA H M V, KIM G, GANESHALINGAM S, JEONG T, JONES D B, MASSAULT C, JERRY D R, LEE J. Genomic prediction model optimization for growth traits of olive flounder (Paralichthys olivaceus). Aquaculture Reports, 2024, 36: 102132.
[19]
MONTESINOS-LÓPEZ O A, GONZALEZ H N, MONTESINOS- LÓPEZ A, DAZA-TORRES M, LILLEMO M, MONTESINOS- LÓPEZ J C, CROSSA J. Comparing gradient boosting machine and Bayesian threshold BLUP for genome-based prediction of categorical traits in wheat breeding. The Plant Genome, 2022, 15(3): e20214.
[20]
ZHAO W, LAI X S, LIU D Y, ZHANG Z Y, MA P P, WANG Q S, ZHANG Z, PAN Y C. Applications of support vector machine in genomic prediction in pig and maize populations. Frontiers in Genetics, 2020, 11: 598318.
[21]
周铂涵, 梅步俊, 吕琦, 王志英, 苏蕊. 机器学习及其在动物遗传育种中的应用研究进展. 中国畜牧兽医, 2024, 51(12): 5348-5358.

doi: 10.16431/j.cnki.1671-7236.2024.12.022
ZHOU B H, MEI B J, Q, WANG Z Y, SU R. Research progress of machine learning and its application in animal genetics and breeding. China Animal Husbandry & Veterinary Medicine, 2024, 51(12): 5348-5358. (in Chinese)
[22]
MA W L, QIU Z X, SONG J, LI J J, CHENG Q, ZHAI J J, MA C. A deep convolutional neural network approach for predicting phenotypes from genotypes. Planta, 2018, 248(5): 1307-1318.

doi: 10.1007/s00425-018-2976-9 pmid: 30101399
[23]
WANG K L, ALI ABID M, RASHEED A, CROSSA J, HEARNE S, LI H H. DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants. Molecular Plant, 2023, 16(1): 279-293.
[24]
WU C L, ZHANG Y Y, YING Z W, LI L, WANG J, YU H, ZHANG M C, FENG X Z, WEI X H, XU X G. A transformer-based genomic prediction method fused with knowledge-guided module. Briefings in Bioinformatics, 2023, 25(1): bbad438.
[25]
MONTESINOS-LÓPEZ O A, MONTESINOS-LÓPEZ A, PÉREZ- RODRÍGUEZ P, BARRÓN-LÓPEZ J A, MARTINI J W R, FAJARDO-FLORES S B, GAYTAN-LUGO L S, SANTANA- MANCILLA P C, CROSSA J. A review of deep learning applications for genomic selection. BMC Genomics, 2021, 22(1): 19.
[26]
HASSANALI M, SOLTANAGHAEI M, JAVDANI GANDOMANI T, ZAMANI BOROUJENI F. Exploring stacking methods for software effort estimation with hyperparameter tuning. Cluster Computing, 2025, 28(4): 241.
[27]
ALZUBI R, RAMZAN N, ALZOUBI H, KATSIGIANNIS S. SNPs-based hypertension disease detection via machine learning techniques. 2018 24th International Conference on Automation and Computing (ICAC). September 6-7, 2018, Newcastle Upon Tyne, UK. IEEE, 2018: 1-6.
[28]
MEHARIE M G, MENGESHA W J, GARIY Z A, MUTUKU R N N. Application of stacking ensemble machine learning algorithm in predicting the cost of highway construction projects. Engineering, Construction and Architectural Management, 2022, 29(7): 2836-2853.
[29]
林泳恩, 孟越, 杜懿, 王大洋, 王大刚. 堆叠集成模型径流预报效果的影响因素研究. 水文, 2023, 43(1): 57-61.
LIN Y E, MENG Y, DU Y, WANG D Y, WANG D G. Study on influence factors about runoff forecasting performance of stacking integrated model. Journal of China Hydrology, 2023, 43(1): 57-61. (in Chinese)
[30]
YOON T, KANG D. Multi-modal stacking ensemble for the diagnosis of cardiovascular diseases. Journal of Personalized Medicine, 2023, 13(2): 373.
[31]
LIANG M, CHANG T P, AN B X, DUAN X H, DU L L, WANG X Q, MIAO J, XU L Y, GAO X, ZHANG L P, LI J Y, GAO H J. A stacking ensemble learning framework for genomic prediction. Frontiers in Genetics, 2021, 12: 600040.
[32]
GU L L, YANG R Q, WANG Z Y, JIANG D, FANG M. Ensemble learning for integrative prediction of genetic values with genomic variants. BMC Bioinformatics, 2024, 25(1): 120.
[33]
YU T X, ZHANG W P, HAN J W, LI F Z, WANG Z H, CAO C Q. An ensemble learning approach for predicting phenotypes from genotypes. 2021 20th International Conference on Ubiquitous Computing and Communications (IUCC/CIT/DSCI/SmartCNS). December 20-22, 2021, London, United Kingdom. IEEE, 2021: 382-389.
[34]
LI S S, YU J, KANG H M, LIU J F. Genomic selection in Chinese Holsteins using regularized regression models for feature selection of whole genome sequencing data. Animals, 2022, 12(18): 2419.
[35]
冯盼峰, 温永仙. 基于随机森林算法的两阶段变量选择研究. 系统科学与数学, 2018, 38(1): 119-130.

doi: 10.12341/jssms13325
FENG P F, WEN Y X. Two-stage stepwise variable selection based on random forests. Journal of Systems Science and Mathematical Sciences, 2018, 38(1): 119-130. (in Chinese)

doi: 10.12341/jssms13325
[36]
孙嘉利, 吴清太, 温阳俊, 张瑾. 基于FASTmrEMMA、最小角回归和随机森林的全基因组选择新算法. 南京农业大学学报, 2021, 44(2): 366-372.
SUN J L, WU Q T, WEN Y J, ZHANG J. A new algorithm of genomics selection based on FASTmrEMMA, least angle regression and random forest. Journal of Nanjing Agricultural University, 2021, 44(2): 366-372. (in Chinese)
[37]
PILES M, BERGSMA R, GIANOLA D, GILBERT H, TUSELL L. Feature selection stability and accuracy of prediction models for genomic prediction of residual feed intake in pigs using machine learning. Frontiers in Genetics, 2021, 12: 611506.
[38]
MCLAREN C G, BRUSKIEWICH R M, PORTUGAL A M, COSICO A B. The International Rice Information System. A platform for meta-analysis of rice crop data. Plant Physiology, 2005, 139(2): 637-642.

pmid: 16219924
[39]
CROSSA J, JARQUÍN D, FRANCO J, PÉREZ-RODRÍGUEZ P, BURGUEÑO J, SAINT-PIERRE C, VIKRAM P, SANSALONI C, PETROLI C, AKDEMIR D, et al. Genomic prediction of gene bank wheat landraces. G3, 2016, 6(7): 1819-1834.
[40]
XAVIER A, MUIR W M, RAINEY K M. Assessing predictive properties of genome-wide selection in soybeans. G3, 2016, 6(8): 2611-2616.
[41]
YOSHIDA G M, LHORENTE J P, CORREA K, SOTO J, SALAS D, YÁÑEZ J M. Genome-wide association study and cost-efficient genomic predictions for growth and fillet yield in Nile Tilapia (Oreochromis niloticus). G3, 2019, 9(8): 2597-2607.
[42]
LOZADA D N, SANDHU K S, BHATTA M. Ridge regression and deep learning models for genome-wide selection of complex traits in New Mexican Chile peppers. BMC Genomic Data, 2023, 24(1): 80.
[43]
李娟, 章明清, 许文江, 孔庆波, 姚宝全. 提高三元肥效模型建模成功率的主成分回归技术研究. 土壤学报, 2018, 55(2): 467-478.
LI J, ZHANG M Q, XU W J, KONG Q B, YAO B Q. Principal component regression technology of ternary fertilizer response model for improving success rate of modeling. Acta Pedologica Sinica, 2018, 55(2): 467-478. (in Chinese)
[44]
NGUYEN T T, HUANG J, WU Q Y, NGUYEN T, LI M. Genome- wide association data classification and SNPs selection using two-stage quality-based Random Forests. BMC Genomics, 2015, 16(Suppl. 2): S5.
[45]
ZHAO W P, LI J C, ZHAO J, ZHAO D D, ZHU X Y. PDD_GBR: Research on evaporation duct height prediction based on gradient boosting regression algorithm. Radio Science, 2019, 54(11): 949-962.
[46]
ZHAO M, YE N. High-dimensional ensemble learning classification: An ensemble learning classification algorithm based on high- dimensional feature space reconstruction. Applied Sciences, 2024, 14(5): 1956.
[47]
HUANG C Y. Feature selection and feature stability measurement method for high-dimensional small sample data based on big data technology. Computational Intelligence and Neuroscience, 2021, 2021(1): 3597051.
[48]
JI Y H, LIANG Y, YANG Z Y, AI N. SW-Net: A novel few-shot learning approach for disease subtype prediction. Biocell, 2023, 47(3): 569-579.
[49]
FU G F, WANG G, DAI X T. An adaptive threshold determination method of feature screening for genomic selection. BMC Bioinformatics, 2017, 18(1): 212.
[50]
DENG Y, HU X L, LI B, ZHANG C X, HU W M. Multi-scale self-attention-based feature enhancement for detection of targets with small image sizes. Pattern Recognition Letters, 2023, 166: 46-52.
[51]
WANG Y H, DENG X L, LUO J Q, LI B L, XIAO S D. Cross-task feature enhancement strategy in multi-task learning for harvesting Sichuan pepper. Computers and Electronics in Agriculture, 2023, 207: 107726.
[52]
CHAN J Y, LEOW S M H, BEA K T, CHENG W K, PHOONG S W, HONG Z W, CHEN Y L. Mitigating the multicollinearity problem and its machine learning approach: A review. Mathematics, 2022, 10(8): 1283.
[53]
LI B, WANG Y Q, LI L S, LIU Y D. Research on apple origins classification optimization based on least-angle regression in instance selection. Agriculture, 2023, 13(10): 1868.
[54]
SHARMA J, JANGALE V, SHEKHAWAT R S, YADAV P. Improving genetic variant identification for quantitative traits using ensemble learning-based approaches. BMC Genomics, 2025, 26(1): 237.
[55]
TANAKA R, IWATA H. Bayesian optimization for genomic selection: A method for discovering the best genotype among a large number of candidates. Theoretical and Applied Genetics, 2018, 131(1): 93-105.

doi: 10.1007/s00122-017-2988-z pmid: 28986680
[1] ZANG ShaoLong, LIU LinRu, GAO YueZhi, WU Ke, HE Li, DUAN JianZhao, SONG Xiao, FENG Wei. Classification and Identification of Nitrogen Efficiency of Wheat Varieties Based on UAV Multi-Temporal Images [J]. Scientia Agricultura Sinica, 2024, 57(9): 1687-1708.
[2] MEI GuangYuan, LI Rong, MEI Xin, CHEN RiQiang, FAN YiGuang, CHENG JinPeng, FENG ZiHeng, TAO Ting, ZHAO Qian, ZHAO PeiQin, YANG XiaoDong. A VSURF-CA Based Hyperspectral Disease Index Estimation Model of Wheat Stripe Rust [J]. Scientia Agricultura Sinica, 2024, 57(3): 484-499.
[3] LIU YanLing, QIU Ao, ZHANG ZiPeng, WANG Xue, DU HeHe, LUO WenXue, WANG GuiJiang, WEI Xia, SHI WenYing, DING XiangDong. The Efficiency of Haplotype-Based Genomic Selection Using Genotyping by Target Sequencing in Pigs [J]. Scientia Agricultura Sinica, 2024, 57(11): 2243-2253.
[4] CAO Ke, CHEN ChangWen, YANG XuanWen, BIE HangLing, WANG LiRong. Genomic Selection for Fruit Weight and Soluble Solid Contents in Peach [J]. Scientia Agricultura Sinica, 2023, 56(5): 951-963.
[5] LI MianYan, WANG LiXian, ZHAO FuPing. Research Progress on Machine Learning for Genomic Selection in Animals [J]. Scientia Agricultura Sinica, 2023, 56(18): 3682-3692.
[6] ZHOU Jun,LIN Qing,SHAO BaoQuan,REN DuanYang,LI JiaQi,ZHANG Zhe,ZHANG Hao. Evaluating the Application Effect of Single-Step Genomic Selection in Pig Populations [J]. Scientia Agricultura Sinica, 2022, 55(15): 3042-3049.
[7] ZHU Mo,ZHENG MaiQing,CUI HuanXian,ZHAO GuiPing,LIU Yang. Comparison of Genomic Prediction Accuracy for Meat Type Chicken Carcass Traits Based on GBLUP and BayesB Method [J]. Scientia Agricultura Sinica, 2021, 54(23): 5125-5131.
[8] TANG ZhenShuang,YIN Dong,YIN LiLin,MA YunLong,XIANG Tao,ZHU MengJin,YU Mei,LIU XiaoLei,LI XinYun,QIU XiaoTian,ZHAO ShuHong. To Evaluate the “Two-Step” Genomic Selection Strategy in Pig by Simulation [J]. Scientia Agricultura Sinica, 2021, 54(21): 4677-4684.
[9] ZHAO Jing,LI ZhiMing,LU LiQun,JIA Peng,YANG HuanBo,LAN YuBin. Weed Identification in Maize Field Based on Multi-Spectral Remote Sensing of Unmanned Aerial Vehicle [J]. Scientia Agricultura Sinica, 2020, 53(8): 1545-1555.
[10] ZHANG JinXin, TANG ShaoQing, SONG HaiLiang, GAO Hong, JIANG Yao, JIANG YiFan, MI ShiRong, MENG QingLi, YU Fan, XIAO Wei, YUN Peng, ZHANG Qing, DING XiangDong. Joint Genomic Selection of Yorkshire in Beijing [J]. Scientia Agricultura Sinica, 2019, 52(12): 2161-2170.
[11] ZHU Bo, WANG Yan-hui, NIU Hong, CHEN Yan, ZHANG Lu-pei, GAO Hui-jiang, GAO Xue, LI Jun-ya, SUN Shao-hua. The Strategy of Parameter Optimization of Bayesian Methods for Genomic Selection in Livestock [J]. Scientia Agricultura Sinica, 2014, 47(22): 4495-4505.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!