Timely and accurate forecasting of crop yields is critical for food management and trade. However, only limited research has explored the impact of integrating crop phenotypic parameters (CPPs) with unmanned aerial vehicle (UAV) data across different phenological stages on maize yield prediction. The extent to which multi-temporal data enhances the accuracy and reliability of yield projections compared to mono-temporal data has yet to be systematically investigated. To attain the proper balance between accuracy and cost in crop yield estimation, this study proposed a structured framework for identifying the optimal phenological periods for summer maize yield prediction using UAV-based multispectral data. Three classical methods of custom mean decrease accuracy (C-MDA), optimal parameters-based geographical detector (OPGD), and grey relational analysis (GRA) were first used to sort and screen both the CPPs and vegetation indices (VIs) derived from UAV-based information over six growth stages. Ridge regression models based on multi-temporal data combinations and mono-temporal data were established separately, and their performance in yield prediction were compared to identify the optimal phenological stages and the corresponding key factors. Our results showed that C-MDA was much better at factor screening and ranking compared to OPGD and GRA. The green normalized difference vegetation index (GNDVI), normalized difference vegetation index (NDVI), and normalized difference red edge index (NDRE) emerged as the top-performing VIs, while the leaf area index (LAI) and above ground biomass (AGB) proved to be the most effective CPPs. When predicting yield using only mono-temporal data, the dough stage delivered the highest predictive accuracy (R2=0.871, RMSE=0.407 t ha–1), while the tasseling stage was the earliest that achieved yield estimates with acceptable precision (R2=0.810, RMSE=0.493 t ha–1). In contrast, the integration of UAV data from different crop growth stages markedly enhanced the accuracy of yield estimation. Combinations of data from the tasseling, silking, and dough stages were recommended as the best option (R2=0.942, RMSE=0.291 t ha–1). These findings indicate that the precise estimation of maize yields in smallholder fields may be attainable, and present both substantial theoretical insights and practical benefits for the advancement of precision agriculture.