Enhancing Student Performance and Retention Prediction across Different Languages Using Ensemble Learning Models
Abstract
This research develops ensemble models for predicting student performance retention across multiple languages using artificial intelligence as part of educational analytics improvement. A dataset of approximately 11500 entries obtained from Kaggle needed preprocessing methodologies that standardized and normalized and balanced the data. A combination of Logistic Regression and Decision Trees with Gradient Boosting constitutes an ensemble model which performs better than Random Forests and Support Vector Machines as conventional models. The ensemble model reveals its superiority through performance measures that show 96% accuracy as well as precision, recall and F1 score. The predictive model demonstrates its capability to perform well in various linguistic contexts. AI-based insights prove valuable as both personalized learning tools and early identification systems which create opportunities for future educational forecasting development through deep learning methods alongside broader socio-cultural factors.