Introduction
Traditionally, data analysis in education has focused on what already happened, using past trends to inform future decisions. But by the time patterns emerge in the data, it may be too late to intervene effectively. As we push towards data-informed decision-making, predictive analytics is a powerful tool to anticipate student outcomes and proactively support student success. Rather than simply looking at what has happened in the past, predictive models use historical data to forecast future performance.
We have developed two models, one to predict CAPE outcomes, including predicted CAPE scores and predicted CAPE level, and one to predict attendance, including predicted end-of-year attendance rates, predicted chronic absenteeism status, and predicted truancy status. These insights can help school leaders and staff to identify students who may need additional support, allocate resources more effectively, and implement timely interventions.
While we cannot predict the future with certainty, predictive models offer a valuable glimpse into what might happen, allowing school leaders to make more intentional decisions.
What is predictive analytics and predictive modeling?
Predictive analytics is the process of using historical data to make informed predictions about future outcomes. Where a predictive model is a specific tool within the process. It is the mathematical formula that generates the predictions, such as estimating the student’s likelihood of being proficient or at risk of chronic absenteeism.
What kind of model do we use?
We use a machine learning technique called gradient boosting which builds predictions by combining many simple models, called decision trees. Each decision tree learns from the mistakes of the last to improve the overall prediction. This step-by-step refinement makes gradient boosting powerful, especially when working with imperfect, nuanced data such as attendance or assessment scores.
Gradient boosting models handles the challenges of education data in that it:
Captures subtle patterns that simpler models might overlook
Handles mixed data types from test scores to demographic information
Delivers high accuracy, often outperforming other models
Highlights key drivers of outcomes, helping educators understand which factors most influence student success
How does it work?
Just like a teacher learning from experience, the predictive models are trained using complete data from the previous school year. This allows the system to learn patterns from outcomes that occurred, what early warning signs were present, and which factors best predict the final outcomes.
We gather multi-year comprehensive data across the EK12 network, including attendance, assessment scores, growth metrics, and demographic information. The system then analyzes thousands of student records to find patterns such as “Students with attendance below 85% in October often end up chronically absent” or “Students with declining assessment scores combined with attendance drops are high-risk of being chronically absent.”
From these patterns, we build targeted models to forecast:
CAPE score and predicted performance level (Level 3+ and Level 4+)
End-of-year attendance outcomes (ISA, chronic absenteeism, and truancy)
Each month, we use the models to generate updated predictions for the current school year in progress, giving educators a proactive tool to guide decisions, celebrate successes, and implement supports to ensure students are reaching their full potential.
The Process:
Train Models: We train the model on the previous school year complete data set
Data Input: We receive current student data (attendance, assessments, etc.)
Prediction: Our models analyze this data and generate three types of predictions each for CAPE and attendance:
Attendance
A specific attendance percentage prediction
A probability score for chronic absence risk
A probability score for truancy risk
CAPE
A specific CAPE score prediction
A probability score for reaching Level 3+
A probability score for reaching Level 4+
4. Output: You receive monthly reports showing which students need attention
For the regression outputs (CAPE Score and ISA Prediction) we use an ensemble method. This allows us to combine predictions from multiple models (each trained with a different random configuration) instead of relying on just one. For each student ,we average the predictions from all models to get a final prediction, and we calculate the standard error (SE) to estimate the model’s confidence. This allows us to:
Reduce random error: we smooth out noise or quirks from any one model
More robust predictions: the final prediction is less likely to be thrown by outliers or overfitting in one model
Estimates uncertainty: the spread of predictions across models gives us a measure of how confident the model is for each student which gives us a sense of the model’s confidence in each prediction, so even before the final results are available, we can identify which predictions are more or less certain.
How does it help schools?
Early Identification: We can spot students who are likely to be chronically absent, truant or below proficiency on CAPE before it happens, giving us time to intervene.
Targeted Support: Staff can focus their efforts on students who are most at risk, making interventions more effective.
Continuous Improvement: By retraining the model each year with new data, we keep improving our predictions and adapting to changing circumstances.
Understanding the Predictions
CAPE
CAPE Score Predictions
- What is shows: Expected CAPE score
- Example: “This student is predicted to score 737 on CAPE”
- How to use it: Monitor students predicted below 750; those below 725 may benefit from targeted academic support
Predicted Level 3+
- What is shows: Probability (0-100%) that a student will reach Level 3 or higher on CAPE
- Example: “This student has a 80% chance of reaching Level 3 or higher on CAPE”
- How to use it: Students below 30% should be prioritized for instructional support and progress monitoring
Predicted Level 4+
- What is shows: Probability (0-100%) that a student will reach Level 4 or higher on CAPE
- Example: “This student has a 25% chance of reaching Level 4 or higher on CAPE”
- How to use it: Students below 40% may benefit from enrichment strategies
Attendance
Attendance Rate Predictions
- What it shows: Expected end-of-year in-seat attendance rate
- Example: "This student is predicted to have 87% attendance by year-end"
- How to use it: Students below 90% need monitoring; below 85% need intervention
Predicted Chronic Status
- What it shows: Probability (0-100%) that a student will be chronically absent by end-of-year
- Example: "This student has a 75% chance of becoming chronically absent"
- How to use it: Students above 50% risk need immediate attention
Predicted Truant Status
- What it shows: Probability (0-100%) that a student will have 10+ unexcused absences
- Example: "This student has a 60% chance of becoming truant"
- How to use it: Students above 40% need family engagement support
How are students classified?
For attendance, there is no universally defined score threshold, instead we rely on optimized probability thresholds. These are cut-off points calculated during model training to classify students as at risk for chronic absenteeism or truancy. Instead of choosing arbitrary values, the model tests a range of thresholds and evaluates each one using the precision (how often predictions are correct) and recall (how many true cases are identified). The goal is to find the best balance of catching students who truly need support while minimizing false alarms.
For example, if the optimized threshold for chronic absenteeism is 0.58, a student with a 60% predicted probability of being chronically absent, would be flagged as chronically absent. If the threshold were 0.70, that same student would not be flagged.
For CAPE, there are clearly defined cut-offs for proficiency levels (e.g. 725 for Level 3 and 750 for Level 4). At the student level, we use the predicted CAPE score to determine whether or not a student is likely to reach those benchmarks. In addition to the score based cut-offs, the probability-based predictions, which estimate the likelihood that students will reach Level 3+ or Level 4+ provide a more nuanced view of student performance, especially when scores near the cut-off. At the school level, to account for variability and uncertainty in individual predictions, we calculate the average percentage of students predicted to reach Level 3+ or Level 4+ using the predicted score cut-offs and the optimized probability thresholds. By averaging the two methods, we create a more balanced estimate of school-wide predicted performance.
How do feature importance shift over time?
A feature is an individual measurable attribute that is used to create the prediction, these are the inputs of the model. For example, previous year MAP percentiles can be used as a feature to help predict CAPE assessments. Feature importance, relates to how much each feature, or input variables predicts the outcome. It tells you which factors the model considered most influential when estimating the outcome. Because each target and month have their own model, the importance of features shifts as the year goes on. Across both models and targets, the following models emerge:
Attendance Predictions
Early Months (August-October): historical attendance patterns are most important
Mid Year (November-February): current year attendance trends emerge
Spring (March-June): Recent attendance becomes most predictive
CAPE Predictions
Early Months (August-October): prior year scores are most important
Mid Year (November-February): current year performance takes over
Spring (March-June): recent test scores become most predictive
[SEE HEATMAPS BELOW]
This illustrates the importance of early warning systems, by the time current year data becomes most predictive, there is less time to support struggling students.
Why train and test the models using data from the previous school year?
Training the model on last year’s data (with features from two years ago) helps us make the most accurate and relevant predictions for this year’s students. It uses the freshest patterns and context, so we can support students before problems arise—even though we don’t yet know this year’s final outcomes.
Most relevant patterns/reflects current conditions: Using previous school year data to train the model reflects the latest attendance patterns, policies, and student behaviors. This makes it the best available “example” for what might happen this year
Includes recent context: The base table includes features from previous years and two year prior – so when training on SY24-25 data this includes SY23-24 & SY22-23, the model can learn both how recent and older history affect current outcomes
Avoids outdated information: Older years may have different conditions (ex: pandemic, policy changes), using most recent year helps the model avoid learning from patterns that no longer apply
Realistic for predictions: The model learns to predict end-of-year outcomes based on partial-year data, just like it will do for the current year
Support early intervention: Training on last year’s data can help make predictions for students as the year unfolds, helping schools intervene early with students at risk
How well do our new models work?
CAPE:
Test Score Predictions: can predict student test scores within about 15-18 points
Level 3/4 Achievement Level: correctly identifies 8-9 out of 10 students who will reach their target
False Alarms: only flags 1-2 students incorrectly for every 10 flagged
Recall and precision of the new model:
The RMSE also improved over time for the new models. With the new models, there is a model for each month, and the RMSE decreases overtime, meaning that as the months go on, the predictions get better and better, which is also to be expected as we get closer to the end of the school year, the final picture becomes more clear.
Attendance:
Attendance Rate Predictions: can predict attendance rates within 0.1% accuracy
Chronic Absenteeism/Truancy: correctly identifies 8-9 out of 10 students who will have attendance issues
False Alarms: only flags 1 student incorrectly for every 10 flagged
Precision and recall of the new model:
RMSE of the new model:
Similar to the CAPE models, the new model's accuracy also improves as the months go on. Meaning that as the school year progresses, the models become more and more accurate.
As with any statistical technique, there is still uncertainty in the predictions, however, these work well and can be trusted to help identify students who need academic or attendance support.
How well do our new models work compared to our previous models?
CAPE
The new CAPE model significantly outperformed the old model across all subjects, levels and targets. For the binary targets, the largest gain is in recall, meaning that the new model significantly catches more students who will actually reach their targets. The most dramatic improvement is for Reached Level 4 for math and ELA, where ELA Level 4 recall improved by 21 percentage points and math Level 4 recall improved by 28 percentage points.
The new model also shows improvement in predicting CAPE scores. The RMSE (Root Mean Standard Error), which measures the average prediction error, was off by 20-24 points on average for the old model and by 15-18 points on average for the new model. The MAE (Mean Absolute Error) which measures the average absolute error, also improved for the new model with the old model being off by 16-19 points on average where the new model is off by 11-14 points on average.
Finally, the R-squared value also improved for the new model. R-squared measures the variance in test scores, the higher the R-squared the more accurate the prediction. The old model had an R-squared value of 61-65% where the new model has a value of 81%. This means that the model can explain 81% of the score variance.
Attendance
Comparing the performance of the old versus new attendance model shows that the new models are high performing with slight improvement in precision when predicting Chronic Absenteeism status. The previous model did not predict truancy. The new models gained some precision, which caused a slight drop in re-call, meaning that there are fewer false positives but the models are also slightly more accurate when predicting “Yes” that a student will be chronically absent. Overall, the new models are more confident in the prediction, with the AUC of 96.45 compared to 95.3% for the old model.
The new models show higher improvement in predicting attendance rate compared to the previous model. The RMSE for the old model was off by 0.19% on average, whereas the new model is off only by 0.10% on average. The MAE for the old model was off by 2.6% on average, where the new model is only off by 1.9%, meaning the predictions are 0.7 percentage points more accurate. Finally, for R-squared the old model explained 82.5% of attendance rate variance, where the new model explains 89.5% of the variance, meaning the model explains 7% more of the variation in student attendance.
Takeaway
Predictive modeling equips educators with the ability to look ahead. By turning historic data into foresight, these models help schools make intentional proactive decisions that support student growth, improve outcomes, and drive equity. This shifts the focus from what has happened to what can happen.
APPENDIX:
Feature Heatmaps
CAPE
ATTENDANCE
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article