Upload files to "Figures"
This commit is contained in:
BIN
Figures/Framework.png
Normal file
BIN
Figures/Framework.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 202 KiB |
BIN
Figures/Model_complex_Opt.png
Normal file
BIN
Figures/Model_complex_Opt.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 858 KiB |
23
Figures/README.md
Normal file
23
Figures/README.md
Normal file
@@ -0,0 +1,23 @@
|
||||
# Explanation-Aware Automated Machine Learning
|
||||
|
||||
This repository accompanies the research paper:
|
||||
|
||||
**“Multi-Objective Automated Machine Learning for Explainable Artificial Intelligence: Optimizing Predictive Accuracy and Shapley-Based Feature Stability.”**
|
||||
|
||||
In high-stakes domains such as agriculture, machine learning models must be not only accurate but also transparent and aligned with domain knowledge. This project presents a novel **multi-objective optimization framework** that jointly maximizes predictive performance and explanation stability. Specifically, we introduce a formal metric based on the **variance of Shapley Additive Explanations across cross-validation folds**, embedding it directly into the model selection process.
|
||||
|
||||
Our approach leverages the **Non-dominated Sorting Genetic Algorithm II** to evolve models that balance predictive accuracy with robust, semantically consistent explanations. When applied to potato yield prediction, the framework outperforms both **H2O.ai's Automatic Machine Learning platform** and traditional grid search, producing models that are both high-performing and interpretable.
|
||||
|
||||
---
|
||||
|
||||
## 🔍 Key Features
|
||||
|
||||
- Multi-objective optimization for predictive accuracy and explanation stability
|
||||
- Shapley-based metric embedded into the model selection loop
|
||||
- Implementation using NSGA-II for evolutionary search
|
||||
- Reproducible case study in potato yield forecasting
|
||||
- Baseline comparisons with grid search and H2O.ai’s platform
|
||||
|
||||
---
|
||||
|
||||
## 📂 Repository Structure
|
||||
42
Figures/background.txt
Normal file
42
Figures/background.txt
Normal file
@@ -0,0 +1,42 @@
|
||||
https://gitlab.com/university-of-prince-edward-isalnd/explanation-aware-optimization-and-automl/-/tree/main/src?ref_type=heads
|
||||
|
||||
|
||||
|
||||
############################################################################################################################################################
|
||||
Code File Structure
|
||||
|
||||
Shell scripts
|
||||
|
||||
h20_batch.sh ->
|
||||
nsga_batch.sh ->
|
||||
grid_search_batch.sh ->
|
||||
|
||||
|
||||
|
||||
|
||||
############################################################################################################################################################
|
||||
Code Changes:
|
||||
|
||||
- SHAP KernelExplainer
|
||||
Use shap.TreeExplainer on tree-based models instead
|
||||
|
||||
- AutoML search size
|
||||
Reduce max_models or max_runtime_secs per fold or pre-select algorithms
|
||||
|
||||
- Data transformations
|
||||
Cache intermediate NumPy arrays to skip repeated fit_transform calls in each fold
|
||||
|
||||
- Parallel folds
|
||||
if CPU has many cores, parallelize the K-fold loop with joblib.parallel to fully use a higher core count CPU
|
||||
|
||||
############################################################################################################################################################
|
||||
Notes
|
||||
- The Slurm headers indicate that the programs should be run on a system with 4 cores per task and 10GB of RAM.
|
||||
This is quite conservative and would not need to be directed towards a cloud-computing environment to run
|
||||
|
||||
- The three jobs run with a run time limit of 11 hours. Considering average Compute Canada / AceNet servers (approx 2.5GHz CPUs),
|
||||
allocate a time limit of at least 5 hours to run on a 13600KF system (assuming no hyperthreading and E-core processing)
|
||||
|
||||
- H20 AutoML supports GPU compute using CUDA libraries. A CUDA accelerate GPU may see performance gains for this computation
|
||||
|
||||
-
|
||||
BIN
Figures/features_heatmap.png
Normal file
BIN
Figures/features_heatmap.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 833 KiB |
Reference in New Issue
Block a user