Explanation-Aware Optimization and AutoML (DEAP + SHAP Stability)
This project implements an AutoML framework that uses DEAP’s NSGA-II for multi-objective optimization, balancing model accuracy and SHAP-based stability.
It supports both classification and regression datasets via OpenML and sklearn.
All results are tracked with MLflow.
1. Environment Setup (macOS / Linux)
Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip wheel setuptools
pip install \
numpy==1.26.4 \
pandas==1.5.3 \
scikit-learn==1.3.2 \
shap==0.45.0 \
deap==1.4.1 \
openml==0.14.2 \
mlflow==2.11.3 \
matplotlib==3.7.5
2. Running Experiments
Classification: Adult Dataset
python run_deap.py \
--dataset adult \
--generations 5 \
--pop-size 24 \
--cv-folds 3
Regression: California Housing Dataset
python run_deap.py \
--dataset cal_housing \
--generations 5 \
--pop-size 24 \
--cv-folds 3
Results are saved under:
runs/<dataset>/pareto_front.csv
3. Viewing Results in MLflow
mlflow ui --backend-store-uri ./mlruns --host 0.0.0.0 --port 5000
Then open: http://localhost:5000
You can visualize:
MSE-like score (lower is better)
SHAP stability (higher is better)
Description
Languages
Python
100%