This project implements an AutoML framework that uses DEAP’s NSGA-II for multi-objective optimization, balancing model accuracy and SHAP-based stability.
It supports both classification and regression datasets via OpenML and sklearn.
All results are tracked with MLflow.

1. Environment Setup (macOS / Linux)

Create and activate a virtual environment

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip wheel setuptools

pip install \
  numpy==1.26.4 \
  pandas==1.5.3 \
  scikit-learn==1.3.2 \
  shap==0.45.0 \
  deap==1.4.1 \
  openml==0.14.2 \
  mlflow==2.11.3 \
  matplotlib==3.7.5

2. Running Experiments

Classification: Adult Dataset

python run_deap.py \
 --dataset adult \
 --generations 5 \
 --pop-size 24 \
 --cv-folds 3

Regression: California Housing Dataset

python run_deap.py \
  --dataset cal_housing \
  --generations 5 \
  --pop-size 24 \
  --cv-folds 3

Results are saved under:

runs/<dataset>/pareto_front.csv

3. Viewing Results in MLflow

mlflow ui --backend-store-uri ./mlruns --host 0.0.0.0 --port 5000

Then open: http://localhost:5000

You can visualize:

MSE-like score (lower is better)

SHAP stability (higher is better)

README.md Unescape Escape

Explanation-Aware Optimization and AutoML (DEAP + SHAP Stability)

1. Environment Setup (macOS / Linux)

Create and activate a virtual environment

2. Running Experiments

3. Viewing Results in MLflow

README.md