Files
deap-based-automl-experimen…/README.md
2025-11-24 23:15:00 -04:00

62 lines
1.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Explanation-Aware Optimization and AutoML (DEAP + SHAP Stability)
This project implements an **AutoML framework** that uses **DEAPs NSGA-II** for multi-objective optimization, balancing **model accuracy** and **SHAP-based stability**.
It supports both **classification** and **regression** datasets via OpenML and sklearn.
All results are tracked with **MLflow**.
---
## 1. Environment Setup (macOS / Linux)
### Create and activate a virtual environment
```bash
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip wheel setuptools
pip install \
numpy==1.26.4 \
pandas==1.5.3 \
scikit-learn==1.3.2 \
shap==0.45.0 \
deap==1.4.1 \
openml==0.14.2 \
mlflow==2.11.3 \
matplotlib==3.7.5
```
## 2. Running Experiments
Classification: Adult Dataset
```bash
python run_deap.py \
--dataset adult \
--generations 5 \
--pop-size 24 \
--cv-folds 3
```
Regression: California Housing Dataset
```bash
python run_deap.py \
--dataset cal_housing \
--generations 5 \
--pop-size 24 \
--cv-folds 3
```
Results are saved under:
```bash
runs/<dataset>/pareto_front.csv
```
## 3. Viewing Results in MLflow
```bash
mlflow ui --backend-store-uri ./mlruns --host 0.0.0.0 --port 5000
```
Then open: http://localhost:5000
You can visualize:
MSE-like score (lower is better)
SHAP stability (higher is better)