42 lines
1.8 KiB
Plaintext
42 lines
1.8 KiB
Plaintext
https://gitlab.com/university-of-prince-edward-isalnd/explanation-aware-optimization-and-automl/-/tree/main/src?ref_type=heads
|
|
|
|
|
|
|
|
############################################################################################################################################################
|
|
Code File Structure
|
|
|
|
Shell scripts
|
|
|
|
h20_batch.sh ->
|
|
nsga_batch.sh ->
|
|
grid_search_batch.sh ->
|
|
|
|
|
|
|
|
|
|
############################################################################################################################################################
|
|
Code Changes:
|
|
|
|
- SHAP KernelExplainer
|
|
Use shap.TreeExplainer on tree-based models instead
|
|
|
|
- AutoML search size
|
|
Reduce max_models or max_runtime_secs per fold or pre-select algorithms
|
|
|
|
- Data transformations
|
|
Cache intermediate NumPy arrays to skip repeated fit_transform calls in each fold
|
|
|
|
- Parallel folds
|
|
if CPU has many cores, parallelize the K-fold loop with joblib.parallel to fully use a higher core count CPU
|
|
|
|
############################################################################################################################################################
|
|
Notes
|
|
- The Slurm headers indicate that the programs should be run on a system with 4 cores per task and 10GB of RAM.
|
|
This is quite conservative and would not need to be directed towards a cloud-computing environment to run
|
|
|
|
- The three jobs run with a run time limit of 11 hours. Considering average Compute Canada / AceNet servers (approx 2.5GHz CPUs),
|
|
allocate a time limit of at least 5 hours to run on a 13600KF system (assuming no hyperthreading and E-core processing)
|
|
|
|
- H20 AutoML supports GPU compute using CUDA libraries. A CUDA accelerate GPU may see performance gains for this computation
|
|
|
|
- |