Training Dataset Preprocessing
Command examples used to preprocess training datasets. It will cut down on processing time if this is run and a static training dataset is used, instead of having to preprocess at all steps.
Disclaimer: If you already have pre-trained model weights, skip to Inference Command and Code configuration.
Before running our commands, we must have a YAML file updated with the specific paths and parameters needed for training.
After setting up the YAML file, we can run our commands:
cd <path_to_SIT_FUSE>/src/sit_fuse/datasets/
# Can be run outside of the repo via command line or in a script as well
python3 sf_dataset.py -y ../config/<folder>/<yaml_file>
# E.g. set <path_to_yaml> to ../config/model/emas_fire_dbn_multi_layer_pl.yaml
Workstreams that use classic DBNs, PCA-based encoding, or no encoder at all would use vector-based samples.
Last updated