Consistency DANN¶
import rul_datasets
import rul_adapt
import pytorch_lightning as pl
import omegaconf
Reproduce original configurations¶
You can reproduce the original experiments by Siahpour et al. by using the get_consistency_dann
constructor function.
Known differences to the original paper are:
- the
consistency_factor
is set to 1.0 because the real value is not mentioned in the paper - the raw vibration data of XJTU-SY is preprocessed by extracting the standard deviation from each window because the given architecture could not handle the raw data
Additional kwargs
for the trainer, e.g. accelerator="gpu"
for training on a GPU, can be passed to the function as a dictionary.
The first dictionary is used for the pre-training trainer and the second one for the main trainer.
pl.seed_everything(42, workers=True) # makes it reproducible
pre_training, main_training = rul_adapt.construct.get_consistency_dann(
"cmapss", 3, 1, {"max_epochs": 1}, {"max_epochs": 1}
)
Global seed set to 42 GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs
The function returns two tuples. The first contains everything needed for pre-training, the second everything needed for the main training.
pre_dm, pre_approach, pre_trainer = pre_training
pre_trainer.fit(pre_approach, pre_dm)
| Name | Type | Params ---------------------------------------------------------- 0 | train_loss | MeanSquaredError | 0 1 | val_loss | MeanSquaredError | 0 2 | test_loss | MeanSquaredError | 0 3 | evaluator | AdaptionEvaluator | 0 4 | _feature_extractor | CnnExtractor | 3.3 K 5 | _regressor | FullyConnectedHead | 221 ---------------------------------------------------------- 3.5 K Trainable params 0 Non-trainable params 3.5 K Total params 0.014 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]
Training: 0it [00:00, ?it/s]
Validation: 0it [00:00, ?it/s]
`Trainer.fit` stopped: `max_epochs=1` reached.
After pre-training, we can use the pre-trained networks to initialize the main training.
The networks of the pre-training approach, i.e. feature_extractor
and regressor
, can be accessed as properties.
dm, approach, domain_disc, trainer = main_training
approach.set_model(pre_approach.feature_extractor, pre_approach.regressor, domain_disc)
trainer.fit(approach, dm)
trainer.test(approach, dm)
| Name | Type | Params ------------------------------------------------------------------- 0 | train_source_loss | MeanSquaredError | 0 1 | consistency_loss | ConsistencyLoss | 0 2 | evaluator | AdaptionEvaluator | 0 3 | _feature_extractor | CnnExtractor | 3.3 K 4 | _regressor | FullyConnectedHead | 221 5 | dann_loss | DomainAdversarialLoss | 21 6 | frozen_feature_extractor | CnnExtractor | 3.3 K ------------------------------------------------------------------- 3.5 K Trainable params 3.3 K Non-trainable params 6.8 K Total params 0.027 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]
Training: 0it [00:00, ?it/s]
Validation: 0it [00:00, ?it/s]
`Trainer.fit` stopped: `max_epochs=1` reached.
Testing: 0it [00:00, ?it/s]
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Test metric DataLoader 0 DataLoader 1 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── test/source/rmse 83.57210540771484 test/source/score 371325.28125 test/target/rmse 84.60678100585938 test/target/score 342092.78125 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[{'test/source/rmse/dataloader_idx_0': 83.57210540771484, 'test/source/score/dataloader_idx_0': 371325.28125}, {'test/target/rmse/dataloader_idx_1': 84.60678100585938, 'test/target/score/dataloader_idx_1': 342092.78125}]
If you only want to see the hyperparameters, you can use the get_consistency_dann_config
function.
This returns an omegaconf.DictConfig
which you can modify.
Afterwards, you can pass the config to consistency_dann_from_config
to receive the training-ready approach.
cmapss_three2one_config = rul_adapt.construct.get_consistency_dann_config("cmapss", 3, 1)
print(omegaconf.OmegaConf.to_yaml(cmapss_three2one_config, resolve=True))
dm: source: _target_: rul_datasets.CmapssReader fd: 3 window_size: 20 target: fd: 1 percent_broken: 1.0 kwargs: batch_size: 128 feature_extractor: _convert_: all _target_: rul_adapt.model.CnnExtractor input_channels: 14 units: - 32 - 16 - 1 seq_len: 20 fc_units: 20 dropout: 0.5 fc_dropout: 0.5 regressor: _convert_: all _target_: rul_adapt.model.FullyConnectedHead input_channels: 20 act_func_on_last_layer: false units: - 10 - 1 domain_disc: _convert_: all _target_: rul_adapt.model.FullyConnectedHead input_channels: 20 act_func_on_last_layer: false units: - 1 consistency_pre: _target_: rul_adapt.approach.SupervisedApproach lr: 0.0001 loss_type: rmse optim_type: sgd consistency: _target_: rul_adapt.approach.ConsistencyApproach consistency_factor: 1.0 max_epochs: 3000 lr: 1.0e-05 optim_type: sgd trainer_pre: _target_: pytorch_lightning.Trainer max_epochs: 1000 trainer: _target_: pytorch_lightning.Trainer max_epochs: 3000
Run your own experiments¶
You can use the Consistency DANN implementation to run your own experiments with different hyperparameters or on different datasets. Here we build an approach with an LSTM feature extractor.
source = rul_datasets.CmapssReader(3)
target = source.get_compatible(1, percent_broken=0.8)
pre_dm = rul_datasets.RulDataModule(source, batch_size=32)
dm = rul_datasets.DomainAdaptionDataModule(
pre_dm, rul_datasets.RulDataModule(target, batch_size=32),
)
feature_extractor = rul_adapt.model.LstmExtractor(
input_channels=14,
units=[16],
fc_units=8,
)
regressor = rul_adapt.model.FullyConnectedHead(
input_channels=8,
units=[8, 1],
act_func_on_last_layer=False,
)
domain_disc = rul_adapt.model.FullyConnectedHead(
input_channels=8,
units=[8, 1],
act_func_on_last_layer=False,
)
pre_approach = rul_adapt.approach.SupervisedApproach(
lr=0.001, loss_type="rmse", optim_type="sgd"
)
pre_approach.set_model(feature_extractor, regressor)
pre_trainer = pl.Trainer(max_epochs=1)
trainer.fit(pre_approach, pre_dm)
approach = rul_adapt.approach.ConsistencyApproach(
consistency_factor=1.0, lr=0.001, max_epochs=1
)
approach.set_model(
pre_approach.feature_extractor, pre_approach.regressor, domain_disc
)
trainer = pl.Trainer(max_epochs=1)
trainer.fit(approach, dm)
trainer.test(approach, dm)
GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs /home/tilman/Programming/rul-adapt/.venv/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:613: UserWarning: Checkpoint directory /home/tilman/Programming/rul-adapt/docs/examples/lightning_logs/version_27/checkpoints exists and is not empty. rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.") | Name | Type | Params ---------------------------------------------------------- 0 | train_loss | MeanSquaredError | 0 1 | val_loss | MeanSquaredError | 0 2 | test_loss | MeanSquaredError | 0 3 | evaluator | AdaptionEvaluator | 0 4 | _feature_extractor | LstmExtractor | 2.2 K 5 | _regressor | FullyConnectedHead | 81 ---------------------------------------------------------- 2.3 K Trainable params 0 Non-trainable params 2.3 K Total params 0.009 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]
`Trainer.fit` stopped: `max_epochs=1` reached. GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs | Name | Type | Params ------------------------------------------------------------------- 0 | train_source_loss | MeanSquaredError | 0 1 | consistency_loss | ConsistencyLoss | 0 2 | evaluator | AdaptionEvaluator | 0 3 | _feature_extractor | LstmExtractor | 2.2 K 4 | _regressor | FullyConnectedHead | 81 5 | dann_loss | DomainAdversarialLoss | 81 6 | frozen_feature_extractor | LstmExtractor | 2.2 K ------------------------------------------------------------------- 2.3 K Trainable params 2.2 K Non-trainable params 4.5 K Total params 0.018 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]
Training: 0it [00:00, ?it/s]
Validation: 0it [00:00, ?it/s]
`Trainer.fit` stopped: `max_epochs=1` reached.
Testing: 0it [00:00, ?it/s]
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Test metric DataLoader 0 DataLoader 1 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── test/source/rmse 18.09880828857422 test/source/score 1549.2022705078125 test/target/rmse 22.494943618774414 test/target/score 814.8432006835938 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[{'test/source/rmse/dataloader_idx_0': 18.09880828857422, 'test/source/score/dataloader_idx_0': 1549.2022705078125}, {'test/target/rmse/dataloader_idx_1': 22.494943618774414, 'test/target/score/dataloader_idx_1': 814.8432006835938}]