Consistency DANN¶
import rul_datasets
import rul_adapt
import pytorch_lightning as pl
import omegaconf
Reproduce original configurations¶
You can reproduce the original experiments by Siahpour et al. by using the get_consistency_dann constructor function.
Known differences to the original paper are:
- the
consistency_factoris set to 1.0 because the real value is not mentioned in the paper - the raw vibration data of XJTU-SY is preprocessed by extracting the standard deviation from each window because the given architecture could not handle the raw data
Additional kwargs for the trainer, e.g. accelerator="gpu" for training on a GPU, can be passed to the function as a dictionary.
The first dictionary is used for the pre-training trainer and the second one for the main trainer.
pl.seed_everything(42, workers=True) # makes it reproducible
pre_training, main_training = rul_adapt.construct.get_consistency_dann(
"cmapss", 3, 1, {"max_epochs": 1}, {"max_epochs": 1}
)
Global seed set to 42 GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs
The function returns two tuples. The first contains everything needed for pre-training, the second everything needed for the main training.
pre_dm, pre_approach, pre_trainer = pre_training
pre_trainer.fit(pre_approach, pre_dm)
| Name | Type | Params ---------------------------------------------------------- 0 | train_loss | MeanSquaredError | 0 1 | val_loss | MeanSquaredError | 0 2 | test_loss | MeanSquaredError | 0 3 | evaluator | AdaptionEvaluator | 0 4 | _feature_extractor | CnnExtractor | 3.3 K 5 | _regressor | FullyConnectedHead | 221 ---------------------------------------------------------- 3.5 K Trainable params 0 Non-trainable params 3.5 K Total params 0.014 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]
Training: 0it [00:00, ?it/s]
Validation: 0it [00:00, ?it/s]
`Trainer.fit` stopped: `max_epochs=1` reached.
After pre-training, we can use the pre-trained networks to initialize the main training.
The networks of the pre-training approach, i.e. feature_extractor and regressor, can be accessed as properties.
dm, approach, domain_disc, trainer = main_training
approach.set_model(pre_approach.feature_extractor, pre_approach.regressor, domain_disc)
trainer.fit(approach, dm)
trainer.test(approach, dm)
| Name | Type | Params ------------------------------------------------------------------- 0 | train_source_loss | MeanSquaredError | 0 1 | consistency_loss | ConsistencyLoss | 0 2 | evaluator | AdaptionEvaluator | 0 3 | _feature_extractor | CnnExtractor | 3.3 K 4 | _regressor | FullyConnectedHead | 221 5 | dann_loss | DomainAdversarialLoss | 21 6 | frozen_feature_extractor | CnnExtractor | 3.3 K ------------------------------------------------------------------- 3.5 K Trainable params 3.3 K Non-trainable params 6.8 K Total params 0.027 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]
Training: 0it [00:00, ?it/s]
Validation: 0it [00:00, ?it/s]
`Trainer.fit` stopped: `max_epochs=1` reached.
Testing: 0it [00:00, ?it/s]
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Test metric DataLoader 0 DataLoader 1
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
test/source/rmse 83.57210540771484
test/source/score 371325.28125
test/target/rmse 84.60678100585938
test/target/score 342092.78125
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[{'test/source/rmse/dataloader_idx_0': 83.57210540771484,
'test/source/score/dataloader_idx_0': 371325.28125},
{'test/target/rmse/dataloader_idx_1': 84.60678100585938,
'test/target/score/dataloader_idx_1': 342092.78125}]
If you only want to see the hyperparameters, you can use the get_consistency_dann_config function.
This returns an omegaconf.DictConfig which you can modify.
Afterwards, you can pass the config to consistency_dann_from_config to receive the training-ready approach.
cmapss_three2one_config = rul_adapt.construct.get_consistency_dann_config("cmapss", 3, 1)
print(omegaconf.OmegaConf.to_yaml(cmapss_three2one_config, resolve=True))
dm:
source:
_target_: rul_datasets.CmapssReader
fd: 3
window_size: 20
target:
fd: 1
percent_broken: 1.0
kwargs:
batch_size: 128
feature_extractor:
_convert_: all
_target_: rul_adapt.model.CnnExtractor
input_channels: 14
units:
- 32
- 16
- 1
seq_len: 20
fc_units: 20
dropout: 0.5
fc_dropout: 0.5
regressor:
_convert_: all
_target_: rul_adapt.model.FullyConnectedHead
input_channels: 20
act_func_on_last_layer: false
units:
- 10
- 1
domain_disc:
_convert_: all
_target_: rul_adapt.model.FullyConnectedHead
input_channels: 20
act_func_on_last_layer: false
units:
- 1
consistency_pre:
_target_: rul_adapt.approach.SupervisedApproach
lr: 0.0001
loss_type: rmse
optim_type: sgd
consistency:
_target_: rul_adapt.approach.ConsistencyApproach
consistency_factor: 1.0
max_epochs: 3000
lr: 1.0e-05
optim_type: sgd
trainer_pre:
_target_: pytorch_lightning.Trainer
max_epochs: 1000
trainer:
_target_: pytorch_lightning.Trainer
max_epochs: 3000
Run your own experiments¶
You can use the Consistency DANN implementation to run your own experiments with different hyperparameters or on different datasets. Here we build an approach with an LSTM feature extractor.
source = rul_datasets.CmapssReader(3)
target = source.get_compatible(1, percent_broken=0.8)
pre_dm = rul_datasets.RulDataModule(source, batch_size=32)
dm = rul_datasets.DomainAdaptionDataModule(
pre_dm, rul_datasets.RulDataModule(target, batch_size=32),
)
feature_extractor = rul_adapt.model.LstmExtractor(
input_channels=14,
units=[16],
fc_units=8,
)
regressor = rul_adapt.model.FullyConnectedHead(
input_channels=8,
units=[8, 1],
act_func_on_last_layer=False,
)
domain_disc = rul_adapt.model.FullyConnectedHead(
input_channels=8,
units=[8, 1],
act_func_on_last_layer=False,
)
pre_approach = rul_adapt.approach.SupervisedApproach(
lr=0.001, loss_type="rmse", optim_type="sgd"
)
pre_approach.set_model(feature_extractor, regressor)
pre_trainer = pl.Trainer(max_epochs=1)
trainer.fit(pre_approach, pre_dm)
approach = rul_adapt.approach.ConsistencyApproach(
consistency_factor=1.0, lr=0.001, max_epochs=1
)
approach.set_model(
pre_approach.feature_extractor, pre_approach.regressor, domain_disc
)
trainer = pl.Trainer(max_epochs=1)
trainer.fit(approach, dm)
trainer.test(approach, dm)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/home/tilman/Programming/rul-adapt/.venv/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:613: UserWarning: Checkpoint directory /home/tilman/Programming/rul-adapt/docs/examples/lightning_logs/version_27/checkpoints exists and is not empty.
rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
| Name | Type | Params
----------------------------------------------------------
0 | train_loss | MeanSquaredError | 0
1 | val_loss | MeanSquaredError | 0
2 | test_loss | MeanSquaredError | 0
3 | evaluator | AdaptionEvaluator | 0
4 | _feature_extractor | LstmExtractor | 2.2 K
5 | _regressor | FullyConnectedHead | 81
----------------------------------------------------------
2.3 K Trainable params
0 Non-trainable params
2.3 K Total params
0.009 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]
`Trainer.fit` stopped: `max_epochs=1` reached. GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs | Name | Type | Params ------------------------------------------------------------------- 0 | train_source_loss | MeanSquaredError | 0 1 | consistency_loss | ConsistencyLoss | 0 2 | evaluator | AdaptionEvaluator | 0 3 | _feature_extractor | LstmExtractor | 2.2 K 4 | _regressor | FullyConnectedHead | 81 5 | dann_loss | DomainAdversarialLoss | 81 6 | frozen_feature_extractor | LstmExtractor | 2.2 K ------------------------------------------------------------------- 2.3 K Trainable params 2.2 K Non-trainable params 4.5 K Total params 0.018 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]
Training: 0it [00:00, ?it/s]
Validation: 0it [00:00, ?it/s]
`Trainer.fit` stopped: `max_epochs=1` reached.
Testing: 0it [00:00, ?it/s]
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Test metric DataLoader 0 DataLoader 1
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
test/source/rmse 18.09880828857422
test/source/score 1549.2022705078125
test/target/rmse 22.494943618774414
test/target/score 814.8432006835938
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[{'test/source/rmse/dataloader_idx_0': 18.09880828857422,
'test/source/score/dataloader_idx_0': 1549.2022705078125},
{'test/target/rmse/dataloader_idx_1': 22.494943618774414,
'test/target/score/dataloader_idx_1': 814.8432006835938}]