LSTM DANN¶
import rul_adapt
import rul_datasets
import pytorch_lightning as pl
import omegaconf
Reproduce original configurations¶
You can reproduce the original experiments of daCosta et al. by using the get_lstm_dann
constructor function.
Known differences to the original are:
- a bigger validation split (20% instead of 10% of training data).
In this example, we re-create configuration for adaption CMAPSS FD003 to FD001.
Additional kwargs
for the trainer, e.g. accelerator="gpu"
for training on a GPU, can be passed to this function, too.
pl.seed_everything(42, workers=True) # make reproducible
dm, dann, trainer = rul_adapt.construct.get_lstm_dann(3, 1, max_epochs=1)
Global seed set to 42 GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs
The networks, feature_extractor
, regressor
, domain_disc
, can be accessed as properties of the dann
object.
dann.feature_extractor
LstmExtractor( (_lstm_layers): _Rnn( (_layers): ModuleList( (0): LSTM(24, 64) (1): LSTM(64, 32) ) ) (_fc_layer): Sequential( (0): Dropout(p=0.3, inplace=False) (1): Linear(in_features=32, out_features=128, bias=True) (2): ReLU() ) )
Training is done in the PyTorch Lightning fashion.
We used the trainer_kwargs
to train only one epoch for demonstration purposes.
trainer.fit(dann, dm)
trainer.test(ckpt_path="best", datamodule=dm) # loads the best model checkpoint
| Name | Type | Params ------------------------------------------------------------- 0 | train_source_loss | MeanAbsoluteError | 0 1 | evaluator | AdaptionEvaluator | 0 2 | _feature_extractor | LstmExtractor | 39.8 K 3 | _regressor | FullyConnectedHead | 5.2 K 4 | dann_loss | DomainAdversarialLoss | 5.2 K ------------------------------------------------------------- 50.2 K Trainable params 0 Non-trainable params 50.2 K Total params 0.201 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]
Training: 0it [00:00, ?it/s]
Validation: 0it [00:00, ?it/s]
`Trainer.fit` stopped: `max_epochs=1` reached. Restoring states from the checkpoint path at /home/tilman/Programming/rul-adapt/docs/examples/lightning_logs/version_32/checkpoints/epoch=0-step=69.ckpt Loaded model weights from checkpoint at /home/tilman/Programming/rul-adapt/docs/examples/lightning_logs/version_32/checkpoints/epoch=0-step=69.ckpt
Testing: 0it [00:00, ?it/s]
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Test metric DataLoader 0 DataLoader 1 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── test/source/rmse 20.155813217163086 test/source/score 1689.973876953125 test/target/rmse 32.33406448364258 test/target/score 12900.6259765625 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[{'test/source/rmse/dataloader_idx_0': 20.155813217163086, 'test/source/score/dataloader_idx_0': 1689.973876953125}, {'test/target/rmse/dataloader_idx_1': 32.33406448364258, 'test/target/score/dataloader_idx_1': 12900.6259765625}]
If you only want to see the hyperparameters, you can use the get_lstm_dann_config
function.
This returns an omegeconf.DictConfig
which you can modify.
three2one_config = rul_adapt.construct.get_lstm_dann_config(3, 1)
print(omegaconf.OmegaConf.to_yaml(three2one_config, resolve=True))
dm: source: _target_: rul_datasets.CmapssReader fd: 3 feature_select: - 0 - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 - 11 - 12 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 20 - 21 - 22 - 23 target: fd: 1 percent_broken: 1.0 batch_size: 256 feature_extractor: _convert_: all _target_: rul_adapt.model.LstmExtractor input_channels: 24 units: - 64 - 32 fc_units: 128 dropout: 0.3 fc_dropout: 0.3 regressor: _convert_: all _target_: rul_adapt.model.FullyConnectedHead input_channels: 128 act_func_on_last_layer: false units: - 32 - 32 - 1 dropout: 0.1 domain_disc: _convert_: all _target_: rul_adapt.model.FullyConnectedHead input_channels: 128 act_func_on_last_layer: false units: - 32 - 32 - 1 dropout: 0.1 dann: _target_: rul_adapt.approach.DannApproach scheduler_type: step scheduler_gamma: 0.1 scheduler_step_size: 100 dann_factor: 2.0 lr: 0.01 optim_weight_decay: 0.01 trainer: _target_: pytorch_lightning.Trainer max_epochs: 200 gradient_clip_val: 1.0 callbacks: - _target_: pytorch_lightning.callbacks.EarlyStopping monitor: val/target/rmse/dataloader_idx_1 patience: 20 - _target_: pytorch_lightning.callbacks.ModelCheckpoint save_top_k: 1 monitor: val/target/rmse/dataloader_idx_1 mode: min
Run your own experiments¶
You can use the LSTM DANN implementation to run your own experiments with different hyperparameters or on different datasets. Here we build a smaller LSTM DANN version for CMAPSS.
source = rul_datasets.CmapssReader(3)
target = source.get_compatible(1, percent_broken=0.8)
dm = rul_datasets.DomainAdaptionDataModule(
rul_datasets.RulDataModule(source, batch_size=32),
rul_datasets.RulDataModule(target, batch_size=32),
)
feature_extractor = rul_adapt.model.LstmExtractor(
input_channels=14,
units=[16],
fc_units=8,
)
regressor = rul_adapt.model.FullyConnectedHead(
input_channels=8,
units=[8, 1],
act_func_on_last_layer=False,
)
domain_disc = rul_adapt.model.FullyConnectedHead(
input_channels=8,
units=[8, 1],
act_func_on_last_layer=False,
)
dann = rul_adapt.approach.DannApproach(dann_factor=1.0, lr=0.001)
dann.set_model(feature_extractor, regressor, domain_disc)
trainer = pl.Trainer(max_epochs=1)
trainer.fit(dann, dm)
trainer.test(dann, dm)
GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs | Name | Type | Params ------------------------------------------------------------- 0 | train_source_loss | MeanAbsoluteError | 0 1 | evaluator | AdaptionEvaluator | 0 2 | _feature_extractor | LstmExtractor | 2.2 K 3 | _regressor | FullyConnectedHead | 81 4 | dann_loss | DomainAdversarialLoss | 81 ------------------------------------------------------------- 2.3 K Trainable params 0 Non-trainable params 2.3 K Total params 0.009 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]
Training: 0it [00:00, ?it/s]
Validation: 0it [00:00, ?it/s]
`Trainer.fit` stopped: `max_epochs=1` reached.
Testing: 0it [00:00, ?it/s]
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Test metric DataLoader 0 DataLoader 1 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── test/source/rmse 20.648313522338867 test/source/score 876.435546875 test/target/rmse 21.399911880493164 test/target/score 1010.3373413085938 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[{'test/source/rmse/dataloader_idx_0': 20.648313522338867, 'test/source/score/dataloader_idx_0': 876.435546875}, {'test/target/rmse/dataloader_idx_1': 21.399911880493164, 'test/target/score/dataloader_idx_1': 1010.3373413085938}]