Consistency DANN¶

In [1]:

Copied!





import rul_datasets
import rul_adapt
import pytorch_lightning as pl
import omegaconf
import rul_datasets
import rul_adapt
import pytorch_lightning as pl
import omegaconf

Reproduce original configurations¶

You can reproduce the original experiments by Siahpour et al. by using the get_consistency_dann constructor function. Known differences to the original paper are:

the consistency_factor is set to 1.0 because the real value is not mentioned in the paper
the raw vibration data of XJTU-SY is preprocessed by extracting the standard deviation from each window because the given architecture could not handle the raw data

Additional kwargs for the trainer, e.g. accelerator="gpu" for training on a GPU, can be passed to the function as a dictionary. The first dictionary is used for the pre-training trainer and the second one for the main trainer.

In [2]:

Copied!





pl.seed_everything(42, workers=True)  # makes it reproducible
pre_training, main_training = rul_adapt.construct.get_consistency_dann(
    "cmapss", 3, 1, {"max_epochs": 1}, {"max_epochs": 1}
)
pl.seed_everything(42, workers=True)  # makes it reproducible
pre_training, main_training = rul_adapt.construct.get_consistency_dann(
    "cmapss", 3, 1, {"max_epochs": 1}, {"max_epochs": 1}
)

Global seed set to 42
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

The function returns two tuples. The first contains everything needed for pre-training, the second everything needed for the main training.

In [3]:

Copied!

pre_dm, pre_approach, pre_trainer = pre_training
pre_trainer.fit(pre_approach, pre_dm)
pre_dm, pre_approach, pre_trainer = pre_training
pre_trainer.fit(pre_approach, pre_dm)

  | Name               | Type               | Params
----------------------------------------------------------
0 | train_loss         | MeanSquaredError   | 0     
1 | val_loss           | MeanSquaredError   | 0     
2 | test_loss          | MeanSquaredError   | 0     
3 | evaluator          | AdaptionEvaluator  | 0     
4 | _feature_extractor | CnnExtractor       | 3.3 K 
5 | _regressor         | FullyConnectedHead | 221   
----------------------------------------------------------
3.5 K     Trainable params
0         Non-trainable params
3.5 K     Total params
0.014     Total estimated model params size (MB)

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

`Trainer.fit` stopped: `max_epochs=1` reached.

After pre-training, we can use the pre-trained networks to initialize the main training. The networks of the pre-training approach, i.e. feature_extractor and regressor, can be accessed as properties.

In [4]:

Copied!





dm, approach, domain_disc, trainer = main_training
approach.set_model(pre_approach.feature_extractor, pre_approach.regressor, domain_disc)
trainer.fit(approach, dm)
trainer.test(approach, dm)
dm, approach, domain_disc, trainer = main_training
approach.set_model(pre_approach.feature_extractor, pre_approach.regressor, domain_disc)
trainer.fit(approach, dm)
trainer.test(approach, dm)

  | Name                     | Type                  | Params
-------------------------------------------------------------------
0 | train_source_loss        | MeanSquaredError      | 0     
1 | consistency_loss         | ConsistencyLoss       | 0     
2 | evaluator                | AdaptionEvaluator     | 0     
3 | _feature_extractor       | CnnExtractor          | 3.3 K 
4 | _regressor               | FullyConnectedHead    | 221   
5 | dann_loss                | DomainAdversarialLoss | 21    
6 | frozen_feature_extractor | CnnExtractor          | 3.3 K 
-------------------------------------------------------------------
3.5 K     Trainable params
3.3 K     Non-trainable params
6.8 K     Total params
0.027     Total estimated model params size (MB)

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

`Trainer.fit` stopped: `max_epochs=1` reached.

Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0             DataLoader 1
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    test/source/rmse         83.57210540771484
    test/source/score          371325.28125
    test/target/rmse                                  84.60678100585938
    test/target/score                                   342092.78125
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Out[4]:

[{'test/source/rmse/dataloader_idx_0': 83.57210540771484,
  'test/source/score/dataloader_idx_0': 371325.28125},
 {'test/target/rmse/dataloader_idx_1': 84.60678100585938,
  'test/target/score/dataloader_idx_1': 342092.78125}]

If you only want to see the hyperparameters, you can use the get_consistency_dann_config function. This returns an omegaconf.DictConfig which you can modify. Afterwards, you can pass the config to consistency_dann_from_config to receive the training-ready approach.

In [5]:

Copied!

cmapss_three2one_config = rul_adapt.construct.get_consistency_dann_config("cmapss", 3, 1)
print(omegaconf.OmegaConf.to_yaml(cmapss_three2one_config, resolve=True))
cmapss_three2one_config = rul_adapt.construct.get_consistency_dann_config("cmapss", 3, 1)
print(omegaconf.OmegaConf.to_yaml(cmapss_three2one_config, resolve=True))

dm:
  source:
    _target_: rul_datasets.CmapssReader
    fd: 3
    window_size: 20
  target:
    fd: 1
    percent_broken: 1.0
  kwargs:
    batch_size: 128
feature_extractor:
  _convert_: all
  _target_: rul_adapt.model.CnnExtractor
  input_channels: 14
  units:
  - 32
  - 16
  - 1
  seq_len: 20
  fc_units: 20
  dropout: 0.5
  fc_dropout: 0.5
regressor:
  _convert_: all
  _target_: rul_adapt.model.FullyConnectedHead
  input_channels: 20
  act_func_on_last_layer: false
  units:
  - 10
  - 1
domain_disc:
  _convert_: all
  _target_: rul_adapt.model.FullyConnectedHead
  input_channels: 20
  act_func_on_last_layer: false
  units:
  - 1
consistency_pre:
  _target_: rul_adapt.approach.SupervisedApproach
  lr: 0.0001
  loss_type: rmse
  optim_type: sgd
consistency:
  _target_: rul_adapt.approach.ConsistencyApproach
  consistency_factor: 1.0
  max_epochs: 3000
  lr: 1.0e-05
  optim_type: sgd
trainer_pre:
  _target_: pytorch_lightning.Trainer
  max_epochs: 1000
trainer:
  _target_: pytorch_lightning.Trainer
  max_epochs: 3000

Run your own experiments¶

You can use the Consistency DANN implementation to run your own experiments with different hyperparameters or on different datasets. Here we build an approach with an LSTM feature extractor.

In [7]:

Copied!





source = rul_datasets.CmapssReader(3)
target = source.get_compatible(1, percent_broken=0.8)

pre_dm = rul_datasets.RulDataModule(source, batch_size=32)
dm = rul_datasets.DomainAdaptionDataModule(
    pre_dm, rul_datasets.RulDataModule(target, batch_size=32),
)

feature_extractor = rul_adapt.model.LstmExtractor(
    input_channels=14,
    units=[16],
    fc_units=8,
)
regressor = rul_adapt.model.FullyConnectedHead(
    input_channels=8,
    units=[8, 1],
    act_func_on_last_layer=False,
)
domain_disc = rul_adapt.model.FullyConnectedHead(
    input_channels=8,
    units=[8, 1],
    act_func_on_last_layer=False,
)

pre_approach = rul_adapt.approach.SupervisedApproach(
    lr=0.001, loss_type="rmse", optim_type="sgd"
)
pre_approach.set_model(feature_extractor, regressor)
pre_trainer = pl.Trainer(max_epochs=1)
trainer.fit(pre_approach, pre_dm)

approach = rul_adapt.approach.ConsistencyApproach(
    consistency_factor=1.0, lr=0.001, max_epochs=1
)
approach.set_model(
    pre_approach.feature_extractor, pre_approach.regressor, domain_disc
)
trainer = pl.Trainer(max_epochs=1)
trainer.fit(approach, dm)
trainer.test(approach, dm)
source = rul_datasets.CmapssReader(3)
target = source.get_compatible(1, percent_broken=0.8)

pre_dm = rul_datasets.RulDataModule(source, batch_size=32)
dm = rul_datasets.DomainAdaptionDataModule(
    pre_dm, rul_datasets.RulDataModule(target, batch_size=32),
)

feature_extractor = rul_adapt.model.LstmExtractor(
    input_channels=14,
    units=[16],
    fc_units=8,
)
regressor = rul_adapt.model.FullyConnectedHead(
    input_channels=8,
    units=[8, 1],
    act_func_on_last_layer=False,
)
domain_disc = rul_adapt.model.FullyConnectedHead(
    input_channels=8,
    units=[8, 1],
    act_func_on_last_layer=False,
)

pre_approach = rul_adapt.approach.SupervisedApproach(
    lr=0.001, loss_type="rmse", optim_type="sgd"
)
pre_approach.set_model(feature_extractor, regressor)
pre_trainer = pl.Trainer(max_epochs=1)
trainer.fit(pre_approach, pre_dm)

approach = rul_adapt.approach.ConsistencyApproach(
    consistency_factor=1.0, lr=0.001, max_epochs=1
)
approach.set_model(
    pre_approach.feature_extractor, pre_approach.regressor, domain_disc
)
trainer = pl.Trainer(max_epochs=1)
trainer.fit(approach, dm)
trainer.test(approach, dm)

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/home/tilman/Programming/rul-adapt/.venv/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:613: UserWarning: Checkpoint directory /home/tilman/Programming/rul-adapt/docs/examples/lightning_logs/version_27/checkpoints exists and is not empty.
  rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")

  | Name               | Type               | Params
----------------------------------------------------------
0 | train_loss         | MeanSquaredError   | 0     
1 | val_loss           | MeanSquaredError   | 0     
2 | test_loss          | MeanSquaredError   | 0     
3 | evaluator          | AdaptionEvaluator  | 0     
4 | _feature_extractor | LstmExtractor      | 2.2 K 
5 | _regressor         | FullyConnectedHead | 81    
----------------------------------------------------------
2.3 K     Trainable params
0         Non-trainable params
2.3 K     Total params
0.009     Total estimated model params size (MB)

Sanity Checking: 0it [00:00, ?it/s]

`Trainer.fit` stopped: `max_epochs=1` reached.
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name                     | Type                  | Params
-------------------------------------------------------------------
0 | train_source_loss        | MeanSquaredError      | 0     
1 | consistency_loss         | ConsistencyLoss       | 0     
2 | evaluator                | AdaptionEvaluator     | 0     
3 | _feature_extractor       | LstmExtractor         | 2.2 K 
4 | _regressor               | FullyConnectedHead    | 81    
5 | dann_loss                | DomainAdversarialLoss | 81    
6 | frozen_feature_extractor | LstmExtractor         | 2.2 K 
-------------------------------------------------------------------
2.3 K     Trainable params
2.2 K     Non-trainable params
4.5 K     Total params
0.018     Total estimated model params size (MB)

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

`Trainer.fit` stopped: `max_epochs=1` reached.

Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0             DataLoader 1
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    test/source/rmse         18.09880828857422
    test/source/score       1549.2022705078125
    test/target/rmse                                 22.494943618774414
    test/target/score                                 814.8432006835938
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Out[7]:

[{'test/source/rmse/dataloader_idx_0': 18.09880828857422,
  'test/source/score/dataloader_idx_0': 1549.2022705078125},
 {'test/target/rmse/dataloader_idx_1': 22.494943618774414,
  'test/target/score/dataloader_idx_1': 814.8432006835938}]