latent_align
The latent space alignment approach uses several auxiliary losses to align the latent space of the source and target domain produced by a shared feature extractor:
- Healthy State Alignment: Pushes the healthy data of both domains into a single compact cluster
- Degradation Direction Alignment: Minimizes the angle between degraded data points with the healthy cluster as origin
- Degradation Level Alignment: Aligns the distance of degraded data points from the healthy cluster to the number of time steps in degradation
- Degradation Fusion: Uses a MMD loss to align the distribution of both domains
Which features are considered in the healthy state and which in degradation is either determined by taking the first few steps of each time series or by using a first-time-to-predict estimation. The first variant is used for CMAPSS, the second for XJTU-SY.
The approach was introduced by Zhang et al. in 2021. For applying the approach on raw vibration data, i.e. XJTU-SY, it uses a windowing scheme and first-point-to-predict estimation introduced by Li et al. in 2020.
LatentAlignApproach
Bases: AdaptionApproach
The latent alignment approach introduces four latent space alignment losses to align the latent space of a shared feature extractor to both source and target domain.
Examples:
>>> from rul_adapt import model, approach
>>> feat_ex = model.CnnExtractor(1, [16, 16, 1], 10, fc_units=16)
>>> reg = model.FullyConnectedHead(16, [1])
>>> latent_align = approach.LatentAlignApproach(0.1, 0.1, 0.1, 0.1, lr=0.001)
>>> latent_align.set_model(feat_ex, reg)
__init__(alpha_healthy, alpha_direction, alpha_level, alpha_fusion, loss_type='mse', rul_score_mode='phm08', evaluate_degraded_only=False, labels_as_percentage=False, **optim_kwargs)
Create a new latent alignment approach.
Each of the alphas controls the influence of the respective loss on the training. Commonly they are all set to the same value.
For more information about the possible optimizer keyword arguments, see here.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
alpha_healthy |
float
|
The influence of the healthy state alignment loss. |
required |
alpha_direction |
float
|
The influence of the degradation direction alignment loss. |
required |
alpha_level |
float
|
The influence of the degradation level regularization loss. |
required |
alpha_fusion |
float
|
The influence of the degradation fusion (MMD) loss. |
required |
loss_type |
Literal['mse', 'mae', 'rmse']
|
The type of regression loss to use. |
'mse'
|
rul_score_mode |
Literal['phm08', 'phm12']
|
The mode for the val and test RUL score, either 'phm08' or 'phm12'. |
'phm08'
|
evaluate_degraded_only |
bool
|
Whether to only evaluate the RUL score on degraded samples. |
False
|
labels_as_percentage |
bool
|
Whether to multiply labels by 100 to get percentages |
False
|
**optim_kwargs |
Any
|
Keyword arguments for the optimizer, e.g. learning rate. |
{}
|
configure_optimizers()
Configure an optimizer.
forward(features)
Predict the RUL values for a batch of input features.
test_step(batch, batch_idx, dataloader_idx)
Execute one test step.
The batch
argument is a list of two tensors representing features and
labels. A RUL prediction is made from the features and the validation RMSE
and RUL score are calculated. The metrics recorded for dataloader idx zero
are assumed to be from the source domain and for dataloader idx one from the
target domain. The metrics are written to the configured logger under the
prefix test
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch |
List[Tensor]
|
A list containing a feature and a label tensor. |
required |
batch_idx |
int
|
The index of the current batch. |
required |
dataloader_idx |
int
|
The index of the current dataloader (0: source, 1: target). |
required |
training_step(batch, batch_idx)
Execute one training step.
The batch
contains the following tensors in order:
- The source domain features.
- The steps in degradation for the source features.
- The RUL labels for the source features.
- The target domain features.
- The steps in degradation for the target features.
- The healthy state features for both domains.
The easies way to produce such a batch is using the LatentAlignDataModule.
The source, target and healthy features are passed through the feature extractor. Afterward, these high-level features are used to compute the alignment losses. The source domain RUL predictions are computed using the regressor and used to calculate the MSE loss. The losses are then combined. Each separate and the combined loss are logged.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch |
Tuple[Tensor, ...]
|
The batch of data. |
required |
batch_idx |
int
|
The index of the batch. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The combined loss. |
validation_step(batch, batch_idx, dataloader_idx)
Execute one validation step.
The batch
argument is a list of two tensors representing features and
labels. A RUL prediction is made from the features and the validation RMSE
and RUL score are calculated. The metrics recorded for dataloader idx zero
are assumed to be from the source domain and for dataloader idx one from the
target domain. The metrics are written to the configured logger under the
prefix val
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch |
List[Tensor]
|
A list containing a feature and a label tensor. |
required |
batch_idx |
int
|
The index of the current batch. |
required |
dataloader_idx |
int
|
The index of the current dataloader (0: source, 1: target). |
required |
LatentAlignFttpApproach
Bases: AdaptionApproach
This first-point-to-predict estimation approach trains a GAN on healthy state bearing data. The discriminator can be used afterward to compute a health indicator for each bearing.
The feature extractor and regressor models are used as the discriminator. The regressor is not allowed to have an activation function on its last layer and needs to use only a single output neuron because BCEWithLogitsLoss is used. The generator receives noise with the shape [batch_size, 1, noise_dim]. The generator needs an output with enough elements so that it can be reshaped to the same shape as the real input data. The reshaping is done internally.
Both generator and discriminator are trained at once by using a Gradient Reversal Layer between them.
Examples:
>>> from rul_adapt import model, approach
>>> feat_ex = model.CnnExtractor(1, [16, 16, 1], 10, fc_units=16)
>>> reg = model.FullyConnectedHead(16, [1])
>>> gen = model.CnnExtractor(1, [1], 10, padding=True)
>>> fttp_model = approach.LatentAlignFttpApproach(1e-4, 10)
>>> fttp_model.set_model(feat_ex, reg, gen)
>>> health_indicator = fttp_model(torch.randn(16, 1, 10)).std()
generator
property
The generator network.
__init__(noise_dim, **optim_kwargs)
Create a new FTTP estimation approach.
The generator is set by the set_model
function together with the feature
extractor and regressor.
For more information about the possible optimizer keyword arguments, see here.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
noise_dim |
int
|
The size of the last dimension of the noise tensor. |
required |
**optim_kwargs |
Any
|
Keyword arguments for the optimizer, e.g. learning rate. |
{}
|
configure_optimizers()
Configure an optimizer for the generator and discriminator.
forward(inputs)
Predict the health indicator for the given inputs.
set_model(feature_extractor, regressor, generator=None, *args, **kwargs)
Set the feature extractor, regressor (forming the discriminator) and generator for this approach.
The regressor is not allowed to have an activation function on its last layer and needs to use only a single output neuron. The generator receives noise with the shape [batch_size, 1, noise_dim]. The generator needs an output with enough elements so that it can be reshaped to the same shape as the real input data. The reshaping is done internally.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_extractor |
Module
|
The feature extraction network. |
required |
regressor |
Module
|
The regressor functioning as the head of the discriminator. |
required |
generator |
Optional[Module]
|
The generator network. |
None
|
training_step(batch)
Execute one training step.
The batch is a tuple of the features and the labels. The labels are ignored. A noise tensor is passed to the generator to generate fake features. The discriminator classifies if the features are real or fake and the binary cross entropy loss is calculated. Real features receive the label zero and the fake features one.
Both generator and discriminator are trained at once by using a Gradient Reversal Layer between them. At the end, the loss is logged.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch |
Tuple[Tensor, Tensor]
|
A tuple of feature and label tensors. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The classification loss. |
extract_chunk_windows(features, window_size, chunk_size)
Extract chunk windows from the given features of shape [num_org_windows,
org_window_size, num_features]
.
A chunk window is a window that consists of window_size
chunks. Each original
window is split into chunks of size chunk_size
. A chunk window is then formed
by concatenating chunks from the same position inside window_size
consecutive
original windows. Therefore, each original window is represented by
org_window_size // chunk_size
chunk windows. The original window size must
therefor be divisible by the chunk size.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
features |
ndarray
|
The features to extract the chunk windows from. |
required |
window_size |
int
|
The number of consecutive original windows to form a chunk window from. |
required |
chunk_size |
int
|
The size of the chunks to extract from the original windows. |
required |
Returns:
Type | Description |
---|---|
ndarray
|
Chunk windows of shape |
get_first_time_to_predict(fttp_model, features, window_size, chunk_size, healthy_index, threshold_coefficient)
Get the first time step to predict for the given features.
The features are pre-processed via the extract_chunk_windows function and fed in
batches to the fttp_model
. Each batch consists of the chunk windows that end in
the same original feature window. The health indicator for the original window is
calculated as the standard deviation of the predictions of the fttp_model
.
The first-time-to-predict is the first time step where the health indicator is
larger than threshold_coefficient
times the mean of the health indicator for
the first healthy_index
time steps. If the threshold is never exceeded,
a RuntimeError is raised.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fttp_model |
LatentAlignFttpApproach
|
The model to use for the health indicator calculation. |
required |
features |
ndarray
|
The features to calculate the first-time-to-predict for. |
required |
window_size |
int
|
The size of the chunk windows to extract. |
required |
chunk_size |
int
|
The size of the chunks for each chunk window to extract. |
required |
healthy_index |
int
|
The index of the last healthy time step. |
required |
threshold_coefficient |
float
|
The threshold coefficient for the health indicator. |
required |
Returns:
Type | Description |
---|---|
int
|
The original window index of the first-time-to-predict. |
get_health_indicator(fttp_model, features, window_size, chunk_size)
Get the health indicator for the given features.
The features are pre-processed via the extract_chunk_windows function and fed in
batches to the fttp_model
. Each batch consists of the chunk windows that end in
the same original feature window. The health indicator for the original window is
calculated as the standard deviation of the predictions of the fttp_model
.
The length of the returned health indicator array is shorter than the features
array by window_size - 1
, due to the chunk windowing. This means the first
health indicator value belongs to the original window with the index
window_size - 1
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fttp_model |
Module
|
The model to use for the health indicator calculation. |
required |
features |
ndarray
|
The features to calculate the health indicator for. |
required |
window_size |
int
|
The size of the chunk windows to extract. |
required |
chunk_size |
int
|
The size of the chunks for each chunk window to extract. |
required |
Returns:
Type | Description |
---|---|
ndarray
|
The health indicator for the original windows. |