xjtu_sy
The XJTU-SY Bearing dataset is a collection of run-to-failure experiments on bearings. Three different operation conditions were used, resulting in three sub-datasets. Each sub-dataset contains five runs without an official training/test split.
XjtuSyReader
Bases: AbstractReader
This reader represents the XJTU-SY Bearing dataset. Each of its three
sub-datasets contains five runs. By default, the reader assigns the first two to
the development, the third to the validation and the remaining two to the test
split. This run to split assignment can be overridden by setting run_split_dist
.
The features contain windows with two channels of acceleration data which are standardized to zero mean and one standard deviation. The scaler is fitted on the development data.
Examples:
Default splits:
>>> import rul_datasets
>>> fd1 = rul_datasets.reader.XjtuSyReader(fd=1)
>>> fd1.prepare_data()
>>> features, labels = fd1.load_split("dev")
>>> features[0].shape
(123, 32768, 2)
Custom splits:
>>> import rul_datasets
>>> splits = {"dev": [5], "val": [4], "test": [3]}
>>> fd1 = rul_datasets.reader.XjtuSyReader(fd=1, run_split_dist=splits)
>>> fd1.prepare_data()
>>> features, labels = fd1.load_split("dev")
>>> features[0].shape
(52, 32768, 2)
Set first-time-to-predict:
>>> import rul_datasets
>>> fttp = [10, 20, 30, 40, 50]
>>> fd1 = rul_datasets.reader.XjtuSyReader(fd=1, first_time_to_predict=fttp)
>>> fd1.prepare_data()
>>> features, labels = fd1.load_split("dev")
>>> labels[0][:15]
array([113., 113., 113., 113., 113., 113., 113., 113., 113., 113., 113.,
112., 111., 110., 109.])
fds: List[int]
property
Indices of available sub-datasets.
__init__(fd, window_size=None, max_rul=None, percent_broken=None, percent_fail_runs=None, truncate_val=False, run_split_dist=None, first_time_to_predict=None, norm_rul=False, truncate_degraded_only=False)
Create a new XJTU-SY reader for one of the sub-datasets. By default, the RUL values are not capped. The default window size is 32768.
Use first_time_to_predict
to set an individual RUL inflection point for
each run. It should be a list with an integer index for each run. The index
is the time step after which RUL declines. Before the time step it stays
constant. The norm_rul
argument can then be used to scale the RUL of each
run between zero and one.
For more information about using readers, refer to the reader module page.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fd |
int
|
Index of the selected sub-dataset |
required |
window_size |
Optional[int]
|
Size of the sliding window. Defaults to 32768. |
None
|
max_rul |
Optional[int]
|
Maximum RUL value of targets. |
None
|
percent_broken |
Optional[float]
|
The maximum relative degradation per time series. |
None
|
percent_fail_runs |
Optional[Union[float, List[int]]]
|
The percentage or index list of available time series. |
None
|
truncate_val |
bool
|
Truncate the validation data with |
False
|
run_split_dist |
Optional[Dict[str, List[int]]]
|
Dictionary that assigns each run idx to each split. |
None
|
first_time_to_predict |
Optional[List[int]]
|
The time step for each time series before which RUL is constant. |
None
|
norm_rul |
bool
|
Normalize RUL between zero and one. |
False
|
truncate_degraded_only |
bool
|
Only truncate the degraded part of the data (< max RUL). |
False
|
prepare_data()
Prepare the XJTU-SY dataset. This function needs to be called before using the dataset and each custom split for the first time.
The dataset is downloaded from a custom mirror and extracted into the data root directory. The whole dataset is converted com CSV files to NPY files to speed up loading it from disk. Afterwards, a scaler is fit on the development features. Previously completed steps are skipped.