Skip to content

xjtu_sy

The XJTU-SY Bearing dataset is a collection of run-to-failure experiments on bearings. Three different operation conditions were used, resulting in three sub-datasets. Each sub-dataset contains five runs without an official training/test split.

XjtuSyReader

Bases: AbstractReader

This reader represents the XJTU-SY Bearing dataset. Each of its three sub-datasets contains five runs. By default, the reader assigns the first two to the development, the third to the validation and the remaining two to the test split. This run to split assignment can be overridden by setting run_split_dist.

The features contain windows with two channels of acceleration data which are standardized to zero mean and one standard deviation. The scaler is fitted on the development data.

Examples:

Default splits:

>>> import rul_datasets
>>> fd1 = rul_datasets.reader.XjtuSyReader(fd=1)
>>> fd1.prepare_data()
>>> features, labels = fd1.load_split("dev")
>>> features[0].shape
(123, 32768, 2)

Custom splits:

>>> import rul_datasets
>>> splits = {"dev": [5], "val": [4], "test": [3]}
>>> fd1 = rul_datasets.reader.XjtuSyReader(fd=1, run_split_dist=splits)
>>> fd1.prepare_data()
>>> features, labels = fd1.load_split("dev")
>>> features[0].shape
(52, 32768, 2)

Set first-time-to-predict:

>>> import rul_datasets
>>> fttp = [10, 20, 30, 40, 50]
>>> fd1 = rul_datasets.reader.XjtuSyReader(fd=1, first_time_to_predict=fttp)
>>> fd1.prepare_data()
>>> features, labels = fd1.load_split("dev")
>>> labels[0][:15]
array([113., 113., 113., 113., 113., 113., 113., 113., 113., 113., 113.,
       112., 111., 110., 109.])

fds: List[int] property

Indices of available sub-datasets.

__init__(fd, window_size=None, max_rul=None, percent_broken=None, percent_fail_runs=None, truncate_val=False, run_split_dist=None, first_time_to_predict=None, norm_rul=False, truncate_degraded_only=False)

Create a new XJTU-SY reader for one of the sub-datasets. By default, the RUL values are not capped. The default window size is 32768.

Use first_time_to_predict to set an individual RUL inflection point for each run. It should be a list with an integer index for each run. The index is the time step after which RUL declines. Before the time step it stays constant. The norm_rul argument can then be used to scale the RUL of each run between zero and one.

For more information about using readers, refer to the reader module page.

Parameters:

Name Type Description Default
fd int

Index of the selected sub-dataset

required
window_size Optional[int]

Size of the sliding window. Defaults to 32768.

None
max_rul Optional[int]

Maximum RUL value of targets.

None
percent_broken Optional[float]

The maximum relative degradation per time series.

None
percent_fail_runs Optional[Union[float, List[int]]]

The percentage or index list of available time series.

None
truncate_val bool

Truncate the validation data with percent_broken, too.

False
run_split_dist Optional[Dict[str, List[int]]]

Dictionary that assigns each run idx to each split.

None
first_time_to_predict Optional[List[int]]

The time step for each time series before which RUL is constant.

None
norm_rul bool

Normalize RUL between zero and one.

False
truncate_degraded_only bool

Only truncate the degraded part of the data (< max RUL).

False

prepare_data()

Prepare the XJTU-SY dataset. This function needs to be called before using the dataset and each custom split for the first time.

The dataset is downloaded from a custom mirror and extracted into the data root directory. The whole dataset is converted com CSV files to NPY files to speed up loading it from disk. Afterwards, a scaler is fit on the development features. Previously completed steps are skipped.