The Prognostics Data Library


Who can register to use the Prognostics Data Library?
Anyone with a valid email address can register, as long as they haven't done so already.
Are there any restrictions on using the data from the Prognostics Data Library?
You may only use the PDL for the purposes of genuine research, teaching, or other purposes primarily intended to further technical developments in the field of prognostics. Please see the Terms of Use page for all the legal stuff or download a copy of the Terms here.
Can I change the files I've downloaded to use them in my modelling program?
Yes, you can make whatever changes you require.
How much does it cost to use the PDL?
Nothing. Using the data is completely free. So is uploading a dataset.
If using the PDL is free, how is the system funded?
The development and upkeep of the Prognostics Data Library is currently funded by the BHP Fellowship Grant, which was awarded to UWA's Professor Melinda Hodkiewicz in 2016. However, BHP has no direct or indirect involvement in the Database and has no access to any information beyond what is available to any registered user of the PDL.
Why do I have to register with the PDL to download data?
By registering as a user of the Prognostics Data Library, you agree to our Terms of Use. Also, we want to keep track of who is downloading data so that we can remind them to give us feedback later on. Users' ratings and feedback will be instrumental in helping others get the most out of the datasets.
I can download data elsewhere. What is so special about the Prognostics Data Library?
The Prognostics Data Library allows users to preview consistent and comprehensive meta-data about all its datasets. We also aim, wherever possible, to provide the actual datasets in a consistent, open source format, so that you don't need proprietary software to view and use the data.
Who can upload data to the Prognostic Data Library?
Anyone who legally owns a dataset, can donate it to the PDL. However, you won't be able to upload it directly, as we'll have to check a few things first. If you are representing a Company, we've got a legal agreement that will need to be signed as well.
Who owns the data?
The original data owner retains ownership of the data. However, once uploaded, the owner then provides usage rights to the Prognostics Data Library and its users. Full details are provided here.
I've spent lots of money and time collecting my data. Why should I just donate it to the PDL?
By donating your dataset to the PDL you can have the world's best prognostic modellers trying to solve your prognostic problems, at no cost to you. The PDL is basically crowd-sourcing your prognostic modelling for you. Instead of getting one solution from a single consultant or researcher, you can have lots of people working simultaneously on the problem for free.

Also, if you are a researcher who has collected this data for your own work and already published a resulting paper, users of this dataset will need to reference your work when they publish their own results.
How do I know the dataset is any good? Is it vetted prior to release?
We don't vet the datasets for prognostic modelling quality. We just check that it is usable, has some useful meta-data, and is appropriately anonymised. We also encourage registered users to rate datasets they've used and provide comments, which will be available for everyone to read prior to downloading a dataset. So ultimately, the users will effectively vet the data.
If you aren't vetting the datasets, what do you do to them prior to publishing them on the site?
Firstly, we try and anonymise the datasets to remove any corporate identifiers. If the dataset is provided by a company directly and has not been previously published, we'll usually work extensively with them to ensure all sensitive and identifying terms are removed. For details of the process we use, check out this flowchart.

Secondly, we ensure that useful meta-data is also provided as well. We believe datasets are only as good as their metadata.

Finally, whenever possible, we'll format the dataset into a consistent format that can be readily interpreted without any propriatary software (e.g. convert Matlab files into text formats). If this is not feasible, we'll make sure that enough details are provided so that the dataset can be easily utilised. We may even provide algorithms to read the dataset using generic open source tools such as R or Python.
If I donate a dataset, how will I know it won't be used by my competitors to improve their competitive advantage?
We can't guarantee this; after all, we want as many people as possible to use this data. The good news is that if a competitor uses your dataset, they won't know it's yours. They could just as easily be analysing their own products. Only the original data provider will know which dataset is theirs.
What do you mean by 'anonymise the data'?
Anonymising the datasets (and meta-data) involves removing any corporate identifiers (company names, manufacturer's names and model names, site names etc) and replacing them with randomly generated classifiers. Any information that may help identify a dataset owner or specific application is replaced with a generic term. For example, users will still know that a dataset refers to excavators used for iron-ore mining, they just won't know which brand of excavators, from which iron ore mine or belonging to which company. Even well recognised codes that are specific to a particular company or site should be removed.

It is important to note that although we will do what we can to provide anonymity, the ultimate responsibility for removing all identifiers remains with the data provider.
I'd like to donate some data but need to prepare it first. Are there any tools to help me?
Yes. The System Health Lab has developed several tools you can use to analyse your dataset (available from The System Health Lab ) and we encourage you do make use of them before trying to upload any dataset to the PDL. The UWA Data Analysis Tool may be particularly useful.
How can I upload my dataset to the PDL?
We are currently working on a web-tool that will help with this process. In the meantime, please send us an email and we'll contact you.
Can I combine datasets into one mega dataset?
Usually not, due to the anonymisation process. The same terms in different files may refer to different conditions. Consequently, we will try to merge datafiles that can be used together into one mega-dataset prior to anonymisation. If we know that one dataset can be used with another, we'll let you know this in the meta-data.
I can't seem to download some datasets. Your site just tells me to follow another link. Why is that?
Some data owners have only given us permission to host their meta-data as they want you to go to their website to download the actual data. We think that having the meta-data about as many different datasets in one place is still useful, so we've kept that information here. In these cases, we haven't been able to reformat the data either, so it may still be in a proprietary format.