An API to Protect Against Re-Identification

Version 1.2.0

Making More Data More Accessible, Safely

Our goal is to help people to make more informed decisions by making it easier to access key datasets, while maintaining individuals’ privacy.

Protari is prototype software that allows authorised users to query confidential "unit record" datasets (eg. covering individuals, households, or businesses), which is then aggregated and confidentialized on-the-fly. Only aggregated and confidentialized data is returned to the user, which decreases the risk of re-identification of the data.

The software has been developed as an API, or "Application Programming Interface", which is a way of providing a standardized interface for a computer program to interact with the outside world.

With more data available by API, we hope that a wide range of downstream applications can be built on it.

Please explore the links in the menu bar above to find out more about setting up the API with your own data, using the API, and helping to develop the code.

Caveats

Protari is undergoing user trials. It must not be used on confidential data to provide unrestrained public access to data.

Protari can be configured to use different perturbation and confidentialisation algorithms. Usage of an algorithm in Protari is not an endorsement of that algorithm by Data61, CSIRO.

Context

Protari can only be part of the solution to the problem of re-identification. Data61 has collaborated with the Office of the Australian Information Commissioner to integrate the different perspectives on the topic of de-identification into a single comprehensible framework. The result is a practical and accessible guide to de-identification, for those who handle personal information and need to share or release it.

Funding

This work was funded by the National Innovation and Science Agenda.