Katharina Rasch: What every data scientist should know about data anonymization | PyData Berlin 2016

Описание к видео Katharina Rasch: What every data scientist should know about data anonymization | PyData Berlin 2016

PyData Berlin 2016

There are numerous examples of data anonymization gone horribly wrong - the most prominent one might be the netflix prize, where researchers were able to uniquely identify users by combining netflix user data with imdb reviews. Let's learn from their mistakes and look at some of the measures you can take to better anonymize data before you share it with others.

Outline:

- Look at some of the examples where data anonymization was broken and identify what went wrong
- What is the state of the art for data anonymization and can you be sure to be safe if you follow it?
- How does anonymization affect the possibilities for data mining/machine learning on the data?

This talk is aimed at people who want release open data or want to share sensitive data between departments.

Slides: https://github.com/krasch/presentatio... 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.

Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...

Комментарии

Информация по комментариям в разработке