|
There is an opinion that almost all data about any person is already on the Internet and with due diligence it can be de-anonymized. This approach limits the use of anonymized data for analytics and training models for artificial intelligence. The regulator is of the same opinion - a year ago, Deputy Head of Roskomnadzor Milos Wagner, speaking at the St. Petersburg International Legal Forum, emphasized that
Anonymized data is still personal data because it characterizes a person, perhaps without direct identifiers
Therefore, depersonalized data content writing service cannot be used in business processes. But not everyone agrees with this approach. On June 19, the Big Data Association (BDA) and HFLabs presented the results of experiments on developing a risk methodology in personal data (PD) processing tasks.

According to Nikita Nazarov, technical director of HFLabs, with the “correct” choice of depersonalization tools and the proper level of their application, it is possible to ensure security while maintaining the usefulness of data for a specific task.
Nikita Nazarov, HFLabs:
– As a security criterion, we propose using a risk-based methodology, that is, assessing the risk of re-identification of a specific data set that is planned to be used in a business case.
According to him, ABD and HFLabs modeled two cases in which anonymized data is used:
The first is preparing a data set for analysis, which ends up in the hands of attackers. By comparing this data with information previously leaked to the Internet, they try to obtain personal data.
The second is related to hackers combining information from different sources. They will search the anonymous array for information about a specific person.
RISK, BUT STRICTLY BY CALCULATION
The model, the purpose of which is to calculate the risks when processing sensitive data, including personal data, began to be formed two years ago, said ABD Executive Director Alexey Neiman.
According to him, during the testing of the risk model, four questions were asked, the first of which was: how are K-anonimity (a characteristic of the data set) and the risks of cyber attacks related?
Alexey Neiman, Big Data Association:
– The regulator drew attention to the second risk. They told us: guys, we understand everything – you take some data set inside your enterprise. Suddenly it leaks onto the Internet. Then the regulator reminds that a lot of different data has leaked onto the Internet and it is necessary to calculate the risks taking into account that attackers will use data that is already on the Internet.
Therefore, he emphasized, the second question that was addressed during testing:
How to effectively assess the risks of information leakage taking into account external sources?
The researchers also wanted to find out whether there are simple methods for data intersection without revealing sensitive identifiers and how different types of protection affect the general model of information leakage when working with multiple data sources.
|
|