Anamike Ved, Author at TechGDPR

How to develop Artificial Intelligence that is GDPR-friendly

Anamike Ved — Thu, 28 Feb 2019 10:57:01 +0000

GDPR coming into effect coincides with the more widespread adoption of artificial intelligence as the technology becomes embedded in more and more enterprise applications. There is a palpable excitement around AI for its potential to revolutionize seemingly every facet of every industry. Studies reveal that 80% of executives believe AI boosts productivity. In the immediate future, execs are looking for AI to alleviate repetitive, menial tasks such as paperwork (82%), scheduling (79%) and timesheets (78%). By 2025, the artificial intelligence market is reported to surpass $100 billion.

Alongside the excitement, there are concerns. Among them, is how to address data privacy and the concern between data privacy and artificial intelligence is most pronounced in the General Data Protection Regulation (GDPR).

The GDPR is designed to protect the privacy of EU citizens and give them more control over their personal data. It aims to establish a new relationship between user and system – one where transparency and a standard of privacy are non-negotiable. Artificial Intelligence (AI) is a set of technologies or systems that allows computers to perform tasks involving a simulation of human intelligence including decision making or learning. In order to do so, the technology or system collects voluminous amounts of data (called Big Data) and namely personal data. AI (especially Machine Learning [ML] algorithms) and Big Data go hand in hand, which has led many to question whether it is possible to use AI while still protecting fundamental personal data protection rights as outlined in GDPR.

Applying the GDPR to machine learning and artificial intelligence

The GDPR–a sprawling piece of legislation–applies to artificial intelligence when it is under development with the help of personal data, and also when it is used to analyze or reach decisions about individuals. GDPR provisions that are squarely aimed at machine learning state “the data subject shall have the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her.” (Article 22 and Recital 71). Also noteworthy are Articles 13 and 15 which state repeatedly that data subjects have a right to “meaningful information about the logic involved” and to “the significance and the envisaged consequences” of automated decision-making.

It is clear that the regulation expects the technologies like AI to be developed while taking into consideration the following principles:

fairness,
purpose limitation,
data minimisation,
transparency, and
the right to information.

The principles mentioned above are supposedly some of the major challenges facing AI to adapt to the new world of GDPR. The problem is because most of the machine learning decision-making systems are “black boxes” rather than old-style rule-based expert systems, and therefore fail to comply with the GDPR requirements of transparency, accountability, and putting the data subject in control.

Solutions and Recommendations to make Artificial Intelligence GDPR-friendly

Some data sets used to train AI systems have been found to contain inherent biases, which results in decisions that unfairly discriminate against certain individuals or groups. To become GDPR compliant, the design, development and use of AI should ensure that there are no unlawful biases or discrimination. Companies should invest in technical research to identify, address and mitigate biases.

One way to address bias in trained machine learning models is to build transparent models. Organizations should improve AI systems transparency by investing in scientific research on explainable artificial intelligence. They should also make their practices more transparent ensuring individuals are informed appropriately when they are interacting with AI and provide adequate information on the purpose and effects of AI systems.

With respect to data minimisation, the developers should start from carrying out research on possible solutions that use less training data, anonymisation techniques and only solutions that explain how systems process data and how they reach their conclusions.

There is need for privacy-friendly development and use of AI. AI should be designed and developed responsibly by applying the principles of privacy by design and privacy by default.

Organizations should conduct data protection impact assessment at the beginning of an AI project and document the process. A report by the Norwegian Data Protection Authority, “Artificial intelligence and privacy” suggests that the impact assessment should include the following as a minimum:

a systematic description of the process, its purpose, and which justified interest it protects;
an assessment of whether the process is necessary and proportional, given its purpose;
an assessment of the risk that processing involves for people’s rights, including the right to privacy; and
the identification of the measures selected for managing risk.

Tools and methods for good data protection in Artificial Intelligence

In addition to impact assessment and the documentation of the process to meet the requirements of transparency and accountability, the Norwegian Data Protection Authority report mentioned above includes tools and methods for good data protection in AI. These methods reportedly have not been evaluated in practice, but assessed according to their possible potential. The methods are divided into three categories:

Methods for reducing the need for training data.
Methods that uphold data protection without reducing the basic dataset.
Methods designed to avoid the black box issue.

1. Methods that can help to reduce the need for training data include:

Generative Adversarial Networks (GANs) have the potential to advance the power of neural networks and their ability to “think” in human ways. It might be an important step towards inventing a form of artificial intelligence that can mimic human behavior, make decisions and perform functions without having a lot of data.
Federated Learning is a privacy-friendly and flexible approach to machine learning in which data are not collected. In a nutshell, the parts of the algorithms that touch the data are moved to the users’ computers. Users collaboratively help to train a model by using their locally available data to compute model improvements. Instead of sharing their data, users then send only these abstract improvements back to the server.
Matrix Capsules are a new variant of neural networks, and require less data for learning than what is currently the norm for deep learning.

2. The field of cryptology offers some promising possibilities in the area of protecting privacy without reducing the data basis, including the following methods:

Differential privacy is the leading technique in computer science to allow for accurate data analysis with formal privacy guarantees. The mechanism used by differential privacy to protect privacy is to add noise to data purposefully (i.e. deliberate errors) so that even if it were possible to recover data about an individual, there would be no way to know whether that information was meaningful or nonsensical. One useful feature of this approach is that even though errors are deliberately introduced into the data, the errors roughly cancel each other out when the data is aggregated.
Homomorphic encryption can help to enforce GDPR compliance in AI solutions without necessarily constraining progress. It is a crypto system that allows computations to be performed on data whilst it is still encrypted, which means the confidentiality can be maintained without limiting the usage possibilities of the dataset.
Transfer Learning enables one to train Deep Neural Networks with comparatively little data. It is the reuse of a pre-trained model on a new problem. In other words, in transfer learning, an attempt is made to transfer as much knowledge as possible from the previous task, the model was trained on, to the new task at hand.

3. Methods for avoiding the black box issue include:

Explainable AI (XAI) plays an important role in achieving fairness, accountability and transparency in machine learning. It is based on the idea that all the automated decisions made should be explicable. In XAI, the artificial intelligence is programmed to describe its purpose, rationale and decision-making process in a way that can be understood by the average person.
Local Interpretable Model-Agnostic Explanations (LIME) provides an explanation of a decision after it has been made, which means it isn’t a transparent model from start to finish. Its strength lies in the fact that it is model-agnostic which means it can be applied to any model in order to produce explanations for its predictions.

The GDPR requires that technologies like AI and machine learning take privacy concerns into consideration as they are developed. With the GDPR, the road ahead will be bumpy for machine learning, but not impassable. The adoption of the measures and the methods discussed above can help to ensure that AI processes are in line with the regulation. These could also go a long way to achieving accountable AI programs that can explain their actions and reassure users that AI is worthy of their trust.

The post How to develop Artificial Intelligence that is GDPR-friendly appeared first on TechGDPR.

What the GDPR’s ‘Privacy By Design’ Really Means for Your Business

Anamike Ved — Fri, 31 Aug 2018 09:52:29 +0000

How, exactly, can privacy be designed? Companies concerned about Europe’s General Data Protection Regulation (GDPR) may or may not have already considered the curious concept of “privacy by design and privacy by default” — but consider it, they must. While it’s hardly the most charming regulatory text ever written, it’s implications are vast, and understanding it properly saves startups considerable time and money (and headaches) if they begin implementing a few key privacy procedures while they are still at earlier stages of product and procedural development. The legal nuts and bolts can be found in Article 25 of the GDPR, with this excerpt below clarifying the main requirements:

“In order to be able to demonstrate compliance with this Regulation, the controller should adopt internal policies and implement measures which meet in particular the principles of data protection by design and data protection by default. Such measures could consist, inter alia, of minimizing the processing of personal data, pseudonymising personal data as soon as possible, transparency with regard to the functions and processing of personal data, enabling the data subject to monitor the data processing, enabling the controller to create and improve security features.” (Recital 78)

Simply put, the GDPR expects companies and other organizations to implement technical and organizational measures at their earliest stages of design and at the earliest stages of their operations. They need to do this in a way that safeguards privacy and data protection principles right from the start (“data protection by design”). Such requirements are also, quite frankly, simple due diligence in the world of reliable data management. So, how does one actually “design” data protection for data subjects?

What is Privacy by Design?

Privacy by design is not a new concept. It is the philosophy proposed by Dr. Ann Cavoukian, the Information and Privacy Commissioner of Ontario in the 1990s. Ann Cavoukian is widely recognized as the primary creator of the privacy by design concept. She defines it as an approach to technology design that embeds privacy-enhancing measures into technology at the point of design and production, and sells to technology to consumers with strong default privacy settings. The foundational principles of “Privacy by Design” as suggested by Ann Cavoukian are:

Privacy by design is proactive, not reactive; it is preventative, not remedial. Privacy by design anticipates and protects privacy against negative and invasive effects of new products and technologies before they happen.
Privacy by design ensures privacy as the default, which means that personal data are automatically protected in any given IT system. If an individual does nothing, their privacy still remains intact. No action is required on the part of the individual to protect their privacy − it is built into the system, by default.
Privacy by design means that privacy is embedded into the design and the architecture of the IT system. It is not bolted on, after-the-fact. The result is that privacy becomes an essential component of the core functionality that is being delivered.

Privacy by design permits full functionality. When embedding privacy into a given technology, process, or system, it should be done in such a way that full functionality is not impaired, and to the greatest extent possible, that all requirements are optimized.
Privacy by design extends securely throughout the entire lifecycle of the data involved. Strong security measures are essential to privacy, from start to finish. Privacy must be continuously protected across the entire domain and throughout the life-cycle of the data in question. There should be no gaps in either protection or accountability. The “Security” principle has special relevance here because, at its essence, without strong security, there can be no privacy.
Privacy by design seeks to assure visibility and transparency, as they are essential to establishing accountability and trust.
Privacy by design is consciously designed around the interests and needs of individual users, who have the greatest vested interest in the management of their own personal data. The architects should keep the interests of the individual uppermost by offering such measures as strong privacy defaults, appropriate notice, and empowering user-friendly options. Keep it user-centric!

After the GDPR came into force on May 25th, 2018 many companies became tempted to regard the regulation as a compliance burden. However, GDPR is about reputation and not just regulation. The benefits of meeting the requirement for data protection by design, which is essentially the GDPR’s version of “privacy by design” go far beyond any legal compliance. Also, as stated earlier, much of it is standard housekeeping if you are already a company that prioritizes data security.

New Consumer Privacy Expectations

Studies have shown that data privacy is a consideration steadily more expected by the consumers. According to a survey conducted online by The Harris Poll on behalf of IBM between March 20-26th, 2018, 78% of U.S. respondents say that a company’s ability to keep their data private is “extremely important” and only 20% “completely trust” organizations they interact with to maintain the privacy of their data. This suggests that privacy breaches not only have significant financial implications but can also cause reputational damage. If consumers do not feel that their privacy is being protected, they will seek out other means of ensuring their privacy.

Embracing privacy from the design phase enables companies to protect customers’ data and enhance their business reputation. It enables trusted, long-term relationships with the existing customers and the opportunity to attract new ones. Irrespective of whether they are affected by the regulatory framework itself, companies should make privacy an integral part of their DNA and their offering for their existence and for their customers’ well being. This is good news for those working in any sector, including IoT (Internet of Things), machine learning, and blockchain.

The reality—that brand reputation and consumer trust are inextricably linked—is especially true in the IoT context. According to one estimate, the total number of connected IoT sensors and devices is set to exceed 50 billion by 2022, up from an estimated 21 billion in 2018. Consumers (or as the GDPR calls them, “data subjects”) want organizations to give them more control over their personal information as the Internet of Things (IoT) grows, and connected devices harvest even more of their data, according to research from the Economic Intelligence Unit (EIU). As more devices, platforms, and infrastructure connect to the Internet in real-time, the most successful industry participants will be those that regard Privacy by Design as an opportunity to demonstrate that they are worthy of consumers’ trust.

A recent report by O’Reilly outlines the current state of machine learning adoption in the enterprise and reveals that in order to keep pace with developing privacy needs, machine learning needs to evolve. “With the EU’s recent General Data Protection Regulation mandates, more companies will begin to implement privacy safeguards into their machine learning practices”, says the report. It further reveals that the GDPR pushes for “privacy by design,” and that more businesses are taking interest in privacy-preserving analytic methods. These methods include techniques like differential privacy, homomorphic encryption, federated learning, and more.

Such privacy-preserving applications not only help companies become GDPR complaint but also allow users to benefit from the security of blockchain, among other technologies. It’s worth noting that the popularity of new decentralized networks comes in large part from the expectation that they offer a means of protecting one’s identity. Ultimately, whatever the technology, taking early action to preserve personal privacy is a winner for both the parties, the companies and the users. The sooner you start, the easier it will be.

For more insights, follow TechGDPR on Twitter.

The post What the GDPR’s ‘Privacy By Design’ Really Means for Your Business appeared first on TechGDPR.