data scraping Archives - TechGDPR https://techgdpr.com/blog/tag/data-scraping/ Tue, 10 Dec 2024 13:56:51 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 Data protection digest 18 May – 2 Jun 2024: decentralised clinical research, Meta’s new virtual assistant https://techgdpr.com/blog/data-protection-digest-05062024-decentralised-clinical-research-meta-ai-training/ Wed, 05 Jun 2024 07:43:31 +0000 https://s8.tgin.eu/?p=8689 In this issue, the personal data lifecycle in decentralised clinical research, Meta’s new AI chatbot, protections for organisations against data scraping, failed backup testing and spreadsheet error real examples, and much more. Stay up to date! Sign up to receive our fortnightly digest via email. Decentralised clinical research To support sponsors in designing their decentralised […]

The post Data protection digest 18 May – 2 Jun 2024: decentralised clinical research, Meta’s new virtual assistant appeared first on TechGDPR.

]]>
In this issue, the personal data lifecycle in decentralised clinical research, Meta’s new AI chatbot, protections for organisations against data scraping, failed backup testing and spreadsheet error real examples, and much more.

Stay up to date! Sign up to receive our fortnightly digest via email.

Decentralised clinical research

To support sponsors in designing their decentralised clinical research projects, the French data protection authority CNIL with other state agencies set up a pilot project, (from January to September 2024). 20 selected projects will receive targeted support and updated guidance, looking especially at the entire lifecycle of personal data processing: 

  • Roles and responsibilities, (oversight of incoming data);
  • Informed consent process, (interviews, leaflets, signatures);
  • Delivery of investigational products, (safety data, biological sample handling, home visits etc);
  • Data collection and management, (defining and handling source data);
  • Trial monitoring, (remote access).

In December 2022, the Commission published the European recommendations on decentralised clinical trials. It came after the COVID-19 pandemic, highlighting the importance of digital tools and decentralisation procedures in health research projects.

Meta’s AI virtual assistant under investigation in the EU

Norway’s data protection regulator reports that as of June 26, posts and photos on Facebook, (often of a private nature), and Instagram will be used to develop and improve Meta’s AI assistant service. This won’t include private messages to friends and family. Reportedly, Meta believes that the company does not need to ask for users’ consent since their interest in using the content outweighs the users’ interests and rights. The regulator has already received a complaint and started an investigation into the new practice and expects that there will be more complaints, both in Norway and in Europe. 

At the moment individuals in Norway can only object to it in a dedicated form on Facebook and Instagram if they wish.

Protections against Data Scraping

The Italian data protection authority has issued nonmandatory guidance on how to protect personal data published online by public and private entities in their capacity as data controllers from web scraping. It particularly targets the indiscriminate collection of personal data on the internet, carried out by third parties for training generative AI models. Some concrete measures, (taking into account the latest technology and the costs of implementation, in particular for SMEs) may include: 

  • creation of areas, accessible only upon registration, to remove data from public availability;
  • the inclusion of anti-scraping clauses in the terms of service of websites; 
  • the monitoring of traffic to web pages, to identify any abnormal flows of incoming and outgoing data; 
  • the technological solutions made available by the same companies responsible for web scraping, (eg, intervening on the robots.txt file).

Other official guidance

Data collection: Getting data collection right is a key to your overall GDPR compliance, as once you have understood and complied with the principles of your data collection, the same principles apply throughout the lifecycle of what you do with the data you have, explains the Guernsey data protection authority. It also offers new guidance regardless of the collection method, (in-person interviews, emails, online forms, paper forms, video surveillance, social media activity, phone calls etc). 

Dynamic data security: Data security measures must be viewed as dynamic, as opposed to a static, obligation, according to the Guernsey regulator. In its latest statistical research, the agency found that the long-established trend of emails being sent to the wrong person continues to be the most common reported breach. At the same time, the vast majority of breaches were still discovered by individuals, and not through system auditing or testing. The regulator requests a deeper understanding of the potential associated harms, ranging from “loss of confidentiality” to “emotional distress,” to properly assess the risk of such incidents. 

Receive our digest by email

Sign up to receive our digest by email every 2 weeks

‘Manage GDPR’:  The Spanish regulator AEPD published a new version of its Manage GDPR tool,(available in English). ‘Gestiona’ targets controllers and processors as well as data protection specialists. It allows managing the records of the processing activities, (ROPA), with up to 500 treatments, in an integrated way, and for different entities. It is now possible to manage the risk with privacy measures that the tool suggests for each identified risk factor. The tool is managed on the user’s device via their browser, without installing any application and storing the information locally. 

Legal processes

Anonymisation standard: The Quebec government enforced the Regulation respecting the anonymisation of personal information. It prescribes that once the purposes for which personal data was used are achieved, organisations, (including the private sector), have two choices: destroy or anonymise it for use only for serious and legitimate purposes. It will largely apply from 2025. 

UK Data Protection reform on hold: The Data Protection and Digital Information Bill falls ahead of a snap UK general election. As UK observers explain, any legislation that did not complete its passage by the end of the ‘wash-up’ on 24 May falls and will need to be reintroduced in the next Parliament. The draft bill was criticised for its flexibility towards data sharing in trade and innovation and state surveillance, threatening the adequacy decision granted by the EU. 

US Privacy and AI legislation: A good chunk of future privacy and AI bills has moved forward through state legislatures this past month. This includes the Maryland Age-Appropriate Design Code and other privacy acts, the Colorado Consumer Protections for AI Act, and the Vermont, Minnesota, and Kentucky Consumer Data Privacy Acts. California’s Bill on AI Accountability was read in the state Assembly, and the House of Representatives subcommittee advanced the American Privacy Rights Act Discussion Draft. 

Worldcoin on pause in Spain

The Worldcoin project committed to freeze its activity in Spain until the end of the year or until the final approval of its processing activities. The data protection authority of Bavaria, where the company has its main establishment in Europe, is progressing and is expected to conclude soon with a final binding decision. Worldcoin uses iris scans for unique identification with plans to expand for wider adoption of a global currency on the blockchain, explains the Techtarget.com article. The iris structure is used to generate a unique identifying code that is saved on the Worldcoin decentralised blockchain to prevent others from replicating the code.

The biometric data is not stored by the scanning device, but is kept in the form of anonymised ‘IrisHash’. 

More enforcement decisions

Failed backup testing: The Danish data protection authority criticised the breakdown of NemID in 2022, where up to 1.5 million users experienced problems logging in to major public services. The data controller followed their emergency procedure to restore the operation with a backup solution. This appeared to be unavailable, and the test to establish the viability of the backup solution was last carried out two years before the collapse. Such tests show whether recovery can be done with existing guides/procedures, that hardware, software, and data can work together, and that recovery can happen quickly enough as the consequences usually increase with time.

Spreadsheet error: In the UK, the Police Service of Northern Ireland is facing a 750,000 pound fine for failing to protect the personal information of its entire workforce. Personal information including surname, initials, rank and role of all 9,483 serving officers and staff was included in a “hidden” tab of a spreadsheet published online in response to a freedom of information request. The error caused several officers to move house, cut themselves off from family members and completely alter their daily routines because of the tangible fear of threat to life. The cause of the data breach was more than trivial as there were insufficient internal procedures and sign-off protocols for the safe disclosure of information.

Data security

decentralised clinical research

US financial entities: If your business is covered by the FTC’s Gramm-Leach Bliley Safeguards Rule, an amendment that requires covered companies to report certain data breaches is now in effect. It lists thirteen distinct company categories, including payday lenders, mortgage lenders, finance companies, mortgage brokers, account servicers, cheque cashers, wire transfers, collection agencies, tax preparation organisations, credit counsellors, and other financial consultants. According to the amendment, financial institutions must report to the FTC any security breach involving the personal data of at least 500 customers as soon as feasible, but no later than 30 days after discovery.

Big Data

Microsoft vs schools: Microsoft’s 365 Education services violate children’s privacy by shifting the responsibility to the school administrations, states the NOYB privacy advocacy group. Digital service providers like Microsoft tend to designate educational bodies as data controllers in their Terms and Conditions. However, in practice, the schools have no control over the applications, their design, and data operations. In just one example, they cannot satisfy data access requests by individuals as they don’t hold the necessary data

Malware and data stealing: Law enforcement agencies in the US and EU announced massive operations against some of the most influential cybercrime platforms for delivering ransomware and data-stealing malware. They targeted droppers/loaders, (a custom-made program designed to surreptitiously install malware onto a system), deployed through email attachments, hacked websites, or bundled with legitimate software. Droppers are typically used in the initial stages of a breach, and they allow cybercriminals to bypass security measures and deploy additional harmful programs. 

ShinyHunters ransom: Meanwhile Ticketmaster in the US was hit by a data hack that may affect 560m customers, the Guardian reports. Cybercrime group ShinyHunters reportedly demanded 400,000 pounds ransom to prevent data from being sold. The unauthorised access was spotted by a third-party cloud database environment containing the company’s data. Earlier Bank Santander also confirmed being hacked by the same group. ShinyHunters claimed it had the data of 30m customers and staff details, 6m account numbers and balances, and 28m credit card numbers, and is demanding a ransom of 1.6m pounds. 

The post Data protection digest 18 May – 2 Jun 2024: decentralised clinical research, Meta’s new virtual assistant appeared first on TechGDPR.

]]>
Weekly digest April 18 – 24, 2022: business and human rights in the activities of tech companies https://techgdpr.com/blog/weekly-digest-26042022-business-and-human-rights-in-the-activities-of-tech-companies/ Tue, 26 Apr 2022 06:38:16 +0000 https://s8.tgin.eu/?p=5663 TechGDPR’s review of international data-related stories from press and analytical reports. Official guidance: business and human rights in the activities of tech companies, relaxed covid measures, regulators’ annual analytics Privacy International, (PI), submitted its input to the forthcoming report by the UN High Commissioner for Human Rights, on the practical application of the UN Guiding […]

The post Weekly digest April 18 – 24, 2022: business and human rights in the activities of tech companies appeared first on TechGDPR.

]]>
TechGDPR’s review of international data-related stories from press and analytical reports.

Official guidance: business and human rights in the activities of tech companies, relaxed covid measures, regulators’ annual analytics

Privacy International, (PI), submitted its input to the forthcoming report by the UN High Commissioner for Human Rights, on the practical application of the UN Guiding Principles on Business and Human Rights to the activities of technology companies. In summary, the PI report highlights the systemic lack of accountability of this industry, national authorities’ slow or nonexistent enforcement of privacy laws against its exploitative practices, and its relations with governments. Among many things, it:

  • asserts the need for tech companies to provide transparency over their technologies and to make their algorithms auditable, and for states to mandate such transparency when these technologies are used to deliver public functions; 
  • reasserts that contracts between public authorities and tech companies must point to redress mechanisms for complaints handling and enforcement of sanctions for abuses or violations of human rights;
  • calls for public authorities to conduct individual human rights risk and impact assessments, as well as data protection impact assessments, during any surveillance technology procurement process, in addition to companies conducting human rights due diligence, on any prospective state client’s end-use of their technology;
  • asserts that public authorities should not systematically use surveillance and data processing systems deployed for private purposes and/or data derived from these systems, etc.

As COVID-19 measures relaxed across the UK, the ICO has set out some key things organisations need to consider around the use of personal information. You should check government guidance for where you live. Guidance varies between England, Northern Ireland, Scotland, and Wales. In general, the organisations should ask themselves a few questions: a) How will still collecting extra personal information help keep our workplace safe? b) Do we still need the information previously collected? c) Could we achieve your desired result without collecting personal information? Also, data protection is one of a number of factors to consider when thinking about collecting this information. Organisations should also take into account:

  • employment law and your contracts with employees,
  • health and safety requirements, and
  • equalities and human rights, including privacy rights.

The ICO had previously outlined some practical methods for destroying documents and guidance on storage limitations for further information. 

Meanwhile, the EDPS published its analytical annual report 2021. It highlights the EDPS’ achievements regarding EU institutions’ compliance with the data protection framework. The report also underscores the EDPS’ increasing role in advocating for the respect of privacy and data protection in EU legislation. The EDPS increased the use of its corrective powers, (eg, the decision to order Europol to delete datasets with no established links to criminal activity). This year was also unprecedented in terms of EDPS advice given to the EU legislator, (with 88 opinions, including formal comments, issued in 2021, compared to 27 in 2020). The EDPS also continued its active participation in the EDPB’s work, and furthered its work on raising awareness about personal data breaches to assist EU institutions in preventing and handling them. You can consult the full report here.

For those, who can read Hungarian, the country’s data protection regulator NAIH similarly prepared its annual activities wrap up for 2021. It looks at a) the authority’s experience over the first ten years, b) statistical characteristics of cases, c) data protection officers tutorials, d) law enforcement, national defence, and national security data-related procedures, e) important court decisions, f) data protection issues in business secrets, g) minors’ data protection, and much more.

Legal processes and redress: lawful data scraping, law firm nonliability for data breach

A decision in the US Ninth Circuit Court of Appeals offers an insight into the conflicting positions between Europe and America on data protection and offers relief for data scrapers who feared a shutdown of their industry. A case pitting business networker LinkedIn against hiQ Labs, a “people analytics” company, sought to prevent the latter from taking data from LinkedIn for its own business purposes. It was successfully argued that the information was publicly available, so no criminal act had taken place. Another point raised was that finding in LinkedIn’s favour would mean big tech companies would have a monopoly on ‘big data’ in the future. It may mean problems ahead for key articles of the GDPR, as privacy policy, competition and criminal law are all pulling in different directions.

A federal jury in Kansas City cleared a law firm, (Warden Grier), of liability to one of its clients, (Hiscox Insurance), after suffering a data breach, Hogan Lovells blog reports. The plaintiff claimed that the defendant failed to meet its standard of care by not sufficiently analyzing its breached server, leaving the plaintiff responsible for approximately 1.3 mln dollars in data analysis and related legal bills. Warden Grier’s counsel argued to the jury that Hiscox was confusing the roles of “service providers” and “data owners.”  Here, Warden Grier argued it was a “service provider” under applicable data breach laws and industry norms, and thus its role was to provide Hiscox with access to impact data, which it had done. Read the full article here

Data breaches: the leak of health data

The French regulator CNIL issued a 1.5 mln euros fine against the company DEDALUS BIOLOGY. A massive data leak concerning nearly 500,000 people was revealed publicly. The surname, first name, social security number, name of the prescribing doctor, date of the examination but also and above all medical information, (HIV, cancers, genetic diseases, pregnancies, drug treatments followed by the patient, or even genetic data), of these people has thus been disseminated on the internet. In its decision the CNIL stated:

  • As part of the migration from software to another tool, requested by two laboratories using the services of DEDALUS BIOLOGY, the latter extracted a larger volume of data than required.
  • The company has therefore processed data beyond the instructions given by the data controllers.

Many technical and organisational shortcomings in terms of security were upheld against the company in the context of the operations of migrating the software:

  • lack of specific procedure for data migration operations;
  • lack of encryption of personal data stored on the problematic server;
  • absence of automatic deletion of data after migration to the other software;
  • lack of authentication required from the Internet to access the public area of ​​the server;
  • use of user accounts shared between several employees on the private zone of the server;
  • absence of supervision procedure and security alert escalation on the server. The full decision in French can be read here

Crypto-asset industry: EU crypto firms appeal against new draft rules

According to Reuters, more than 40 crypto business leaders have asked the EU not to require crypto firms to disclose transaction details and dial down attempts to bring to heel rapidly growing decentralized finance platforms, (the above draft legislation explained in one of our previous digests).  In a letter sent to EU finance ministers, crypto businesses asked policymakers to ensure their regulations did not go beyond rules already in place under the global Financial Action Task Force, which set standards for combating money laundering. In their opinion, this would reduce crypto holders’ privacy and safety. In addition, the letter also asked that the EU excludes decentralized projects, which include decentralised finance, (DeFi), from the requirements to register as legal entities. It also said that certain decentralized “stablecoins” should not be subject to the wider MiCA regulation.

Artificial Intelligence: ISO new guide and EP recommendations on AI Act

The ISO published guidance for members of the governing body of an organisation to enable and govern the use of Artificial Intelligence, in order to ensure its effective, efficient, and acceptable use. The document also provides guidance to a wider community, including executive managers; external businesses or technical specialists, such as legal or accounting specialists, retail or industrial associations, professional bodies; public authorities and policymakers; internal and external service providers (including consultants); assessors and auditors. The guide is applicable:

  • to the governance of current and future uses of AI as well as the implications of such use for the organization itself;
  • to any organisation, including public and private companies, government entities, and not-for-profit organizations;
  • to an organisation of any size irrespective of their dependence on data or information technologies.

Similarly, the European Parliament’s Committee on the Internal Market and Consumer Protection, and Committee on Civil Liberties, Justice and Home Affairs released a joint report with their recommendations for the proposed Artificial Intelligence Act. Proposed amendments from the committee include a ban on predictive policing, a public AI technology registration requirement and further alignment with the GDPR, IAPP News reports. Advocacy group ‘Access Now’ has already examined the recommendations from the committees. According to them, the draft report contains significant improvements for the protection of fundamental rights. These include the rights of people affected by AI systems to lodge a complaint or seek judicial remedies, for public authorities to register their use of high-risk AI systems in a public database, and numerous improvements to procedures and enforcement. At the same time, the recommendations “have missed an important opportunity to protect people’s rights by completely banning remote biometric identification in publicly accessible spaces.”

Big Tech: GPS data, Google’s “Deny All button”, Pegasus spyware, new Microsoft Purview

Data Broker Otonomo is facing a California class-action lawsuit for allegedly collecting and selling GPS data secretly from 50 mln vehicle owners worldwide, IAPP News reports. The company, originally founded in Israel, claims it has systems to protect customer privacy, but investigative journalists in 2021 discovered Otonomo data could reveal customers’ home addresses, where they worked, and where they drove to. At that time legal opinion was the company could face problems down the road. The company has deals with several car manufacturers to include their systems onboard, but the lead plaintiff says he was never informed of this nor was his consent sought.

Beginning with YouTube France, but due to be rolled out across Google Europe-wide, the giant search engine is updating its cookie consent banner, which a few months ago was hit with a hefty 150 million-euro fine by French data regulator the CNIL. The familiar ‘Accept All’ and ‘Customise’ buttons will be joined by a ‘Deny all’ button disabling cookies altogether. Multiple clicks over several pages were previously needed to opt-out of tracking, in violation of the principle that opting out should be as simple for users as opting in.

More high-profile scrutiny of NSO group’s Pegasus spyware is on the way, as the European Parliament launched an inquiry committee into the Israeli company’s potential use of the software on EU member states’ governments, or its use by those governments. Pegasus software was last week reportedly discovered on UK government computer networks, infecting files even within the Prime Minister’s office, and in Spain, it was found infecting pro-Catalonian independence networks.

Microsoft has bundled its Azure Purview and Microsoft 365 Compliance data governance and risk management services into a new package with enhanced and new features to beef up data security and privacy. Christened Microsoft Purview, the new platform should simplify life for administrators, and the integration of functions allows for new capabilities Microsoft says it will extend with time. A key feature will allow admins to apply sensitivity labels to data consistently, across platforms and data types. Labels will now travel with data and be recognised by all services it extends to, says Microsoft.

The post Weekly digest April 18 – 24, 2022: business and human rights in the activities of tech companies appeared first on TechGDPR.

]]>