February 27, 2026

Data Scraping

By

Tendem Team

Is Web Scraping Legal? 2026 Compliance Overview

Disclaimer: This article provides general educational information about the legal landscape surrounding web scraping. It does not constitute legal advice and should not be relied upon as such. Laws vary by jurisdiction and change frequently. Readers should consult with qualified legal counsel for guidance specific to their situation.

Overview

Web scraping - the automated extraction of data from websites - exists within a complex and evolving legal landscape. While certain court rulings have clarified some aspects of scraping legality, multiple legal frameworks intersect to create a nuanced picture that varies by jurisdiction, data type, and intended use.

The landmark hiQ Labs v. LinkedIn case established that accessing publicly available data does not necessarily violate the Computer Fraud and Abuse Act (CFAA) in the United States. The Ninth Circuit Court of Appeals ruled that scraping public information may not constitute "unauthorized access" under federal law. However, this ruling addresses only one legal framework among many that may apply to scraping activities.

This article provides an educational overview of the legal frameworks, court cases, and considerations relevant to web scraping activities. It is not exhaustive and does not address all potential legal issues.

Legal Frameworks That May Apply to Web Scraping

Multiple legal frameworks may be relevant to web scraping activities depending on the jurisdiction, type of data collected, and how that data is used.

Legal Framework

Scope

Potential Relevance to Scraping

CFAA (US)

Computer Fraud and Abuse Act - federal statute addressing unauthorized computer access

Courts have ruled on application to public vs. authenticated data access

GDPR (EU)

General Data Protection Regulation - EU data protection law

Establishes requirements for processing personal data of EU residents

CCPA (California)

California Consumer Privacy Act - state privacy law

Provides California residents rights over their personal information

DMCA (US)

Digital Millennium Copyright Act - addresses copyright and circumvention

Section 1201 addresses circumvention of technological protection measures

Copyright Law

Protection for creative and original works

Distinguishes between facts (not protected) and creative expression (protected)

Contract Law

Terms of Service as potentially binding agreements

ToS violations may create civil liability in certain circumstances

Notable Court Cases

Several court cases have addressed web scraping legality. These cases provide insight into how courts have analyzed scraping activities, though each case involves specific facts and circumstances.

hiQ Labs v. LinkedIn (2017-2022)

LinkedIn sent cease-and-desist letters to hiQ Labs, a data analytics company that scraped publicly available LinkedIn profiles. hiQ sued and obtained a preliminary injunction. The Ninth Circuit upheld the injunction in 2019 and reaffirmed its decision in April 2022 following a Supreme Court remand.

The court ruled that accessing publicly available data - information viewable without authentication - does not violate the CFAA's prohibition against unauthorized access. The case concluded in December 2022 with a permanent injunction against hiQ, but based on contract and state law claims rather than CFAA violations.

Meta v. Bright Data (2024)

Meta sued Bright Data for scraping content from its platforms. The ruling addressed contract-based theories, indicating that scraping content subject to terms of service restrictions may constitute breach of contract even when data appears publicly accessible.

Reddit v. Perplexity AI (2025 - Ongoing)

Reddit filed suit against Perplexity AI and several data collection service providers in late 2025. The complaint invokes DMCA Section 1201, alleging circumvention of technological measures including rate limits and anti-bot systems. This case involves legal theories related to AI training data collection. As of early 2026, the case remains pending.

KASPR - CNIL Fine (France, 2025)

The French data protection authority (CNIL) fined KASPR €240,000 for collecting LinkedIn data without appropriate consent. The decision indicates that publicly visible data may still be subject to privacy regulations when it contains personal information.

Categories of Data

Legal analysis of web scraping often distinguishes between different categories of data based on accessibility, content type, and applicable regulations.

Publicly Accessible Data

Data that is viewable without authentication or payment may receive different legal treatment than protected content. The hiQ case addressed scraping of such publicly accessible data under the CFAA. However, public accessibility does not necessarily exempt data from other legal frameworks such as privacy regulations or copyright law.

Authenticated or Protected Content

Content behind login walls, paywalls, or other authentication mechanisms raises different legal considerations. Courts have generally treated unauthorized access to such content more seriously under computer fraud statutes.

Personal Data

Information that identifies or relates to individuals may be subject to privacy regulations regardless of its public visibility. GDPR, CCPA, and similar laws establish requirements for collecting and processing personal data that apply independently of how accessible that data may be on websites.

Copyrighted Content

Creative works including articles, images, and original written content may be protected by copyright law. Factual information is generally not copyrightable, but the expression of facts may be protected. The distinction between facts and creative expression is often fact-specific.

Privacy Regulation Considerations

Privacy laws may apply to web scraping activities that involve personal data, regardless of whether that data appears on public websites.

GDPR

The General Data Protection Regulation applies to processing of personal data of EU residents. GDPR requires a lawful basis for processing personal data, such as consent, legitimate interest, or other enumerated bases. The regulation also imposes requirements around transparency, data subject rights, and data security. Penalties for violations can reach €20 million or 4% of global annual revenue.

CCPA and US State Laws

The California Consumer Privacy Act and similar laws in other US states provide residents with rights over their personal information. These laws may impose obligations on organizations that collect personal data, including through automated means. Additional states including Virginia, Colorado, Connecticut, and Utah have enacted privacy legislation.

International Considerations

Privacy laws vary significantly by jurisdiction. Organizations engaged in cross-border data collection may be subject to multiple regulatory frameworks simultaneously. Canada's PIPEDA, Brazil's LGPD, and various Asian privacy regulations each establish distinct requirements.

Terms of Service Considerations

Many websites include Terms of Service that address automated data collection. The legal effect of these terms depends on various factors including how users are notified of and bound to the terms, the specific language of the restrictions, and applicable contract law principles.

Courts have reached different conclusions about the enforceability of ToS provisions against scrapers in different contexts. The hiQ case indicated that ToS violations alone may not give rise to CFAA liability, but the Meta v. Bright Data case suggests contract-based claims may still be viable. The legal effect of ToS provisions remains an evolving area of law.

Whether particular ToS provisions are enforceable against a particular party in a particular situation requires case-specific legal analysis.

Technical Standards: Robots.txt

Robots.txt is a technical standard that allows websites to communicate crawling preferences to automated systems. The file indicates which parts of a site the operator prefers not to be accessed by automated crawlers.

The legal significance of robots.txt is not entirely settled. Respecting robots.txt directives may be viewed as evidence of good faith, while ignoring them may be relevant to certain legal analyses. However, robots.txt is a technical convention rather than a legal mechanism, and its role in legal proceedings varies by jurisdiction and context.

Evolving Legal Landscape

The legal framework surrounding web scraping continues to evolve. Several factors contribute to ongoing changes in this area.

AI Training and Data Collection

The growth of AI systems trained on web-scraped data has prompted new legal challenges and regulatory attention. Cases involving AI companies and content creators are working through courts, and outcomes may affect how scraping for AI training purposes is treated legally.

Expanding Privacy Regulation

Privacy laws continue to expand globally. New regulations and amendments to existing laws may affect the legal treatment of data collection activities. Organizations should monitor regulatory developments in jurisdictions where they operate or collect data.

Platform Responses

Website operators continue to develop technical measures to detect and prevent automated access, and some pursue legal action against scrapers. The interplay between technical measures and legal theories like DMCA circumvention claims represents an active area of litigation.

Summary

Web scraping legality involves multiple intersecting legal frameworks including computer fraud statutes, privacy regulations, copyright law, and contract law. Court decisions have provided some guidance, but the legal landscape continues to evolve.

Key factors in legal analysis typically include whether data is publicly accessible or requires authentication, whether personal data is involved and which privacy regulations apply, whether copyrighted content is collected and how it is used, what terms of service govern the target website, and the jurisdiction and applicable law.

Organizations and individuals considering web scraping activities should consult with qualified legal counsel to evaluate the specific legal issues relevant to their situation.

Further Reading

- hiQ Labs, Inc. v. LinkedIn Corp., 938 F.3d 985 (9th Cir. 2022)

- Van Buren v. United States, 141 S. Ct. 1648 (2021)

- General Data Protection Regulation (EU) 2016/679

- California Consumer Privacy Act, Cal. Civ. Code § 1798.100 et seq.

- Computer Fraud and Abuse Act, 18 U.S.C. § 1030

- Digital Millennium Copyright Act, 17 U.S.C. § 1201

- CNIL: Scraping publicly accessible data (KASPR decision)

beta

Task in. Result out.

© Toloka AI BV. All rights reserved.

Terms

Privacy

Cookies

Manage cookies

beta

Task in. Result out.

© Toloka AI BV. All rights reserved.

Terms

Privacy

Cookies

Manage cookies

beta

Task in. Result out.

© Toloka AI BV. All rights reserved.

Terms

Privacy

Cookies

Manage cookies