July 30, 2025
Artificial intelligence (AI) model training and data scraping are essential processes in the development of modern AI systems. AI model training involves using large datasets to teach machine learning algorithms to recognize patterns, make predictions, or generate new content. Data scraping refers to the automated extraction of information from websites or digital sources, often to assemble the vast datasets required for effective AI training.
As these practices become more widespread, questions about the legality of using third-party content—especially copyrighted works—have become increasingly important. In Thailand, the legal landscape for AI developers is shaped primarily by the Copyright Act, which presents unique challenges due to the absence of a fair-use exception. This article examines the copyright-related risks and legal uncertainties facing AI developers under Thailand’s current copyright law and practices, offering strategic guidance for navigating this complex environment.
Copyright Risks in AI Scraping and Training
Thailand’s Copyright Act does not provide a broad fair use or fair dealing exception, unlike some other jurisdictions, such as the United States. This absence has significant consequences for AI developers:
No general defense for AI training: Any use of copyrighted material for AI model training is presumed to be infringing unless a specific, narrow statutory exception applies or explicit permission is obtained from the rights holder. There is no general legal basis for using copyrighted works in AI training without authorization.
Increased rights clearance burden: Developers must identify and secure licenses for every copyrighted work included in their training datasets. Given the scale and diversity of data required for effective AI models, this process can be both impractical and costly.
Legal ambiguity and litigation risk: The lack of clear statutory guidance or case law leaves developers in a legal gray area. There is no established precedent clarifying whether certain uses of copyrighted material for AI training might be tolerated or