This chapter provides an overview of the legal system and key laws for foreign companies doing business in Thailand. Presented in a question-and-answer format, the chapter examines the rules governing foreign investment, business vehicles, employment, tax, competition, intellectual property, marketing agreements, e-commerce, data protection, and product liability.
June 15, 2026
The surge in AI development has led to a desperate demand for large, high-quality training data. However, real-world data can be expensive to collect, difficult to access, and often subject to strict privacy and regulatory constraints. Synthetic data, which consists of artificially generated records that replicate the statistical properties of real-world data without reproducing specific individuals’ information, provides an appealing solution by generating artificial datasets at scale without relying on identifiable personal information. It combines speed, cost efficiency, and regulatory compliance, making it a sensible alternative for organizations seeking to reduce risks while maintaining data utility. When properly anonymized, synthetic datasets may fall outside the scope of laws such as the EU’s General Data Protection Regulation (GDPR) or Thailand’s Personal Data Protection Act (PDPA), reducing compliance burdens while still supporting high-quality model training. However, relying on synthetic data without rigorous legal due diligence could be a strategic mistake. It replaces one set of known risks (scraping, direct privacy liability) with a new set of complex liabilities. The narrative that synthetic data is a “silver bullet” for privacy and IP compliance is dangerous and could be misleading. While synthetic data addresses data scarcity, it also introduces new legal uncertainties. Legal counsel should anticipate downstream risks arising from compromised data sources. Models trained on unlawfully obtained data may need to be decommissioned, even if their outputs appear lawful. What is synthetic data? Synthetic data refers to artificially generated information created using AI techniques such as deep learning and generative models. Instead of copying real records, it reproduces the statistical patterns and relationships found in the original dataset. Synthetic data generally falls into three categories: Fully synthetic data – Entirely new data points generated from learned patterns. The model studies the structure of the original data and produces records that resemble real-world