Back to Insights
AI Governance & Risk ManagementGuide

AI and Intellectual Property Regulations: Copyright, Patents, and Trade Secrets

March 8, 202514 min readMichael Lansdowne Hauge
For:Legal/ComplianceCTO/CIOConsultantCISOIT ManagerCEO/FounderCMOCHROHead of OperationsData Science/ML

Navigate IP law for AI systems. Understand copyright in training data and AI outputs, patentability of AI inventions, trade secret protection for models, and licensing frameworks for AI-generated content.

Summarize and fact-check this article with:
Muslim Man Lawyer Formal - ai governance & risk management insights

Key Takeaways

  • 1.Purely AI-generated works generally lack copyright protection; human authorship is required for protection.
  • 2.The legality of training AI on copyrighted works is unresolved and varies by jurisdiction and use case.
  • 3.AI cannot be named as an inventor; human inventors must contribute to conception for valid patents.
  • 4.Trade secret protection is central for safeguarding proprietary models, weights, and training data.
  • 5.Generative AI providers face growing transparency and IP compliance obligations, especially in the EU.
  • 6.AI users bear significant risk for infringing outputs and should implement review and governance controls.

The rapid proliferation of generative AI has collided with intellectual property frameworks designed for an era of exclusively human creators. The resulting legal uncertainty touches every dimension of IP law: copyright, patents, and trade secrets. Organizations deploying AI systems now face a landscape where fundamental questions remain unanswered. Can AI-generated works receive copyright protection when no human author exists? Does scraping copyrighted content for model training constitute fair use? Can an AI system be named as an inventor on a patent filing? And how does one protect proprietary AI models as trade secrets when those models are served through cloud APIs?

The stakes are enormous. The U.S. Copyright Office maintains that copyright requires human authorship and has rejected registrations for AI-generated works. Courts remain split on whether training AI on copyrighted materials qualifies as fair use, with major lawsuits against OpenAI, Stability AI, and others still unresolved. The U.S. Patent and Trademark Office refuses to recognize AI as an inventor but will grant patents for AI-implemented inventions where human inventors are identified. This guide maps the evolving IP landscape and provides practical strategies for protecting AI innovations while navigating deeply uncertain legal terrain.

Can AI-Generated Works Be Copyrighted?

The U.S. Copyright Office issued guidance in 2023 establishing a clear position: purely AI-generated works cannot receive copyright protection. The reasoning flows from a long-standing doctrinal requirement that copyright applies only to "original works of authorship," and "authorship" has been interpreted to require a human creator since the Supreme Court's 1884 decision in Burrow-Giles. Just as works produced by nature (the well-known monkey selfie case among them) lack copyright eligibility, works produced by a non-human AI system fall outside the statute's protective scope.

Several high-profile registration decisions have crystallized this principle. In the Zarya of the Dawn case (2023), the Copyright Office initially granted copyright for a graphic novel containing Midjourney-generated images, then rescinded protection for the AI-generated images themselves while preserving copyright only for the human-authored text and the author's creative arrangement. The AI-generated artwork Theatre D'opera Spatial, also created with Midjourney, was denied copyright registration in 2023. And A Recent Entrance to Paradise, a text generated with GPT-3, was denied registration in 2022.

The picture becomes more nuanced when humans collaborate meaningfully with AI tools. The Copyright Office recognizes copyright protection when a human exercises creative control over AI output and the human contribution is more than de minimis. Writing detailed prompts, selecting and arranging AI outputs, and substantively editing results can produce copyrightable work. An author who writes prompts, curates a thousand AI images, and arranges them into a graphic novel may hold copyright in the selection, arrangement, and human-authored text. By contrast, typing a simple prompt such as "create image of sunset over ocean" and accepting the first output produced by Midjourney is insufficient to establish copyright in the resulting image.

AI Training on Copyrighted Works: Fair Use?

At the center of the most consequential IP disputes in a generation lies a deceptively simple question: AI models are trained on billions of copyrighted works (books, articles, images, code) scraped from the internet without permission. Is this copyright infringement, or does it qualify as fair use?

The answer will emerge from a series of landmark lawsuits now working through the courts. In Authors Guild v. OpenAI, authors allege that GPT was trained on their books without license. Getty Images v. Stability AI claims that Stable Diffusion was trained on Getty's stock photo library without permission. The New York Times v. OpenAI and Microsoft argues that training on news articles infringes copyright and undermines licensing markets. A class action against GitHub Copilot alleges that training on open-source code violates license terms and constitutes infringement.

The plaintiffs' arguments center on four claims. First, training involves copying entire works into the training corpus, violating the reproduction right. Second, the trained model is allegedly a derivative work of the training corpus. Third, AI substitutes for human creators, depressing demand for original works and undermining licensing markets. Fourth, AI merely remixes training data without adding new meaning or purpose, making the use non-transformative.

Defendants counter on each point. They argue that the model learns statistical patterns rather than storing or reproducing specific works, making the use fundamentally transformative. They characterize training as intermediate copying for analysis, analogous to the Google Books scanning project and search engine indexing. They contend that AI outputs do not replace specific works but serve a different market and function. And they invoke the public benefit of advancing science, innovation, and access to information.

The fair use analysis under 17 USC Section 107 involves four factors, each presenting unresolved tensions. On the first factor (purpose and character of use), the commercial nature of AI companies' operations weighs against fair use, while the potentially transformative nature of pattern learning weighs in favor. Courts have not yet aligned on whether training is "transformative" in the legal sense. The second factor (nature of the copyrighted work) tilts more favorably when factual works like news and research are involved, compared to highly creative works like novels and art. Training corpora typically contain both. The third factor (amount used) weighs against fair use because training uses entire works, though defendants argue this is necessary for the transformative purpose of pattern learning. The fourth factor (effect on the market) depends heavily on whether outputs compete with or closely mimic specific works. Where outputs substitute for originals, this weighs against fair use. Where they serve distinct purposes, it supports fair use. The emergence of AI training licenses adds another dimension: as rightsholders develop these licensing markets, ignoring them may itself weigh against a fair use finding.

Courts may ultimately distinguish between factual and creative works, and between research and commercial applications. Regardless of doctrinal outcomes, significant settlement pressure may push AI companies toward broad licensing agreements.

The European Union has taken a different approach. The EU Copyright Directive creates a specific text and data mining (TDM) exception. Article 3 permits TDM for scientific research purposes. Article 4 allows commercial TDM unless rights holders explicitly opt out. This opt-out framework creates a fundamentally different landscape from the uncertain fair use analysis playing out in U.S. courts.

Even where the legality of training remains contested, a separate infringement risk arises from AI outputs themselves. Under the substantial similarity test, AI output infringes if it is substantially similar to a copyrighted work. Intent is irrelevant; unintentional copying can still constitute infringement. The fact that Stable Diffusion has generated images containing visible Getty watermarks, for instance, suggests close copying of training data.

The memorization problem compounds this risk. Large models sometimes memorize and reproduce training data verbatim or near-verbatim. Researchers have documented language models reproducing passages from books in their training sets, and image models can output near-identical copies of training images when prompted in certain ways.

Liability attaches at multiple points in the chain. AI users face direct liability if they create, publish, or commercialize infringing outputs. AI providers face potential liability under contributory or vicarious infringement theories. Available defenses include de minimis (the copying is too trivial to constitute infringement), fair use (the output is transformative and does not substitute for the original), and DMCA safe harbor (limited protection for providers hosting user content, contingent on notice-and-takedown compliance).

Licensing AI Training Data

A market for licensed AI training data is rapidly taking shape, driven by AI companies seeking to mitigate copyright risk. The projected market value is expected to reach billions of dollars by the mid-2020s as licensing agreements with publishers, stock photo services, and content creators become standard practice.

Content licensing deals represent the most prominent model. AI companies are licensing archives from publishers, news organizations, and stock photo libraries, with deals between OpenAI and major publishers among the most visible examples. Terms are typically confidential but often involve multimillion-dollar payments along with attribution or branding requirements.

Opt-out mechanisms offer an alternative framework. The robots.txt file allows website operators to instruct crawlers not to scrape their content, and some AI companies voluntarily honor it. Services like Spawning.ai allow creators to signal that their works should not be used for training. The legal status of these mechanisms remains uncertain, with violations more likely to implicate contract or computer access laws than copyright itself.

AI-friendly licenses add further nuance. Some Creative Commons licenses (such as CC BY) allow commercial AI training, while others (such as CC BY-NC-ND) restrict it. Permissive open-source licenses like MIT and Apache are generally viewed as compatible with AI training, but copyleft licenses like GPL raise unresolved derivative-work questions.

Micro-licensing platforms are experimenting with paying creators small amounts when their works are used for training. Artist-focused platforms and stock providers integrating AI tools have begun offering these arrangements, though challenges around attribution, tracking, and scalable payment infrastructure remain substantial.

AI and Patent Law

Can AI Be Named as Inventor?

The current answer across every major jurisdiction is no.

The question was tested definitively through the DABUS cases (2018 to 2023), in which Dr. Stephen Thaler filed patent applications naming his AI system "DABUS" as the sole inventor. The USPTO rejected the application, holding that "inventor" must be a natural person. The UK Intellectual Property Office and UK Supreme Court reached the same conclusion, finding that patent law contemplates only human inventors. The European Patent Office rejected the filing on the grounds that an inventor must have legal personality. Australia initially accepted a DABUS application, but overturned the decision on appeal. South Africa granted a DABUS patent with minimal examination, a result that has proven uninfluential elsewhere.

The rationale is consistent across jurisdictions. Statutory language referring to "inventor," "individual," and "person" is interpreted as requiring a human being. Inventors must sign oaths and assign rights, and AI systems cannot hold or transfer legal rights. At a policy level, patent systems are designed to incentivize human innovation.

The practical consequence is significant. AI-assisted inventions must list the human inventors who contributed to the conception of the invention. Purely AI-generated inventions with no human conceptual contribution are currently unpatentable in most jurisdictions.

Patenting AI-Implemented Inventions

Where a human conceives the inventive concept and uses AI as a tool to implement or validate it, the resulting invention can be patentable, subject to the standard requirements of subject matter eligibility, novelty, non-obviousness, and utility.

AI patent applications face several distinctive challenges. Abstract idea rejections under the Alice/Mayo framework are common, as many AI claims are characterized as abstract mathematical algorithms or data processing. Applicants must demonstrate a concrete technical improvement or application, such as improved hardware performance, reduced latency, or better compression.

Enablement requirements under 35 USC Section 112 demand that the specification teach a person of ordinary skill in the art (POSITA) how to make and use the invention without undue experimentation. For AI inventions, this often means disclosing model architecture, training methodology, data characteristics, and performance metrics. Written description requirements add a further burden: applicants must demonstrate they possessed the claimed invention at the time of filing. Overly broad claims to "an AI system" without specific technical detail risk rejection.

The question of obviousness is also evolving. As AI tools become standard instruments in the inventor's toolkit, what qualifies as "obvious" to a POSITA may shift upward. Current doctrine still evaluates obviousness from a human perspective, but AI-assisted design may raise the bar over time.

Inventorship for AI-Assisted Inventions

Determining inventorship for AI-assisted inventions depends on the conception test: the inventor is the person who forms in their mind a definite and permanent idea of the complete and operative invention. When a human formulates the problem, defines the solution space, and interprets AI outputs to reach a specific inventive concept, that human qualifies as the inventor. Current law does not recognize AI as a co-inventor. Where AI contributes substantially to conception and the human contribution is minimal, there is a real risk that no valid human inventor exists, potentially rendering the invention unpatentable.

Organizations should document human contributions to conception meticulously, including problem framing, model design choices, and interpretation of results. Internal and external communications should avoid characterizing the AI as the "inventor."

Trade Secrets for AI Models

Why Trade Secret Instead of Patent?

Trade secret protection offers several compelling advantages for AI models. It requires no public disclosure of model architecture, weights, or training data. Protection can last indefinitely as long as secrecy is maintained. There is no examination or registration process, and the scope of protectable information is broad, covering data, processes, parameters, and more.

The disadvantages are equally significant. Trade secret law provides no protection against independent development or reverse engineering. Protection is permanently lost once the secret becomes public. And enforcement requires proving both misappropriation and that reasonable secrecy measures were in place.

What Qualifies as a Trade Secret?

Under the Uniform Trade Secrets Act (UTSA) and the Defend Trade Secrets Act (DTSA), trade secret protection requires three elements: the information must consist of a formula, pattern, compilation, program, device, method, technique, or process; it must derive independent economic value from not being generally known; and it must be subject to reasonable efforts to maintain secrecy.

For AI organizations, the range of protectable assets is substantial. Model architectures and custom layers, proprietary training datasets and curated corpora, hyperparameters and optimization strategies, training pipelines with preprocessing and augmentation techniques, and pre-trained weights with fine-tuned checkpoints all qualify as potential trade secrets.

Reasonable secrecy measures form the foundation of enforceability. These include role-based access controls with least-privilege permissions, NDAs and confidentiality clauses with employees, contractors, and partners, encryption of data and model artifacts at rest and in transit, logging and monitoring of access to sensitive systems, and watermarking or fingerprinting of models to trace leaks.

Cloud-hosted AI presents particular challenges for maintaining trade secret status. API access exposes model behavior, enabling potential model extraction or membership inference attacks. Customers may infer aspects of training data from outputs. Mitigations include rate limiting, output filtering, and privacy-preserving training techniques.

One strategic tension deserves particular attention: once an organization releases model weights or training data publicly, trade secret protection is permanently lost. Many organizations navigate this by open-sourcing older models or subsets of their technology stack while keeping frontier models and data pipelines protected as trade secrets.

Licensing AI-Generated Content

Who Owns AI-Generated Content?

Ownership of AI-generated content depends on the degree of human involvement. Under current U.S. guidance, purely AI-generated content without meaningful human authorship is not copyrightable. Such content effectively falls into the public domain, and anyone can copy or reuse it without restriction.

Where a human contributes original expression through detailed prompting, selection, arrangement, and editing, that human may hold copyright in those contributions. A designer who uses AI to generate many images, then heavily edits and composes them into a unique layout, can claim protection in the resulting work.

Contractual terms add another layer of complexity. Many AI providers assign or grant broad rights in outputs to users, even where the copyright status of those outputs is uncertain. Some providers reserve rights to use inputs and outputs for model improvement unless users explicitly opt out.

Commercial Use of AI-Generated Content

The absence of copyright protection for purely AI-generated content creates a fundamental commercial challenge: without copyright, there is no legal mechanism to prevent others from copying that content. This significantly weakens any IP-based competitive advantage for purely AI-generated assets such as marketing images or stock content.

At the same time, AI outputs may still infringe third-party rights if they are substantially similar to protected works. Users who publish or commercialize such outputs face potential infringement claims regardless of the output's own copyright status.

Organizations can mitigate these risks through several approaches. Adding substantial human creativity through editing, composition, and narrative strengthens copyright claims in the resulting work. Reviewing outputs for similarity to known works is particularly important in high-risk domains such as logos, characters, and code. Seeking contractual indemnities or warranties from enterprise AI vendors provides an additional layer of protection. IP insurance, including errors and omissions policies covering copyright and trademark claims, offers a financial backstop.

Licensing Models

AI model licenses span a broad spectrum. Open-source models released under MIT, Apache 2.0, or similar licenses allow broad commercial use, while copyleft licenses like GPL impose share-alike obligations. AI-specific licenses, including Responsible AI Licenses (RAIL), restrict harmful uses while permitting commercial deployment. Proprietary API access comes with contractual restrictions on use, redistribution, and benchmarking.

Content licensing follows its own patterns. Platforms now offer AI-generated images or videos with commercial licenses and IP warranties. Enterprise contracts may treat outputs as work-for-hire or grant exclusive licenses, subject to provider policies and the underlying uncertainty around copyright status.

Regulatory Developments

The U.S. Copyright Office has launched formal inquiries into three critical areas: the copyrightability of AI-generated works and human-AI collaborations, the legal treatment of training on copyrighted works, and liability for infringing AI outputs. Potential outcomes include new guidance from the Office, legislative proposals from Congress, and influential case law from the litigation already working through the federal courts.

EU AI Act and IP Intersections

The EU AI Act introduces transparency obligations under Article 53 that directly intersect with intellectual property law. Generative AI providers must disclose that content is AI-generated and publish summaries of copyrighted training data used, at least at a high level. These requirements are designed to help rightsholders identify unauthorized use and enforce their rights.

The Copyright Directive (DSM Directive, Article 17) adds further obligations. Platforms hosting user content can face direct liability for copyright infringement and must implement licensing, filtering, or takedown mechanisms. As AI-generated content proliferates across these platforms, similar obligations are likely to extend to AI content providers.

International Developments

Beyond the U.S. and EU, several jurisdictions are charting distinct paths. The United Kingdom considered a broad TDM exception applicable to any purpose but paused the reform after significant industry pushback. China's generative AI regulations require respect for IP rights and discourage training on unlicensed content, though enforcement mechanisms are still evolving. Japan has adopted a notably permissive stance: incidental copying for data analysis and AI training is generally allowed under Article 30-4, making the country an attractive jurisdiction for AI research and development.

Practical IP Strategies

For AI Developers

Training data compliance should begin with a preference for licensed, public domain, or clearly permissioned datasets. Where organizations rely on fair use or local TDM exceptions, documenting the legal analysis and risk assessments is essential. Honoring opt-out signals where feasible reduces both legal and reputational risk.

Output controls represent a critical second line of defense. Implementing filters to detect and block outputs that closely match known copyrighted works or contain watermarks helps prevent infringement. Periodic testing for memorization, with corresponding adjustments to training or safety layers, further reduces exposure.

Terms of service should clearly address ownership of inputs and outputs, allocate risk and responsibility for infringing use (through user indemnities and limitations of liability), and establish acceptable-use policies with enforcement mechanisms.

Patent and trade secret strategy requires deliberate choices about what to protect and how. Core technical innovations that are difficult to keep secret (hardware designs, protocols) are often best protected through patents. Training data, model weights, and proprietary pipelines are better suited to trade secret protection. Defensive publications can prevent competitors from patenting widely used techniques.

For AI Users and Deployers

Understanding ownership and rights begins with a careful review of provider terms of service, focusing on ownership, license scope, and reuse rights. For custom AI solutions, negotiating work-for-hire arrangements or assignments may be strategically important.

Managing infringement risk requires establishing review processes for high-stakes outputs, particularly in branding, product design, and production code. Maintaining records of prompts, edits, and human contributions creates an evidentiary foundation for any future disputes.

Commercialization practices should include adding human creative input to AI outputs used in branding, content, and product features. Vendor selection criteria should incorporate IP warranties, indemnities, and the provider's overall compliance posture.

For Content Creators

Controlling the use of existing works requires proactive measures. Deploying robots.txt directives and registering with Do Not Train services signals preferences to AI companies. Choosing licenses (such as specific Creative Commons variants) that reflect a clear stance on AI training provides additional contractual protection.

Monitoring and enforcement tools include reverse image search and dataset search utilities for identifying potential misuse. DMCA takedowns and platform policies offer mechanisms for addressing AI-generated content that infringes existing works.

Monetizing participation in the AI ecosystem is also becoming viable. Micro-licensing platforms and collective bargaining initiatives offer new revenue channels. Content creators who control valuable archives or catalogs may find direct licensing deals with AI companies increasingly attractive.

Common Questions

You may own copyright only in the parts of the work that reflect your own original expression. Simple prompting with no meaningful creative input typically does not create copyrightable authorship, but substantial prompting, editing, and arrangement can support a human authorship claim.

The legality is unsettled and depends on jurisdiction and context. In the U.S., courts are still weighing whether training on scraped copyrighted content is fair use. In the EU and some other regions, specific text and data mining exceptions apply, often with opt-out rights for rightsholders.

No. Major patent offices, including the USPTO, EPO, and UKIPO, require inventors to be natural persons. AI-assisted inventions must list human inventors who contributed to conception.

Protect models and data through access controls, encryption, NDAs, monitoring, and clear confidentiality policies. Avoid releasing model weights or training data publicly, and document your reasonable efforts to maintain secrecy.

The user who generates and distributes the infringing output is typically directly liable. The AI provider may face secondary liability if it knowingly facilitates infringement or profits from it while having the ability to control it.

EU Text and Data Mining Rules

The EU Copyright Directive allows text and data mining for scientific research and, with an opt-out mechanism, for commercial purposes. AI developers targeting EU data should implement processes to detect and honor rightsholder opt-outs.

Multi-billion dollar

Projected size of the AI training data licensing market by the mid-2020s

Source: Industry analyst projections cited in IP and AI market reports

"For frontier AI systems, the most valuable IP is often not the code but the combination of proprietary data, training pipelines, and model weights—assets that are usually best protected as trade secrets rather than patents."

AI and IP regulatory practitioners

References

  1. EU AI Act — Regulatory Framework for Artificial Intelligence. European Commission (2024). View source
  2. AI Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology (NIST) (2023). View source
  3. OECD Principles on Artificial Intelligence. OECD (2019). View source
  4. ASEAN Guide on AI Governance and Ethics. ASEAN Secretariat (2024). View source
  5. Model AI Governance Framework (Second Edition). PDPC and IMDA Singapore (2020). View source
  6. ISO/IEC 42001:2023 — Artificial Intelligence Management System. International Organization for Standardization (2023). View source
  7. Personal Data Protection Act 2012. Personal Data Protection Commission Singapore (2012). View source
Michael Lansdowne Hauge

Managing Partner · HRDF-Certified Trainer (Malaysia), Delivered Training for Big Four, MBB, and Fortune 500 Clients, 100+ Angel Investments (Seed–Series C), Dartmouth College, Economics & Asian Studies

Advises leadership teams across Southeast Asia on AI strategy, readiness, and implementation. HRDF-certified trainer with engagements for a Big Four accounting firm, a leading global management consulting firm, and the world's largest ERP software company.

AI StrategyAI GovernanceExecutive AI TrainingDigital TransformationASEAN MarketsAI ImplementationAI Readiness AssessmentsResponsible AIPrompt EngineeringAI Literacy Programs

EXPLORE MORE

Other AI Governance & Risk Management Solutions

INSIGHTS

Related reading

Talk to Us About AI Governance & Risk Management

We work with organizations across Southeast Asia on ai governance & risk management programs. Let us know what you are working on.