Back to Insights
AI Governance & Risk ManagementGuideAdvanced

AI and Intellectual Property Regulations: Copyright, Patents, and Trade Secrets

March 8, 202514 min readPertama Partners
For:CTO/CIO

Navigate IP law for AI systems. Understand copyright in training data and AI outputs, patentability of AI inventions, trade secret protection for models, and licensing frameworks for AI-generated content.

Muslim Man Lawyer Formal - ai governance & risk management insights

Key Takeaways

  • 1.Purely AI-generated works generally lack copyright protection; human authorship is required for protection.
  • 2.The legality of training AI on copyrighted works is unresolved and varies by jurisdiction and use case.
  • 3.AI cannot be named as an inventor; human inventors must contribute to conception for valid patents.
  • 4.Trade secret protection is central for safeguarding proprietary models, weights, and training data.
  • 5.Generative AI providers face growing transparency and IP compliance obligations, especially in the EU.
  • 6.AI users bear significant risk for infringing outputs and should implement review and governance controls.

Executive Summary: AI systems challenge foundational intellectual property doctrines designed for human creators. Can AI-generated works be copyrighted when no human author exists? Is scraping copyrighted content for AI training fair use? Can AI be named as inventor on patents? How do you protect AI models as trade secrets while using them in cloud services? The U.S. Copyright Office maintains that copyright requires human authorship, rejecting AI-generated work registrations. Courts are split on whether training AI on copyrighted works constitutes fair use—major lawsuits against OpenAI, Stability AI, and others remain unresolved. The USPTO refuses to recognize AI as inventor but grants patents for AI-implemented inventions with human inventors. This guide maps the evolving IP landscape for AI, providing practical strategies for protecting AI innovations while navigating uncertain legal terrain.

Can AI-Generated Works Be Copyrighted?

U.S. Copyright Office Position (2023 Guidance):
No copyright for purely AI-generated works. Copyright requires human authorship.

Human Authorship Requirement:

  • Copyright Act protects "original works of authorship"
  • "Authorship" implies human creator (Supreme Court: Burrow-Giles, 1884)
  • Works by nature (e.g., photographs of monkey selfie) lack copyright
  • AI = non-human creator = no copyright

Rejection Cases:

  • Zarya of the Dawn (2023): Copyright Office granted limited copyright for graphic novel with Midjourney images, then rescinded copyright for AI-generated images themselves (only human-authored text/arrangement copyrighted)
  • Theatre D'opera Spatial (2023): AI-generated art created with Midjourney denied copyright registration
  • A Recent Entrance to Paradise (2022): AI-generated text created with GPT-3 denied copyright

Human-AI Collaboration: The Copyright Office recognizes copyright when:

  • A human exercises creative control over AI output
  • The human contribution is more than de minimis (trivial)

Sufficient Human Contribution:

  • Writing detailed prompts, selecting/arranging AI outputs, and editing results can be copyrightable
  • Example: An author writes prompts, curates 1,000 AI images, and arranges them into a graphic novel—the selection, arrangement, and human-authored text are protected

Insufficient Human Contribution:

  • Typing a simple prompt and accepting the first AI output is not enough
  • Example: "Create image of sunset over ocean" → Midjourney output = no copyright in the image

AI Training on Copyrighted Works: Fair Use?

The Core Issue: AI models are trained on billions of copyrighted works (books, articles, images, code) scraped from the internet without permission. Is this copyright infringement or fair use?

Pending Lawsuits (as of 2024):

  • Authors Guild v. OpenAI: Authors claim GPT was trained on books without a license
  • Getty Images v. Stability AI: Getty alleges Stable Diffusion was trained on its stock photos without permission
  • New York Times v. OpenAI & Microsoft: Claims that training on news articles infringes copyright and harms licensing markets
  • GitHub Copilot class action: Alleges training on open-source code violates licenses and infringes copyright

Plaintiffs' Arguments (Training = Infringement):

  1. Reproduction: Training involves copying entire works into the training corpus, violating the reproduction right
  2. Derivative works: The trained model is allegedly a derivative work of the training corpus
  3. Market harm: AI substitutes for human creators, depressing demand for original works and licensing markets
  4. Not transformative: AI merely remixes training data without adding new meaning or purpose

Defendants' Arguments (Training = Fair Use):

  1. Transformative use: The model learns patterns and statistics rather than storing or reproducing specific works
  2. Intermediate copying: Training involves temporary copies for analysis, analogous to Google Books and search indexing
  3. No market substitution: Outputs do not replace specific works but serve a different market and function
  4. Public benefit: AI advances science, innovation, and access to information, serving the public interest

Fair Use Factors (17 USC § 107):

Factor 1: Purpose and character of use

  • Commercial use weighs against fair use (AI companies profit from models)
  • Transformative use weighs in favor (pattern learning vs. expressive substitution)
  • Key question: Is training "transformative"? Courts are not yet aligned.

Factor 2: Nature of copyrighted work

  • Use of factual works (e.g., news, research) leans more toward fair use than highly creative works (e.g., novels, art)
  • Training corpora typically mix both factual and creative content

Factor 3: Amount and substantiality used

  • Training uses entire works, which weighs against fair use
  • Defendants argue this is necessary for the transformative purpose of pattern learning

Factor 4: Effect on the market

  • Substitution harm: If outputs compete with or closely mimic specific works or authors' styles, this weighs against fair use
  • Non-substitution: If outputs do not replace specific works, this supports fair use
  • Licensing markets: As rightsholders develop AI training licenses, ignoring those markets may weigh against fair use

Likely Outcome (Speculative):

  • Courts may distinguish between factual and creative works, and between research and commercial uses
  • Significant settlement pressure may push AI companies toward broad licensing deals regardless of ultimate doctrine

CALLOUT: INFO
Text and Data Mining Exception (EU): The EU Copyright Directive creates a specific text and data mining (TDM) exception. Article 3 allows TDM for scientific research. Article 4 allows commercial TDM unless rights holders opt out. AI companies must respect opt-outs, creating a different landscape than in the U.S.

When AI Output Infringes

Substantial Similarity Test:

  • AI output infringes if it is substantially similar to a copyrighted work
  • Intent is not required; unintentional copying can still infringe
  • Example: Stable Diffusion generating images with visible Getty watermarks suggests close copying

Memorization Problem:

  • Large models sometimes memorize and reproduce training data verbatim or near-verbatim
  • Studies have shown language models reproducing passages from books in their training sets
  • Image models can output near-identical copies of training images when prompted in certain ways

Liability:

  • AI user: Directly liable if they create, publish, or commercialize infringing outputs
  • AI provider: Potentially liable under contributory or vicarious infringement theories

Defenses:

  • De minimis: Copying is so trivial that it does not rise to infringement
  • Fair use: Output is transformative and does not substitute for the original
  • DMCA safe harbor: Limited protection for providers hosting user content, contingent on notice-and-takedown and other requirements

Licensing AI Training Data

Emerging Business Models

  1. Content Licensing Deals

    • AI companies license archives from publishers, news organizations, and stock photo libraries
    • Examples include deals between OpenAI and major publishers, and between Getty Images and technology companies
    • Terms are typically confidential but often involve multimillion-dollar payments and attribution or branding requirements
  2. Opt-Out Mechanisms

    • robots.txt: A file on websites instructing crawlers not to scrape; some AI companies voluntarily honor it
    • Do Not Train registries: Services like Spawning.ai allow creators to signal that their works should not be used for training
    • Legal status is uncertain; violations may implicate contract or computer access laws more than copyright
  3. AI-Friendly Licenses

    • Creative Commons: Some CC licenses allow commercial AI training (e.g., CC BY), while others restrict it (e.g., CC BY-NC-ND)
    • Open-source code licenses: Permissive licenses (MIT, Apache) are generally viewed as compatible with AI training; copyleft licenses (GPL) raise unresolved derivative-work questions
  4. Micro-Licensing

    • Platforms experiment with paying creators small amounts when their works are used for training
    • Examples include artist-focused platforms and stock providers integrating AI tools
    • Challenges include attribution, tracking, and scalable payment infrastructure

STATISTIC
AI Licensing Market: The market for licensed AI training data is projected to reach billions of dollars by the mid-2020s, driven by AI companies seeking to mitigate copyright risk through licensing agreements with publishers, stock photo sites, and content creators.

AI and Patent Law

Can AI Be Named as Inventor?

Current Answer in Major Jurisdictions: No

DABUS Cases (2018–2023):

  • Dr. Stephen Thaler filed patent applications naming his AI system "DABUS" as the sole inventor
  • USPTO (U.S.): Rejected—"inventor" must be a natural person
  • UK IPO and UK Supreme Court: Rejected—patent law contemplates human inventors only
  • EPO (Europe): Rejected—an inventor must have legal personality
  • Australia: Initial acceptance overturned on appeal
  • South Africa: Granted a DABUS patent with minimal examination; not influential elsewhere

Rationale:

  • Statutory language ("inventor," "individual," "person") is interpreted as human
  • Inventors must sign oaths and assign rights; AI systems cannot hold or transfer legal rights
  • Policy: Patent systems are designed to incentivize human innovation

Consequence:

  • AI-assisted inventions must list human inventors who contributed to conception
  • Purely AI-generated inventions with no human conception are currently unpatentable in most jurisdictions

Patenting AI-Implemented Inventions

AI as a Tool (Human Inventor)

  • If a human conceives the inventive concept and uses AI as a tool to implement or validate it, the invention can be patentable (subject to standard requirements)

Patentability Requirements:

  1. Subject matter eligibility: Must not be a disembodied abstract idea, law of nature, or natural phenomenon
  2. Novelty: Not previously disclosed in prior art
  3. Non-obviousness: Not obvious to a person of ordinary skill in the art (POSITA)
  4. Utility: Specific, substantial, and credible utility

AI Patentability Challenges:

  • Abstract Idea Rejections (Alice/Mayo):

    • Many AI claims are characterized as abstract mathematical algorithms or data processing
    • Applicants must show a concrete technical improvement or application (e.g., improved hardware, reduced latency, better compression)
  • Enablement (35 USC § 112):

    • The specification must teach a POSITA how to make and use the invention without undue experimentation
    • For AI, this often requires disclosing model architecture, training methodology, data characteristics, and performance metrics
  • Written Description:

    • Applicants must show they possessed the claimed invention at filing
    • Overly broad claims to "an AI system" without specific technical detail risk rejection
  • Obviousness in an AI-Enabled World:

    • As AI tools become standard, what is "obvious" to a POSITA may evolve
    • Current doctrine still evaluates obviousness from a human perspective, but AI-assisted design may raise the bar over time

Inventorship for AI-Assisted Inventions

Who Is the Inventor?

Conception Test:

  • The inventor is the person who forms in their mind a definite and permanent idea of the complete and operative invention

AI as Tool:

  • If a human formulates the problem, defines the solution space, and interprets AI outputs to reach a specific inventive concept, that human is the inventor

AI as Co-Inventor?

  • Current law does not recognize AI as a co-inventor
  • If AI contributes substantially to conception and human contribution is minimal, there is a risk that no valid human inventor exists, potentially rendering the invention unpatentable

Practical Strategies:

  • Document human contributions to conception, including problem framing, model design choices, and interpretation of results
  • Avoid characterizing the AI as the "inventor" in internal or external communications

Trade Secrets for AI Models

Why Trade Secret Instead of Patent?

Advantages:

  • No public disclosure of model architecture, weights, or training data
  • Protection can last indefinitely as long as secrecy is maintained
  • No examination or registration process required
  • Can cover a broad range of information (data, processes, parameters)

Disadvantages:

  • No protection against independent development or reverse engineering
  • Protection is lost once the secret becomes public
  • Enforcement requires proving misappropriation and reasonable secrecy measures

What Qualifies as a Trade Secret?

Under the UTSA and DTSA, trade secrets must:

  1. Consist of information (e.g., formula, pattern, compilation, program, device, method, technique, or process)
  2. Derive independent economic value from not being generally known
  3. Be subject to reasonable efforts to maintain secrecy

AI Trade Secret Assets:

  • Model architectures and custom layers
  • Proprietary training datasets and curated corpora
  • Hyperparameters and optimization strategies
  • Training pipelines, preprocessing, and augmentation techniques
  • Pre-trained weights and fine-tuned checkpoints

Reasonable Secrecy Measures:

  • Role-based access controls and least-privilege permissions
  • NDAs and confidentiality clauses with employees, contractors, and partners
  • Encryption of data and model artifacts at rest and in transit
  • Logging and monitoring of access to sensitive systems
  • Watermarking or fingerprinting of models to trace leaks

Challenges for Cloud-Hosted AI:

  • API access exposes model behavior, enabling potential model extraction or membership inference attacks
  • Customers may infer aspects of training data from outputs
  • Mitigations include rate limiting, output filtering, and privacy-preserving training techniques

KEY INSIGHT
Trade Secret vs. Open Source: Once you release model weights or training data publicly, trade secret protection is lost. Many organizations open-source older models or subsets of their stack while keeping frontier models and data pipelines as trade secrets.

Licensing AI-Generated Content

Who Owns AI-Generated Content?

Pure AI Output (Minimal Human Input):

  • Under current U.S. guidance, purely AI-generated content without meaningful human authorship is not copyrightable
  • Such content effectively falls into the public domain; anyone can copy or reuse it

Human-AI Collaboration (Substantial Human Input):

  • Where a human contributes original expression—through detailed prompting, selection, arrangement, and editing—the human may own copyright in those contributions
  • Example: A designer uses AI to generate many images, then heavily edits and composes them into a unique layout; the resulting work can be protected

Contractual Terms (Provider ToS):

  • Many AI providers assign or grant broad rights in outputs to users, even where copyright status is uncertain
  • Some providers reserve rights to use inputs and outputs to improve their models unless users opt out

Commercial Use of AI-Generated Content

No Copyright = No Exclusivity:

  • If content is not protected by copyright, you cannot prevent others from copying it
  • This weakens IP-based competitive advantage for purely AI-generated assets (e.g., stock marketing images)

Infringement Risk:

  • Outputs may still infringe third-party rights if they are substantially similar to protected works
  • Users who publish or commercialize such outputs can face infringement claims

Mitigation Strategies:

  1. Add substantial human creativity (editing, composition, narrative) to strengthen copyright claims
  2. Review outputs for similarity to known works, especially in high-risk domains (logos, characters, code)
  3. Seek contractual indemnities or warranties where available (e.g., from enterprise AI vendors)
  4. Consider IP insurance (e.g., E&O policies covering copyright and trademark claims)

Licensing Models

AI Model Licenses:

  • Open Source: Models released under MIT, Apache 2.0, or similar licenses allow broad commercial use; copyleft licenses (e.g., GPL) impose share-alike obligations
  • AI-Specific Licenses: Some models use Responsible AI Licenses (RAIL) that restrict harmful uses while allowing commercial deployment
  • Proprietary APIs: Access via paid APIs with contractual restrictions on use, redistribution, and benchmarking

Content Licenses:

  • Stock AI Outputs: Platforms offer AI-generated images or videos with commercial licenses and IP warranties
  • Custom Services: Enterprise contracts may treat outputs as work-for-hire or grant exclusive licenses, subject to provider policies

Regulatory Developments

The U.S. Copyright Office has launched inquiries into:

  1. Copyrightability of AI-generated works and human-AI collaborations
  2. Legal treatment of training on copyrighted works
  3. Liability for infringing AI outputs

Potential outcomes include new guidance, legislative proposals, and influential case law from ongoing litigation.

EU AI Act and IP Intersections

Transparency Obligations (Article 53):

  • Generative AI providers must:
    • Disclose that content is AI-generated
    • Publish summaries of copyrighted training data used, at least at a high level
  • These obligations aim to help rightsholders identify unauthorized use and enforce their rights

Copyright Directive (DSM Directive, Article 17):

  • Platforms hosting user content can be directly liable for copyright infringement
  • They must implement licensing, filtering, or takedown mechanisms
  • AI content platforms may face similar obligations as AI-generated content proliferates

International Developments

  • UK: Considered a broad TDM exception for any purpose but paused reforms after industry pushback
  • China: Generative AI regulations require respect for IP rights and discourage training on unlicensed content, though enforcement is evolving
  • Japan: Adopts a permissive stance; incidental copying for data analysis and AI training is generally allowed under Article 30-4, making it attractive for AI R&D

Practical IP Strategies

For AI Developers

  1. Training Data Compliance

    • Prefer licensed, public domain, or clearly permissioned datasets
    • If relying on fair use or local TDM exceptions, document legal analysis and risk assessments
    • Honor opt-out signals where feasible to reduce legal and reputational risk
  2. Output Controls

    • Implement filters to detect and block outputs that closely match known copyrighted works or contain watermarks
    • Periodically test for memorization and adjust training or safety layers accordingly
  3. Terms of Service and Policies

    • Clarify ownership of inputs and outputs
    • Allocate risk and responsibility for infringing use (e.g., user indemnities, limitations of liability)
    • Provide clear acceptable-use policies and enforcement mechanisms
  4. Patent and Trade Secret Strategy

    • Patent core technical innovations that are hard to keep secret (e.g., hardware, protocols)
    • Protect training data, weights, and proprietary pipelines as trade secrets
    • Use defensive publications to prevent competitors from patenting widely used techniques

For AI Users and Deployers

  1. Understand Ownership and Rights

    • Review provider ToS for ownership, license scope, and reuse rights
    • For custom solutions, negotiate work-for-hire or assignment where strategic
  2. Manage Infringement Risk

    • Establish review processes for high-stakes outputs (e.g., branding, product designs, production code)
    • Maintain records of prompts, edits, and human contributions
  3. Commercialization Practices

    • Add human creative input to outputs used in branding, content, and product features
    • Consider vendor selection based on IP warranties, indemnities, and compliance posture

For Content Creators

  1. Control Use of Your Works

    • Use robots.txt and Do Not Train registries to signal preferences
    • Choose licenses (e.g., specific CC variants) that reflect your stance on AI training
  2. Monitor and Enforce

    • Use reverse image search and dataset search tools to identify potential misuse
    • Leverage DMCA takedowns and platform policies when AI-generated content infringes your works
  3. Monetize Participation

    • Explore micro-licensing platforms and collective bargaining initiatives
    • Consider direct licensing deals if you control valuable archives or catalogs

Frequently Asked Questions

It depends on the level of human creativity involved. If you provide only a simple prompt and accept the first output, U.S. guidance suggests there is no copyright in the output. If you contribute substantial original expression—through detailed prompting, selection, editing, and arrangement—you may own copyright in those human contributions.

The legality is unsettled. Plaintiffs argue that scraping and training infringe reproduction and derivative-work rights; defendants argue that training is a transformative fair use or covered by TDM exceptions in some jurisdictions. Outcomes will likely depend on jurisdiction, type of content, and specific use case.

Can I patent an AI-generated invention?

You cannot list an AI system as an inventor under current U.S., EU, and UK law. However, if a human conceived the inventive concept and used AI as a tool, that human can be named as inventor and seek patent protection, assuming other patentability requirements are met.

What's the difference between open-sourcing an AI model and open-sourcing code?

Open-sourcing code typically involves releasing source files under an open-source license. Open-sourcing an AI model may involve releasing code, model weights, and sometimes data. Once weights or data are publicly released, trade secret protection is lost, and license terms (including any copyleft obligations) govern downstream use.

If AI generates content similar to a copyrighted work, who's liable?

The user who prompts the AI and publishes or commercializes the output is typically the primary infringer. The AI provider may face secondary liability if it knew of and materially contributed to infringement or profited while having the ability to control it. Both may be sued, but plaintiffs often target providers with deeper pockets.

How do I protect my AI model as a trade secret?

Limit access to model artifacts and training data, use technical controls (encryption, access logs, watermarking), implement robust NDAs and confidentiality policies, and clearly label confidential materials. Regularly review security practices and have an incident response plan for suspected leaks or misappropriation.

What happens if I train AI on GPL-licensed code?

It is unclear whether a model trained on GPL code is a derivative work subject to GPL obligations. Some argue that training is transformative analysis; others argue that the model is derived from GPL-covered material. Because there is no definitive case law, training on GPL code carries legal risk and should be evaluated with specialized counsel.

Key Takeaways

  1. Purely AI-generated works generally lack copyright protection in the U.S.; meaningful human authorship is required.
  2. The legality of training AI on copyrighted works is unresolved and will be shaped by ongoing litigation and regional TDM rules.
  3. AI systems cannot be named as inventors; human inventors must contribute to conception for patent protection.
  4. Trade secret law is a primary mechanism for protecting proprietary models, weights, and training data.
  5. EU and other jurisdictions are imposing transparency and IP-respecting obligations on generative AI providers.
  6. Users of AI outputs bear significant infringement risk and should implement review and governance processes.
  7. Content creators can use technical, contractual, and legal tools to control or monetize AI training on their works.

Frequently Asked Questions

You may own copyright only in the parts of the work that reflect your own original expression. Simple prompting with no meaningful creative input typically does not create copyrightable authorship, but substantial prompting, editing, and arrangement can support a human authorship claim.

The legality is unsettled and depends on jurisdiction and context. In the U.S., courts are still weighing whether training on scraped copyrighted content is fair use. In the EU and some other regions, specific text and data mining exceptions apply, often with opt-out rights for rightsholders.

No. Major patent offices, including the USPTO, EPO, and UKIPO, require inventors to be natural persons. AI-assisted inventions must list human inventors who contributed to conception.

Protect models and data through access controls, encryption, NDAs, monitoring, and clear confidentiality policies. Avoid releasing model weights or training data publicly, and document your reasonable efforts to maintain secrecy.

The user who generates and distributes the infringing output is typically directly liable. The AI provider may face secondary liability if it knowingly facilitates infringement or profits from it while having the ability to control it.

EU Text and Data Mining Rules

The EU Copyright Directive allows text and data mining for scientific research and, with an opt-out mechanism, for commercial purposes. AI developers targeting EU data should implement processes to detect and honor rightsholder opt-outs.

Multi-billion dollar

Projected size of the AI training data licensing market by the mid-2020s

Source: Industry analyst projections cited in IP and AI market reports

"For frontier AI systems, the most valuable IP is often not the code but the combination of proprietary data, training pipelines, and model weights—assets that are usually best protected as trade secrets rather than patents."

AI and IP regulatory practitioners

References

  1. Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence. U.S. Copyright Office (2023). View source
  2. Inventorship Guidance for AI-Assisted Inventions. U.S. Patent and Trademark Office (2020). View source
  3. EU Artificial Intelligence Act – Transparency Obligations for Generative AI. European Commission (2024). View source
  4. Fair Learning. Texas Law Review (2021). View source
  5. Artificial Intelligence's Fair Use Crisis. Columbia Journal of Law & the Arts (2023). View source
AI CopyrightAI PatentsTrade SecretsGenerative AITraining DataIP Law

Ready to Apply These Insights to Your Organization?

Book a complimentary AI Readiness Audit to identify opportunities specific to your context.

Book an AI Readiness Audit