Classic OSS Licenses (Apache, MIT) vs Custom Model Licenses

Classic OSS Licenses (Apache, MIT) vs Custom Model Licenses

Classic OSS Licenses (Apache, MIT) vs Custom Model Licenses

AI Model Licensing Guide

Classic OSS Licenses (Apache, MIT) vs Custom Model Licenses

Apache-2.0 and MIT are the gold standard of open-source licensing — legally clear, commercially friendly, and universally understood by enterprise legal teams. Custom AI model licences from Meta, Google, and NVIDIA are none of these things. This guide explains how the two categories differ, why copyleft has largely failed to reach LLM licensing, and which licence types enterprise clients actually prefer — and why.

Apache-2.0 MIT BSD GPL / Copyleft Permissive license Llama License Gemma License NVIDIA Open Model License

In this guide

Introduction — Why the Classic OSS Framework Does Not Fully Apply to AI Models

Apache-2.0 and MIT are the two most widely used open-source licences in commercial software development. Legal teams at Fortune 500 companies know exactly what they mean, procurement departments have standard approval workflows for them, and developers can use them in commercial products without a legal review. For AI model licences, none of that clarity currently exists — and the gap between a model labelled "open" and one that actually carries Apache-2.0 or MIT terms is larger than most product teams appreciate.

The reason is structural. Classic OSS licences were designed for source code — a defined artefact with well-understood legal characteristics. AI model weights are not source code in the traditional sense: they are the compressed output of a training process on data that is typically proprietary, scraped, or otherwise encumbered. Whether weights are even "software" for copyright and licensing purposes remains an open legal question in most jurisdictions. Custom model licences — from Meta, Google, and NVIDIA — sidestep this ambiguity by using contractual restrictions rather than copyright licensing as their primary enforcement mechanism.

📄

What classic OSS licences govern

Apache-2.0 and MIT license source code as a copyrighted work — a human-readable artefact with clear copyright ownership. The licence grants downstream users rights under copyright law that would otherwise be reserved.

🧠

What model licences must also govern

Model weights are not source code. Custom model licences use contractual terms (terms of use, acceptable use policies) that bind users through contract law — a distinct and less settled legal mechanism than copyright licensing.

⚖️

The enforcement gap

Copyright-based licences like Apache-2.0 are enforceable regardless of whether the user agreed to anything. Contract-based model licences require acceptance of terms — a condition custom licences attempt to satisfy through click-through or continued-use acceptance clauses.

Classic OSS assumptions (Apache-2.0 / MIT)
AI model licence reality
The licence is immutable — once published, terms cannot change
Custom model licences (Gemma, Llama) reserve the right to update terms. Continued use constitutes acceptance.
No use-case restrictions — any lawful use is permitted
Custom licences include Prohibited Use Policies, competitor restrictions, and training bans that go far beyond lawful-use-only.
No termination mechanism — the licence cannot be revoked
Custom licences typically reserve the right to terminate a licensee's rights on breach — a power Apache-2.0 and MIT do not grant licensors.
No scale thresholds — large deployments are treated identically to small ones
Llama 3's 700M MAU clause creates a scale-dependent licence that converts to "Meta's sole discretion" above the threshold.
No flow-down obligations beyond attribution and licence text
Custom licences (especially Gemma) require passing use restrictions to every downstream user of your product — a contractual obligation with no parallel in Apache-2.0 or MIT.
Patent grant included (Apache-2.0) or implicit (MIT) for the licensed code
Patent rights in model weights, training processes, and inference methods are separate from copyright. Custom licences typically do not include patent grants — leaving potential exposure for methods embedded in the model.

The AI model licence permissiveness spectrum

Apache-2.0 / MIT Most permissive
No use restrictions
Mistral (Apache-2.0) Apache-2.0 for weights
Standard OSS
Gemma ToU PUP + flow-down
Termination right
Llama 3 MAU cap + competitor
restriction + training ban
RAIL / Custom Use restrictions built
into every derivative
⚖️

IP ownership context: Licence type determines your right to use a model — it does not determine who owns your fine-tuning data, inference outputs, or the IP structure of a product built on top of a model. For analysis of AI IP ownership in commercial and investment contexts, see AI IP Ownership — wcr.legal.

Section 1 — How Apache-2.0 and MIT Work: For Code and for AI Models

Apache-2.0 and MIT are called "permissive licences" because they impose minimal conditions on downstream use. Both allow commercial use, modification, redistribution, and sublicensing without requiring the downstream user to open-source their own code. The legal architecture underlying each licence is well-understood, widely litigated, and broadly consistent across jurisdictions. Understanding that architecture — and where it does and does not map to AI model weights — is the foundation for any comparison with custom model licences.

Apache-2.0

Apache License, Version 2.0

The most commercially complete permissive licence — includes an explicit patent grant and a patent termination clause that no other major permissive licence provides.

Mechanism Copyright licence
Patent grant Yes — explicit
Copyleft None

Apache-2.0 grants users a perpetual, worldwide, royalty-free copyright licence to reproduce, prepare derivative works, publicly display, publicly perform, sublicense, and distribute the licensed work and derivatives. Critically, it also grants an explicit patent licence covering patents necessarily infringed by the contribution or the work itself — something MIT does not do explicitly.

The patent termination clause is Apache-2.0's most distinctive feature: if a downstream user initiates patent litigation claiming the licensed work infringes a patent, their patent licence under Apache-2.0 terminates automatically. This creates a deterrent against patent assertions by licensees and makes Apache-2.0 the preferred licence in patent-sensitive commercial environments.

Conditions: Preserve copyright notices. Include the Apache-2.0 licence text in any distribution. If modifications are made to the original, state that changes were made. Include a NOTICE file if one exists in the original.

Application to model weights When applied to model weights (as Mistral does), Apache-2.0 grants the same broad permissions: commercial use, modification (fine-tuning), redistribution (open weights), and sublicensing. The patent grant, however, covers patents in the contribution — whether training methodologies and inference techniques embedded in weights are covered by the explicit patent grant depends on how broadly "contribution" is read, and has not been definitively settled in litigation.
MIT License

MIT License (Expat License)

The shortest and simplest major permissive licence — maximum freedom with minimal conditions. Universally recognised and the most commonly used licence in open-source software.

Mechanism Copyright licence
Patent grant None explicit
Copyleft None

MIT grants permission to use, copy, modify, merge, publish, distribute, sublicense, and sell copies of the licensed software, and to permit recipients to do the same — subject only to the condition that the original copyright notice and permission text be included in all copies or substantial portions.

MIT does not contain an explicit patent grant. In jurisdictions where patent licences must be explicit, this creates a theoretical exposure: the copyright licence grants rights to the code, but does not explicitly license patents that may be practised by using it. In practice, this gap is rarely material for library code — but becomes more significant for AI model weights where training and inference methods may be patent-encumbered.

Why MIT dominates small projects: Its brevity (under 200 words) and minimal conditions make it frictionless for individual contributors. Enterprise legal teams generally understand MIT and have standard approval workflows for it — making it commercially easier to consume than more complex licences.

Application to model weights MIT applies to model weights just as to code — but the absence of a patent grant and the single-condition simplicity mean it provides less legal clarity in patent-sensitive commercial contexts. For enterprise procurement teams, Apache-2.0 is typically preferred over MIT for AI model weights precisely because of the explicit patent grant.
BSD Licenses

BSD 2-Clause and 3-Clause Licenses

A family of permissive licences predating MIT and Apache — historically important, still widely encountered, and functionally similar to MIT in commercial practice.

Mechanism Copyright licence
Patent grant None explicit
Copyleft None

BSD 2-Clause ("Simplified BSD") requires preservation of copyright notices and disclaimer text in source and binary distributions. BSD 3-Clause adds a non-endorsement clause: the licensee may not use the copyright holder's name to endorse or promote products derived from the software without written permission. BSD 4-Clause (the original) added an advertising clause that has been dropped in modern usage due to GPL incompatibility and commercial friction.

In commercial AI product contexts, BSD licences are less commonly encountered in model weights than Apache-2.0 or MIT. They provide equivalent commercial freedom but lack the explicit patent grant of Apache-2.0. The non-endorsement clause in BSD 3-Clause serves a similar function to the trademark restriction in custom model licences — but operates more narrowly and is derived from copyright rather than contract.

Permissive Licence Feature Matrix — Code vs Model Weight Application

What each major permissive licence provides for code and for AI model weights

Commercial use permitted
MIT
Yes
Apache-2.0
Yes
BSD 3-Clause
Yes
Fine-tuning / modification permitted
MIT
Yes
Apache-2.0
Yes
BSD 3-Clause
Yes
Redistribution of derivatives permitted
MIT
Yes
Apache-2.0
Yes
BSD 3-Clause
Yes
Explicit patent licence grant
MIT
No
Apache-2.0
Yes
BSD 3-Clause
No
Patent termination on litigation
MIT
No
Apache-2.0
Yes
BSD 3-Clause
No
Licence is irrevocable once granted
MIT
Yes
Apache-2.0
Yes
BSD 3-Clause
Yes
Prohibited use categories
MIT
None
Apache-2.0
None
BSD 3-Clause
None
Scale thresholds (MAU cap)
MIT
None
Apache-2.0
None
BSD 3-Clause
None
Termination on breach
MIT
No termination mechanism
Apache-2.0
Only on patent litigation
BSD 3-Clause
No termination mechanism
Use-case coverage for AI outputs / inference
MIT
Unsettled
Apache-2.0
Unsettled
BSD 3-Clause
Unsettled

Section 2 — How Custom Model Licences Differ from Apache/MIT in Terms of Risk

Apache-2.0 and MIT share a defining characteristic: they impose no restrictions on use and cannot be revoked once granted. Custom model licences from Meta, Google, and NVIDIA share neither of these properties. The legal distance between an Apache-2.0 licence and a custom model licence is not a matter of degree — it is a structural difference that changes how the licence functions, how it can be enforced, and what risks it creates for products built on top of those models.

Three vendor licences currently dominate the commercial open-weight landscape: Meta's Llama 3 Community License, Google's Gemma Terms of Use, and NVIDIA's Open Model License. Each takes a different approach to use restrictions, but all three share characteristics that place them in a fundamentally different risk category from Apache-2.0 or MIT.

🦙

Meta — Llama 3 Community License

Scale-conditional commercial licence with competitor restriction and training ban

The Llama 3 licence permits commercial use for most companies, but embeds three restrictions that have no equivalent in Apache-2.0 or MIT. First, the 700 million monthly active users clause converts the licence from unconditional to conditional: above that threshold, continued commercial use requires a separate agreement with Meta on terms Meta sets unilaterally. Second, the competitor restriction bars use in products that directly compete with Meta's core businesses — including social networking, messaging, and AI assistant categories — regardless of user count. Third, the training ban prohibits using Llama 3 outputs or fine-tunes to train, distil, or improve any AI model that is not itself a Llama derivative.

From a commercial risk perspective, the critical issue with the 700M clause is not whether a given company will reach that threshold. It is that investors and acquirers evaluate the clause as an unquantified contingent liability during due diligence — a dependency on Meta's discretion for continued operation at scale that has no equivalent cost or mechanism in an Apache-2.0 product.

⚠ Scale threshold (700M MAU) ⚠ Competitor restriction ⚠ Training ban ⚠ No relicensing permitted ⚠ Derivatives: Llama licence only
💎

Google — Gemma Terms of Use

Prohibited Use Policy with flow-down obligation and unilateral termination right

The Gemma Terms of Use introduce two risk vectors not present in Llama 3 and entirely absent from Apache-2.0: a flow-down obligation and a unilateral termination right. The flow-down obligation requires that every product built on Gemma passes the Prohibited Use Policy restrictions to downstream users — meaning a B2B platform using Gemma must ensure its enterprise clients do not use the platform for any prohibited purpose. This creates a compliance monitoring obligation with no parallel in permissive open-source licensing.

The termination right gives Google authority to revoke a licensee's rights on breach — a power that Apache-2.0 and MIT simply do not grant licensors. Combined with Google's right to update the Terms of Use with continued use constituting acceptance, the Gemma licence creates a dynamic legal relationship rather than the static, irrevocable grant characteristic of true OSS licences. For enterprise software products built on Gemma, both features require contractual attention that would never arise with an Apache-2.0 model.

⚠ Flow-down obligation to users ⚠ Unilateral termination right ⚠ Updatable terms (continued use = acceptance) ⚠ Prohibited Use Policy (18 categories) ✓ No scale threshold
🖥

NVIDIA — Open Model License

Hardware-agnostic use permitted with output ownership retained by licensee

NVIDIA's Open Model License is the most permissive of the three major custom model licences, but it is still not Apache-2.0. It permits commercial use, allows deployment on non-NVIDIA hardware, and does not include the scale thresholds or competitor restrictions found in Llama 3. It explicitly confirms that inference outputs belong to the user — a provision that is useful to have in writing even if it should be the default position.

However, the licence retains use-case restrictions via an Acceptable Use Policy, prohibits using NVIDIA models to train competing foundation models, and reserves the right to modify terms. It also does not include a patent grant for methods embedded in or used by the model — leaving potential exposure for inference techniques or fine-tuning methods that touch NVIDIA's patent portfolio. For commercial deployments, the absence of a patent grant and the training restriction are the two terms that require legal review before treating NVIDIA's Open Model License as equivalent to Apache-2.0.

✓ No scale threshold ✓ Non-NVIDIA hardware permitted ⚠ No patent grant for model methods ⚠ Training restriction (competing models) ⚠ Updatable AUP

Risk Dimension Comparison — Apache-2.0 vs Custom Model Licences

The following matrix maps six risk dimensions across Apache-2.0 (the OSS benchmark), the Llama 3 Community License, the Gemma Terms of Use, and NVIDIA's Open Model License. All custom model licences carry risk characteristics that do not exist in Apache-2.0 or MIT.

Terms can be updated by licensor
Apache-2.0
No
Llama 3
Yes
Gemma ToU
Yes
NVIDIA OML
Yes
Licence can be revoked / terminated
Apache-2.0
No
Llama 3
Yes
Gemma ToU
Yes
NVIDIA OML
Limited
Use-case restrictions beyond legality
Apache-2.0
None
Llama 3
AUP + competitor
Gemma ToU
PUP (18 categories)
NVIDIA OML
AUP
Scale-conditional licence
Apache-2.0
No
Llama 3
700M MAU
Gemma ToU
No
NVIDIA OML
No
Flow-down obligation to downstream users
Apache-2.0
None
Llama 3
Partial
Gemma ToU
Required
NVIDIA OML
Partial
Patent grant for model methods
Apache-2.0
Included
Llama 3
Not included
Gemma ToU
Not included
NVIDIA OML
Not included
Training / distillation restriction
Apache-2.0
None
Llama 3
Non-Llama models
Gemma ToU
Competing models
NVIDIA OML
Competing models
Relicensing permitted
Apache-2.0
Yes
Llama 3
Llama licence only
Gemma ToU
Limited
NVIDIA OML
Limited
How custom licence risk escalates through the product stack
1
Model selection (development)

The choice of model determines which licence binds the product. With Apache-2.0, the licence analysis ends here. With a custom model licence, it is the beginning of a compliance chain.

2
Product architecture (internal risk)

Custom licences restrict what the product can do (competitor clause, AUP) and how it must be distributed (flow-down, relicensing restrictions). These are design constraints with no Apache-2.0 equivalent.

3
Customer contracts (downstream obligations)

Gemma's flow-down obligation means enterprise customer agreements must incorporate the Prohibited Use Policy. This creates sales friction and legal overhead not present with Apache-2.0 products.

4
Fundraising and M&A due diligence (investor and acquirer risk)

At Series B or acquisition, custom model licences are reviewed as contingent liabilities. Llama 3's 700M clause, Gemma's termination right, and NVIDIA's absent patent grant each require specific legal analysis and disclosure. Apache-2.0 requires none.

💡

Practical implication: The risk gap between Apache-2.0 and a custom model licence is not visible at deployment — it surfaces during fundraising, customer contract negotiations, and acquisition due diligence. Companies that model their IP stack as if a Llama 3 or Gemma licence were equivalent to Apache-2.0 will encounter friction at these stages that could have been avoided or priced in at the model selection stage.

Section 3 — Copyleft and Why It Is Mostly Absent from LLM Licences

Copyleft is the most powerful mechanism in open-source licensing. Under the GPL, any software that incorporates GPL-licensed code and is distributed externally must itself be released under the GPL — a requirement that makes proprietary products built on copyleft code commercially unviable. For AI models, copyleft has largely failed to take hold, and the reasons go beyond commercial preference: the standard copyleft mechanism does not cleanly map onto how model weights are created, distributed, or used.

How copyleft works in classic software licensing

The GPL principle

The GPL grants the right to use, modify, and distribute software — but only on the condition that any derivative work distributed externally is also released under the GPL. This "share-alike" obligation propagates through the code, ensuring that no commercial entity can build a proprietary product from GPL foundations and keep it closed.

Why it works for source code

GPL copyleft works because derivative works in software have a clear legal definition: a work that incorporates, links to, or is derived from GPL code. The boundary between the GPL work and the derivative is identifiable — the source code is either present or absent. This clarity makes the copyleft obligation both detectable and enforceable.

Three Barriers to GPL-Style Copyleft in AI Models

Even if a model developer wanted to apply true copyleft terms to their weights, three structural barriers make it technically, legally, and commercially problematic.

🔍

Barrier 1 — What constitutes a derivative work?

In software, a derivative is determinable by inspecting source code. For model weights, the question is unresolved. If a developer fine-tunes a GPL-equivalent model, are the resulting weights a derivative? Does the training dataset create a derivative? Does using a model's outputs to build a product create a derivative? None of these questions have clear legal answers, making copyleft propagation through weights legally unpredictable.

🛠

Barrier 2 — What is "source code" for a model?

The GPL requires that source code be made available for any distributed derivative. For software, source code is well-defined. For a model, the equivalent of source code is the full training process — the dataset, preprocessing pipeline, training code, and hyperparameters — not just the weights. Requiring disclosure of proprietary training data to comply with a model copyleft obligation is commercially unacceptable to almost every developer.

🌐

Barrier 3 — The API distribution gap

The GPL is triggered by distribution of a derivative work. Under the AGPL (Affero GPL), the trigger is extended to network use — a service that uses GPL software over a network must release its source. For AI models, most commercial use occurs via API, not through distributing weights. Even the AGPL's network use trigger does not clearly apply to a model running behind an API that was not built from GPL-licensed software.

⚗️

RAIL Licences — Copyleft's Nearest AI Equivalent

Responsible AI Licences (RAIL) as a hybrid approach to use restriction propagation

The Responsible AI License (RAIL) framework, developed by the BigScience community and used in licences like the CreativeML Open RAIL-M (Stable Diffusion), is the closest the AI licensing ecosystem has come to a copyleft analogue. RAIL licences do not attempt to reproduce GPL's source-disclosure requirement — they instead embed use restrictions directly into every downstream licence, requiring that anyone who distributes a RAIL-licensed model or derivative must pass the same use restrictions to their downstream recipients.

This "use restriction propagation" is structurally similar to copyleft: it creates obligations that travel with the model through the distribution chain. But it differs in two important ways. First, RAIL restrictions are use-based (prohibiting specific harmful applications) rather than openness-based (requiring open distribution). Second, RAIL licences still permit commercial use and proprietary products — they are closer to a non-commercial restriction on specific uses than to the full copyleft requirement to open-source derivatives.

What RAIL does (like copyleft)

Use restrictions propagate to every downstream recipient and derivative — no one in the distribution chain can strip them out.

What RAIL does not do (unlike copyleft)

Does not require open-sourcing derivatives. Proprietary fine-tunes and commercial products are permitted — only specific harmful uses are restricted.

Commercial implication

RAIL licences create the same flow-down monitoring obligation as Gemma's ToU: every downstream product must contractually pass the restrictions to its users.

Adoption in commercial LLMs

Major commercial LLM providers have not adopted RAIL licences. Meta, Google, and NVIDIA use custom AUP-based licences rather than RAIL, because RAIL's propagation requirement adds downstream compliance complexity.

Comparing Copyleft Mechanisms Across Licence Types

Mechanism / property
GPL v3
AGPL v3
RAIL
Custom AUP
(Llama/Gemma)
Propagation trigger
Distribution
Network use
Distribution
None
Requires open-sourcing derivatives
Yes
Yes
No
No
Restricts commercial use
No
No
Specific uses
Specific uses
Use restrictions flow to downstream users
N/A
N/A
Yes (required)
Varies
Applicable to model weights (settled law)
Unclear
Unclear
Contract-based
Contract-based
Used by major commercial LLM providers
No
No
No
Yes

The practical conclusion for product and legal teams is straightforward: copyleft as traditionally understood in software licensing does not currently operate in the commercial LLM space. GPL and AGPL are absent from the licences of all major open-weight models. RAIL licences exist but are uncommon in commercial deployments. The dominant licence architecture for open-weight models is custom AUP-based contractual restriction — which is neither as protective as copyleft nor as permissive as Apache-2.0, and occupies a distinct risk category that standard OSS compliance workflows are not designed to handle.

Section 4 — Which Licence Types Enterprise Clients Prefer — and Why

Enterprise procurement teams apply a fundamentally different standard to AI model licences than to other software dependencies. For SaaS tools and standard software libraries, Apache-2.0 and MIT are considered approved by default in most Fortune 500 organisations. Custom model licences — regardless of how "open" they are marketed — require individual legal review because the risk categories they introduce (updatable terms, termination rights, use restrictions, flow-down obligations) have no equivalent in the standard OSS review workflow.

What Enterprise Legal Teams Look for in Model Licences

When enterprise procurement or legal teams evaluate whether a product built on an open-weight model can be approved for internal or customer-facing use, four requirements consistently determine the outcome.

1

Immutability — terms that cannot change unilaterally

Enterprise contracts are built on the assumption that licence terms are fixed at the point of procurement. Updatable licences (Gemma, Llama 3, NVIDIA) require ongoing monitoring and create compliance exposure that cannot be addressed through standard software procurement workflows. Apache-2.0 and MIT cannot be updated by the licensor — a version-specific grant is permanent.

2

No use-case restrictions — or clearly scoped exceptions

Enterprise products are deployed across many contexts simultaneously. Use restrictions in custom model licences require an enterprise to audit every deployment against every prohibited use category — a review process with no equivalent for Apache-2.0 or MIT products. Restrictions must be clearly written and narrowly scoped to be acceptable; broad AUP categories like "any harmful application" create interpretive uncertainty.

3

No flow-down obligation — or clear mechanism for compliance

The Gemma flow-down requirement is the most commercially disruptive characteristic for B2B enterprise products. It requires contract amendments with every enterprise customer to pass through the Prohibited Use Policy — a sales friction that makes Gemma-based products more difficult to sell to regulated industry clients, financial institutions, and government buyers whose standard agreement forms do not accommodate third-party use restrictions.

4

Patent clarity — express grants or clear scope

Apache-2.0 includes an express patent grant for the licensed code. Custom model licences typically do not include patent grants for model methods, inference techniques, or training processes. For enterprise buyers in patent-heavy industries (financial services, healthcare, semiconductors), the absence of a patent grant leaves residual IP exposure that must be addressed through indemnity provisions — a negotiation with no Apache-2.0 equivalent.

Enterprise Licence Approval — Procurement Comparison Matrix

The following matrix maps how standard enterprise procurement workflows respond to four licence categories. "Standard approval" means the licence passes enterprise legal review without a dedicated IP or contracts team engagement.

Standard procurement approval (no legal review)
Apache-2.0 / MIT
Yes
Llama 3 Community
No — legal review
Gemma ToU
No — legal review
Proprietary API
Depends on MSA
Immutable — terms cannot be updated by vendor
Apache-2.0 / MIT
Yes
Llama 3 Community
No
Gemma ToU
No
Proprietary API
Negotiable
No use-case restrictions beyond lawful use
Apache-2.0 / MIT
Yes
Llama 3 Community
No — AUP + competitor
Gemma ToU
No — PUP (18 cats.)
Proprietary API
Varies
No flow-down obligation to end customers
Apache-2.0 / MIT
None
Llama 3 Community
Limited
Gemma ToU
Required
Proprietary API
Typically none
Patent grant for model / inference methods
Apache-2.0 / MIT
Included
Llama 3 Community
Not included
Gemma ToU
Not included
Proprietary API
Indemnity only
Passes regulated industry compliance (HIPAA, FedRAMP, FSA)
Apache-2.0 / MIT
Typically yes
Llama 3 Community
Case-by-case
Gemma ToU
Case-by-case
Proprietary API
Via MSA / BAA
M&A / investor due diligence — standard treatment
Apache-2.0 / MIT
Clean
Llama 3 Community
Flagged (700M clause)
Gemma ToU
Flagged (termination)
Proprietary API
Service continuity risk
When enterprise contracts accept custom model licence terms
Use-case restrictions are clearly scoped and the product does not approach prohibited categories

If an enterprise product is in a well-defined vertical (e.g., internal HR tooling, code completion, document summarisation) that is unambiguously outside all prohibited use categories, AUP restrictions can be accepted with a documented scope analysis rather than ongoing monitoring.

Flow-down obligations can be incorporated into standard customer terms without sales friction

For B2B products with standardised subscription agreements, incorporating Gemma-style use restrictions into ToS is operationally achievable — though it must be reviewed by legal and disclosed in enterprise MSA negotiations.

Scale thresholds are disclosed and managed as a known contingent item in investor materials

The Llama 3 700M clause does not automatically block enterprise approval — it must be disclosed in IP schedules during fundraising and M&A, with a documented model transition plan or a clear argument for why the threshold is not reachable within the product's intended scope.

The performance or cost advantage justifies the legal overhead vs. Apache-2.0 alternatives

For regulated sectors where Apache-2.0 alternatives (Mistral, certain Qwen releases) perform comparably, enterprise procurement teams will default to the simpler licence. Custom model licences are typically accepted only when the model provides a material capability advantage that cannot be replicated with a cleaner licence.

Enterprise Compliance Checklist — Before Deploying on a Custom Model Licence

Pre-deployment review items for enterprise and commercial products
Identify the exact licence version — confirm which version of the licence was in effect when the model was downloaded; for Gemma and Llama 3, document the terms as they existed at that date.
Map all product use cases against the AUP / PUP — every current and roadmap use case should be reviewed against all prohibited use categories; document analysis and approvals.
Assess competitor restriction applicability — for Llama 3, confirm whether any product feature or market segment could be characterised as competing with Meta's core products (social, messaging, AI assistant).
Incorporate flow-down obligations into customer terms — if using Gemma, update customer-facing ToS and enterprise MSA templates to pass through the Prohibited Use Policy to end users.
Add scale threshold to commercial risk register — for Llama 3, document the 700M MAU clause in the company's IP risk register and include it in Series A+ investor disclosures and M&A schedules.
Confirm training and output restrictions — verify that model outputs are not being used to train, distil, or evaluate any model outside the permitted scope of the licence.
Set a licence change monitoring process — for updatable licences, establish a process for reviewing any updates to terms and assessing whether continued use constitutes acceptance of materially changed obligations.
Evaluate Apache-2.0 alternatives at the same capability level — document whether a Mistral or similarly licensed model provides comparable performance; this supports the decision rationale if the custom licence is later challenged in due diligence.
Conclusion

Apache-2.0 Remains the Benchmark — Custom Model Licences Require a Different Framework

Apache-2.0 and MIT are the most permissive and commercially reliable licences available for software — including AI models. Where they apply (most notably to Mistral models), they provide the full set of enterprise procurement benefits: immutable terms, no use restrictions, no flow-down obligations, a patent grant, and standard OSS compliance workflows.

Custom model licences from Meta, Google, and NVIDIA operate differently. They are contractual instruments rather than copyright licences, their terms can change, their use restrictions require ongoing compliance monitoring, and they introduce risk categories — scale thresholds, termination rights, training bans, flow-down obligations — that the standard enterprise OSS approval workflow was not designed to handle.

Copyleft, which might have provided a different but equally clear framework, has largely failed to reach the LLM licensing layer for structural reasons: the legal definition of a model derivative is unsettled, the equivalent of "source code" for a model involves proprietary training data, and most model use occurs via API rather than through distributable weight files.

The practical conclusion is that model licence type is a product architecture decision, not an afterthought for the legal team. Choosing an Apache-2.0 model where one exists at the required capability level eliminates an entire class of commercial and legal risk. Where a custom model licence is the right choice on performance or capability grounds, the checklist and framework in this guide provide the minimum required compliance infrastructure.


For analysis of how licence type interacts with AI IP ownership, fine-tuning rights, and investment structuring, see AI IP Ownership — wcr.legal.

Oleg Prosin is the Managing Partner at WCR Legal, focusing on international business structuring, regulatory frameworks for FinTech companies, digital assets, and licensing regimes across various jurisdictions. Works with founders and investment firms on compliance, operating models, and cross-border expansion strategies.