AI Model Licensing AI compliance, AI contracts, AI model distribution, AI SaaS, LLM licensing Oleg Prosin March 5, 2026 0 Comments

How AI Models Are Actually Licensed?

AI Model Licensing Guide

How AI Models Are Actually Licensed?

The word “open” appears in the description of most leading AI models — yet almost none of them are open source in the legal sense. This guide explains what you are actually agreeing to when you deploy Llama 3, Gemma, or Mistral: three separate licence layers, a taxonomy of terms that mean very different things in law, and the commercial blind spots that regularly surface at due diligence and Series A.

Open weights Open source vs source-available Llama 3 Gemma Mistral Apache-2.0 / MIT / GPL MAU caps RAIL clauses Derivative-work obligations

In this guide

1

Code, Data, and Model Weights — Three Distinct Licensing Objects

Why AI models create three separate IP layers and why each is governed by a different licence

›

2

“Open Source”, “Open Weights”, “Source-Available” — What Each Term Really Means

The OSI definition, where Llama 3, Gemma, and Mistral actually sit, and why the distinction matters for your business

›

3

The Founder’s Blind Spots — Use Restrictions, MAU Caps, and Derivative Obligations

The commercial licence terms most builders miss: prohibited uses, the 700M MAU trigger, and downstream obligations on fine-tuned models

›

4

How to Read a Model Licence — A Practical Framework

A five-step licence-reading process, a comparison of the major licences in use, and a six-step compliance roadmap for commercial deployments

›

Introduction — The Open Model Licensing Problem

When a developer downloads Llama 3, Gemma 2, or Mistral and reads “open weights” in the documentation, a natural assumption follows: this works like open source software. Use it freely, modify it, deploy it commercially, build products on top of it. That assumption is wrong in ways that regularly surface during investor due diligence, enterprise procurement reviews, and regulatory examinations. The word “open” in AI model distribution means something materially different from what it means in the software licensing context — and the gap between those meanings carries real commercial and legal risk.

AI model licensing is not a single question. It is three separate questions asked simultaneously about three distinct components: the code that trains and runs the model, the data the model was trained on, and the model weights themselves. Each of these components may be licensed under a completely different framework — and the permissiveness of one does not extend to the others. A model whose code is released under Apache-2.0 (maximally permissive) may have weights subject to a custom licence that prohibits specific commercial applications or imposes obligations when you distribute a fine-tuned version.

🚫

Common misconception

“It says Apache-2.0 in the repository — I’m covered.” Apache-2.0 typically applies to the inference code and training scripts. The model weights — the actual artefact you are deploying — are frequently governed by a separate custom licence. Reading only the repository licence without checking the weights licence is the single most common AI licensing error made by technical founders.

🚫

Common misconception

“Open weights = open source. Same thing.” Open source, under the OSI definition, means free use for any purpose with no field-of-use restrictions. Open weights means the parameters are publicly downloadable — but may come with prohibited use lists, monthly active user caps, derivative-work obligations, and attribution requirements that have no equivalent in traditional open source licences.

🚫

Common misconception

“We’re too small for the licence terms to matter.” Licence obligations do not scale with company size. A startup using Llama 3 in violation of its use restrictions is in breach from day one — not from Series A. More practically, enterprise customers and institutional investors perform licence reviews as part of procurement and due diligence, and non-compliant AI dependencies can block deals or require costly mid-project model replacements.

What founders typically assume

Apache-2.0 in the repo covers everything

Commercial use is always permitted

Fine-tuning creates a new, independent model

Training data licensing is someone else’s problem

Scale doesn’t change the licence terms

Distribution and API access work the same way

What the licence actually says

Code licence ≠ weights licence — they are separate documents

Some licences prohibit specific industries, use cases, or regions

Fine-tuned derivatives inherit the base model’s licence obligations

Training data rights affect what the model can legally do in some jurisdictions

Llama 3 requires a separate agreement if you exceed 700M MAU

Some licences treat hosted API access differently from local deployment

🧱

Three separate IP layers

Unlike traditional software — where one licence governs the whole product — AI models create at least three independently licensed objects: code, training data, and weights. Each can have a different licence owner and different permitted uses.

📄

Custom licences, not OSS standards

Major model providers have drafted bespoke model licences — not standard OSS templates. Meta’s Llama Community License, Google’s Gemma Terms of Use, and Stability AI’s RAIL licence are all custom instruments with terms that have no direct parallel in Apache-2.0 or MIT.

🔗

Downstream obligation inheritance

Fine-tuning, distillation, and model merging can all create derivative works that inherit the base model’s licence obligations. A product built on a RAIL-licensed model must comply with RAIL restrictions even after the model is modified — unlike traditional software where re-implementation starts clean.

🌍

Jurisdiction-specific enforcement risk

Model licences are typically governed by US law, but the activities they restrict may be regulated differently across jurisdictions. Using a model for “law enforcement purposes” may be prohibited by licence in a jurisdiction where that use is mandatory — creating a conflict that requires specific legal analysis.

🔗

Related: AI model licensing intersects directly with questions of who owns the outputs and underlying IP in an AI-assisted development context. For a full analysis of AI IP ownership structures for founders and investors, see our guide: AI IP Ownership — How to Structure It for Founders and Investors.

The three sections that follow address each dimension of AI model licensing in turn. Section 1 maps the three licensing layers. Section 2 defines the terminology that is routinely misused — including by major model providers in their own documentation. Section 3 identifies the specific commercial blind spots that create liability for founders. Section 4 provides a practical framework for reading any model licence and a licence comparison matrix for the major models in current use.

Section 1 — Code, Data, and Model Weights: Three Distinct Licensing Objects

The most important structural fact about AI model licensing is that a single deployed model involves three legally distinct objects — and each can be, and frequently is, licensed under a different framework. Treating them as one thing — as the README of most model repositories implicitly encourages you to do — is the root cause of almost every AI licensing error made in commercial product development.

The distinction matters in practice because the licence that governs the code tells you nothing about what you can do with the weights. The most commercially permissive code licence in existence (MIT) applied to inference code does not grant you any rights to the model weights if those weights are separately released under a restrictive custom licence. You need to check all three layers independently, every time, for every model you deploy.

1

Layer 1 — The Code

Training scripts, inference engines, fine-tuning pipelines, evaluation tools, and model-serving frameworks

Typically well-understood

The code layer comprises all software associated with the model: the training framework, data processing pipelines, the model architecture definition, inference server code, fine-tuning utilities, and any tooling distributed alongside the weights. This is standard software and is governed by standard software licences — most commonly Apache-2.0, MIT, or, for some research codebases, GPL variants.

This is the layer most developers understand and most README files describe prominently. A repository with “License: Apache-2.0” in its badge is telling you about the code — not about the weights. Developers who treat this as the complete licensing picture are reading roughly one-third of the relevant documents.

Typical licencesMIT, Apache-2.0, GPL-3.0, BSD

IP typeCopyright in software (standard)

Lawyer familiarityHigh — well-established frameworks

Apache-2.0 and MIT applied to code grant broad commercial use rights, including modification, distribution, and sublicencing
GPL-3.0 on training code creates copyleft obligations — any modifications to the training code that are distributed must also be GPL
The code licence does NOT extend to model weights unless explicitly stated in the same licence document
Most major model codebases (LLaMA reference implementation, Gemma, Mistral) use Apache-2.0 for their code — the permissiveness stops there

2

Layer 2 — The Training Data

Datasets used to pre-train, instruction-tune, and RLHF-align the model — and the rights (or lack of them) that flow from their use

Often undisclosed

The training data layer is the most opaque and legally contested dimension of AI model licensing. Most frontier model providers either do not disclose their training data composition in detail (OpenAI, Anthropic, Google’s proprietary models) or disclose it only at a broad categorical level (Meta’s Llama 3 was trained on “publicly available online data” — a description that encompasses trillions of web documents with an enormous range of individual licence terms).

The legal questions that flow from training data are still actively litigated. The central debate — whether training a model on copyrighted text or images constitutes copyright infringement — has not been definitively resolved in any major jurisdiction as of 2025. What is known is that training data rights do not automatically transfer to model users: if a model was trained on data subject to restrictions, those restrictions do not disappear because you downloaded the weights.

IP questionsCopyright, database rights, privacy/GDPR compliance

Disclosure levelHighly variable — often partial or absent

Legal statusActively litigated in US, EU, UK

Data licence compliance is the model provider’s obligation — but downstream users may face claims where the provider’s terms pass residual liability
GDPR issues with training data (personal data in web scrapes) can affect operators who deploy models processing EU personal data — regulators are examining the chain
Models fine-tuned on proprietary or customer data inherit a fourth licensing question: who owns the fine-tuned model and its outputs?
Sector-specific deployment (legal, medical, financial) may trigger additional analysis of whether training data provenance creates domain-specific liability

3

Layer 3 — The Model Weights

The trained parameter sets — the actual commercial artefact — governed by custom model licences that diverge significantly from standard OSS frameworks

The critical layer

Model weights are the numerical parameter matrices that encode everything a neural network has learned from its training data. They are the thing you are actually deploying when you run an AI model in production — the code merely tells the runtime how to use them. Weights are a new category of IP-protected artefact that does not fit neatly into existing legal categories: they are not code (in the traditional sense), not databases, and not literary works, though they may attract protection under all three regimes in different jurisdictions.

The weights layer is where commercial licences diverge dramatically from what developers expect. Meta’s Llama 3 weights are distributed under the Meta Llama 3 Community License — a custom instrument that prohibits specific use cases, caps users at 700 million monthly active users, and requires that derivative models (fine-tunes) are also subject to the same licence. Google’s Gemma weights come under the Gemma Terms of Use, which similarly prohibits named categories of harmful use and requires compliance downstream. These are not open source licences. They are proprietary licences with open distribution of the artefact — a conceptually distinct category.

Llama 3 weightsMeta Llama 3 Community License (custom)

Gemma weightsGoogle Gemma Terms of Use (custom)

Mistral 7B v0.1Apache-2.0 applied to weights (genuinely open)

Weights are the only layer where custom, bespoke AI licences are routinely used — creating an entirely new body of commercial licence terms with no case law history
The legal basis for protecting weights varies by jurisdiction: trade secret (if not published), database rights (EU), copyright (some jurisdictions treat parameter matrices as expressive works), or simply contract through the licence terms
Some weights are released under genuinely open licences (Mistral 7B v0.1 — Apache-2.0; some OLMo variants — Apache-2.0) — these are the exception, not the norm among frontier models
Distributing a product that includes model weights requires you to comply with the weights licence’s distribution provisions — which may include downstream pass-through obligations

Layer Combinations in Practice — Real Model Examples

The following table shows how the three licence layers combine for the most widely used open-weight models. Confirming these combinations is the minimum due diligence for any commercial deployment.

Model	Code licence	Data disclosure	Weights licence	Commercial use?
Meta Llama 3 (8B / 70B / 405B)	MIT	Partial — “publicly available online data”	Meta Llama 3 Community License (custom)	Yes — with restrictions + 700M MAU cap
Google Gemma 2 (2B / 9B / 27B)	Apache-2.0	Not disclosed in detail	Gemma Terms of Use (custom)	Yes — with prohibited use categories
Mistral 7B v0.1	Apache-2.0	Not publicly disclosed	Apache-2.0 (applied to weights)	Yes — genuinely permissive
Mistral Large / Medium	N/A — API only	Not disclosed	Mistral commercial API ToS	API access only — no weight distribution
Falcon 40B / 180B	Apache-2.0	RefinedWeb dataset — partially described	TII Falcon License (custom, v2: Apache-2.0)	v1: restricted; v2 and later: Apache-2.0
GPT-4 / Claude / Gemini Pro	Proprietary	Not disclosed	Proprietary — API access only	API ToS only — no weight access

⚡

The independence principle: Each licensing layer is legally independent. A permissive code licence does not make the weights permissive. A restrictive data provenance does not automatically make the code or weights restricted. Each layer must be assessed on its own terms. When building a commercial product on any AI model, the weights licence is the most commercially significant document — and the one most commonly left unread.

Having established what is being licensed, Section 2 addresses the terminology problem: what “open source”, “open weights”, and “source-available” actually mean in the context of AI models, why most “open” models do not satisfy the technical definition of open source, and why that distinction has direct commercial consequences for founders.

Section 2 — “Open Source”, “Open Weights”, “Source-Available”: What Each Term Really Means

The AI industry has borrowed the vocabulary of open source software and applied it in ways that do not correspond to the legal meaning those terms carry. A model described as “open” by its creator may be open in a narrow technical sense (the weights are downloadable) while being closed in the legally significant sense (not usable for all purposes, not freely distributable, not freely modifiable without downstream obligations). Understanding these distinctions is not pedantry — it is the difference between a compliant deployment and an inadvertent breach.

OSI definition

Open Source

In precise technical and legal usage, “open source” refers to software meeting the Open Source Initiative (OSI) definition: the source must be freely redistributable, modifiable, and usable for any purpose without discrimination against persons, groups, or fields of endeavour. Licences that restrict commercial use, impose field-of-use limitations, or require specific approval for certain applications do not qualify.

Standard OSI-approved licences include MIT, Apache-2.0, and GPL. Applied to model weights, these licences are permissive in the full OSI sense: commercial use, fine-tuning, redistribution, and product embedding are all permitted without use-case carve-outs.

AI examples: Mistral 7B v0.1 (Apache-2.0 on weights), OLMo 7B (Apache-2.0), some BLOOM checkpoints — the minority of major models

Common label

Open Weights

A model is “open weights” when its trained parameter matrices are publicly available for download. This says nothing about what you can do with them. Open weights may be released under a fully permissive OSI licence, under a custom licence with significant commercial restrictions, or under a bespoke research licence with no commercial rights at all — and all three are described as “open” in common usage.

Open weights is a distribution characteristic, not a permission framework. That the weights are available to download means only that Meta or Google has not technically prevented access — not that access grants the legal rights that “open” implies in software engineering.

AI examples: Llama 3, Gemma 2, Falcon, Code Llama, DeepSeek — widely downloaded, but each carrying a different and custom permission framework

Visibility ≠ rights

Source-Available

“Source-available” describes software where the source code is publicly readable but where the licence does not grant the rights required to qualify as open source. In the AI context, “source-available” typically describes situations where training code or model architecture code is published and inspectable — but either the weights are withheld, or the licence restricts what you can do with what you see.

Some providers publish training code for research credibility without releasing the trained weights. Others release weights under terms that would make the arrangement “available but not free”. Knowing what you are looking at requires reading the actual licence — not the marketing copy on the model card.

AI examples: Some Stability AI releases under RAIL variants; early Falcon versions; CodeLlama for non-commercial use only configurations

📋

The OSI test for AI models: To qualify as open source under the OSI definition, a model licence applied to weights must permit free use, redistribution, and modification for any purpose — including commercial use — without field-of-use restrictions. Most “open” AI model licences currently in use fail this test because they prohibit specific applications (weapons, surveillance, illegal content) or impose obligations on derived models that go beyond what OSI-approved licences require.

This is not necessarily a criticism of those licences — responsible-use clauses may be appropriate policy. But it does mean that the common description of Llama 3 or Gemma 2 as “open source” is technically incorrect. They are open-weight, source-available models with custom commercial licences. Legal and commercial analysis must reflect that reality.

Model Profiles — Where Llama 3, Gemma, and Mistral Actually Sit

🦙

Meta Llama 3 (8B · 70B · 405B)

Released April 2024 · Meta AI · Most downloaded open-weight model family

Open weights — not open source

Llama 3 is the most widely adopted open-weight model family and the one most frequently mischaracterised as “open source” in startup pitches and technical documentation. The code (model architecture, inference implementation) is released under MIT. The weights are released under the Meta Llama 3 Community License — a custom document that is materially distinct from any OSI-approved licence.

Code licenceMIT (inference code)

Weights licenceMeta Llama 3 Community License

OSI open source?No — use restrictions + MAU cap disqualify

Critical commercial terms: The weights licence prohibits use for weapons/WMD development, certain surveillance applications, and generating content that violates applicable law. Most significantly for commercial products: if your service has more than 700 million monthly active users, you must request a separate commercial licence from Meta. Derivative models (fine-tunes, distillations) must also be distributed under the Llama 3 Community License — they cannot be relicensed.

💎

Google Gemma 2 (2B · 9B · 27B)

Released June 2024 · Google DeepMind · Competitive performance at smaller parameter counts

Open weights — not open source

Gemma 2 is Google’s open-weight model series, released with the stated aim of enabling research and commercial applications. Like Llama 3, its code (including the Keras and JAX implementations) is released under Apache-2.0 — a permissive OSI licence. The weights are released under the Gemma Terms of Use — a separate custom document that imposes substantive restrictions not present in Apache-2.0.

Code licenceApache-2.0

Weights licenceGemma Terms of Use (custom)

OSI open source?No — prohibited use categories disqualify

Critical commercial terms: The Gemma Terms of Use incorporates Google’s Prohibited Use Policy by reference. This covers: weapons development, content that creates or facilitates illegal activity, surveillance and tracking of individuals without consent, political or electoral influence operations, and systems that could cause widespread harm. Fine-tuned derivatives must comply with the same terms. Unlike Llama 3, there is no stated MAU threshold — but the prohibited use categories are broadly drafted.

🌊

Mistral 7B v0.1 (and v0.3)

Released September 2023 · Mistral AI · One of the few genuinely open source frontier-class models

Open source (Apache-2.0 on weights)

Mistral 7B v0.1 is notable precisely because it is the exception: both the code and the weights were released under Apache-2.0, making it one of the few frontier-class models that meets the OSI definition of open source at the weights layer. Commercial use, fine-tuning, redistribution, and product embedding are all permitted without use-case restrictions or derivative licensing obligations beyond Apache-2.0’s basic attribution requirement (preserve copyright notices).

Code licenceApache-2.0

Weights licenceApache-2.0 (applied to weights)

OSI open source?Yes — genuinely permissive

Important caveat: The Apache-2.0 weights licence applies to Mistral 7B v0.1 and v0.3. Mistral AI’s larger and more capable models — Mistral Large, Mistral Medium, Mistral Small (latest versions) — are not open weight. They are accessible only through Mistral’s commercial API under its Terms of Service. Verify the specific model version before assuming permissive licensing based on Mistral’s open source reputation from its early releases.

Model	Category	Weights publicly available?	OSI open source?	Commercial use?	Fine-tune freely?
Llama 3 (Meta)	Open weights	Yes	No	Yes — with restrictions	Yes — derivative inherits licence
Gemma 2 (Google)	Open weights	Yes	No	Yes — with restrictions	Yes — derivative inherits ToU
Mistral 7B v0.1/v0.3	Open source	Yes	Yes — Apache-2.0	Yes — unrestricted	Yes — Apache attribution only
Mistral Large / Medium	Proprietary API	No	No	API access under commercial ToS	No — no weight access
OLMo (Allen AI)	Open source	Yes	Yes — Apache-2.0	Yes — unrestricted	Yes
BLOOM (BigScience)	Source-available	Yes	No — RAIL-M restrictions	Yes — with RAIL use restrictions	Yes — RAIL restrictions pass downstream

Section 3 addresses the specific commercial restrictions that appear in the most widely deployed open-weight model licences — the use-case prohibitions, scale thresholds, and derivative-work obligations that founders most commonly overlook when building products on top of models like Llama 3 and Gemma.

Section 3 — The Founder’s Blind Spots: Use Restrictions, MAU Caps, and Derivative Obligations

Most founders who use open-weight AI models have read the headline of the licence but not the body. The headline — “commercial use permitted” — is accurate as far as it goes. The body contains four categories of restriction that create real commercial risk: prohibited use categories, scale thresholds, derivative-work obligations, and attribution and naming requirements. Each of these is found in one or more of the major open-weight model licences currently used in production deployments.

🚫

Blind Spot 1 — Prohibited Use Categories

Specific applications explicitly excluded from permitted commercial use — regardless of the founder’s intent or industry

Critical — commonly missed

Every major custom model licence contains a list of prohibited uses. These lists prohibit specific application categories regardless of the operator’s commercial intent, the harm potential of the specific implementation, or the jurisdiction in which the operator is based. They are hard prohibitions, not guidelines — using a model for a prohibited purpose is a breach of licence regardless of how the product is structured.

The categories vary between licences, but converge around a common set: weapons development (broadly defined to include cybersecurity offensive tools in some licences), illegal content generation, surveillance and tracking without consent, and — in some formulations — any use intended to undermine “appropriate human oversight of AI systems.” The last category is notably broad and its scope is not defined in most licences.

Llama 3 Community License — selected prohibited uses Direct text from licence: “use or permit others to use the Llama Materials for… any use in any manner for the purposes of developing, or providing assistance in the development of, weapons or technologies for use by military forces of any kind; generating or facilitating the generation, dissemination or distribution of content that promotes, enables, or is intended to facilitate the sexual abuse of minors; [or] engaging in, facilitating, or promoting activities that are illegal or that violate applicable laws.”

!

Weapons / WMD development: All major licences. Broadly drafted — “weapons technologies” may encompass cybersecurity offensive tooling or dual-use research depending on interpretation.

!

CSAM and illegal content: All major licences. Non-negotiable across Llama 3, Gemma, Mistral’s commercial ToS.

!

Surveillance without consent: Gemma Terms of Use. Potentially broad — location tracking apps, employee monitoring systems may require analysis.

!

Law enforcement applications: Varies by licence. Some prohibit use in “policing and criminal justice” contexts — relevant for legaltech and govtech products.

!

“Violating applicable laws”: Present in Llama 3 licence. Jurisdiction-dependent and potentially circular — what is legal in one market may be prohibited in another, making global compliance unclear.

📈

Blind Spot 2 — Scale Thresholds and MAU Caps

Commercial conditions that change at defined user or revenue thresholds — the 700 million monthly active user trigger in Llama 3

High risk — often not modelled

The Meta Llama 3 Community License contains a scale threshold that is routinely absent from founders’ commercial modelling. Section 1 of the licence provides that the licence is granted subject to the condition that, if “the monthly active users of the products or services made available by or for [the licensee] exceeds 700 million monthly active users in the preceding calendar month, [the licensee] must request a license from Meta, which Meta may grant to you in its sole discretion.”

700 million MAU sounds like a threshold no startup needs to think about. But the commercial significance is different from what that number suggests. First, the threshold applies to the product or service as a whole — not just to AI feature usage — which means a consumer product with a large user base using Llama 3 for any feature crosses the threshold even if only a small percentage of users interact with the AI component. Second, the clause gives Meta sole discretion — which means the terms of the post-threshold licence are not specified in the document you agreed to. Third, in acquisition due diligence, any threshold that requires a third-party consent is a potential deal delay or complication regardless of its likelihood of triggering.

Practical implication for startup founders The 700M MAU cap is a viral success clause: it requires a separate, unspecified commercial arrangement with Meta if your product succeeds at consumer scale. This creates an undisclosed commercial dependency in your IP stack — one that investors and acquirers will identify in due diligence. Model it in your commercial risk register from day one, even if the threshold seems remote.

!

Applied to the whole product: The 700M MAU threshold appears to apply to total product MAU, not AI feature MAU — a platform with 1B users using Llama 3 for autocomplete is above the threshold.

!

Sole discretion: Post-threshold licensing is at Meta’s discretion on unspecified terms — the current free licence does not guarantee continuity at scale.

!

Gemma 2 has no stated MAU cap, but Google’s ToU reserves the right to update terms and to terminate access for violations — the economic risk is in use categories, not scale.

!

Mistral 7B (Apache-2.0) has no scale threshold — one of the reasons it is preferred for large-scale commercial deployments by risk-conscious operators.

🔗

Blind Spot 3 — Derivative-Work and Fine-Tuning Obligations

What happens to the licence when you fine-tune, distil, adapt, or merge a model — and what obligations pass downstream to your product

Critical — often discovered late

Fine-tuning is how most commercial AI products are built — you take a pre-trained base model and adapt it to your specific domain, use case, or data. The question that founders consistently fail to ask before beginning this process is: what does the base model’s licence say about fine-tuned derivatives? In traditional software, writing new code that uses an Apache-2.0 library does not make your code Apache-2.0. In AI, the equivalent is less clear — and custom model licences typically have explicit provisions that govern this.

Llama 3’s Community License specifies that any “Llama Materials” redistributed — including fine-tuned variants — must carry the Llama 3 Community License. GPL-style licences applied to model weights require that derivatives also be GPL. RAIL-M (used with BLOOM and some Stable Diffusion variants) passes use restrictions downstream specifically — even if you remove all identifiable connection to the base model, the restrictions follow the derived weights.

Scenario: fine-tuned product built on Llama 3 Situation: A legal-tech startup fine-tunes Llama 3 70B on proprietary case law and client intake data to create a document review assistant. They wish to deploy it for enterprise clients who have asked whether the model can be used for confidential government-adjacent work. The problem: the fine-tuned model remains subject to the Llama 3 Community License, which the enterprise client’s legal team has flagged as incompatible with their procurement requirements for AI tools used in security-adjacent workflows.

!

Llama 3 derivatives: Must be distributed under Llama 3 Community License. Cannot be relicensed under Apache-2.0, MIT, or proprietary terms. Pass-through is mandatory.

!

GPL weights derivatives: Copyleft applies — if you distribute a fine-tuned GPL model, the derivative must also be GPL. “Distribute” may include model-as-a-service in some analyses.

!

RAIL-M derivatives: Use restrictions pass downstream by design. RAIL was specifically engineered so that responsible-use restrictions cannot be fine-tuned away or excluded from derivative licences.

!

Gemma derivatives: Must comply with Gemma Terms of Use — a derivative does not escape the prohibited use categories by being a different model checkpoint.

🏷️

Blind Spot 4 — Attribution Requirements and Trademark Restrictions

What you must say, display, and disclose to users — and what you cannot name your product

Medium risk — compliance overhead

Attribution and naming requirements are the least commercially dangerous of the four blind spots — but they are the most commonly non-compliant in practice, because they require affirmative actions that development teams do not routinely build into their deployment workflows. Most model licences require some form of attribution; some require specific disclosures to end users; and almost all prohibit using the model’s name, branding, or associated marks in product marketing without separate trademark authorisation.

The trademark restriction is particularly relevant for product naming. A product called “LlamaLegal” or “GemmaAssist” uses the model’s name in a way that implies endorsement or official association — Meta and Google have not licensed these uses through the model licence. Startup founders regularly choose product names that incorporate model names without understanding that trademark rights are not included in open-weight distribution.

!

Apache-2.0 attribution: Requires that you retain copyright notices and include the Apache-2.0 licence text in any distribution. Simple to comply with; frequently omitted in practice.

!

Llama 3 naming: The licence explicitly states: “You will not use any name that includes ‘Llama’ without Meta’s prior written consent.” Using “Llama” in your product or company name requires separate trademark clearance.

!

Gemma naming: Google’s brand guidelines govern “Gemma” — incorporation of the model name in commercial products or marketing without authorisation creates trademark risk.

!

End-user disclosure: Some licences and emerging EU AI Act requirements independently require that users be informed they are interacting with an AI system — an obligation independent of the model licence but often first encountered through it.

Restriction Matrix — Which Models Carry Which Obligations

Commercial restriction comparison: major open-weight models

Llama 3 (Meta) Yes 700M MAU Yes Yes

Gemma 2 (Google) Yes None Yes Yes

Mistral 7B v0.1/v0.3 None None Attribution only Brand guidelines

BLOOM / RAIL-M Yes None Pass-through N/A

Code Llama (Meta) Yes 700M MAU Yes Yes

OLMo (Allen AI) None None Attribution only None

⚠️

Due diligence disclosure point: Each of these four blind spots is a standard item in AI-focused due diligence conducted by investor legal counsel and enterprise procurement teams. A product built on Llama 3 without documented analysis of use restrictions, the MAU cap, derivative obligations, and attribution compliance will require remediation — or a model replacement — before institutional investment or enterprise contracts can close. Addressing this at the architecture stage costs hours. Addressing it at due diligence costs weeks and sometimes the deal itself.

Section 4 provides a practical framework for reading any model licence before deployment, and a comparison of the full licence terms for the six most widely used open-weight model licences currently in production.

Section 4 — How to Read a Model Licence: A Practical Framework

There is no universal method for reading a model licence because there is no standard format for writing one. Each major provider has drafted their own custom instrument — and each uses different language, different structure, and different concepts to describe similar restrictions. The five-step framework below provides a consistent analytical approach that works across any model licence, regardless of the provider’s formatting preferences or the document’s length.

Before applying the framework, confirm one preliminary: you are reading the right document. For most model repositories, there are at least two licence-relevant documents — the code licence (usually linked from the repository README) and the model or weights licence (sometimes a separate document linked from the model card, sometimes a file called LICENSE_MODEL or USE_POLICY.md). For products accessed via API, the commercial terms of service are the relevant document — the model’s weights licence does not apply to API access. Apply the framework to each document independently.

1

Identify all three licence layers and locate each document Before reading any licence in depth, identify whether a separate weights licence exists (different from the code licence), whether the training data has disclosed terms, and whether any ToU or acceptable use policy is incorporated by reference. Many licences incorporate external documents — Google’s Gemma Terms of Use incorporates the Prohibited Use Policy by reference. Read everything that is incorporated, not just the primary document.

Non-negotiable

2

Find and map the permitted use definition against your product Locate the section that defines what the licence permits — not the section that lists what it prohibits. Read the permitted use definition positively first: does your application fall within it? Then cross-check each prohibited use category against your product’s actual functionality, not its intended functionality. Regulators and licence counterparties will assess what your product can do — not what you intend it for.

Non-negotiable

3

Check for scale thresholds, revenue triggers, and renewal conditions Search the document for numbers: any mention of users, MAU, revenue, or turnover. Llama 3’s 700M MAU threshold is the most prominent current example, but other licences may contain revenue-based triggers or conditions that require re-evaluation at defined milestones. If a threshold exists, model the product trajectory against it and document your analysis — even if the threshold seems remote at the current stage.

Non-negotiable

4

Read the derivative works, fine-tuning, and distribution provisions in full If your product involves fine-tuning the base model, distillation, model merging, or distributing model weights (including as part of a deployed application or container), read every sentence in the derivative works and distribution sections. Identify: (a) whether fine-tunes are covered by the licence; (b) what obligations apply to distributing a derivative; (c) whether the licence is explicitly stated to flow through to derivatives. Do not assume the code licence analogy applies — it frequently does not.

Required if fine-tuning or distributing

5

Document attribution obligations, naming restrictions, and governing law Create a compliance checklist for each model your product uses: what attribution must appear and where, what names or marks cannot be used in your product or marketing, and which jurisdiction’s law governs the licence. Governing law matters for interpretation — an ambiguous restriction in a California-law licence will be interpreted differently from the same language in an English-law licence. Document this as part of your IP schedule, not as an afterthought.

Required before deployment

Licence Comparison — The Six Frameworks You Will Encounter

The following table summarises the key commercial characteristics of the six model licence frameworks most commonly encountered in AI product development. It is a reference summary — full licence texts should be read before any commercial deployment decision.

Licence	Applies to code?	Applies to weights?	Commercial use	Fine-tune / modify	Derivative licence	Use restrictions	Scale threshold
Apache-2.0	Yes	Yes (if applied)	Unrestricted	Yes	Apache or compatible; preserve notices	None	None
MIT	Yes	Yes (if applied)	Unrestricted	Yes	Any; preserve MIT copyright notice	None	None
GPL-3.0	Yes	Contested (if applied)	Yes	Yes	Must be GPL-3.0 (copyleft)	None (use restrictions)	None
RAIL-M (BigScience)	Yes	Yes	Yes — with use restrictions	Yes	Must pass RAIL restrictions downstream	Yes — named prohibitions	None
Meta Llama 3 Community License	Code is MIT	Yes — custom	Yes — with prohibitions	Yes	Must remain Llama 3 licence; no relicensing	Yes — named + law compliance	700M MAU cap
Google Gemma Terms of Use	Code is Apache-2.0	Yes — custom	Yes — with prohibitions	Yes	Derivative must comply with Gemma ToU	Yes — prohibited use policy by reference	None stated

AI Licensing Compliance Roadmap

Conclusion: Six Steps to Licence-Compliant AI Deployment

AI model licensing is an evolving, non-standard field with no single governing framework and no history of case law to guide interpretation. What it does have is a small number of documents — the actual weights licences of the models in production — that can be read, documented, and audited. The following six-step framework converts that reading into a defensible compliance position.

1

Map all three licence layers for every model in your stack

For each model: identify the code licence, the weights licence, and the training data disclosure. Store these as your AI IP schedule. Update it when you swap models or update versions — model versions can carry different licences.

2

Read the weights licence in full — not the README

The README describes the model. The weights licence governs what you can do with it. These are different documents and they say different things. Apply the five-step framework from Section 4 to every weights licence before deployment.

3

Check every use restriction against your product’s full capability set

Do not assess restrictions against intended use only. Assess them against what your product can do — including edge cases, user-generated inputs, and integrations. A prohibited use is a prohibited use regardless of whether you intended to enable it.

4

Model scale thresholds against your growth projections

If your model has a scale threshold (currently: Llama 3’s 700M MAU), include it in your commercial risk register from day one. Document that you have modelled the threshold and assessed the risk. This is what investors and acquirers will ask about at due diligence.

5

Review fine-tuning and derivative provisions before building your training pipeline

If you intend to fine-tune a model, read the derivative works provisions before you begin — not after. A fine-tuned model built on a licence-incompatible base is difficult to remediate. Switching models at the architecture stage is a design decision; switching at launch is a crisis.

6

Implement attribution compliance and naming review as part of product QA

Attribution requirements and naming restrictions are process obligations, not one-time decisions. Build them into your product QA checklist, your marketing review process, and your legal sign-off workflow. Include model licence compliance in every product update cycle that changes or upgrades model dependencies.

AI model licensing will not become simpler in the near term. As model providers compete for developer adoption while managing legal and regulatory exposure, licence terms will continue to evolve — often without the clarity of established OSS frameworks and without the case law that makes software licence interpretation predictable. The founders and product teams who treat model licence analysis as a routine part of their technical due diligence — alongside security review, API dependency analysis, and data processing agreements — will avoid the licence-driven product redesigns and deal delays that routinely affect companies that discover these issues only when it matters most. For complex AI deployments involving fine-tuning, model distillation, or IP-sensitive commercial contexts, see our analysis on how to structure AI IP ownership across development teams, model providers, and investors.

How AI Models Are Actually Licensed?

Code, Data, and Model Weights — Three Distinct Licensing Objects

“Open Source”, “Open Weights”, “Source-Available” — What Each Term Really Means

The Founder’s Blind Spots — Use Restrictions, MAU Caps, and Derivative Obligations

How to Read a Model Licence — A Practical Framework

Introduction — The Open Model Licensing Problem

Three separate IP layers

Custom licences, not OSS standards

Downstream obligation inheritance

Jurisdiction-specific enforcement risk

Section 1 — Code, Data, and Model Weights: Three Distinct Licensing Objects

Layer 1 — The Code

Layer 2 — The Training Data

Layer 3 — The Model Weights

Layer Combinations in Practice — Real Model Examples

Section 2 — “Open Source”, “Open Weights”, “Source-Available”: What Each Term Really Means

Open Source

Open Weights

Source-Available

Model Profiles — Where Llama 3, Gemma, and Mistral Actually Sit

Meta Llama 3 (8B · 70B · 405B)

Google Gemma 2 (2B · 9B · 27B)

Mistral 7B v0.1 (and v0.3)

Section 3 — The Founder’s Blind Spots: Use Restrictions, MAU Caps, and Derivative Obligations

Blind Spot 1 — Prohibited Use Categories

Blind Spot 2 — Scale Thresholds and MAU Caps

Blind Spot 3 — Derivative-Work and Fine-Tuning Obligations

Blind Spot 4 — Attribution Requirements and Trademark Restrictions

Restriction Matrix — Which Models Carry Which Obligations

Section 4 — How to Read a Model Licence: A Practical Framework

Licence Comparison — The Six Frameworks You Will Encounter

Conclusion: Six Steps to Licence-Compliant AI Deployment

Map all three licence layers for every model in your stack

Read the weights licence in full — not the README

Check every use restriction against your product’s full capability set

Model scale thresholds against your growth projections

Review fine-tuning and derivative provisions before building your training pipeline

Implement attribution compliance and naming review as part of product QA

AI Model Licensing and Legal Risks: What Developers Often Overlook

Google Gemma: The Hidden Risks of an “Almost Open” License

Related Posts