How AI Models Are Actually Licensed?
How AI Models Are Actually Licensed?
The word "open" appears in the description of most leading AI models — yet almost none of them are open source in the legal sense. This guide explains what you are actually agreeing to when you deploy Llama 3, Gemma, or Mistral: three separate licence layers, a taxonomy of terms that mean very different things in law, and the commercial blind spots that regularly surface at due diligence and Series A.
Code, Data, and Model Weights — Three Distinct Licensing Objects
Why AI models create three separate IP layers and why each is governed by a different licence
"Open Source", "Open Weights", "Source-Available" — What Each Term Really Means
The OSI definition, where Llama 3, Gemma, and Mistral actually sit, and why the distinction matters for your business
The Founder's Blind Spots — Use Restrictions, MAU Caps, and Derivative Obligations
The commercial licence terms most builders miss: prohibited uses, the 700M MAU trigger, and downstream obligations on fine-tuned models
How to Read a Model Licence — A Practical Framework
A five-step licence-reading process, a comparison of the major licences in use, and a six-step compliance roadmap for commercial deployments
Introduction — The Open Model Licensing Problem
When a developer downloads Llama 3, Gemma 2, or Mistral and reads "open weights" in the documentation, a natural assumption follows: this works like open source software. Use it freely, modify it, deploy it commercially, build products on top of it. That assumption is wrong in ways that regularly surface during investor due diligence, enterprise procurement reviews, and regulatory examinations. The word "open" in AI model distribution means something materially different from what it means in the software licensing context — and the gap between those meanings carries real commercial and legal risk.
AI model licensing is not a single question. It is three separate questions asked simultaneously about three distinct components: the code that trains and runs the model, the data the model was trained on, and the model weights themselves. Each of these components may be licensed under a completely different framework — and the permissiveness of one does not extend to the others. A model whose code is released under Apache-2.0 (maximally permissive) may have weights subject to a custom licence that prohibits specific commercial applications or imposes obligations when you distribute a fine-tuned version.
Three separate IP layers
Unlike traditional software — where one licence governs the whole product — AI models create at least three independently licensed objects: code, training data, and weights. Each can have a different licence owner and different permitted uses.
Custom licences, not OSS standards
Major model providers have drafted bespoke model licences — not standard OSS templates. Meta's Llama Community License, Google's Gemma Terms of Use, and Stability AI's RAIL licence are all custom instruments with terms that have no direct parallel in Apache-2.0 or MIT.
Downstream obligation inheritance
Fine-tuning, distillation, and model merging can all create derivative works that inherit the base model's licence obligations. A product built on a RAIL-licensed model must comply with RAIL restrictions even after the model is modified — unlike traditional software where re-implementation starts clean.
Jurisdiction-specific enforcement risk
Model licences are typically governed by US law, but the activities they restrict may be regulated differently across jurisdictions. Using a model for "law enforcement purposes" may be prohibited by licence in a jurisdiction where that use is mandatory — creating a conflict that requires specific legal analysis.
Related: AI model licensing intersects directly with questions of who owns the outputs and underlying IP in an AI-assisted development context. For a full analysis of AI IP ownership structures for founders and investors, see our guide: AI IP Ownership — How to Structure It for Founders and Investors.
The three sections that follow address each dimension of AI model licensing in turn. Section 1 maps the three licensing layers. Section 2 defines the terminology that is routinely misused — including by major model providers in their own documentation. Section 3 identifies the specific commercial blind spots that create liability for founders. Section 4 provides a practical framework for reading any model licence and a licence comparison matrix for the major models in current use.
Section 1 — Code, Data, and Model Weights: Three Distinct Licensing Objects
The most important structural fact about AI model licensing is that a single deployed model involves three legally distinct objects — and each can be, and frequently is, licensed under a different framework. Treating them as one thing — as the README of most model repositories implicitly encourages you to do — is the root cause of almost every AI licensing error made in commercial product development.
The distinction matters in practice because the licence that governs the code tells you nothing about what you can do with the weights. The most commercially permissive code licence in existence (MIT) applied to inference code does not grant you any rights to the model weights if those weights are separately released under a restrictive custom licence. You need to check all three layers independently, every time, for every model you deploy.
Layer 1 — The Code
Training scripts, inference engines, fine-tuning pipelines, evaluation tools, and model-serving frameworksThe code layer comprises all software associated with the model: the training framework, data processing pipelines, the model architecture definition, inference server code, fine-tuning utilities, and any tooling distributed alongside the weights. This is standard software and is governed by standard software licences — most commonly Apache-2.0, MIT, or, for some research codebases, GPL variants.
This is the layer most developers understand and most README files describe prominently. A repository with "License: Apache-2.0" in its badge is telling you about the code — not about the weights. Developers who treat this as the complete licensing picture are reading roughly one-third of the relevant documents.
- Apache-2.0 and MIT applied to code grant broad commercial use rights, including modification, distribution, and sublicencing
- GPL-3.0 on training code creates copyleft obligations — any modifications to the training code that are distributed must also be GPL
- The code licence does NOT extend to model weights unless explicitly stated in the same licence document
- Most major model codebases (LLaMA reference implementation, Gemma, Mistral) use Apache-2.0 for their code — the permissiveness stops there
Layer 2 — The Training Data
Datasets used to pre-train, instruction-tune, and RLHF-align the model — and the rights (or lack of them) that flow from their useThe training data layer is the most opaque and legally contested dimension of AI model licensing. Most frontier model providers either do not disclose their training data composition in detail (OpenAI, Anthropic, Google's proprietary models) or disclose it only at a broad categorical level (Meta's Llama 3 was trained on "publicly available online data" — a description that encompasses trillions of web documents with an enormous range of individual licence terms).
The legal questions that flow from training data are still actively litigated. The central debate — whether training a model on copyrighted text or images constitutes copyright infringement — has not been definitively resolved in any major jurisdiction as of 2025. What is known is that training data rights do not automatically transfer to model users: if a model was trained on data subject to restrictions, those restrictions do not disappear because you downloaded the weights.
- Data licence compliance is the model provider's obligation — but downstream users may face claims where the provider's terms pass residual liability
- GDPR issues with training data (personal data in web scrapes) can affect operators who deploy models processing EU personal data — regulators are examining the chain
- Models fine-tuned on proprietary or customer data inherit a fourth licensing question: who owns the fine-tuned model and its outputs?
- Sector-specific deployment (legal, medical, financial) may trigger additional analysis of whether training data provenance creates domain-specific liability
Layer 3 — The Model Weights
The trained parameter sets — the actual commercial artefact — governed by custom model licences that diverge significantly from standard OSS frameworksModel weights are the numerical parameter matrices that encode everything a neural network has learned from its training data. They are the thing you are actually deploying when you run an AI model in production — the code merely tells the runtime how to use them. Weights are a new category of IP-protected artefact that does not fit neatly into existing legal categories: they are not code (in the traditional sense), not databases, and not literary works, though they may attract protection under all three regimes in different jurisdictions.
The weights layer is where commercial licences diverge dramatically from what developers expect. Meta's Llama 3 weights are distributed under the Meta Llama 3 Community License — a custom instrument that prohibits specific use cases, caps users at 700 million monthly active users, and requires that derivative models (fine-tunes) are also subject to the same licence. Google's Gemma weights come under the Gemma Terms of Use, which similarly prohibits named categories of harmful use and requires compliance downstream. These are not open source licences. They are proprietary licences with open distribution of the artefact — a conceptually distinct category.
- Weights are the only layer where custom, bespoke AI licences are routinely used — creating an entirely new body of commercial licence terms with no case law history
- The legal basis for protecting weights varies by jurisdiction: trade secret (if not published), database rights (EU), copyright (some jurisdictions treat parameter matrices as expressive works), or simply contract through the licence terms
- Some weights are released under genuinely open licences (Mistral 7B v0.1 — Apache-2.0; some OLMo variants — Apache-2.0) — these are the exception, not the norm among frontier models
- Distributing a product that includes model weights requires you to comply with the weights licence's distribution provisions — which may include downstream pass-through obligations
Layer Combinations in Practice — Real Model Examples
The following table shows how the three licence layers combine for the most widely used open-weight models. Confirming these combinations is the minimum due diligence for any commercial deployment.
| Model | Code licence | Data disclosure | Weights licence | Commercial use? |
|---|---|---|---|---|
| Meta Llama 3 (8B / 70B / 405B) | MIT | Partial — "publicly available online data" | Meta Llama 3 Community License (custom) | Yes — with restrictions + 700M MAU cap |
| Google Gemma 2 (2B / 9B / 27B) | Apache-2.0 | Not disclosed in detail | Gemma Terms of Use (custom) | Yes — with prohibited use categories |
| Mistral 7B v0.1 | Apache-2.0 | Not publicly disclosed | Apache-2.0 (applied to weights) | Yes — genuinely permissive |
| Mistral Large / Medium | N/A — API only | Not disclosed | Mistral commercial API ToS | API access only — no weight distribution |
| Falcon 40B / 180B | Apache-2.0 | RefinedWeb dataset — partially described | TII Falcon License (custom, v2: Apache-2.0) | v1: restricted; v2 and later: Apache-2.0 |
| GPT-4 / Claude / Gemini Pro | Proprietary | Not disclosed | Proprietary — API access only | API ToS only — no weight access |
The independence principle: Each licensing layer is legally independent. A permissive code licence does not make the weights permissive. A restrictive data provenance does not automatically make the code or weights restricted. Each layer must be assessed on its own terms. When building a commercial product on any AI model, the weights licence is the most commercially significant document — and the one most commonly left unread.
Having established what is being licensed, Section 2 addresses the terminology problem: what "open source", "open weights", and "source-available" actually mean in the context of AI models, why most "open" models do not satisfy the technical definition of open source, and why that distinction has direct commercial consequences for founders.
Section 2 — "Open Source", "Open Weights", "Source-Available": What Each Term Really Means
The AI industry has borrowed the vocabulary of open source software and applied it in ways that do not correspond to the legal meaning those terms carry. A model described as "open" by its creator may be open in a narrow technical sense (the weights are downloadable) while being closed in the legally significant sense (not usable for all purposes, not freely distributable, not freely modifiable without downstream obligations). Understanding these distinctions is not pedantry — it is the difference between a compliant deployment and an inadvertent breach.
Open Source
In precise technical and legal usage, "open source" refers to software meeting the Open Source Initiative (OSI) definition: the source must be freely redistributable, modifiable, and usable for any purpose without discrimination against persons, groups, or fields of endeavour. Licences that restrict commercial use, impose field-of-use limitations, or require specific approval for certain applications do not qualify.
Standard OSI-approved licences include MIT, Apache-2.0, and GPL. Applied to model weights, these licences are permissive in the full OSI sense: commercial use, fine-tuning, redistribution, and product embedding are all permitted without use-case carve-outs.
Open Weights
A model is "open weights" when its trained parameter matrices are publicly available for download. This says nothing about what you can do with them. Open weights may be released under a fully permissive OSI licence, under a custom licence with significant commercial restrictions, or under a bespoke research licence with no commercial rights at all — and all three are described as "open" in common usage.
Open weights is a distribution characteristic, not a permission framework. That the weights are available to download means only that Meta or Google has not technically prevented access — not that access grants the legal rights that "open" implies in software engineering.
Source-Available
"Source-available" describes software where the source code is publicly readable but where the licence does not grant the rights required to qualify as open source. In the AI context, "source-available" typically describes situations where training code or model architecture code is published and inspectable — but either the weights are withheld, or the licence restricts what you can do with what you see.
Some providers publish training code for research credibility without releasing the trained weights. Others release weights under terms that would make the arrangement "available but not free". Knowing what you are looking at requires reading the actual licence — not the marketing copy on the model card.
The OSI test for AI models: To qualify as open source under the OSI definition, a model licence applied to weights must permit free use, redistribution, and modification for any purpose — including commercial use — without field-of-use restrictions. Most "open" AI model licences currently in use fail this test because they prohibit specific applications (weapons, surveillance, illegal content) or impose obligations on derived models that go beyond what OSI-approved licences require.
This is not necessarily a criticism of those licences — responsible-use clauses may be appropriate policy. But it does mean that the common description of Llama 3 or Gemma 2 as "open source" is technically incorrect. They are open-weight, source-available models with custom commercial licences. Legal and commercial analysis must reflect that reality.
Model Profiles — Where Llama 3, Gemma, and Mistral Actually Sit
Meta Llama 3 (8B · 70B · 405B)
Released April 2024 · Meta AI · Most downloaded open-weight model familyLlama 3 is the most widely adopted open-weight model family and the one most frequently mischaracterised as "open source" in startup pitches and technical documentation. The code (model architecture, inference implementation) is released under MIT. The weights are released under the Meta Llama 3 Community License — a custom document that is materially distinct from any OSI-approved licence.
Google Gemma 2 (2B · 9B · 27B)
Released June 2024 · Google DeepMind · Competitive performance at smaller parameter countsGemma 2 is Google's open-weight model series, released with the stated aim of enabling research and commercial applications. Like Llama 3, its code (including the Keras and JAX implementations) is released under Apache-2.0 — a permissive OSI licence. The weights are released under the Gemma Terms of Use — a separate custom document that imposes substantive restrictions not present in Apache-2.0.
Mistral 7B v0.1 (and v0.3)
Released September 2023 · Mistral AI · One of the few genuinely open source frontier-class modelsMistral 7B v0.1 is notable precisely because it is the exception: both the code and the weights were released under Apache-2.0, making it one of the few frontier-class models that meets the OSI definition of open source at the weights layer. Commercial use, fine-tuning, redistribution, and product embedding are all permitted without use-case restrictions or derivative licensing obligations beyond Apache-2.0's basic attribution requirement (preserve copyright notices).
| Model | Category | Weights publicly available? | OSI open source? | Commercial use? | Fine-tune freely? |
|---|---|---|---|---|---|
| Llama 3 (Meta) | Open weights | Yes | No | Yes — with restrictions | Yes — derivative inherits licence |
| Gemma 2 (Google) | Open weights | Yes | No | Yes — with restrictions | Yes — derivative inherits ToU |
| Mistral 7B v0.1/v0.3 | Open source | Yes | Yes — Apache-2.0 | Yes — unrestricted | Yes — Apache attribution only |
| Mistral Large / Medium | Proprietary API | No | No | API access under commercial ToS | No — no weight access |
| OLMo (Allen AI) | Open source | Yes | Yes — Apache-2.0 | Yes — unrestricted | Yes |
| BLOOM (BigScience) | Source-available | Yes | No — RAIL-M restrictions | Yes — with RAIL use restrictions | Yes — RAIL restrictions pass downstream |
Section 3 addresses the specific commercial restrictions that appear in the most widely deployed open-weight model licences — the use-case prohibitions, scale thresholds, and derivative-work obligations that founders most commonly overlook when building products on top of models like Llama 3 and Gemma.
Section 3 — The Founder's Blind Spots: Use Restrictions, MAU Caps, and Derivative Obligations
Most founders who use open-weight AI models have read the headline of the licence but not the body. The headline — "commercial use permitted" — is accurate as far as it goes. The body contains four categories of restriction that create real commercial risk: prohibited use categories, scale thresholds, derivative-work obligations, and attribution and naming requirements. Each of these is found in one or more of the major open-weight model licences currently used in production deployments.
Blind Spot 1 — Prohibited Use Categories
Specific applications explicitly excluded from permitted commercial use — regardless of the founder's intent or industryEvery major custom model licence contains a list of prohibited uses. These lists prohibit specific application categories regardless of the operator's commercial intent, the harm potential of the specific implementation, or the jurisdiction in which the operator is based. They are hard prohibitions, not guidelines — using a model for a prohibited purpose is a breach of licence regardless of how the product is structured.
The categories vary between licences, but converge around a common set: weapons development (broadly defined to include cybersecurity offensive tools in some licences), illegal content generation, surveillance and tracking without consent, and — in some formulations — any use intended to undermine "appropriate human oversight of AI systems." The last category is notably broad and its scope is not defined in most licences.
Blind Spot 2 — Scale Thresholds and MAU Caps
Commercial conditions that change at defined user or revenue thresholds — the 700 million monthly active user trigger in Llama 3The Meta Llama 3 Community License contains a scale threshold that is routinely absent from founders' commercial modelling. Section 1 of the licence provides that the licence is granted subject to the condition that, if "the monthly active users of the products or services made available by or for [the licensee] exceeds 700 million monthly active users in the preceding calendar month, [the licensee] must request a license from Meta, which Meta may grant to you in its sole discretion."
700 million MAU sounds like a threshold no startup needs to think about. But the commercial significance is different from what that number suggests. First, the threshold applies to the product or service as a whole — not just to AI feature usage — which means a consumer product with a large user base using Llama 3 for any feature crosses the threshold even if only a small percentage of users interact with the AI component. Second, the clause gives Meta sole discretion — which means the terms of the post-threshold licence are not specified in the document you agreed to. Third, in acquisition due diligence, any threshold that requires a third-party consent is a potential deal delay or complication regardless of its likelihood of triggering.
Blind Spot 3 — Derivative-Work and Fine-Tuning Obligations
What happens to the licence when you fine-tune, distil, adapt, or merge a model — and what obligations pass downstream to your productFine-tuning is how most commercial AI products are built — you take a pre-trained base model and adapt it to your specific domain, use case, or data. The question that founders consistently fail to ask before beginning this process is: what does the base model's licence say about fine-tuned derivatives? In traditional software, writing new code that uses an Apache-2.0 library does not make your code Apache-2.0. In AI, the equivalent is less clear — and custom model licences typically have explicit provisions that govern this.
Llama 3's Community License specifies that any "Llama Materials" redistributed — including fine-tuned variants — must carry the Llama 3 Community License. GPL-style licences applied to model weights require that derivatives also be GPL. RAIL-M (used with BLOOM and some Stable Diffusion variants) passes use restrictions downstream specifically — even if you remove all identifiable connection to the base model, the restrictions follow the derived weights.
Blind Spot 4 — Attribution Requirements and Trademark Restrictions
What you must say, display, and disclose to users — and what you cannot name your productAttribution and naming requirements are the least commercially dangerous of the four blind spots — but they are the most commonly non-compliant in practice, because they require affirmative actions that development teams do not routinely build into their deployment workflows. Most model licences require some form of attribution; some require specific disclosures to end users; and almost all prohibit using the model's name, branding, or associated marks in product marketing without separate trademark authorisation.
The trademark restriction is particularly relevant for product naming. A product called "LlamaLegal" or "GemmaAssist" uses the model's name in a way that implies endorsement or official association — Meta and Google have not licensed these uses through the model licence. Startup founders regularly choose product names that incorporate model names without understanding that trademark rights are not included in open-weight distribution.
Restriction Matrix — Which Models Carry Which Obligations
Due diligence disclosure point: Each of these four blind spots is a standard item in AI-focused due diligence conducted by investor legal counsel and enterprise procurement teams. A product built on Llama 3 without documented analysis of use restrictions, the MAU cap, derivative obligations, and attribution compliance will require remediation — or a model replacement — before institutional investment or enterprise contracts can close. Addressing this at the architecture stage costs hours. Addressing it at due diligence costs weeks and sometimes the deal itself.
Section 4 provides a practical framework for reading any model licence before deployment, and a comparison of the full licence terms for the six most widely used open-weight model licences currently in production.
Section 4 — How to Read a Model Licence: A Practical Framework
There is no universal method for reading a model licence because there is no standard format for writing one. Each major provider has drafted their own custom instrument — and each uses different language, different structure, and different concepts to describe similar restrictions. The five-step framework below provides a consistent analytical approach that works across any model licence, regardless of the provider's formatting preferences or the document's length.
Before applying the framework, confirm one preliminary: you are reading the right document.
For most model repositories, there are at least two licence-relevant documents — the code
licence (usually linked from the repository README) and the model or weights licence
(sometimes a separate document linked from the model card, sometimes a file called
LICENSE_MODEL or USE_POLICY.md). For products accessed via API,
the commercial terms of service are the relevant document — the model's weights licence
does not apply to API access. Apply the framework to each document independently.
Licence Comparison — The Six Frameworks You Will Encounter
The following table summarises the key commercial characteristics of the six model licence frameworks most commonly encountered in AI product development. It is a reference summary — full licence texts should be read before any commercial deployment decision.
| Licence | Applies to code? | Applies to weights? | Commercial use | Fine-tune / modify | Derivative licence | Use restrictions | Scale threshold |
|---|---|---|---|---|---|---|---|
| Apache-2.0 | Yes | Yes (if applied) | Unrestricted | Yes | Apache or compatible; preserve notices | None | None |
| MIT | Yes | Yes (if applied) | Unrestricted | Yes | Any; preserve MIT copyright notice | None | None |
| GPL-3.0 | Yes | Contested (if applied) | Yes | Yes | Must be GPL-3.0 (copyleft) | None (use restrictions) | None |
| RAIL-M (BigScience) | Yes | Yes | Yes — with use restrictions | Yes | Must pass RAIL restrictions downstream | Yes — named prohibitions | None |
| Meta Llama 3 Community License | Code is MIT | Yes — custom | Yes — with prohibitions | Yes | Must remain Llama 3 licence; no relicensing | Yes — named + law compliance | 700M MAU cap |
| Google Gemma Terms of Use | Code is Apache-2.0 | Yes — custom | Yes — with prohibitions | Yes | Derivative must comply with Gemma ToU | Yes — prohibited use policy by reference | None stated |
Conclusion: Six Steps to Licence-Compliant AI Deployment
AI model licensing is an evolving, non-standard field with no single governing framework and no history of case law to guide interpretation. What it does have is a small number of documents — the actual weights licences of the models in production — that can be read, documented, and audited. The following six-step framework converts that reading into a defensible compliance position.
Map all three licence layers for every model in your stack
For each model: identify the code licence, the weights licence, and the training data disclosure. Store these as your AI IP schedule. Update it when you swap models or update versions — model versions can carry different licences.
Read the weights licence in full — not the README
The README describes the model. The weights licence governs what you can do with it. These are different documents and they say different things. Apply the five-step framework from Section 4 to every weights licence before deployment.
Check every use restriction against your product's full capability set
Do not assess restrictions against intended use only. Assess them against what your product can do — including edge cases, user-generated inputs, and integrations. A prohibited use is a prohibited use regardless of whether you intended to enable it.
Model scale thresholds against your growth projections
If your model has a scale threshold (currently: Llama 3's 700M MAU), include it in your commercial risk register from day one. Document that you have modelled the threshold and assessed the risk. This is what investors and acquirers will ask about at due diligence.
Review fine-tuning and derivative provisions before building your training pipeline
If you intend to fine-tune a model, read the derivative works provisions before you begin — not after. A fine-tuned model built on a licence-incompatible base is difficult to remediate. Switching models at the architecture stage is a design decision; switching at launch is a crisis.
Implement attribution compliance and naming review as part of product QA
Attribution requirements and naming restrictions are process obligations, not one-time decisions. Build them into your product QA checklist, your marketing review process, and your legal sign-off workflow. Include model licence compliance in every product update cycle that changes or upgrades model dependencies.


