Can You Monetize on “Open” LLMs? A Practical Breakdown

Can You Monetize on “Open” LLMs? A Practical Breakdown

Can You Monetize on “Open” LLMs? A Practical Breakdown

AI Licensing · Commercial Use

Can You Monetize on "Open" LLMs? A Practical Breakdown

Gemma, Llama 3, and Mistral are described as "open" — but open does not mean commercially unrestricted. The gap between what the licence permits and what a business wants to do is where legal exposure begins. This post maps the restrictions, clarifies the feature-vs-reselling line, and identifies the business models that work and those that do not.

Gemma Llama 3 Mistral Apache-2.0 SaaS API reselling Usage-based pricing MAU limits Competitor bans Commercial use

Introduction — The Permission Gap in "Open" LLMs

The word "open" in open-source AI carries a specific technical meaning — the model weights are publicly available. It does not mean the model is free to commercialise in any way you choose. Every major open LLM — Llama 3, Gemma, Mistral's commercial releases, and NVIDIA's OML-licensed models — comes with a licence that defines what commercial use is permitted, what is restricted, and what is outright prohibited. The gap between what founders assume they can do and what the licence actually permits is where legal exposure silently accumulates.

This is not a theoretical concern. Licence restrictions on open LLMs have already produced measurable commercial consequences: products redesigned at scale after legal review, enterprise deals stalled over procurement questions about the underlying model, and startup valuations impacted by IP chain uncertainty at fundraising. Understanding the licence before building the revenue model — not after the product is in production — is the commercially rational approach.

Three Assumptions That Get Founders Into Trouble

⚠️
Common assumption"Open weights mean I can build any commercial product"
Reality

Open weights grant access to the model — they do not grant unlimited commercial rights. Llama 3 prohibits certain use cases entirely, applies MAU thresholds above which a commercial licence is required, and restricts using the model to build products that compete with Meta. Gemma's licence similarly bans specific harm categories and usage types regardless of whether the weights are publicly available.

⚠️
Common assumption"I'm building a SaaS product, so no model licence restrictions apply to me"
Reality

SaaS deployment does not insulate you from model licence restrictions — it changes how the restrictions apply. The question is not whether you are a SaaS product, but what your SaaS product does: whether it builds on the model to create distinct value, or whether the model itself is the product being sold. The licence restrictions run to the use case, not the delivery mechanism.

⚠️
Common assumption"Usage-based pricing based on model calls is just standard SaaS pricing"
Reality

Pricing that is directly proportional to model API calls — with no meaningful value-added layer between the model and the customer — can constitute API reselling regardless of what you call it in your terms of service. Several model licences prohibit "selling access to" the model as a distinct commercial act. Whether your pricing structure crosses that line depends on whether there is genuine product value added, not on the pricing label you use.

How Licence Restrictions Flow Through the Product Stack

The obligation cascade — from model provider to end customer
Model provider licence (Llama 3 / Gemma / Mistral / Apache-2.0)

Sets the outer boundary: what use cases are permitted, what scale triggers additional obligations, what downstream uses are prohibited, and what the product layer must flow down to customers

AI product layer (your SaaS / API / application)

Must stay within the model licence boundary on use cases and must flow down required restrictions to customers in the product's own terms of service — the product cannot grant customers rights the model provider has not granted the product

Customer / end user

Bound by both the product's ToS and — through flow-down clauses — by the underlying model's acceptable use policy; customer activities that violate the model licence create upstream liability for the product layer even when the product's own terms do not explicitly address the behaviour

Four Variables That Determine Your Licence Exposure

1
Which model you are using

The licence type — Llama 3 Community Licence, Gemma ToU, Apache-2.0, NVIDIA OML, Mistral Research Licence vs commercial — determines the baseline restrictions. Apache-2.0 models carry the fewest restrictions; custom model licences carry the most. This is the single most important variable in the commercial risk assessment.

2
How you deploy it

Self-hosted open-weight models used internally create different obligations than models accessed via API or redistributed as part of a product. Deployment method affects sub-processor obligations, data processing agreements, and whether distribution-specific licence clauses are triggered.

3
What your revenue model is

Revenue models that monetise the model directly — per-token billing, inference API reselling, model-as-a-service — face higher licence scrutiny than revenue models that monetise a product outcome the model enables. The closer the pricing unit is to the model's own output unit, the more carefully the licence must be reviewed.

4
Who your customers are

Certain model licences restrict use in competitive contexts — serving customers who compete with the model provider, or enabling applications in sectors the model provider has identified as prohibited. Enterprise customers in regulated sectors also trigger additional obligations around data processing and model governance that affect the commercial relationship.

⚖️

Structuring context: Model licence choice affects not just product compliance but the IP structure of the company itself — including what is defensible at fundraising and how AI assets are valued in M&A. For the framework on how open model licence obligations interact with AI IP ownership and investment structuring, see AI IP Ownership — wcr.legal.

Section 1 — Monetization Restrictions Across Open LLMs

Every major open LLM applies commercial restrictions that fall into three categories: scale-based thresholds that trigger additional obligations above certain user counts, competitor bans that prohibit using the model to build products that compete with the model provider, and reselling prohibitions that restrict monetising the model or its weights directly. These restrictions are not identical across models — the specific terms, thresholds, and prohibitions vary materially between Llama 3, Gemma, NVIDIA's OML, and Apache-2.0 licensed models like Mistral 7B.

The Three Restriction Categories That Matter for Monetization

📊
Scale / MAU thresholds
Llama 3: products with more than 700 million monthly active users must obtain a separate commercial licence from Meta; below this threshold, the community licence applies
What triggers the threshold: MAU is counted across all products using the Llama 3 model; a company with multiple products using Llama 3 aggregates MAU across all of them for threshold purposes
Practical impact: at current scale levels, the 700M MAU threshold is not immediately relevant for most startups; however, the threshold creates a contractual dependency at a future point that must be disclosed in fundraising and M&A documentation
Gemma and Mistral: neither model applies a formal MAU threshold; scale-based restrictions under these licences operate through AUP category enforcement rather than user count triggers
🚫
Competitor / use-case bans
Llama 3: prohibits using the model to train or fine-tune another large language model that is intended to compete with Meta's AI products; this restricts foundation model development on top of Llama 3
Gemma: Google's terms prohibit use to "create content that may negatively affect Google" and restrict training competitive models; the competitor clause is broader in language if less specifically defined
NVIDIA OML: explicitly prohibits use to develop competing AI software, frameworks, or hardware tools — one of the clearest competitor ban formulations among open-weight licences
Apache-2.0 (Mistral 7B): no competitor ban; Apache-2.0 does not permit discrimination against fields of endeavour, so competitive use is explicitly allowed under the licence terms
💰
Reselling / API monetization bans
Llama 3: the licence does not explicitly prohibit offering inference services, but does prohibit use in violation of Meta's usage policies; whether API reselling is permitted depends on whether it constitutes "use" under the licence or requires a separate agreement
Gemma: the Terms of Use prohibit "providing the model as a service or API to third parties" without Google's consent — this is one of the clearest API-reselling prohibitions in a major open-weight licence
NVIDIA OML: prohibits redistributing the model or its outputs as the primary basis of a commercial service; inference services that pass model outputs as the core product are restricted
Apache-2.0 (Mistral 7B): no reselling restriction; redistribution and commercial API deployment are explicitly permitted under Apache-2.0 terms

Licence Profile by Model — Commercial Restriction Overview

🦙
Llama 3
Meta — Llama 3 Community Licence
MAU threshold
700M — commercial licence required above
Competitor ban
Yes — competing LLM training prohibited
API reselling
Ambiguous — not explicitly prohibited, but usage policy governs
Fine-tune distribution
Permitted under community licence with attribution
SaaS product use
Permitted below MAU threshold
Licence revocability
Custom licence — revocable by Meta
💎
Gemma
Google — Gemma Terms of Use
MAU threshold
None — AUP-based restrictions only
Competitor ban
Yes — broad language covering Google-competing use
API reselling
Prohibited without Google's consent
Fine-tune distribution
Permitted; fine-tune inherits Gemma ToU
SaaS product use
Permitted for non-prohibited use cases
Licence revocability
Custom ToU — revocable by Google
🌬️
Mistral 7B
Mistral AI — Apache-2.0
MAU threshold
None — Apache-2.0 has no scale limit
Competitor ban
None — all fields of endeavour permitted
API reselling
Permitted — redistribution allowed under Apache-2.0
Fine-tune distribution
Permitted; Apache-2.0 applies to derivative works
SaaS product use
Permitted without restriction
Licence revocability
Irrevocable — Apache-2.0 grants permanent rights
NVIDIA Open Model Licence
NVIDIA — OML (custom)
MAU threshold
None stated — AUP and scope restrictions apply
Competitor ban
Yes — explicitly prohibits competing AI/software products
API reselling
Restricted — model output as primary product prohibited
Fine-tune distribution
Restricted — derivative works governed by OML
SaaS product use
Permitted for non-prohibited, non-competitive use cases
Licence revocability
Custom — NVIDIA retains change and termination rights

Restriction Comparison — Key Monetization Variables

MAU / scale threshold
🦙 Llama 3
700M MAU trigger
Commercial licence from Meta required above 700 million monthly active users; counts aggregate across all products using Llama 3
💎 Gemma
None
No formal MAU threshold; scale-based restrictions operate through AUP category enforcement rather than user count
🌬️ Mistral 7B (Apache-2.0)
None
Apache-2.0 applies no scale limits of any kind; the product can grow to any size without the licence terms changing
⚡ NVIDIA OML
None stated
No explicit MAU trigger; AUP restrictions and scope limitations apply regardless of scale
Competitor use ban
🦙 Llama 3
Banned
Using Llama 3 to train or fine-tune a competing LLM is explicitly prohibited; foundation model development on Llama 3 is restricted
💎 Gemma
Banned
Broad language prohibiting use that "negatively affects Google"; training competitive models and building Google-competing AI products is restricted
🌬️ Mistral 7B (Apache-2.0)
None
Apache-2.0 cannot discriminate against any field of endeavour; competing directly with Mistral AI using Mistral 7B is fully permitted
⚡ NVIDIA OML
Banned
One of the clearest competitor ban formulations: explicitly prohibits use to develop competing AI software, frameworks, or hardware tools
API / inference reselling
🦙 Llama 3
Ambiguous
Not explicitly prohibited; Meta's usage policy governs whether inference services constitute permitted use — material legal risk for pass-through API products
💎 Gemma
Prohibited
ToU explicitly bans "providing the model as a service or API to third parties" without Google's prior consent — the clearest API-reselling prohibition of any major open-weight licence
🌬️ Mistral 7B (Apache-2.0)
Permitted
Apache-2.0 explicitly permits redistribution in any form; building a commercial inference API on Mistral 7B is fully permitted without restriction
⚡ NVIDIA OML
Restricted
Prohibits redistributing the model or its outputs as the primary basis of a commercial service; inference products where model output is the core deliverable are banned
SaaS product deployment
🦙 Llama 3
Permitted
SaaS products using Llama 3 as internal infrastructure are permitted below the 700M MAU threshold, provided the use case does not fall within AUP-prohibited categories
💎 Gemma
Conditional
Permitted for non-prohibited use cases; AUP restrictions on harm categories apply; cannot build Google-competing SaaS products on Gemma
🌬️ Mistral 7B (Apache-2.0)
Fully permitted
No restrictions on SaaS deployment; any product category, any customer type, any scale — Apache-2.0 applies no deployment constraints
⚡ NVIDIA OML
Conditional
Permitted for non-competing, non-prohibited use cases; products must not build AI tooling that competes with NVIDIA's commercial offerings
Usage-based / per-token pricing
🦙 Llama 3
Conditional
Permitted where the pricing unit reflects genuine product value rather than raw model consumption; per-token billing that mirrors API reselling creates material licence risk
💎 Gemma
High risk
Per-token or per-call pricing on Gemma inference approaches the API reselling prohibition directly; high risk of falling within the ToU ban on providing Gemma as a service
🌬️ Mistral 7B (Apache-2.0)
Fully permitted
No pricing restrictions under Apache-2.0; per-token, per-call, or any other pricing unit is commercially permissible without licence risk
⚡ NVIDIA OML
Conditional
Depends on whether a meaningful product value-add layer exists; pricing that passes model output cost through to customers approaches the reselling restriction
Licence change risk
🦙 Llama 3
High
Custom licence — Meta retains the right to amend terms; products built on Llama 3 are exposed to future restriction changes without contractual protection
💎 Gemma
High
Custom ToU — Google can update terms unilaterally; continued use after an update constitutes acceptance of new terms under the Gemma Terms of Use
🌬️ Mistral 7B (Apache-2.0)
None
Apache-2.0 is irrevocable; rights granted cannot be withdrawn retroactively; products built on Mistral 7B carry no licence change risk
⚡ NVIDIA OML
High
Custom licence — NVIDIA retains change and termination rights; OML terms have already evolved across model generations

Section 2 — Selling Your Feature vs Reselling the Model

The single most important commercial distinction in open LLM licensing is the difference between building a product that uses a model and reselling the model itself. Licence restrictions on API reselling and model redistribution do not apply to every business that makes money using an open LLM — they apply to a specific type of monetization where the model, rather than the value the model enables, is the thing being sold. Understanding where the line falls is not a legal technicality. It is the difference between a product that is freely commercialisable and one that requires a separate agreement with the model provider.

The line is not always obvious in practice. A document summarisation tool that charges per document is clearly selling a product feature — the model is the engine, not the product. An inference API that exposes raw model completions and charges per token is clearly reselling model access. Between these two extremes sits a large grey zone: usage-based SaaS products, AI platform tooling, embedded AI APIs sold to developers, and multi-model routing services where the underlying model is material to the customer proposition.

Permitted vs Prohibited: The Core Commercial Distinction

Generally permitted — selling your feature

You add value on top of the model

Vertical SaaS: a product that uses a model to provide domain-specific output (legal drafting, code review, financial analysis) charges for the workflow outcome — the model is infrastructure, not the product
AI-augmented features: adding AI-powered features (smart search, auto-tagging, summarisation) to an existing product and charging for the overall product rather than for the AI feature specifically
Fine-tuned specialists: training a domain-specific model on proprietary data and selling the resulting capability as a product — provided the base model licence permits derivative distribution
Usage-based pricing on product outcomes: charging per document processed, per report generated, or per task completed — where the pricing unit is the product output, not the underlying model's compute unit
Internal tooling sold externally: AI tools built for internal use that are then packaged and sold to other companies — the product is the tool, its integration, and its workflow, not model access
Generally restricted — reselling the model

The model itself is what customers pay for

Raw inference API: an API endpoint that accepts prompts and returns model completions, with no meaningful transformation, context injection, or product logic between the customer and the model
Per-token billing that mirrors API costs: pricing that is directly proportional to model token consumption, passed through to customers with a margin — the economic structure of reselling even when the legal structure claims otherwise
Model-as-a-service with customer model choice: platforms that give customers a menu of underlying models (Llama 3, Gemma, Mistral) and bill for access — the model selection is the product value proposition
Weight redistribution: distributing model weights (with or without modification) as a commercial product, or charging for access to fine-tuned weights derived from restricted base models
Inference hosting services: hosting open-weight models and selling inference access to customers where the primary value is avoiding the customer's need to run the model themselves — compute + model access bundled

The Five-Question Test — Which Side of the Line Are You On?

Apply all five questions — risk accumulates with each "reselling" answer
1
What is the customer actually paying for — a product outcome, or model access?
Feature signal: customer pays for a completed task, a workflow outcome, or access to a product that uses AI to deliver its core function
Reselling signal: customer pays for the ability to send prompts and receive model responses — the model is the service being delivered
2
Does the product add substantial logic, data, or transformation between the customer and the model?
Feature signal: the product has its own prompt engineering, retrieval systems, output validation, domain logic, or post-processing that creates value the model alone does not provide
Reselling signal: the customer's input goes to the model with minimal transformation; the model's output comes back with minimal processing; the product layer is thin routing infrastructure
3
Is the pricing unit the model's compute unit (tokens, API calls) or the product's output unit?
Feature signal: pricing is per document, per report, per user seat, per month — units that reflect the product's value delivery rather than the model's resource consumption
Reselling signal: pricing is per token, per API call, or per completion — the same unit the model provider uses; the customer's cost scales directly with model API consumption
4
Would the customer notice or care if you swapped the underlying model?
Feature signal: the customer buys the product outcome; the underlying model is an implementation detail that the customer does not interact with directly and would not necessarily notice if changed
Reselling signal: the specific model (Llama 3, Gemma, Mistral) is material to the customer's purchase decision; customers select or are billed based on model choice; model identity is part of the product proposition
5
Does your product have meaningful standalone value if the underlying model is removed?
Feature signal: the product's data integrations, workflow logic, domain knowledge, UX, and customer relationships represent value that would survive a change of underlying model provider
Reselling signal: the product is essentially a wrapper around the model; without the model, there is no product; the value delivered to customers is entirely the model's capability, accessed through the product's interface

Usage-Based Pricing — When It Is Fine and When It Is Not

Generally safe
Outcome-based usage pricing
Per document processed — the billing unit is the product workflow unit, not the model's token unit
Per user seat — flat recurring billing for access to the product regardless of model consumption volume
Per result generated — billing for a completed output (report, analysis, classification) rather than the underlying compute
Tiered feature access — higher tiers unlock more sophisticated features, not more raw model calls; the pricing reflects product capability
Requires review
Hybrid usage pricing
Credit-based systems — where credits map closely to API calls or token consumption; the mapping may constitute implicit API reselling
Metered API billing — charging developers per API call to the product's own endpoint where those calls map 1:1 to underlying model calls
Volume discounts tied to token use — if pricing structure is transparently based on model token consumption even if labelled as product credits
Developer API products — where developers build their own products on your API and your value proposition is primarily model access with light orchestration
High licence risk
Model-mirroring pricing
Per-token billing — pricing directly denominated in model tokens, identical to how the model provider bills its own API customers
Per-completion billing — charging per raw model API response with a margin, no product logic between request and response
Model selection as premium tier — charging more for access to more capable models; the model identity is the product differentiator
Inference cost pass-through — billing customers based on your actual model API costs plus a fixed margin, transparently or implicitly mirroring the model provider's pricing

Section 3 — Business Models That Work Well on Open LLMs

The commercial success of the Gemma, Llama 3, and Mistral licence families for software businesses is not theoretical — there are well-established archetypes of products that fit cleanly within the licence boundaries while generating defensible revenue. The pattern that runs through every commercially viable open-LLM business model is the same: the model is infrastructure, and the product is built on top. The revenue is derived from what the product does, not from access to the model itself.

The models best suited for commercial products without licence complications are Apache-2.0 licensed models — specifically Mistral 7B and its derivatives — because Apache-2.0 places no commercial restrictions on use, redistribution, or monetization. For products that need higher capability and can work within their licence restrictions, Llama 3 and Gemma are viable for most SaaS and vertical application use cases below the specific restriction thresholds. The key is matching the business model archetype to the model licence's actual constraints.

What Each Model Is Suited For Commercially

🌬️
Mistral 7B (Apache-2.0)
Lowest commercial restriction of any capable open LLM
Best for: any product where legal simplicity is commercially material — regulated sectors, enterprise sales with procurement scrutiny, products approaching fundraising or M&A where IP chain must be clean
Inference APIs: the only major open-weight model family where building an inference API business is commercially straightforward — Apache-2.0 permits redistribution without restriction
Developer tooling: tools sold to developers that surface the model's capability as part of the product — Apache-2.0 permits this without competitor-ban or reselling concerns
Capability trade-off: Mistral 7B is capable for many production tasks but trails Llama 3 70B or Gemma Pro in capability; products requiring frontier-level performance may need to accept the licence restrictions of higher-capability models
🦙
Llama 3
Community licence — restrictions apply above 700M MAU
Best for: vertically focused SaaS products that need strong general capability, are not competing with Meta's AI offerings, and have a clearly defined product layer between customers and the model
Fine-tuned vertical models: building domain-specific models by fine-tuning Llama 3 on proprietary data and offering the resulting capability as a product is straightforwardly permitted at most commercial scales
Enterprise applications: internal productivity tools, code assistants, knowledge management systems — products where the customer never interacts with the model directly and the product's workflow logic creates the value
Avoid: using Llama 3 to build a competing foundation model or AI platform; products with ambitions to reach the 700M MAU threshold without a Meta commercial licence agreement in place
💎
Gemma
Google ToU — API reselling explicitly restricted
Best for: applications in the Google ecosystem, products that pair Gemma with Google Cloud infrastructure, and verticals where Google's investment in safety fine-tuning is commercially relevant to customers
Embedded AI features: Gemma's efficient smaller variants (2B, 7B) are well-suited for on-device or edge deployment where the model runs locally within a product rather than as an external API
Research and education products: the ToU explicitly contemplates research and educational use; products in these verticals have a cleaner licence posture than general commercial products
Avoid: building an inference API or model-access service on Gemma; products whose revenue model could be characterised as providing Gemma as a service to third parties without Google consent

Four Business Model Archetypes That Work

🏢
Vertical SaaS with AI-powered workflow
The dominant commercially safe archetype
Structure: the product is a domain-specific workflow tool — legal contract review, medical documentation, financial analysis, code review — that uses an open LLM to automate or augment the core task; the model is never exposed to customers directly
Revenue model: per-seat SaaS subscription, per-document pricing, or per-organisation licensing — pricing tied to the product's workflow value, not to underlying model consumption
Preferred models: Llama 3 (capability), Mistral 7B (IP simplicity), Gemma (Google ecosystem); model is chosen for task fit and licence compatibility with the specific use case
Why it works: the product's domain knowledge, integrations, training data, and workflow logic create value that is clearly distinct from the underlying model capability; the licence restriction on reselling does not reach products with genuine value-add layers
🔬
Fine-tuned specialist model as a product
Proprietary data creates the moat
Structure: take an open-weight base model (ideally Mistral 7B on Apache-2.0 for maximum IP clarity), fine-tune it on proprietary domain data, and offer the resulting specialised model capability as a product to customers in that domain
Revenue model: subscription to the specialised AI capability (not to "the model"); per-outcome pricing for tasks the specialist model performs — the moat is the fine-tuning dataset, not the base model
IP considerations: fine-tuned weights derived from Apache-2.0 base models are commercially cleanest; fine-tunes on Llama 3 or Gemma inherit their licence restrictions and must not be distributed or monetised in ways those base licences prohibit
Why it works: the proprietary fine-tuning dataset is the commercial IP — customers pay for access to the specialised capability that dataset creates, not for the base model itself; clearly on the "feature" side of the reselling line
🖥️
AI-augmented internal tooling sold externally
Internal use case → commercial product
Structure: tools originally built for internal use — customer support automation, internal knowledge bases, code assistants, document management — packaged and sold to other companies facing the same operational problem
Revenue model: per-organisation licensing, per-seat subscription, or per-deployment pricing; the product is the packaged tool and its implementation, not the AI model
Compliance advantage: these products have organic product logic because they were built to solve a real internal problem; the value-add layer is genuine, which creates a robust answer to any "are you just reselling the model?" question
Why it works: the product was never designed as a model access layer; the model is infrastructure within a tool that has its own integrations, workflows, and institutional knowledge baked in
📱
Consumer or SMB applications with embedded AI
AI as a feature, not the product
Structure: consumer or SMB-facing products — writing tools, productivity apps, research assistants, creative tools — where AI capability is one feature among several and the brand, UX, and distribution are the commercial assets
Revenue model: freemium to paid subscription; premium features may include more AI usage (higher limits) but the pricing unit is the subscription tier, not the AI consumption unit
Model considerations: consumer products typically prioritise on-device capability (Gemma 2B efficient variant, Mistral 7B quantised) or cost efficiency; the Apache-2.0 licence advantage of Mistral 7B is commercially significant at consumer scale
Why it works: the product's brand, distribution, and user experience are the business; the AI model is one component of the product — the same way a database is infrastructure, not the product

Why Apache-2.0 Models Are the Default Choice for Maximum Commercial Freedom

Apache-2.0 commercial advantages vs custom model licences
What Apache-2.0 permits that custom licences restrict
Redistribution without restriction: model weights and derivatives can be redistributed commercially — including as part of products sold to customers — without approval from the model creator
All fields of endeavour: Apache-2.0 cannot discriminate against use cases; building a product that competes with the model creator is fully permitted — no competitor ban
Irrevocability: Apache-2.0 rights cannot be revoked retroactively; a product built on Mistral 7B today cannot have its licence pulled by Mistral AI tomorrow — the rights are permanent
No MAU threshold: there is no scale-based commercial licence trigger; a product can grow to any size without the licence terms changing or requiring a new agreement
Where Apache-2.0 models may require trade-offs
Capability ceiling: Apache-2.0 models currently available (Mistral 7B, early Falcon) trail the capability of Llama 3 70B or Gemma Pro on complex reasoning and instruction-following tasks; capability trade-offs must be assessed per use case
Safety fine-tuning: Apache-2.0 base models typically have less safety fine-tuning than Gemma or instruction-tuned Llama 3 variants; products in regulated or sensitive sectors may need to invest in safety fine-tuning on top of the Apache-2.0 base
Attribution requirement: Apache-2.0 requires attribution notices to be preserved in derivative works; for commercial products, this is a documentation obligation rather than a commercial restriction, but it must be handled correctly
No patent grant from model creator: Apache-2.0 includes a patent licence, but where model training involves third-party IP claims, Apache-2.0's permissive terms do not resolve underlying training data copyright questions

Section 4 — Business Models That Hit Licence Walls

Not every business model that involves an open LLM is within the licence boundaries. Several commercially attractive models — inference-as-a-service, raw API reselling, usage-based pass-through pricing, and competitor-facing tooling — run directly into the restrictions that Gemma, Llama 3, and NVIDIA's OML apply to commercial use. Understanding which models hit these walls, and what the structural alternatives are, is commercially necessary before committing to a product and revenue architecture.

Four Scenarios Where Licence Walls Appear

🚧
Inference API / model-as-a-service
Offering raw model completions as the commercial product
Licence conflictGemma's ToU explicitly prohibits "providing the model as a service or API to third parties" without Google's consent. NVIDIA OML prohibits inference services where model output is the primary product. Llama 3's usage policy creates ambiguity for pass-through inference services, even if it does not use identical language. Building a hosted inference API on any of these models creates direct licence exposure.
Structural alternativeUse Apache-2.0 licensed models (Mistral 7B) for inference API products — Apache-2.0 explicitly permits this. For products that require higher-capability models, add a genuine product layer (retrieval, orchestration, domain fine-tuning) that transforms the inference product into a feature-selling product rather than a model-access product.
🚧
AI platform with model selection
Offering customers a choice of underlying open models
Licence conflictA platform that offers customers a menu of open-weight models (Llama 3, Gemma, Mistral) and charges for access to their inference is providing model access as the product. Under Gemma's ToU, this constitutes providing the model as a service. Under NVIDIA OML, model selection as the product value proposition falls within the reselling restriction. The multi-model wrapper does not escape the individual model licence restrictions.
Structural alternativeReplace model selection as the product proposition with model orchestration as an internal capability. The platform's value should be in what it does with the models — routing, aggregation, quality scoring, domain adaptation — not in giving customers access to choose the model. Apache-2.0 models can be included in the platform without restriction.
🚧
Token-based pricing on model completions
Usage-based pricing mirroring the model provider's billing
Licence conflictPricing that is directly denominated in model tokens — or in a credit unit that maps transparently to model token consumption — creates the economic structure of API reselling even when the legal wrapper claims otherwise. Under Gemma's API reselling prohibition, this pricing structure is high-risk. For Llama 3, per-token pricing that creates a pass-through revenue model approaches the reselling line even if it does not cross it unambiguously.
Structural alternativeReprice on product output units (documents, tasks, analyses) rather than model consumption units. If per-usage pricing is commercially necessary, ensure the pricing unit reflects genuine product value rather than underlying compute. Apache-2.0 models (Mistral 7B) permit per-token pricing without restriction — use them for products where this pricing structure is core to the business model.
🚧
Competitor-facing AI tooling
Using restricted models to build tools for model providers' competitors
Licence conflictLlama 3 prohibits using the model to train other LLMs that would compete with Meta's AI products. Gemma prohibits use that "negatively affects Google." NVIDIA OML explicitly bans building competing AI software. A product that uses these models to build AI infrastructure — model training pipelines, AI development platforms, foundation model evaluation tools — sold to companies competing with these providers runs directly into the competitor ban clauses.
Structural alternativeUse Apache-2.0 models for any product sold into the AI development stack, or for any product whose customers include companies that compete with Llama, Gemma, or NVIDIA in AI. Apache-2.0's non-discrimination clause makes all fields of endeavour and all customer types permissible without the competitor ban risk.

Licence Wall Severity by Business Model and Model

Pre-Launch Monetization Compliance Checklist

Identify model licence type — Apache-2.0, Llama 3 Community, Gemma ToU, or NVIDIA OML; confirm the commercial restrictions that apply to your specific version and variant
Map revenue model to licence — apply the five-question test; confirm pricing unit is a product output unit, not a model consumption unit, for any model with API reselling restrictions
Check competitor ban scope — confirm your product does not build AI infrastructure sold to model provider competitors; if it does, use Apache-2.0 models only
Verify AUP / use-case alignment — map all product use cases against the model's prohibited use categories; confirm no use case sits within or adjacent to a restriction category
Assess MAU trajectory for Llama 3 — if using Llama 3, model the timeline to 700M MAU and confirm a Meta commercial licence agreement is in the product roadmap if enterprise scale is planned
Review ToS flow-down requirements — confirm the product's terms of service incorporate required model licence restrictions and prohibit customers from using the product for model-prohibited purposes
Document the product value-add layer — for any non-Apache-2.0 model, create and maintain documentation that articulates why the product is a feature product rather than a model reselling product; this is the primary defence in any licence compliance review
Snapshot the licence at adoption — store a timestamped copy of the model licence, AUP, and any additional terms in effect at the point of first commercial deployment; this record is required for fundraising, M&A due diligence, and licence change disputes
Conclusion

Open LLMs Are Commercially Viable — With the Right Architecture

The licence determines the business model, not the reverse Building a revenue model first and reviewing the underlying model licence later is the structural error that creates legal exposure. The model licence must be assessed before the product architecture is committed — changing the business model after the product is in production is far more costly than choosing the right model at the start.
Apache-2.0 is the default for commercial freedom Mistral 7B and other Apache-2.0 licensed models are the commercially unambiguous choice for products where legal simplicity is material — enterprise sales, fundraising, API-centric revenue models, or competitive use cases. The capability trade-off is real but manageable for a large category of production use cases.
Selling a feature is always safer than reselling the model The single most reliable protection against licence wall exposure is building a product with genuine value-add between the customer and the model. The five-question test is the tool for diagnosing whether your current architecture is on the right side of the line — and for documenting why, when it matters.
Licence risk accumulates silently Products built on non-compliant revenue models do not usually fail immediately — they create exposure that surfaces at the worst commercial moments: fundraising, M&A due diligence, enterprise procurement review, or a model provider audit. The cost of addressing licence risk proactively is a fraction of the cost of addressing it reactively.

The choice of which open LLM to build on is not just a technical decision — it is an IP structuring decision that affects the product's commercial architecture, enterprise sales posture, and investor-facing risk profile. Products that treat the model licence as infrastructure — as fundamental as their choice of cloud provider or database — avoid the compliance surprises that follow founders who treat it as a legal formality. For guidance on how model licence choice intersects with AI IP ownership and investment readiness, see AI IP Ownership — wcr.legal.

Oleg Prosin is the Managing Partner at WCR Legal, focusing on international business structuring, regulatory frameworks for FinTech companies, digital assets, and licensing regimes across various jurisdictions. Works with founders and investment firms on compliance, operating models, and cross-border expansion strategies.