AI Licensing · Commercial Use

Can You Monetize on "Open" LLMs? A Practical Breakdown

Gemma, Llama 3, and Mistral are described as "open" — but open does not mean commercially unrestricted. The gap between what the licence permits and what a business wants to do is where legal exposure begins. This post maps the restrictions, clarifies the feature-vs-reselling line, and identifies the business models that work and those that do not.

Gemma Llama 3 Mistral Apache-2.0 SaaS API reselling Usage-based pricing MAU limits Competitor bans Commercial use

Section 1

Monetization restrictions across open models

MAU thresholds, competitor bans, weight reselling prohibitions, and how they differ by licence

Section 2

Selling your feature vs reselling the model

The single most important commercial distinction — and the five-question test to determine which side you are on

Section 3

Business models that work on open LLMs

SaaS products, vertical applications, internal tooling, fine-tuned specialists — what each model is best suited for commercially

Section 4

Business models that hit licence walls

API reselling, inference-as-a-service, competitor tooling, and usage-based pass-through pricing — where the restrictions bite

Introduction — The Permission Gap in "Open" LLMs

The word "open" in open-source AI carries a specific technical meaning — the model weights are publicly available. It does not mean the model is free to commercialise in any way you choose. Every major open LLM — Llama 3, Gemma, Mistral's commercial releases, and NVIDIA's OML-licensed models — comes with a licence that defines what commercial use is permitted, what is restricted, and what is outright prohibited. The gap between what founders assume they can do and what the licence actually permits is where legal exposure silently accumulates.

This is not a theoretical concern. Licence restrictions on open LLMs have already produced measurable commercial consequences: products redesigned at scale after legal review, enterprise deals stalled over procurement questions about the underlying model, and startup valuations impacted by IP chain uncertainty at fundraising. Understanding the licence before building the revenue model — not after the product is in production — is the commercially rational approach.

Three Assumptions That Get Founders Into Trouble

⚠️

Common assumption"Open weights mean I can build any commercial product"

Reality

Open weights grant access to the model — they do not grant unlimited commercial rights. Llama 3 prohibits certain use cases entirely, applies MAU thresholds above which a commercial licence is required, and restricts using the model to build products that compete with Meta. Gemma's licence similarly bans specific harm categories and usage types regardless of whether the weights are publicly available.

⚠️

Common assumption"I'm building a SaaS product, so no model licence restrictions apply to me"

Reality

SaaS deployment does not insulate you from model licence restrictions — it changes how the restrictions apply. The question is not whether you are a SaaS product, but what your SaaS product does: whether it builds on the model to create distinct value, or whether the model itself is the product being sold. The licence restrictions run to the use case, not the delivery mechanism.

⚠️

Common assumption"Usage-based pricing based on model calls is just standard SaaS pricing"

Reality

Pricing that is directly proportional to model API calls — with no meaningful value-added layer between the model and the customer — can constitute API reselling regardless of what you call it in your terms of service. Several model licences prohibit "selling access to" the model as a distinct commercial act. Whether your pricing structure crosses that line depends on whether there is genuine product value added, not on the pricing label you use.

How Licence Restrictions Flow Through the Product Stack

The obligation cascade — from model provider to end customer

Model provider licence (Llama 3 / Gemma / Mistral / Apache-2.0)

Sets the outer boundary: what use cases are permitted, what scale triggers additional obligations, what downstream uses are prohibited, and what the product layer must flow down to customers

AI product layer (your SaaS / API / application)

Must stay within the model licence boundary on use cases and must flow down required restrictions to customers in the product's own terms of service — the product cannot grant customers rights the model provider has not granted the product

Customer / end user

Bound by both the product's ToS and — through flow-down clauses — by the underlying model's acceptable use policy; customer activities that violate the model licence create upstream liability for the product layer even when the product's own terms do not explicitly address the behaviour

Four Variables That Determine Your Licence Exposure

Which model you are using

The licence type — Llama 3 Community Licence, Gemma ToU, Apache-2.0, NVIDIA OML, Mistral Research Licence vs commercial — determines the baseline restrictions. Apache-2.0 models carry the fewest restrictions; custom model licences carry the most. This is the single most important variable in the commercial risk assessment.

How you deploy it

Self-hosted open-weight models used internally create different obligations than models accessed via API or redistributed as part of a product. Deployment method affects sub-processor obligations, data processing agreements, and whether distribution-specific licence clauses are triggered.

What your revenue model is

Revenue models that monetise the model directly — per-token billing, inference API reselling, model-as-a-service — face higher licence scrutiny than revenue models that monetise a product outcome the model enables. The closer the pricing unit is to the model's own output unit, the more carefully the licence must be reviewed.

Who your customers are

Certain model licences restrict use in competitive contexts — serving customers who compete with the model provider, or enabling applications in sectors the model provider has identified as prohibited. Enterprise customers in regulated sectors also trigger additional obligations around data processing and model governance that affect the commercial relationship.

⚖️

Structuring context: Model licence choice affects not just product compliance but the IP structure of the company itself — including what is defensible at fundraising and how AI assets are valued in M&A. For the framework on how open model licence obligations interact with AI IP ownership and investment structuring, see AI IP Ownership — wcr.legal.

Section 1 — Monetization Restrictions Across Open LLMs

Every major open LLM applies commercial restrictions that fall into three categories: scale-based thresholds that trigger additional obligations above certain user counts, competitor bans that prohibit using the model to build products that compete with the model provider, and reselling prohibitions that restrict monetising the model or its weights directly. These restrictions are not identical across models — the specific terms, thresholds, and prohibitions vary materially between Llama 3, Gemma, NVIDIA's OML, and Apache-2.0 licensed models like Mistral 7B.

The Three Restriction Categories That Matter for Monetization

📊

Scale / MAU thresholds

Llama 3: products with more than 700 million monthly active users must obtain a separate commercial licence from Meta; below this threshold, the community licence applies

What triggers the threshold: MAU is counted across all products using the Llama 3 model; a company with multiple products using Llama 3 aggregates MAU across all of them for threshold purposes

Practical impact: at current scale levels, the 700M MAU threshold is not immediately relevant for most startups; however, the threshold creates a contractual dependency at a future point that must be disclosed in fundraising and M&A documentation

Gemma and Mistral: neither model applies a formal MAU threshold; scale-based restrictions under these licences operate through AUP category enforcement rather than user count triggers

🚫

Competitor / use-case bans

Llama 3: prohibits using the model to train or fine-tune another large language model that is intended to compete with Meta's AI products; this restricts foundation model development on top of Llama 3

Gemma: Google's terms prohibit use to "create content that may negatively affect Google" and restrict training competitive models; the competitor clause is broader in language if less specifically defined

NVIDIA OML: explicitly prohibits use to develop competing AI software, frameworks, or hardware tools — one of the clearest competitor ban formulations among open-weight licences

Apache-2.0 (Mistral 7B): no competitor ban; Apache-2.0 does not permit discrimination against fields of endeavour, so competitive use is explicitly allowed under the licence terms

💰

Reselling / API monetization bans

Llama 3: the licence does not explicitly prohibit offering inference services, but does prohibit use in violation of Meta's usage policies; whether API reselling is permitted depends on whether it constitutes "use" under the licence or requires a separate agreement

Gemma: the Terms of Use prohibit "providing the model as a service or API to third parties" without Google's consent — this is one of the clearest API-reselling prohibitions in a major open-weight licence

NVIDIA OML: prohibits redistributing the model or its outputs as the primary basis of a commercial service; inference services that pass model outputs as the core product are restricted

Apache-2.0 (Mistral 7B): no reselling restriction; redistribution and commercial API deployment are explicitly permitted under Apache-2.0 terms

Licence Profile by Model — Commercial Restriction Overview

🦙

Llama 3

Meta — Llama 3 Community Licence

MAU threshold

700M — commercial licence required above

Competitor ban

Yes — competing LLM training prohibited

API reselling

Ambiguous — not explicitly prohibited, but usage policy governs

Fine-tune distribution

Permitted under community licence with attribution

SaaS product use

Permitted below MAU threshold

Licence revocability

Custom licence — revocable by Meta

💎

Gemma

Google — Gemma Terms of Use

MAU threshold

None — AUP-based restrictions only

Competitor ban

Yes — broad language covering Google-competing use

API reselling

Prohibited without Google's consent

Fine-tune distribution

Permitted; fine-tune inherits Gemma ToU

SaaS product use

Permitted for non-prohibited use cases

Licence revocability

Custom ToU — revocable by Google

🌬️

Mistral 7B

Mistral AI — Apache-2.0

MAU threshold

None — Apache-2.0 has no scale limit

Competitor ban

None — all fields of endeavour permitted

API reselling

Permitted — redistribution allowed under Apache-2.0

Fine-tune distribution

Permitted; Apache-2.0 applies to derivative works

SaaS product use

Permitted without restriction

Licence revocability

Irrevocable — Apache-2.0 grants permanent rights

⚡

NVIDIA Open Model Licence

NVIDIA — OML (custom)

MAU threshold

None stated — AUP and scope restrictions apply

Competitor ban

Yes — explicitly prohibits competing AI/software products

API reselling

Restricted — model output as primary product prohibited

Fine-tune distribution

Restricted — derivative works governed by OML

SaaS product use

Permitted for non-prohibited, non-competitive use cases

Licence revocability

Custom — NVIDIA retains change and termination rights

Restriction Comparison — Key Monetization Variables

MAU / scale threshold

🦙 Llama 3

700M MAU trigger

Commercial licence from Meta required above 700 million monthly active users; counts aggregate across all products using Llama 3

💎 Gemma

None

No formal MAU threshold; scale-based restrictions operate through AUP category enforcement rather than user count

🌬️ Mistral 7B (Apache-2.0)

None

Apache-2.0 applies no scale limits of any kind; the product can grow to any size without the licence terms changing

⚡ NVIDIA OML

None stated

No explicit MAU trigger; AUP restrictions and scope limitations apply regardless of scale

Competitor use ban

🦙 Llama 3

Banned

Using Llama 3 to train or fine-tune a competing LLM is explicitly prohibited; foundation model development on Llama 3 is restricted

💎 Gemma

Banned

Broad language prohibiting use that "negatively affects Google"; training competitive models and building Google-competing AI products is restricted

🌬️ Mistral 7B (Apache-2.0)

None

Apache-2.0 cannot discriminate against any field of endeavour; competing directly with Mistral AI using Mistral 7B is fully permitted

⚡ NVIDIA OML

Banned

One of the clearest competitor ban formulations: explicitly prohibits use to develop competing AI software, frameworks, or hardware tools

API / inference reselling

🦙 Llama 3

Ambiguous

Not explicitly prohibited; Meta's usage policy governs whether inference services constitute permitted use — material legal risk for pass-through API products

💎 Gemma

Prohibited

ToU explicitly bans "providing the model as a service or API to third parties" without Google's prior consent — the clearest API-reselling prohibition of any major open-weight licence

🌬️ Mistral 7B (Apache-2.0)

Permitted

Apache-2.0 explicitly permits redistribution in any form; building a commercial inference API on Mistral 7B is fully permitted without restriction

⚡ NVIDIA OML

Restricted

Prohibits redistributing the model or its outputs as the primary basis of a commercial service; inference products where model output is the core deliverable are banned

SaaS product deployment

🦙 Llama 3

Permitted

SaaS products using Llama 3 as internal infrastructure are permitted below the 700M MAU threshold, provided the use case does not fall within AUP-prohibited categories

💎 Gemma

Conditional

Permitted for non-prohibited use cases; AUP restrictions on harm categories apply; cannot build Google-competing SaaS products on Gemma

🌬️ Mistral 7B (Apache-2.0)

Fully permitted

No restrictions on SaaS deployment; any product category, any customer type, any scale — Apache-2.0 applies no deployment constraints

⚡ NVIDIA OML

Conditional

Permitted for non-competing, non-prohibited use cases; products must not build AI tooling that competes with NVIDIA's commercial offerings

Usage-based / per-token pricing

🦙 Llama 3

Conditional

Permitted where the pricing unit reflects genuine product value rather than raw model consumption; per-token billing that mirrors API reselling creates material licence risk

💎 Gemma

High risk

Per-token or per-call pricing on Gemma inference approaches the API reselling prohibition directly; high risk of falling within the ToU ban on providing Gemma as a service

🌬️ Mistral 7B (Apache-2.0)

Fully permitted

No pricing restrictions under Apache-2.0; per-token, per-call, or any other pricing unit is commercially permissible without licence risk

⚡ NVIDIA OML

Conditional

Depends on whether a meaningful product value-add layer exists; pricing that passes model output cost through to customers approaches the reselling restriction

Licence change risk

🦙 Llama 3

High

Custom licence — Meta retains the right to amend terms; products built on Llama 3 are exposed to future restriction changes without contractual protection

💎 Gemma

High

Custom ToU — Google can update terms unilaterally; continued use after an update constitutes acceptance of new terms under the Gemma Terms of Use

🌬️ Mistral 7B (Apache-2.0)

None

Apache-2.0 is irrevocable; rights granted cannot be withdrawn retroactively; products built on Mistral 7B carry no licence change risk

⚡ NVIDIA OML

High

Custom licence — NVIDIA retains change and termination rights; OML terms have already evolved across model generations

Section 2 — Selling Your Feature vs Reselling the Model

The single most important commercial distinction in open LLM licensing is the difference between building a product that uses a model and reselling the model itself. Licence restrictions on API reselling and model redistribution do not apply to every business that makes money using an open LLM — they apply to a specific type of monetization where the model, rather than the value the model enables, is the thing being sold. Understanding where the line falls is not a legal technicality. It is the difference between a product that is freely commercialisable and one that requires a separate agreement with the model provider.

The line is not always obvious in practice. A document summarisation tool that charges per document is clearly selling a product feature — the model is the engine, not the product. An inference API that exposes raw model completions and charges per token is clearly reselling model access. Between these two extremes sits a large grey zone: usage-based SaaS products, AI platform tooling, embedded AI APIs sold to developers, and multi-model routing services where the underlying model is material to the customer proposition.

Permitted vs Prohibited: The Core Commercial Distinction

Generally permitted — selling your feature

You add value on top of the model

Vertical SaaS: a product that uses a model to provide domain-specific output (legal drafting, code review, financial analysis) charges for the workflow outcome — the model is infrastructure, not the product

AI-augmented features: adding AI-powered features (smart search, auto-tagging, summarisation) to an existing product and charging for the overall product rather than for the AI feature specifically

Fine-tuned specialists: training a domain-specific model on proprietary data and selling the resulting capability as a product — provided the base model licence permits derivative distribution

Usage-based pricing on product outcomes: charging per document processed, per report generated, or per task completed — where the pricing unit is the product output, not the underlying model's compute unit

Internal tooling sold externally: AI tools built for internal use that are then packaged and sold to other companies — the product is the tool, its integration, and its workflow, not model access

Generally restricted — reselling the model

The model itself is what customers pay for

Raw inference API: an API endpoint that accepts prompts and returns model completions, with no meaningful transformation, context injection, or product logic between the customer and the model

Per-token billing that mirrors API costs: pricing that is directly proportional to model token consumption, passed through to customers with a margin — the economic structure of reselling even when the legal structure claims otherwise

Model-as-a-service with customer model choice: platforms that give customers a menu of underlying models (Llama 3, Gemma, Mistral) and bill for access — the model selection is the product value proposition

Weight redistribution: distributing model weights (with or without modification) as a commercial product, or charging for access to fine-tuned weights derived from restricted base models

Inference hosting services: hosting open-weight models and selling inference access to customers where the primary value is avoiding the customer's need to run the model themselves — compute + model access bundled

The Five-Question Test — Which Side of the Line Are You On?

Apply all five questions — risk accumulates with each "reselling" answer

What is the customer actually paying for — a product outcome, or model access?

✓

Feature signal: customer pays for a completed task, a workflow outcome, or access to a product that uses AI to deliver its core function

✗

Reselling signal: customer pays for the ability to send prompts and receive model responses — the model is the service being delivered

Does the product add substantial logic, data, or transformation between the customer and the model?

✓

Feature signal: the product has its own prompt engineering, retrieval systems, output validation, domain logic, or post-processing that creates value the model alone does not provide

✗

Reselling signal: the customer's input goes to the model with minimal transformation; the model's output comes back with minimal processing; the product layer is thin routing infrastructure

Is the pricing unit the model's compute unit (tokens, API calls) or the product's output unit?

✓

Feature signal: pricing is per document, per report, per user seat, per month — units that reflect the product's value delivery rather than the model's resource consumption

✗

Reselling signal: pricing is per token, per API call, or per completion — the same unit the model provider uses; the customer's cost scales directly with model API consumption

Would the customer notice or care if you swapped the underlying model?

✓

Feature signal: the customer buys the product outcome; the underlying model is an implementation detail that the customer does not interact with directly and would not necessarily notice if changed

✗

Reselling signal: the specific model (Llama 3, Gemma, Mistral) is material to the customer's purchase decision; customers select or are billed based on model choice; model identity is part of the product proposition

Does your product have meaningful standalone value if the underlying model is removed?

✓

Feature signal: the product's data integrations, workflow logic, domain knowledge, UX, and customer relationships represent value that would survive a change of underlying model provider

✗

Reselling signal: the product is essentially a wrapper around the model; without the model, there is no product; the value delivered to customers is entirely the model's capability, accessed through the product's interface

Usage-Based Pricing — When It Is Fine and When It Is Not

Generally safe

Outcome-based usage pricing

Per document processed — the billing unit is the product workflow unit, not the model's token unit

Per user seat — flat recurring billing for access to the product regardless of model consumption volume

Per result generated — billing for a completed output (report, analysis, classification) rather than the underlying compute

Tiered feature access — higher tiers unlock more sophisticated features, not more raw model calls; the pricing reflects product capability

Requires review

Hybrid usage pricing

Credit-based systems — where credits map closely to API calls or token consumption; the mapping may constitute implicit API reselling

Metered API billing — charging developers per API call to the product's own endpoint where those calls map 1:1 to underlying model calls

Volume discounts tied to token use — if pricing structure is transparently based on model token consumption even if labelled as product credits

Developer API products — where developers build their own products on your API and your value proposition is primarily model access with light orchestration

High licence risk

Model-mirroring pricing

Per-token billing — pricing directly denominated in model tokens, identical to how the model provider bills its own API customers

Per-completion billing — charging per raw model API response with a margin, no product logic between request and response

Model selection as premium tier — charging more for access to more capable models; the model identity is the product differentiator

Inference cost pass-through — billing customers based on your actual model API costs plus a fixed margin, transparently or implicitly mirroring the model provider's pricing

Section 3 — Business Models That Work Well on Open LLMs

The commercial success of the Gemma, Llama 3, and Mistral licence families for software businesses is not theoretical — there are well-established archetypes of products that fit cleanly within the licence boundaries while generating defensible revenue. The pattern that runs through every commercially viable open-LLM business model is the same: the model is infrastructure, and the product is built on top. The revenue is derived from what the product does, not from access to the model itself.

The models best suited for commercial products without licence complications are Apache-2.0 licensed models — specifically Mistral 7B and its derivatives — because Apache-2.0 places no commercial restrictions on use, redistribution, or monetization. For products that need higher capability and can work within their licence restrictions, Llama 3 and Gemma are viable for most SaaS and vertical application use cases below the specific restriction thresholds. The key is matching the business model archetype to the model licence's actual constraints.

What Each Model Is Suited For Commercially

🌬️

Mistral 7B (Apache-2.0)

Lowest commercial restriction of any capable open LLM

Best for: any product where legal simplicity is commercially material — regulated sectors, enterprise sales with procurement scrutiny, products approaching fundraising or M&A where IP chain must be clean

Inference APIs: the only major open-weight model family where building an inference API business is commercially straightforward — Apache-2.0 permits redistribution without restriction

Developer tooling: tools sold to developers that surface the model's capability as part of the product — Apache-2.0 permits this without competitor-ban or reselling concerns

Capability trade-off: Mistral 7B is capable for many production tasks but trails Llama 3 70B or Gemma Pro in capability; products requiring frontier-level performance may need to accept the licence restrictions of higher-capability models

🦙

Llama 3

Community licence — restrictions apply above 700M MAU

Best for: vertically focused SaaS products that need strong general capability, are not competing with Meta's AI offerings, and have a clearly defined product layer between customers and the model

Fine-tuned vertical models: building domain-specific models by fine-tuning Llama 3 on proprietary data and offering the resulting capability as a product is straightforwardly permitted at most commercial scales

Enterprise applications: internal productivity tools, code assistants, knowledge management systems — products where the customer never interacts with the model directly and the product's workflow logic creates the value

Avoid: using Llama 3 to build a competing foundation model or AI platform; products with ambitions to reach the 700M MAU threshold without a Meta commercial licence agreement in place

💎

Gemma

Google ToU — API reselling explicitly restricted

Best for: applications in the Google ecosystem, products that pair Gemma with Google Cloud infrastructure, and verticals where Google's investment in safety fine-tuning is commercially relevant to customers

Embedded AI features: Gemma's efficient smaller variants (2B, 7B) are well-suited for on-device or edge deployment where the model runs locally within a product rather than as an external API

Research and education products: the ToU explicitly contemplates research and educational use; products in these verticals have a cleaner licence posture than general commercial products

Avoid: building an inference API or model-access service on Gemma; products whose revenue model could be characterised as providing Gemma as a service to third parties without Google consent

Four Business Model Archetypes That Work

🏢

Vertical SaaS with AI-powered workflow

The dominant commercially safe archetype

Structure: the product is a domain-specific workflow tool — legal contract review, medical documentation, financial analysis, code review — that uses an open LLM to automate or augment the core task; the model is never exposed to customers directly

Revenue model: per-seat SaaS subscription, per-document pricing, or per-organisation licensing — pricing tied to the product's workflow value, not to underlying model consumption

Preferred models: Llama 3 (capability), Mistral 7B (IP simplicity), Gemma (Google ecosystem); model is chosen for task fit and licence compatibility with the specific use case

Why it works: the product's domain knowledge, integrations, training data, and workflow logic create value that is clearly distinct from the underlying model capability; the licence restriction on reselling does not reach products with genuine value-add layers

🔬

Fine-tuned specialist model as a product

Proprietary data creates the moat

Structure: take an open-weight base model (ideally Mistral 7B on Apache-2.0 for maximum IP clarity), fine-tune it on proprietary domain data, and offer the resulting specialised model capability as a product to customers in that domain

Revenue model: subscription to the specialised AI capability (not to "the model"); per-outcome pricing for tasks the specialist model performs — the moat is the fine-tuning dataset, not the base model

IP considerations: fine-tuned weights derived from Apache-2.0 base models are commercially cleanest; fine-tunes on Llama 3 or Gemma inherit their licence restrictions and must not be distributed or monetised in ways those base licences prohibit

Why it works: the proprietary fine-tuning dataset is the commercial IP — customers pay for access to the specialised capability that dataset creates, not for the base model itself; clearly on the "feature" side of the reselling line

🖥️

AI-augmented internal tooling sold externally

Internal use case → commercial product

Structure: tools originally built for internal use — customer support automation, internal knowledge bases, code assistants, document management — packaged and sold to other companies facing the same operational problem

Revenue model: per-organisation licensing, per-seat subscription, or per-deployment pricing; the product is the packaged tool and its implementation, not the AI model

Compliance advantage: these products have organic product logic because they were built to solve a real internal problem; the value-add layer is genuine, which creates a robust answer to any "are you just reselling the model?" question

Why it works: the product was never designed as a model access layer; the model is infrastructure within a tool that has its own integrations, workflows, and institutional knowledge baked in

📱

Consumer or SMB applications with embedded AI

AI as a feature, not the product

Structure: consumer or SMB-facing products — writing tools, productivity apps, research assistants, creative tools — where AI capability is one feature among several and the brand, UX, and distribution are the commercial assets

Revenue model: freemium to paid subscription; premium features may include more AI usage (higher limits) but the pricing unit is the subscription tier, not the AI consumption unit

Model considerations: consumer products typically prioritise on-device capability (Gemma 2B efficient variant, Mistral 7B quantised) or cost efficiency; the Apache-2.0 licence advantage of Mistral 7B is commercially significant at consumer scale

Why it works: the product's brand, distribution, and user experience are the business; the AI model is one component of the product — the same way a database is infrastructure, not the product

Why Apache-2.0 Models Are the Default Choice for Maximum Commercial Freedom

Apache-2.0 commercial advantages vs custom model licences

What Apache-2.0 permits that custom licences restrict

Redistribution without restriction: model weights and derivatives can be redistributed commercially — including as part of products sold to customers — without approval from the model creator

All fields of endeavour: Apache-2.0 cannot discriminate against use cases; building a product that competes with the model creator is fully permitted — no competitor ban

Irrevocability: Apache-2.0 rights cannot be revoked retroactively; a product built on Mistral 7B today cannot have its licence pulled by Mistral AI tomorrow — the rights are permanent

No MAU threshold: there is no scale-based commercial licence trigger; a product can grow to any size without the licence terms changing or requiring a new agreement

Where Apache-2.0 models may require trade-offs

Capability ceiling: Apache-2.0 models currently available (Mistral 7B, early Falcon) trail the capability of Llama 3 70B or Gemma Pro on complex reasoning and instruction-following tasks; capability trade-offs must be assessed per use case

Safety fine-tuning: Apache-2.0 base models typically have less safety fine-tuning than Gemma or instruction-tuned Llama 3 variants; products in regulated or sensitive sectors may need to invest in safety fine-tuning on top of the Apache-2.0 base

Attribution requirement: Apache-2.0 requires attribution notices to be preserved in derivative works; for commercial products, this is a documentation obligation rather than a commercial restriction, but it must be handled correctly

No patent grant from model creator: Apache-2.0 includes a patent licence, but where model training involves third-party IP claims, Apache-2.0's permissive terms do not resolve underlying training data copyright questions

Section 4 — Business Models That Hit Licence Walls

Not every business model that involves an open LLM is within the licence boundaries. Several commercially attractive models — inference-as-a-service, raw API reselling, usage-based pass-through pricing, and competitor-facing tooling — run directly into the restrictions that Gemma, Llama 3, and NVIDIA's OML apply to commercial use. Understanding which models hit these walls, and what the structural alternatives are, is commercially necessary before committing to a product and revenue architecture.

Four Scenarios Where Licence Walls Appear

🚧

Inference API / model-as-a-service

Offering raw model completions as the commercial product

Licence conflictGemma's ToU explicitly prohibits "providing the model as a service or API to third parties" without Google's consent. NVIDIA OML prohibits inference services where model output is the primary product. Llama 3's usage policy creates ambiguity for pass-through inference services, even if it does not use identical language. Building a hosted inference API on any of these models creates direct licence exposure.

Structural alternativeUse Apache-2.0 licensed models (Mistral 7B) for inference API products — Apache-2.0 explicitly permits this. For products that require higher-capability models, add a genuine product layer (retrieval, orchestration, domain fine-tuning) that transforms the inference product into a feature-selling product rather than a model-access product.

🚧

AI platform with model selection

Offering customers a choice of underlying open models

Licence conflictA platform that offers customers a menu of open-weight models (Llama 3, Gemma, Mistral) and charges for access to their inference is providing model access as the product. Under Gemma's ToU, this constitutes providing the model as a service. Under NVIDIA OML, model selection as the product value proposition falls within the reselling restriction. The multi-model wrapper does not escape the individual model licence restrictions.

Structural alternativeReplace model selection as the product proposition with model orchestration as an internal capability. The platform's value should be in what it does with the models — routing, aggregation, quality scoring, domain adaptation — not in giving customers access to choose the model. Apache-2.0 models can be included in the platform without restriction.

🚧

Token-based pricing on model completions

Usage-based pricing mirroring the model provider's billing

Licence conflictPricing that is directly denominated in model tokens — or in a credit unit that maps transparently to model token consumption — creates the economic structure of API reselling even when the legal wrapper claims otherwise. Under Gemma's API reselling prohibition, this pricing structure is high-risk. For Llama 3, per-token pricing that creates a pass-through revenue model approaches the reselling line even if it does not cross it unambiguously.

Structural alternativeReprice on product output units (documents, tasks, analyses) rather than model consumption units. If per-usage pricing is commercially necessary, ensure the pricing unit reflects genuine product value rather than underlying compute. Apache-2.0 models (Mistral 7B) permit per-token pricing without restriction — use them for products where this pricing structure is core to the business model.

🚧

Competitor-facing AI tooling

Using restricted models to build tools for model providers' competitors

Licence conflictLlama 3 prohibits using the model to train other LLMs that would compete with Meta's AI products. Gemma prohibits use that "negatively affects Google." NVIDIA OML explicitly bans building competing AI software. A product that uses these models to build AI infrastructure — model training pipelines, AI development platforms, foundation model evaluation tools — sold to companies competing with these providers runs directly into the competitor ban clauses.

Structural alternativeUse Apache-2.0 models for any product sold into the AI development stack, or for any product whose customers include companies that compete with Llama, Gemma, or NVIDIA in AI. Apache-2.0's non-discrimination clause makes all fields of endeavour and all customer types permissible without the competitor ban risk.

Licence Wall Severity by Business Model and Model

Raw inference API / model-as-a-service

Llama 3

Ambiguous
Usage policy governs; no explicit prohibition but material risk

Gemma

Prohibited
ToU explicitly bans providing model as API to third parties

Mistral 7B (Apache-2.0)

Permitted
Apache-2.0 allows redistribution; inference API is fine

Per-token / per-API-call pricing to customers

Llama 3

High risk
Approaches reselling line; product value-add must be clear

Gemma

Prohibited
Token pricing on Gemma API constitutes providing model as service

Mistral 7B (Apache-2.0)

Permitted
No pricing restriction under Apache-2.0

Competitor AI platform or LLM training tooling

Llama 3

Prohibited
Training competing LLMs on Llama 3 explicitly banned

Gemma

Prohibited
Use that negatively affects Google is prohibited

Mistral 7B (Apache-2.0)

Permitted
Apache-2.0 cannot restrict any field of endeavour

Weight redistribution or fine-tune sales

Llama 3

Conditional
Fine-tune distribution permitted; must preserve community licence

Gemma

Conditional
Fine-tune distribution permitted; inherits Gemma ToU restrictions

Mistral 7B (Apache-2.0)

Permitted
Derivative weight distribution fully permitted under Apache-2.0

Pre-Launch Monetization Compliance Checklist

✓

Identify model licence type — Apache-2.0, Llama 3 Community, Gemma ToU, or NVIDIA OML; confirm the commercial restrictions that apply to your specific version and variant

✓

Map revenue model to licence — apply the five-question test; confirm pricing unit is a product output unit, not a model consumption unit, for any model with API reselling restrictions

✓

Check competitor ban scope — confirm your product does not build AI infrastructure sold to model provider competitors; if it does, use Apache-2.0 models only

✓

Verify AUP / use-case alignment — map all product use cases against the model's prohibited use categories; confirm no use case sits within or adjacent to a restriction category

✓

Assess MAU trajectory for Llama 3 — if using Llama 3, model the timeline to 700M MAU and confirm a Meta commercial licence agreement is in the product roadmap if enterprise scale is planned

✓

Review ToS flow-down requirements — confirm the product's terms of service incorporate required model licence restrictions and prohibit customers from using the product for model-prohibited purposes

✓

Document the product value-add layer — for any non-Apache-2.0 model, create and maintain documentation that articulates why the product is a feature product rather than a model reselling product; this is the primary defence in any licence compliance review

✓

Snapshot the licence at adoption — store a timestamped copy of the model licence, AUP, and any additional terms in effect at the point of first commercial deployment; this record is required for fundraising, M&A due diligence, and licence change disputes

Conclusion

Open LLMs Are Commercially Viable — With the Right Architecture

The licence determines the business model, not the reverse Building a revenue model first and reviewing the underlying model licence later is the structural error that creates legal exposure. The model licence must be assessed before the product architecture is committed — changing the business model after the product is in production is far more costly than choosing the right model at the start.

Apache-2.0 is the default for commercial freedom Mistral 7B and other Apache-2.0 licensed models are the commercially unambiguous choice for products where legal simplicity is material — enterprise sales, fundraising, API-centric revenue models, or competitive use cases. The capability trade-off is real but manageable for a large category of production use cases.

Selling a feature is always safer than reselling the model The single most reliable protection against licence wall exposure is building a product with genuine value-add between the customer and the model. The five-question test is the tool for diagnosing whether your current architecture is on the right side of the line — and for documenting why, when it matters.

Licence risk accumulates silently Products built on non-compliant revenue models do not usually fail immediately — they create exposure that surfaces at the worst commercial moments: fundraising, M&A due diligence, enterprise procurement review, or a model provider audit. The cost of addressing licence risk proactively is a fraction of the cost of addressing it reactively.

The choice of which open LLM to build on is not just a technical decision — it is an IP structuring decision that affects the product's commercial architecture, enterprise sales posture, and investor-facing risk profile. Products that treat the model licence as infrastructure — as fundamental as their choice of cloud provider or database — avoid the compliance surprises that follow founders who treat it as a legal formality. For guidance on how model licence choice intersects with AI IP ownership and investment readiness, see AI IP Ownership — wcr.legal.

Can You Monetize on “Open” LLMs? A Practical Breakdown

Can You Monetize on "Open" LLMs? A Practical Breakdown

Introduction — The Permission Gap in "Open" LLMs

Three Assumptions That Get Founders Into Trouble

How Licence Restrictions Flow Through the Product Stack

Four Variables That Determine Your Licence Exposure

Which model you are using

How you deploy it

What your revenue model is

Who your customers are

Section 1 — Monetization Restrictions Across Open LLMs

The Three Restriction Categories That Matter for Monetization

Licence Profile by Model — Commercial Restriction Overview

Restriction Comparison — Key Monetization Variables

Section 2 — Selling Your Feature vs Reselling the Model

Permitted vs Prohibited: The Core Commercial Distinction

You add value on top of the model

The model itself is what customers pay for

The Five-Question Test — Which Side of the Line Are You On?

Usage-Based Pricing — When It Is Fine and When It Is Not

Section 3 — Business Models That Work Well on Open LLMs

What Each Model Is Suited For Commercially

Four Business Model Archetypes That Work

Vertical SaaS with AI-powered workflow

Fine-tuned specialist model as a product

AI-augmented internal tooling sold externally

Consumer or SMB applications with embedded AI

Why Apache-2.0 Models Are the Default Choice for Maximum Commercial Freedom

What Apache-2.0 permits that custom licences restrict

Where Apache-2.0 models may require trade-offs

Section 4 — Business Models That Hit Licence Walls

Four Scenarios Where Licence Walls Appear

Licence Wall Severity by Business Model and Model

Pre-Launch Monetization Compliance Checklist

Open LLMs Are Commercially Viable — With the Right Architecture

Enterprise Risk When Choosing an “Open” Model

From VASP to CASP: What Crypto Businesses Must Do Before the 1 July 2026 Deadline?

Related Posts