How to License Your AI Model: Options for Developers and Startups
How to License Your AI Model: Options for Developers and Startups
The licence you attach to your AI model determines who can use it, how they can use it, whether you can build a business around it, and what legal obligations bind both you and your users. This guide maps every major licensing strategy — with the trade-offs, clause requirements, and monetization implications for each.
-
0Introduction: Why Your Licensing Decision Is Your Most Important IP Choice What a model licence controls, what it cannot do, and why the choice is often irreversible→
-
1The Four Licensing Strategies: Trade-Offs and When to Use Each Fully open source, open-weight custom licence, dual-licence, and proprietary — mapped against business objectives→
-
2Key Clauses Every AI Model Licence Must Contain Use restrictions, derivative control, output ownership, attribution, liability limits, and acceptable use→
-
3Licensing Strategy and Revenue Models: How They Connect API-as-a-service, commercial licence tiers, enterprise agreements, and fine-tuning revenue→
-
4Protecting Your Model: IP, Trade Secrets, and Derivative Control What can be protected, how to protect it, and how to limit what others can do with your model→
-
5Conclusion: A Six-Step Framework for Choosing Your Licensing Strategy From business model to legal structure — a decision process for founders and developers→
Introduction: Why Your Licensing Decision Is Your Most Important IP Choice
When a developer or startup releases an AI model, the licence attached to that model determines the entire legal framework within which the model exists in the world. It controls who can use the model, what they can build with it, whether competitors can train on it, whether you can charge for it, and what happens when someone uses it to cause harm. It is, in many ways, the single most consequential intellectual property decision a developer can make — and one that is frequently approached as an afterthought.
Unlike a software product licence, which typically governs access to a running application, an AI model licence governs something more fundamental: the distribution of a trained artefact — the weights — that can be run, fine-tuned, merged, distilled, and redistributed independently of any application built on it. Once those weights are released, the licence choice is effectively permanent. You cannot retroactively revoke rights granted by an open-source licence from users who already have the weights.
Access and use rights
The licence defines who may obtain and run the model — whether access is unrestricted, limited to specific user categories, gated behind a commercial agreement, or reserved entirely for the developer. It also defines the permitted use cases and, critically, the prohibited uses that apply to all downstream users.
Derivative and redistribution rights
The licence governs whether users may fine-tune the model, create derivative versions, merge it with other models, or redistribute the original or modified weights — commercially or non-commercially. These provisions determine whether an open release will generate an ecosystem or a fork that competes with you.
Liability and warranty allocation
The licence allocates risk between the developer and users — typically disclaiming all warranties as to accuracy, fitness for purpose, and safety. It also defines the conditions under which the developer's liability is limited. These provisions interact directly with the contractual liability framework applicable to downstream deployments.
The liability provisions in a model licence interact directly with the contracts your downstream users will sign when they deploy your model in a business context. For a full analysis of how AI contracts should address risk and accountability at the deployment layer, see AI Risk & Liability: A Contractual Framework for Businesses — which covers the duties, indemnities, and governance mechanisms that downstream deployers of AI models must put in place.
- "We can just use MIT — it keeps things simple and gets us adoption"
- "Open source licensing doesn't affect our ability to raise investment"
- "We can always switch to a commercial licence later if we need to"
- "The licence doesn't matter until we have real users"
- "Apache 2.0 means anyone can use it but not compete with us"
- "A custom licence is too complicated and will put developers off"
- Permissive licences grant irrevocable rights — including commercial rights — to anyone who downloads the weights
- Fully open releases can make IP due diligence harder in later funding rounds
- Rights already granted under an open licence cannot be retroactively restricted
- The licence defines who your potential commercial customers are from day one
- Apache 2.0 explicitly allows competitors to use your model to build competing products
- Custom licences are standard across all major frontier model releases and are expected by enterprise buyers
An AI model licence is the legal instrument that governs the terms under which a trained AI model — including its weights, architecture, associated training code, and documentation — may be accessed, used, modified, and redistributed. It is distinct from both a software licence (which governs a running application) and a data licence (which governs training data). The model licence travels with the weights: any party that obtains the weights, regardless of the channel through which they obtained them, is bound by the terms of the licence under which they were released.
A model licence may take the form of an OSI-approved open-source licence (such as Apache 2.0 or MIT), a RAIL-framework licence (which adds use-restriction clauses to a permissive base), a custom commercial licence drafted specifically for the model, or a proprietary end-user licence agreement that restricts access to a hosted service only. Each form creates a materially different legal and commercial profile.
Irrevocability of open licences
OSI-approved licences grant rights that are irrevocable to recipients who obtained the model under those terms. If you release v1.0 under Apache 2.0, users who downloaded v1.0 retain their Apache 2.0 rights even if you move v2.0 to a commercial licence.
Community and ecosystem lock-in
Permissive releases create derivative ecosystems — fine-tuned versions, integrations, research builds — that are impossible to fully reclaim. Switching from open to proprietary after community adoption typically damages trust and adoption without fully solving the competitive exposure.
Investor and enterprise expectations
Institutional investors and enterprise procurement teams form their IP assessment of a product at the point of initial release. A model that was fully open-sourced early requires more diligence effort to establish that a defensible commercial position remains viable.
Regulatory classification
The EU AI Act's obligations for general-purpose AI model providers differ based on how the model is released. Open-source releases may benefit from reduced obligations — but only for genuinely open licences. Custom commercial licences do not qualify for the open-source exemption.
The remainder of this guide builds the framework for making the licensing decision well — beginning with a clear map of the available strategies and their trade-offs, proceeding through the specific clauses that every AI model licence must contain, and concluding with the revenue models and IP protection strategies that each licensing approach enables.
1. The four licensing strategies: trade-offs and when to use each
Every AI model licence falls into one of four strategic categories. The category determines the fundamental relationship between you and your model's users — whether that relationship is one of open community, conditioned access, commercial transaction, or exclusive service. Choosing a strategy without understanding its commercial and legal trade-offs is the single most common licensing mistake founders make.
Apache 2.0 / MIT — maximum adoption, minimum control
Releasing an AI model under a genuine OSI-approved licence — Apache 2.0 or MIT — grants any person or organisation unrestricted rights to use, modify, redistribute, and commercially exploit the model. Apache 2.0 additionally provides an express patent licence from contributors. There are no use-case restrictions, no commercial limitations, and no requirement to contribute back. Attribution is required but minimal. This is the strategy that maximises adoption and community contribution at the cost of commercial control. It is the right choice when the primary goal is ecosystem building, research influence, or positioning for standards adoption — and when you have a separate commercial strategy (hosted product, enterprise services, training data) that does not depend on licence exclusivity.
- Research organisations and academics
- Developer tools with hosted product revenue
- Standards-building / foundation work
- Models where community fine-tunes are the product
- The model is your primary commercial asset
- Competitor use of the model is a material risk
- You need to control downstream use cases
- You expect to raise institutional investment
RAIL / Custom terms — open access with conditions
The open-weight custom licence makes the model weights publicly available while attaching conditions that OSI licences do not permit: use-case restrictions (RAIL framework), commercial thresholds requiring a paid licence above a defined scale, derivative licensing requirements, and attribution or branding obligations. This is the strategy used by Meta (Llama), Google (Gemma), Stability AI, and BigScience (BLOOM). It allows broad adoption while preserving some commercial and ethical control. The RAIL variant focuses on use-case restrictions; the custom commercial variant focuses on revenue and competitive protections. The trade-off is that custom licences are not OSI-approved, which creates some enterprise procurement friction and removes eligibility for the EU AI Act's open-source reduced-obligation regime.
- Startups seeking adoption and commercial leverage
- Models with ethical risk in specific use cases
- Hybrid open/commercial business models
- Frontier model labs building ecosystems
- Full OSI compliance is a procurement requirement
- You need EU AI Act open-source exemption
- Legal resources to draft and maintain custom terms are limited
Open for non-commercial + proprietary for commercial
The dual-licence strategy releases the model under two separate licences simultaneously: an open or research-only licence for non-commercial users (academics, individual developers, open-source projects) and a commercial licence for any use that generates revenue. This is common in database and developer tooling (MySQL, MongoDB), and is increasingly used by AI companies. It allows community building and research adoption while creating a clear commercial gate for revenue-generating deployments. The GPL or AGPL is often used for the open tier (because it forces commercial users to either open their own source or purchase a proprietary licence), though research-only licences are also used. The critical operational requirement is a CLA (Contributor Licence Agreement) from all contributors — without it, you cannot relicence their contributions under the commercial tier.
- Developer tools targeting both OSS and enterprise
- Models with clear non-commercial research value
- Teams with resources to manage two licence tracks
- Products where commercial users are identifiable
- You cannot enforce or identify commercial use reliably
- Contributors will resist signing a CLA
- The open tier creates sufficient competitive exposure
No weights distributed — API access only
The fully proprietary strategy distributes no model weights. Users access the model exclusively through an API or hosted application. This is the model used by OpenAI (GPT-4), Anthropic (Claude), and Google (Gemini through the API). The model itself is not licensed to users — it is a service. The legal instrument is a Terms of Service / API Agreement, not a model licence. This strategy offers maximum commercial control, strongest IP protection, and no risk of competitive model redistribution. The trade-off is that it requires significant infrastructure, limits the addressable market to users willing to use a hosted service, and excludes air-gapped, data-residency, or offline deployment use cases entirely unless a separate on-premises licence is offered.
- Frontier model companies with scale infrastructure
- Models with significant proprietary training investment
- Products where inference control is a safety requirement
- SaaS products built on proprietary model capabilities
- On-prem or data-residency markets are strategic
- Infrastructure cost is prohibitive at current scale
- Community adoption and fine-tuning are part of the strategy
| Strategy | Weights distributed | Commercial use | Derivative control | Competitive protection | Enterprise readiness | Best objective |
|---|---|---|---|---|---|---|
| Apache 2.0 / MIT | Yes — freely | Unrestricted | None | None | Highest | Ecosystem & adoption |
| Open-weight custom (RAIL / Llama-style) | Yes — with conditions | Permitted with conditions | Partial | Partial | Medium | Adoption + control balance |
| Dual-licence (OSS + commercial) | Yes — open tier | Commercial licence required | Strong (with CLA) | Strong | Medium | Community + commercial revenue |
| Proprietary / API-only | No | Service-based (ToS) | Maximum | Maximum | Medium (data concerns) | IP protection + revenue |
Most AI startups will operate with a hybrid of these strategies across their product portfolio — open-weight community models to drive adoption and build trust, with proprietary hosted models or commercial licence tiers for revenue-generating deployments. The section that follows addresses the specific clauses that every model licence — regardless of strategy — must contain to be commercially and legally effective.
2. Key clauses every AI model licence must contain
Whether you draft a bespoke custom licence or adapt an existing framework, seven categories of clause are essential for any AI model licence to be legally effective. Missing or inadequately drafted provisions in any of these areas create gaps that will be exploited — by competitors, by regulatory inquiry, or in downstream disputes — at the most inconvenient time.
The model language examples below are illustrative starting points. Any licence intended for commercial deployment should be reviewed and finalised by qualified legal counsel familiar with the applicable jurisdiction and the specific characteristics of the model.
Grant of rights — scope and limitations
The grant clause is the core of the licence. It must precisely specify what rights are being granted: the right to download and run the model weights, the right to fine-tune or modify them, the right to create derivative models, the right to distribute the model or derivatives, and the right to use the model or its outputs commercially. Each right should be separately enumerated — a broad grant of "use" without specifying these components is ambiguous and unenforceable as intended. The grant should also specify the scope: is it worldwide, perpetual, royalty-free, and non-exclusive? For open models, yes. For commercial licences, specify the exact term, territory, and royalty or fee structure.
"Subject to the terms of this Licence, Licensor grants you a worldwide, non-exclusive, non-transferable, royalty-free licence to: (a) use the Model Weights for any lawful purpose; (b) create Derivative Works; and (c) distribute the Model Weights or Derivative Works, subject to the conditions set out in clauses [X] through [Y]."
Acceptable use and prohibited use list
The acceptable use clause defines what the model may and may not be used for. For models with significant potential for misuse, a specific prohibited use list — modelled on the RAIL framework — is the standard approach. The list must be drafted with enough specificity to be enforceable: broad prohibitions like "harmful use" are not actionable. Specific prohibitions — "use to generate disinformation targeting named individuals," "use in systems that make employment decisions without human review," "use to facilitate mass surveillance of individuals based on protected characteristics" — are. For commercial licences, the acceptable use policy also defines the scope of the commercial grant. Critically, if the prohibited use list is to bind downstream users, the licence must require any distributor to pass through the prohibited use list to their users.
"You may not use the Model Weights or any Derivative Work to: (i) generate disinformation or false content intended to deceive; (ii) enable autonomous lethal weapons systems; (iii) facilitate surveillance of individuals based on protected characteristics; or (iv) generate content that sexually exploits minors. You must include these restrictions in any licence or terms under which you distribute the Model Weights or any Derivative Work."
Derivative works — control over fine-tuning and redistribution
The derivative works clause governs what happens when users fine-tune, merge, distil, or otherwise modify the model. For custom licences, this is the most commercially significant clause after the grant and prohibited use provisions. Key questions: Must derivatives be distributed under the same licence terms (copyleft-style)? Must derivatives carry the same prohibited use list? Can users distribute derivatives commercially? Can users withhold fine-tuned weights and offer them only as a service? Must users notify the licensor of major derivative releases? The answers to these questions determine how much ecosystem control you retain after open release.
"Any Derivative Work that you distribute must be licensed under the same version of this Licence (or a later version designated by Licensor), must include the Prohibited Use List without modification or addition, and must clearly indicate that it is derived from the Model Weights and identify the modifications made."
Output ownership and IP in model outputs
The output ownership clause addresses who owns the intellectual property in content generated by the model. This is a legally complex area: in many jurisdictions, AI-generated content has uncertain copyright status. From a licensing perspective, the developer should clarify: (1) whether you claim any ownership over outputs generated using the model; (2) whether users may copyright their outputs; and (3) whether outputs may be used to train competing models. Some custom licences prohibit using model outputs to train other models (particularly to train models that compete with the licensor). This clause should be drafted with reference to the applicable law — claims of output ownership that are unenforceable in the user's jurisdiction create false expectations without legal effect.
"Licensor makes no claim of ownership over any output generated by the Model Weights. You are solely responsible for any output you generate and any use you make of it. You may not use outputs from the Model Weights to train, fine-tune, or otherwise improve any model that is designed to compete with Licensor's commercial model offerings."
Attribution and branding requirements
Attribution requirements serve two purposes: they give the developer credit for the work (legally meaningful for Apache 2.0 compliance and brand building), and they make downstream deployments traceable (operationally significant for model provenance and responsible use tracking). Attribution requirements should specify the form of attribution required (product documentation, about page, model card), the specific text required (e.g., "Powered by [Model Name]" or "Built on [Model Name]"), and whether attribution is required for hosted products or only for weight redistribution. Branding restrictions should also address what users may not claim: the licence should prohibit claiming that the licensor endorses the user's product, and restrict use of the licensor's trademarks beyond the required attribution.
"You must include the following notice in any product or service that uses the Model Weights or a Derivative Work: '[Model Name] is developed by [Company]. This product uses [Model Name] subject to the [Licence Name].' You may not use [Company]'s trademarks in a manner that implies endorsement of your product."
Disclaimer of warranties and limitation of liability
The warranty disclaimer and liability limitation are the clauses that protect the developer from claims arising from how others use the model. The disclaimer should cover: fitness for a particular purpose, accuracy or completeness of outputs, absence of bugs, security vulnerabilities, or biases, and compatibility with any specific use case. The limitation of liability should cap the developer's total liability to an appropriate amount — for a free/open model, zero or a nominal amount; for a commercial model, tied to the fees paid. Critical: these clauses must comply with the law of the jurisdiction governing the licence — blanket exclusions of all liability are unenforceable in many EU jurisdictions and under UK consumer law. The disclaimer should also make explicit that the developer is not liable for regulatory consequences arising from a user's deployment of the model.
"THE MODEL WEIGHTS ARE PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. LICENSOR EXPRESSLY DISCLAIMS ALL WARRANTIES AS TO THE ACCURACY, COMPLETENESS, OR FITNESS FOR PURPOSE OF OUTPUTS. IN NO EVENT SHALL LICENSOR BE LIABLE FOR ANY CONSEQUENTIAL, SPECIAL, OR INDIRECT DAMAGES ARISING FROM YOUR USE OF THE MODEL WEIGHTS."
Commercial scale threshold and upgrade obligation
For open-weight custom licences with commercial gates (Llama-style or Stability-style), the commercial threshold clause is the mechanism that converts the licence from free to paid at defined scale. The clause must specify: the threshold metric (monthly active users, annual revenue, or number of API calls), the exact threshold value, the notification obligation when the user approaches the threshold, the process for transitioning to a commercial licence, and the consequences of exceeding the threshold without transitioning. Ambiguity in the threshold definition — particularly for B2B products where "monthly active users" may refer to end users or to the buyer's organisation — is a common source of compliance uncertainty and should be addressed with specific definitions.
"If your product or service that uses the Model Weights has more than [X] Monthly Active Users in any calendar month, you must obtain a commercial licence from Licensor before the end of that month. 'Monthly Active Users' means individual persons who interact with your product or service in a given calendar month. You must notify Licensor at [contact] upon reaching [X × 0.75] Monthly Active Users."
| Clause | Apache 2.0 / MIT | Open-weight custom | Dual-licence | Proprietary (ToS) |
|---|---|---|---|---|
| Grant of rights | In licence text | Must be explicit | Two separate grants | In ToS / API agreement |
| Prohibited use list | N/A (OSI prohibits) | Required (RAIL) / recommended | Required | Required (AUP) |
| Derivative works clause | Governed by OSI terms | Required — define copyleft or not | Required — CLA needed too | N/A — no weights distributed |
| Output ownership | Optional | Recommended | Recommended | Required in ToS |
| Attribution requirements | Required (notice retention) | Required (specify form) | Required | Optional (branding) |
| Warranty disclaimer / liability limit | In OSI text (verify jurisdiction) | Required — jurisdiction-appropriate | Required for both tiers | Required in ToS |
| Commercial threshold clause | N/A | Required if commercial gate exists | Separate commercial licence instead | N/A — usage is the service |
Common drafting errors that undermine AI model licences
- Prohibited use lists that are aspirational rather than specific — "do not use for harmful purposes" is not enforceable without defining "harmful"
- Derivative works clauses that fail to address the SaaS carve-out — omitting whether fine-tuned models offered as a service (not distributed) trigger the pass-through requirement
- Liability disclaimers drafted for a US audience without adaptation for EU or UK jurisdictions where blanket exclusions are invalid
- Attribution clauses that require a specific URL — if the URL changes, compliance with the clause becomes impossible
- Commercial threshold clauses that define the threshold in MAUs without defining what "user" or "monthly active" means in a B2B context
- Output ownership clauses that claim copyright over AI-generated content in jurisdictions where that claim has no legal basis
- Licences that omit a governing law and jurisdiction clause — creating uncertainty about which legal system determines enforceability
3. Licensing strategy and revenue models: how they connect
The licensing strategy you choose directly determines which revenue models are available to you and which are foreclosed. A fully open-source model cannot generate licence revenue — but it can support a services, hosting, and enterprise customisation revenue model. A proprietary API model cannot serve air-gapped deployments — but it creates a high-margin recurring revenue stream. Understanding this relationship before committing to a licence is essential to building a financially viable AI business.
The four primary revenue models for AI model developers each have a different relationship to the licensing strategy. None of them are exclusive — most commercially successful AI companies operate two or more in parallel — but each has a primary licence type that enables it most cleanly.
API-as-a-service — inference revenue
You host the model and charge for API access — per token, per request, per minute of compute, or via a subscription. This is the revenue model of OpenAI, Anthropic, and Mistral's commercial API. It requires no weights to be distributed, is fully compatible with proprietary and custom licences, and generates predictable recurring revenue. The key legal structure is a Terms of Service plus an API agreement that governs permitted use, data handling, rate limits, and liability. For enterprise customers, this typically escalates to a commercial agreement with negotiated terms on data processing, security, and SLAs.
Best licence basis: Proprietary ToS / API agreement. No weights distributed. All inference control retained. Enterprise tier requires negotiated commercial agreement with DPA and security schedule.
Commercial licence tiers — paid rights above the free threshold
You release the model weights under a custom licence that is free up to a defined threshold (user count, revenue, or industry) and requires a paid commercial licence above it. This is the Llama model (700M MAU threshold) and the Stability AI model ($1M revenue threshold). The commercial licence tier is a separate bilateral agreement with specific terms negotiated for the customer's use case. This model works best when there is a large base of free users (creating market awareness and an integration ecosystem) and a smaller set of large commercial users who generate substantial revenue. The operational requirement is a process for identifying users who cross the threshold and converting them to paid customers.
Best licence basis: Open-weight custom licence with explicit commercial threshold clause. Commercial tier licence must be pre-drafted and readily executable. Sales process should include proactive outreach to qualifying users.
Enterprise licence agreements — negotiated custom terms
You offer the model under open or research-only terms for non-commercial users and negotiate separate enterprise agreements for business customers. Enterprise licences can include: the right to deploy the model in a private cloud, the right to fine-tune for a specific domain, the right to redistribute a fine-tuned version to the enterprise's own customers, data processing terms, SLA commitments, and indemnification against IP claims. This model generates high-value recurring revenue from a smaller number of customers. It works best for models with strong vertical domain capabilities — legal, medical, financial — where enterprise customers have clear willingness to pay for proprietary access and deployment flexibility.
Best licence basis: Dual-licence (open/research tier plus proprietary enterprise tier). Enterprise agreement template should be pre-drafted with modular schedules for different deployment scenarios (on-prem, private API, fine-tuning rights).
Training, fine-tuning, and model services revenue
You release the model openly to build community adoption, then generate revenue from services built around the model: domain fine-tuning services, training data curation, deployment infrastructure, model evaluation, and technical support. This is the strategy of many open-source AI tooling companies and research labs with a commercial arm. The model itself is not the commercial product — it is the trust-building mechanism that creates a market for the commercial services. The risk is that the services can also be replicated, so this model works best where the developer has proprietary data, proprietary infrastructure, or accumulated expertise that is not easily replicated from the open release alone.
Best licence basis: Apache 2.0 or open-weight custom for the base model. Service agreements for fine-tuning, training data, and deployment. Consider retaining proprietary rights to training data sets and evaluation methodologies as the commercial moat.
| Revenue model | Apache 2.0 | Open-weight custom | Dual-licence | Proprietary API | Primary risk |
|---|---|---|---|---|---|
| API inference fees | Possible (competitor can too) | Natural fit | Natural fit | Primary model | Infrastructure cost; competition |
| Commercial licence tiers | Not available | Primary model | Commercial tier | Negotiated enterprise add-on | Threshold enforcement; conversion rate |
| Enterprise agreements | Available (services only) | Available | Primary commercial tier | Primary enterprise model | Sales cycle length; legal complexity |
| Fine-tuning / training services | Natural fit | Natural fit | Natural fit | Possible (training API) | Replicability of service by users |
| On-prem deployment licence | Possible (no licence fee leverage) | Available | Enterprise tier | Requires separate on-prem offering | IP leakage; support burden |
| Vertical SaaS on model | Available (no exclusivity) | Available | Available | Natural fit | Model dependency; licence change risk |
The question institutional investors ask is not "which licence did you choose?" but "why does your licensing strategy create a defensible commercial position?" An Apache 2.0 release with no commercial strategy is a red flag. A custom licence with a clear commercial tier, a mapped path to enterprise conversion, and a demonstrated ability to generate revenue above the free threshold is a fundable business model.
The licensing strategy should be documented as part of the business model presentation — not as a legal formality but as a commercial strategy with a specific theory of how the licence structure creates and captures value. Investors who have seen open-source AI companies struggle to monetize will expect to understand exactly where the commercial gate is, how it will be enforced, and what the revenue conversion looks like at scale.
The revenue model is not just a financial planning question — it is an operational and legal one. A commercial threshold that cannot be enforced is not a gate; it is a courtesy that sophisticated commercial users will ignore. The licence clauses in Section 2 and the IP protections in Section 4 work together with the revenue model to create a commercial AI licensing programme that is both attractive to users and defensible to investors.
4. Protecting your model: IP, trade secrets, and derivative control
The licence governs what others are legally permitted to do with your model. IP protection governs what you can actually enforce when they do something they are not permitted to do. The two are related but distinct — and a licensing strategy that is not backed by the appropriate IP protection mechanisms is a set of restrictions that exists only on paper.
AI models present a novel IP protection challenge because they are not clearly protected by any single IP right. Copyright protects expression; patents protect methods; trade secrets protect confidential information. AI model weights — the numerical parameters that define the model — exist in a legal space where none of these protections applies cleanly. Building an effective IP protection strategy requires using all three in combination, supplemented by contractual restrictions that fill the gaps the property rights leave.
Copyright in model code and architecture
Copyright protects original literary and artistic works. AI model weights themselves are probably not protected by copyright in most jurisdictions — they are numerical parameters generated by an automated training process, not authored expression. However, the model's software code (training pipeline, inference code, architecture implementation), the model documentation, and the model card are protected by copyright. For developers releasing code alongside weights, the copyright in the code is the primary statutory IP protection. The licence governs how that code may be used; copyright provides the legal basis for enforcing the licence.
Trade secret protection for training data and methods
Trade secret law protects commercially valuable information that is kept confidential and subject to reasonable measures to maintain its secrecy. For proprietary AI models, the training data curation methodology, the RLHF reward model design, the hyperparameter configurations, the evaluation benchmarks, and the proprietary training dataset are all candidates for trade secret protection — provided they are not disclosed to the public. Trade secret protection is why API-only models (where the weights are never distributed) have a stronger overall IP position than open-weight models. For open-weight releases, trade secret protection applies to the elements that are not released — the training data, the fine-tuning methodology, and the undisclosed evaluation systems.
Patent protection for novel training methods
Patents protect novel, inventive, and industrially applicable methods. Specific AI training techniques, novel architectural innovations, and proprietary inference optimisations may be patentable — but the requirements are high: the method must be genuinely novel (not previously disclosed or published), it must involve an inventive step, and in many jurisdictions it must be implemented in a way that produces a technical effect (not merely a mathematical or abstract result). AI patents are litigated frequently in the US and increasingly in Europe. For startups, the patent filing cost and timeline (2–5 years to grant) must be weighed against the protection benefit. Publication of a research paper describing the method typically bars patent protection — this decision must be made before publication, not after.
Contractual derivative control and CLA requirements
The most operationally important form of IP protection for an AI model developer is contractual: the licence's derivative works clause, combined with a Contributor Licence Agreement (CLA) from all contributors, creates a contractual framework that governs what users and contributors may do with your model. The derivative works clause restricts how users may use fine-tuned or modified versions. The CLA ensures that you retain the right to relicence contributions — essential for dual-licence models and for any future commercial licensing of community contributions. Without a CLA, every contributor to your model may have a veto over future licence changes.
A Contributor Licence Agreement (CLA) is a legal instrument signed by each person who contributes code, data, or other material to your model repository. It grants you (the model developer) a broad licence to use, modify, and relicence the contributor's work — typically including the right to sublicence it under a proprietary commercial licence. Without a CLA, contributors retain their own copyright in their contributions, which means you may not be able to relicence the model commercially without their consent.
CLAs are standard practice for open-source projects that operate a dual-licence or commercial model. They should be implemented at the start of the project — retroactively obtaining CLAs from existing contributors is possible but operationally complex. Tools like CLA Assistant automate CLA signing for GitHub-hosted projects and maintain signed agreements as part of the repository's compliance record.
| Protection mechanism | What it protects | Strength for AI models | Time to protection | Cost | Applicability |
|---|---|---|---|---|---|
| Copyright | Code, docs, model card | Medium (not weights) | Immediate | Low (registration optional) | All models |
| Trade secret | Training data, methods, configs | High (proprietary) | Immediate (if confidential) | Low–Medium (security measures) | Proprietary / partial open |
| Patent | Novel training methods, architectures | Medium (if granted) | 2–5 years | High (£20k–£100k+) | Novel methods only |
| Contractual (licence clauses) | Use, derivatives, redistribution | High (enforceable) | Immediate | Medium (legal drafting) | All models |
| CLA | Contribution relicensing rights | High (if in place) | Immediate (per contributor) | Low (tooling available) | Open / dual-licence models |
| Database rights (EU) | Training dataset structure | Medium (dataset) | Immediate | Low | EU jurisdictions only |
A six-step framework for choosing your AI model licensing strategy
The licensing decision for an AI model is the intersection of IP strategy, commercial strategy, and legal risk management. There is no universally correct answer — but there is a process that produces the right answer for your specific model, business, and market. Apply these six steps before committing to a licence, not after the first version is released.
Define your primary business objective
Ecosystem and adoption? Community research? Commercial revenue? Identify the one objective that takes precedence over the others — your licence should optimise for it.
Map your commercial risk exposure
Who could use your model to compete with you? What is the commercial impact if a competitor fine-tunes your open model and offers a competing product? If the answer is "significant," Apache 2.0 is not the right licence.
Select the licence category
Choose between the four strategies based on your objectives and risk exposure. Define the specific permissions, restrictions, and commercial thresholds before drafting begins.
Draft the seven essential clauses
Use the clause checklist from Section 2. Engage legal counsel to draft a jurisdiction-appropriate licence — particularly the warranty disclaimer and liability limitation, which are jurisdiction-sensitive.
Implement CLA and IP protections before first release
Set up CLA tooling before the first external contributor commits. Document your trade secret posture. Assess patent filing viability for novel methods before publishing research papers that would bar protection.
Build a licence review schedule
AI licensing law is evolving rapidly. Schedule annual licence reviews aligned to EU AI Act implementation milestones. Track the licence strategies of comparable models. Build your commercial threshold monitoring process before you need it.
The goal is a licence that reflects your actual business model, creates enforceable obligations that match your IP strategy, and can withstand due diligence from investors, enterprise customers, and regulators. That is achievable for any AI model developer — from an individual researcher to a well-funded startup — with the right analysis done at the right time. The licence you attach to your model is not a legal formality: it is the foundation of your commercial and IP position for as long as the model exists in the world.


