AI-powered support
Ask me anything about SettleGrid
A step-by-step framework for pricing an MCP server in 2026. Calculate your cost floor, pick the right model (per-call, subscription, tiered, freemium, outcome-based), benchmark against OpenAI, Anthropic, and Stripe, and avoid the psychological traps that lead to underpricing.
Pricing a traditional API is a solved problem. You have humans on one side, a SaaS dashboard or a webhook integration on the other, and a roughly-predictable usage pattern. You can copy a competitor's pricing page, run an A/B test against sign-ups, and iterate over months.
Pricing an MCP server is different. The buyer is an AI agent — often stateless, often low-trust, and governed by a human budget the agent must stay under. The usage pattern is bursty: an agent may call your tool five times in one second while pursuing a goal, and then not at all for three days. The discovery path is automated: an agent sees your tool's price in a registry or in a payment/quote response and decides, in one round-trip, whether to use you or a competitor. Your price page has to be machine-readable, the units have to map cleanly onto how agents think about cost, and the entire pricing surface has to be stable enough that an agent's internal planner can trust it.
This lesson walks through the end-to-end pricing decision for an MCP tool: how to compute a defensible cost floor, how to pick between per-call, subscription, tiered, freemium, and outcome-based models, how to benchmark against the real ecosystem (OpenAI, Anthropic, Stripe), how to think about pricing psychology for a buyer that has no emotions, and how to use dynamic pricing responsibly. By the end, you should be able to ship a price, defend it in a founder conversation, and change it without breaking agent clients.
A price below your cost floor is a subsidy. A price barely above your cost floor is a margin trap — one AWS bill spike and your tool is unprofitable. The first move is to know your floor precisely.
For most MCP tools, the cost floor is the sum of four line items per call: model inference (if you wrap an LLM), infrastructure (compute, bandwidth, storage), third-party API fees (if you call someone else's paid service), and settlement (payment processing fees). Walk through each in order.
If your tool calls Claude, GPT, or any paid model under the hood, model cost usually dominates everything else. At current Anthropic API pricing, a Sonnet 4.6 call with 4K tokens in and 1K tokens out costs about (4 × $3 + 1 × $15) / 1000 = $0.027. The same shape on Haiku 4.5 costs about $0.009. Opus 4.7 runs roughly 5× Sonnet. If you're a Sonnet wrapper charging 5 cents per call, your inference margin is about 45%; at 3 cents, you lose money on every call. Prompt caching (cache reads at $0.30 / MTok on Sonnet) and the batch API's flat 50% discount on asynchronous jobs can cut this floor further, but only if your workload tolerates the caching TTL or the batch latency.
A lightweight Node.js handler on a serverless platform costs fractions of a cent per call at low volume. But if your tool does real work — a Playwright browser session, a Postgres query on a large table, a vector search over an index — infrastructure can easily match or exceed model cost. Measure per-call cost at your actual deployment, not from a napkin estimate. Instrument an invocation counter and divide your monthly bill by the month's call volume; the result is almost always higher than you expect.
If you call Google, OpenAI, a financial-data provider, or any paid upstream service, their per-call rate is part of your floor. Don't assume you can pass that cost through at-cost: payment processing fees come off the top, and you still need margin for your own work. The rule of thumb is to charge at least (upstream cost + 20%) + settlement fee + your margin.
Stripe charges 2.9% + 30¢ for a US card and 0.8% for ACH direct debit (capped at $5 per transaction), per their pricing page. The 30-cent fixed fee is a catastrophe for sub-dollar charges — a 5-cent call would lose 600% of its revenue to Stripe if settled individually. This is why MCP billing platforms settle in batches: consumers prepay into a balance, agents draw against it, and the platform batches those draws into larger Stripe charges so the 30¢ floor gets amortized across thousands of sub-cent operations. When you pick a billing layer, check whether it batches for you or leaves you to build the batching pipeline yourself.
Add those four numbers and you have your cost floor per call. For a typical LLM-wrapper tool, it tends to land between $0.01 and $0.08. For a pure-compute tool with no third-party cost, it can be under $0.005. For a tool that calls a premium data provider at list price (Bloomberg, Clearbit), it can exceed $0.50. Know your number before you name your price.
There is no universally correct pricing model for an MCP server. There are five models the ecosystem has converged on, each with a usage pattern that makes it shine.
One call, one charge, one price. This is the default for a reason: it maps exactly onto how agents consume tools. A reasoning loop picks up a tool, invokes it, pays, and moves on. No commitment, no minimum, no end-of-month surprise. See the per-call billing guide for the operational mechanics; the pricing move is simply to pick a price per invocation, expose it in the experimental.payment capability or your listing, and let agents self-select against their budget. Per-call is the right default for any tool where each invocation is a discrete unit of value and cost is roughly uniform across calls.
A monthly fee for unlimited access (or access up to a soft cap). Subscriptions work for tools where the caller can predict usage well — for example, a coding-assistant tool where the pattern is "developer uses this for 8 hours a day, five days a week." Subscriptions fail when usage is unpredictable or bursty, because the operator either overpays for unused capacity or hits a ceiling and can't complete the task. For MCP specifically, subscriptions are a hard sell: agents don't plan for calendar intervals, they plan for goals.
Different prices for different calls to the same tool. A search method might cost 1 cent; a deep_research method that runs a full Sonnet analysis might cost 25 cents. Tiered pricing is right when your tool exposes a range of operations with very different underlying costs, and the caller can reasonably predict which tier a given invocation needs. It's wrong when the cost is hidden behind a single method signature and the caller can't tell which tier they'll land in — agents that can't preview cost will avoid your tool entirely.
Some calls are free; others are paid. The SettleGrid SDK supports this natively: set costCents: 0 on introductory methods, positive costs on the real-work ones. Freemium is the right model when discovery matters more than revenue on the first call — when an agent needs to verify your tool works before committing. The MCP server free-tier guide walks through the configuration in detail. Freemium is wrong when every call has non-trivial cost and free usage will be abused by scraping or benchmarking bots; in that case a tiny paid floor (sub-cent, with a pre-funded balance requirement) is a better filter.
Charge only when your tool succeeds against a concrete, machine-verifiable outcome. A compliance-check tool charges only when it finds a violation; a deduplication tool charges only on matches returned. Outcome-based is the hardest model to implement — it requires an unambiguous success signal that both sides trust — but it's the most agent-friendly option for tools with uncertain success rates. Agents prefer it because the worst case of a failed call is wasted latency, not wasted dollars.
For most first-time MCP monetizers, per-call is the right starting point. You can switch models later without breaking callers, as long as your billing layer makes switching easy — pricing changes that require redeploying the tool or reshaping the MCP interface are the ones that break agent clients.
Copying a competitor's price is lazy, but copying an entire pricing ecosystem is smart. Three benchmarks matter for most MCP tool developers: what LLM providers charge, what payment rails cost, and what tools in your specific category are earning.
If your tool wraps an LLM, your floor is set by the underlying model. Anthropic's current API pricing lists Claude Opus 4.7 at $5 / MTok input and $25 / MTok output, Sonnet 4.6 at $3 / $15, and Haiku 4.5 at $1 / $5. Prompt caching reads are about an order of magnitude cheaper (Sonnet reads at $0.30 / MTok), and the batch API offers a flat 50% discount on asynchronous jobs. OpenAI publishes comparable tiers for GPT-5, GPT-4o, and the o-series at openai.com/api/pricing. Check both pages at the time you're pricing — rates change every few months, and a stale number is worse than no number. The useful move isn't to copy inference rates; your customers aren't calling you to do raw inference, they're calling you for packaged work. The move is to be able to answer: "why does your tool cost more, or less, than running the model directly?"
Stripe is the floor for fiat payment processing: 2.9% + 30¢ per US card transaction, + 1.5% for international cards, 0.8% with a $5 cap for ACH direct debit, per Stripe's pricing page. That 30-cent per-transaction floor is why micropayments have historically been impossible — you can't charge 5 cents for a call when the processor takes 30 cents to handle the charge. The MCP ecosystem works around this with pre-funded balances and batched settlement; the MCP billing comparison breaks down how different platforms handle it. If you build billing yourself without batching, the Stripe floor forces your minimum practical per-call price up to about $1 before margins make sense — which is why almost no serious MCP monetization happens outside a platform that amortizes the fixed fee.
The going rate for an MCP tool in your specific category is the most predictive benchmark you have. From the distribution in the per-call pricing benchmarks table, data-enrichment tools typically price between $0.02 and $0.50 per call with a median near $0.08; web search sits lower, median near $0.03; code analysis runs between $0.05 and $1.00 with a median near $0.15. These aren't rules — they're the observed distribution of tools that are actually earning revenue today. If your tool is 3× the category median, you need a clear value story to support the premium. If it's 10× below, you're probably leaving money on the table — or your tool isn't as differentiated as you think it is.
Traditional pricing psychology — charm pricing ($9.99 vs $10), anchoring, loss aversion — assumes a human reader making a fast subconscious decision. AI agents don't have a subconscious. But they do have pricing behaviors that feel like psychology, and getting them wrong is expensive.
$0.05 is parsed, stored, and reasoned about more reliably than $0.049. Agent planners that compute expected cost over a multi-tool plan round and truncate; they don't appreciate the anchoring move, and some tokenize the extra digit as noise. Pick round numbers in your smallest practical unit and stop there.
An agent that needs to stay under a $5 budget strongly prefers a tool with a published per-call ceiling over a tool whose cost varies with input length (unless that input is trivially bounded). If your cost is variable, either publish the upper bound in your listing or offer a payment/quote method that returns an exact number before commitment. Both moves make you selectable by planners that otherwise skip variable-cost tools entirely.
An agent compares your 5-cent tool against a competitor's 6-cent tool not by price alone but by (price / success_rate). A 5-cent tool that fails 30% of the time is effectively 7 cents per successful call — worse than a 6-cent tool with a 100% success rate. This is why outcome-based pricing has a psychological edge: it guarantees a success-adjusted price that planners can reason about without knowing the failure rate.
Below a penny per call, agents are willing to call a tool they haven't verified — the downside is capped by the caller's balance. Above a dollar, most agents route through a higher-trust channel (a known vendor, a cached result, a human confirmation step). The tier-break is category-dependent but the pattern is universal: cheap tools can ride pure discovery; expensive tools have to earn trust separately. Price above that threshold only if you're also willing to do the trust-building work.
Agents don't feel shame when they pay $2.50, but the human operator reviewing agent bills at month-end does. The agent-for-the-human dynamic means your price must survive a post-hoc review by a skeptical finance person. If a reasonable reviewer would flinch at the number, expect pushback — and design for the reviewer, not only the agent.
Dynamic pricing is the move that separates mature monetization from entry-level. It also burns developers who deploy it too early. The simple test: dynamic pricing is worth the complexity if your tool has enough call volume that you can detect price-response signal within a week, and if your cost has enough variance that a single price can't cover all cases efficiently.
For most first-year MCP tools, that test fails. A tool with 500 calls a month doesn't have the statistical power to detect a 5-cent price-change effect with any confidence; the noise from normal usage variability dominates. Pick a single defensible price, run the tool for three to six months, and revisit.
When the test passes, three dynamic-pricing moves are worth considering.
Run an A/B test: 50% of traffic sees price A, 50% sees price B, for 48 to 72 hours. Measure total revenue (price × volume) for each variant. Most developers find their revenue-maximizing price is 20 to 50% higher than their initial guess — first prices tend to be pegged to cost floor plus a thin margin rather than to value delivered, and the market tolerates more than you think.
If your tool's cost varies predictably by time (infrastructure more expensive at peak cloud hours, or an upstream API with peak-tier pricing), reflect that in your price. Planners that care about cost will route around peak by deferring non-urgent work; planners that care about latency will pay the premium. Both are fine outcomes, and the price signal does the scheduling for you.
Different prices for different consumer types — free-tier for individual developers, premium-tier for enterprise agents with higher usage. This is tiered pricing applied to identity rather than method. It requires knowing who's calling, which in practice means an API key scheme that distinguishes consumer types.
Dynamic pricing that over-rotates destroys trust. If your price changes every hour in unpredictable ways, agent planners stop caching your quoted price and start avoiding your tool; or worse, human operators get flagged bills they can't explain. The rule is: dynamic pricing should be explainable in one sentence to a non-technical stakeholder. If you can't explain it, don't ship it.
Three illustrative pricing patterns drawn from common MCP-ecosystem dynamics. Each sketch is a composite teaching example rather than a specific named tool — the numbers and behaviors are the kind of thing that plays out in practice, not a claim about any one real deployment.
A sentiment-analysis tool that started at 2 cents. The developer's cost floor was about 1 cent (Haiku inference plus a few tenths of a cent for infrastructure). They launched at 2 cents per call, expecting thin margins but high volume. Adoption was decent but revenue was stuck at a ceiling. An A/B test comparing 2 cents vs 5 cents showed volume dropped by about 20% at 5 cents, but revenue grew by about 90%. They settled at 4 cents as the revenue-maximizing point and doubled their monthly earnings without changing the tool at all. Lesson: your first price is almost always too low.
A legal-document-review tool priced per outcome. Inference cost per review was ~$0.60 (Opus, long-context). A flat per-call price of $1 would have been viable, but the tool had a 25% "no violations found" outcome where the caller derived no actionable value. The developer moved to outcome-based: $1.50 on findings, $0 on clean reviews. Volume tripled (agents routed more work through the tool because downside was zero), and total revenue grew about 40% despite collecting on only 75% of calls. Lesson: aligning price with value delivered can grow the pie, not just redistribute it.
A geolocation enrichment tool that tried freemium. The first 100 calls per key were free, with additional calls at 1 cent. The tool got heavy use from evaluators and benchmarking bots that ran exactly 100 calls per key and never upgraded; paying users accounted for less than 15% of volume. Switching to a 0.5-cent-per-call model with no free tier cut "total calls" in half but increased paid revenue by 60%. Lesson: freemium is a discovery strategy, not a revenue strategy; if your tool is already discoverable through listings and SDK docs, freemium may be pure overhead. (The MCP billing comparison explores how different platforms support freemium and when it makes sense.)
Pricing an MCP tool is a repeatable exercise, not a one-off creative act. The playbook is:
You can browse how real tools in the ecosystem are pricing by visiting the SettleGrid shadow directory and sorting by category — the listed per-call prices are live data, not marketing numbers.
One final implementation note, independent of which billing platform you use: switching pricing models should be a configuration change, not a code change. If your billing layer forces you to redeploy the tool or reshape the MCP interface every time you want to try per-call vs tiered vs freemium, you will run fewer experiments than you should, and you will settle on a worse price. Whatever stack you pick — build it yourself, wrap a platform, or stay on fixed per-call forever — make pricing iteration cheap. Your first price is almost never your final price, and the developers who iterate fastest on pricing are the ones who earn the most over the first year.
SettleGrid lets you try any pricing model — per-call, tiered, freemium — with two lines of code, and switch without redeploying. Free tier is production-ready: 50K operations/month, 0% take rate on your first $1K/mo of revenue.