Read Ed Zitron
The pricing question assumes the current model (cloud inference, centralized compute, hyperscaler margins) is the only model.
Local inference flips that math entirely. If the model runs on your hardware, the marginal cost to the provider is close to zero. The pricing problem is a distribution problem, not a compute problem.
What I think actually happens: cloud AI settles at $20-50/month for power users who need the latest frontier models and don’t want to manage hardware. That’s sustainable. The “free tier” disappears or gets severely throttled.
But for a large chunk of use cases (summarization, classification, drafting, local assistants) models small enough to run on a consumer GPU are already good enough. That market doesn’t need to pay $50/month to Anthropic. It needs a good local runner and a one-time hardware investment.
The companies that will survive the pricing correction are the ones who either have genuinely differentiated frontier capability, or who make local deployment easy enough that users own their own stack.
There are also going to be issues with how bleeding edge AI gets sold. If the AI that can detect security exploits is real, the AI owner isn’t going to sell open access to that model.
I suspect that, if the AI is really that good for certain tasks, it won’t get sold on a token model but something more akin to human work.
I have created a machine
It can know all your secrets
I will sell your own secrets back to you
Because I own my machine and it now owns you too
Let’s do some estimates:
- An 8x H100 machine costs about $20 / hr to rent.
- With a 70B model with 4K context, a H100 node can do about 300 requests in parallel.
- A single response takes around 30 seconds to generate.
- An average user sends about 300 messages / month.
The throughput of a node is
300 concurrent * (3600 / 30) = 36 000 messages / hour.
The cost per message, then, is $20 / 36 000 = $.00055…
With 300 messages per month, the compute cost for the AI vendor is 300*$20/36000 = $0.16 / month per user. By contrast, a subscription costs $20.
So given these assumptions, it’s other things (like R&D, safety research, training runs, free accounts, etc) that represent the bulk of the cost and those could be scaled down to turn a profit. What will they do? Give how hyped AI is currently and the competitive landscape, I don’t think they’ll increase prices that much. We have products like DeepSeek on the horizon which are much cheaper, so it’s more likely that they squeeze money out of it by becoming more efficient.
It’s a weird market.
Those H100s are $25k minimum. So $200,000 just in GPUs. Drawing 700W each, or 5.6kW total. At my local prices that’s about a dollar per hour just for electricity.
It’s going to take you a couple of years to break even at $20/h. They might still hold some value at that point. Or they might be obsolete.
Well that entirely depends on your users… coding agents or in general agents that run for hours will crash your calculation
That won’t happen due to token limits. According to Anthropic, only about 5% of users hit the limit.
Exactly. Then you move up to the $100 or $200 or per token API pricing levels.
I don’t think they will ever make a profit from the AI products. I think that they will build until the bubble bursts and then get bailouts from governments because the companies are “too big to fail”. Microsoft, Google and AWS can’t go down without taking down the internet as it currently exists
I’m confused, aren’t they already charging? Something about tokens?
They’re charging but they’re burning cash by the truck load.
I’m guessing they’d need to charge north of $1000/month to get in the black.
most people are getting by on free tier or ~USD$20/MONTH
as many comments have said - this is probably a loss leader that won’t survive IPO.
more than the cost of the human labor it replaced
I personally think that general consumers will never use LLMs in any significant number. I think that LLMs will exist in two distinct spaces, FOSS for devs and other technical people who want to run there own infra locally - and B2B for everything else.
The few big AI companies that manage to last will be selling access to their models for much higher prices. Probably similar to current proprietary commercial software like VMWare, SolidWorks, VEEAM, Splunk, etc. Companies will pay hundreds, possibly thousands of dollars per seat depending on the niche offering and amount of usage.
Suppose that a company developed an LLM that is trained & tuned specifically to do legal work, and suppose it produced work that was around 95% the quality of a typical paralegal. If that company charged $6,000 a year per license to work on their platform, that’s expensive, but if you’re a small firm with say, a dozen full time lawyers, then for the yearly price of a single average paralegal, you could have each lawyer using that software to do most of the work that the paralegal would have done. I can see those kinds of applications happening more and more.
This assumes though that LLMs will continue to improve at a significant rate for a long time into the future, (5-10 more years) which isn’t at all obvious, and there is some evidence that it’s already starting to hit a ceiling.
There are other ways it might work, like if there is a method of compression that is discovered that reduces the necessary RAM and Compute needs by 2-3 orders of magnitude. So models that are considered very large today (100-300 billion params at full quality) might be able to run effectively on a single 32GB GPU that costs a few thousand dollars.
So the cost to run these models is reduced immensely, and a single small data center could run enormous models with 1,000,000+ context windows for tens of thousands of users at once.
But that cuts both ways, which is something that any AI company is going to have to deal with. Once small free models get good enough to do the vast majority of a task, a user is going to start weighing the cost/benefits, and the prospect of just buying a box and throwing one of these models in for a few grand will be very appealing.
I think there may be a good market out there for “AI boxes”, compact computers designed to run a tuned LLM, set up with a little special sauce so the interface is user-friendly, etc. Companies could sell these with support contracts to legal firms, indie Dev studios, startups, small government agencies, etc.
Idk, it’s so up in the air right now, and everything is constantly changing so fast. It’s impossible to predict where things will be in 6 months, let alone 6 years from now.
There are other ways it might work, like if there is a method of compression that is discovered that reduces the necessary RAM and Compute needs by 2-3 orders of magnitude. So models that are considered very large today (100-300 billion params at full quality) might be able to run effectively on a single 32GB GPU that costs a few thousand dollars.
You might want to check in on how well distilled / quantized models are doing, compared to gigundo datacenter versions.
Im not convinced something like Claude isnt profitable with enough users. I dont think people are spending more in compute than they pay.
Getting enough paying users though requires it to be better so more people will pay.
Obviously the free tier is at a loss, but I mean at a per paid user level.
People familiar with inference costs think that the API prices are either already profitable enough. The major cost sinks are the subsidized subscriptions and RnD costs.
The subsidized subscriptions are similar to how Uber burned cash and jacked up prices after killing competition. The RnD costs remain difficult to offset unless these companies have sufficient scale
I expect consumer prices to be always a loss leader, with professional prices starting from €300/month.
The image, video, music generators will never be profitable since human artists are also paid jack shit anyway. But they will be available for their marketing value, making the real profit from business use.
The business prices need a 10-fold or raise though, at least.
I dunno on the music one. You are right human artists get paid shit but they also put out like an album a year at most whereas AI puts out literally thousands of tracks a year so theoretically with 100x more music, even if it’s only half as popular, should make plenty on streaming services.
Obviously it’s terrible music and we should all boycott music platforms that don’t aggressively remove AI content. But people are clearly doing it for a real, financial reason
Won’t that cause true economic chaos? There already is more music than one can listen to in their lifetime. You’ll never have to listen to the same track again if you chose so. That is true before AI comes in. So how to you chose what to listen to? Where do you put value in music?
AI is just going to devalue music and art in general. Flood these industries in a race to the bottom.
Copilot Studio is already $30 a month per person, which i think is an insane high price for slop. It’s barely less than the 20 apps you get in Adobe Creative Cloud.
Unfortunately, I think the only way people will pay the prices these companies need is if LLMs become so ingrained in our lives that we effectively require them to live (like smartphones).
personally I think that’s low unless some kind of system is developed to make compute and storage cheaper. $30 / year is consumer level pricing, but I was wondering about enterprise level- Salesforce, AWS, Intuit level pricing
do they? did search engines? free to play with mmo’s took over the market to some degree. I think there will always be a free tier and a monthly subscription that might get to the level of streaming services and corpo ones that ingest internal data and such and give a variety of professional resources. For that matter might be some professional monthly subscriptions for like coding.
They’re fucked.
Local models are already winning. Those benchmarked a year behind the biggest of big boys, a year ago. Six months ago they were six months behind. Yesterday Qwen released 3.6 27B and it outperforms 3.5 397B… from February.
Either we’re plateauing toward the asymptotic limit of LLM capabilities, and the endgame runs as well on a toaster as it does on a server - or breakthroughs use big fat models as a glorified search space to be rapidly discarded. Both options point toward neural networks as a lump of algebra that sits on your hard drive and occasionally spins your fans. Remote computing loses, as it basically always must, and the drastically reduced requirements for competing on local software favor clever new competitors who aren’t a bajillion dollars in debt.
I agree with this. I have an openclaw setup since I want to own my own data and services. A few months ago Sonnet was the clear leader for general use task for me. Now Gemma 4 performs nearly as well hosted off my gaming PC. Based on resource utilization, I actually think I can run it from the same nuc that openclaw is hosted from.
I have absolutely no fear that something like chatGPT will always be accessible for free, at least with the functionality we have now with minor improvements.
AI is about more than profit; it’s about control.
This is something I keep asking and can’t get a good answer for. You see some of the ads as “your $20 per month gets you $300 in tokens!” But that’s not sustainable unless it’s being subsidized by low use people.
But then for that matter, what does the value of a token mean? Is it the amount of money you would save as compared to having a human work? That doesn’t help the company providing the service.
Or maybe is supposed to cover the price of the compute to execute the query. That would be ideal, but I don’t think that value is correct.
I really think it’s “the first hit is free” approach and soon they’re going to start jacking prices up. I believe all of them are operating as loss leaders, just to try to get market share and even this few initial price increases are showing how much they’re bleeding money.
It’s about market share (“Your first hit is free…” marketing), but you’re probably seeing only one aspect.
They’re already charging very real money for subscription users, especially enterprise.
Uber spent $3.4bn, their entire budget for AI fees for 2026, within the first four months of this year - that’s real money by anyone’s definition.
We (not Uber) set up a monitoring portal (litellm) to manage this. Some users are burning through a surprising amount, hitting what we considered sane daily limits within their first hour. One person asked a single query that cost $30 .
Individual consumers of AI are riding free on this as the big AI players jostle for position and valuation.
Will that bubble burst or gradually deflate? Or keep growing longer? Nobody knows, or if they do they’re investing cleverly and keeping their mouth shut.
I would be interested in the real world costs, what was the value of your daily limit per user and how much would you say is typical? I presume the $30 query was an outlier, otherwise it would be more efficient to hire twice as many people than use AI unless they only run a single query a day.




