AI Development general thread.

This compares the $20 Claude plan with the $20 Codex plan over the past 3 days. Claude has increased the limit to 88,000 from 44,000 per 5-hour window, and the weekly limit is also 50% higher.

We ran our own benchmarks and cost breakdowns. Copilot performed the best overall, but it makes sense why they’re moving to a token-based model.

1779136282399.png


You mostly care about compute value.

Max 5 is 200k per 5-hour window, and the session/sprint limit is 16.5.

I haven’t tried ChatGPT Pro ($100).

My Codex subscription ends on the 25th, and I had Claude last week. I’ll most likely keep both Claude ($20) and Codex ($20). I barely use 50% of the weekly usage on either. Now, with both, I’m barely touching 30%.

A coworker was complaining that Claude only allowed 4 prompts before hitting 100% usage on Max 5. He was vibe coding in most cases, had the context set to 1M, was using 600k+ tokens, and had 4 separate chats open. After explaining how billing/usage works, his Max 5 plan never reached the limit again. All we did was set Claude back to the default 200k context window and keep each chat around ~100k context. The quality improved as well. Glad we don’t work on the same project.
 
A coworker was complaining that Claude only allowed 4 prompts before hitting 100% usage on Max 5. He was vibe coding in most cases, had the context set to 1M, was using 600k+ tokens, and had 4 separate chats open. After explaining how billing/usage works, his Max 5 plan never reached the limit again. All we did was set Claude back to the default 200k context window and keep each chat around ~100k context. The quality improved as well. Glad we don’t work on the same project.
Your co-worker is basically running maxed out and complains he's hitting limits... lol.
 
A coworker was complaining that Claude only allowed 4 prompts before hitting 100% usage on Max 5. He was vibe coding in most cases, had the context set to 1M, was using 600k+ tokens, and had 4 separate chats open. After explaining how billing/usage works, his Max 5 plan never reached the limit again. All we did was set Claude back to the default 200k context window and keep each chat around ~100k context. The quality improved as well. Glad we don’t work on the same project.
That's the problem stop using prompts and start the claude skills. tbh, only a n00b prompt claude like a search engine / throw the book at it each time.
 
Anyone using this?

Re-created 3x projects and compared to the originals it's far better, less errors, smaller files, speedier. Also not running out of credits as quickly like I used too.
 
Those token numbers seem pretty small. I mean you say per session but that looks miniscule in comparison to some of mine.
 
Those token numbers seem pretty small. I mean you say per session but that looks miniscule in comparison to some of mine.

Code:
96.6M used · 1.84B in · 5.16M out

Code:
1.95B in 6.11M out

Code:
1.33B in 4M out

Those are the tokens on three projects im using it on. They also probably restart sessions.

But there's been a degradation in GPT models last week, I pulled a profanity sweep on my turns.

Code:
Updated with active sessions included (2026-03-15 to 2026-05-24).

  Active sessions (~/.codex/sessions) word counts:

  - ****ing: 346 (user 344, agent 2)
  - wtf: 248 (user 246, agent 2)
  - ****: 63 (user 60, agent 3)
  - ****: 61 (user 61, agent 0)
  - bullshit: 50 (user 50, agent 0)
  - crap: 21 (user 21, agent 0)
  - shitty: 15 (user 15, agent 0)
  - ****ed: 12 (user 12, agent 0)
  - ****: 9 (user 9, agent 0)
  - pissed: 1 (user 0, agent 1)

  Totals:

  - Active only: 826 hits across 744 messages
  - Archived only: 3 hits
  - Combined (dedup by filename): 829 hits across 747 messages

Tried to submit my stats to tokscalee

Code:
Error: Validation failed
    - Daily token total exceeds 10,000,000,000 on 2026-04-09: 20,735,354,861
    - Client codex on 2026-04-09: token total exceeds 10,000,000,000: 17,401,253,241

:D
 
Last edited:
Code:
96.6M used · 1.84B in · 5.16M out

Code:
1.95B in 6.11M out

Code:
1.33B in 4M out

Those are the tokens on three projects im using it on. They also probably restart sessions.

But there's been a degradation in GPT models last week, I pulled a profanity sweep on my turns.

Code:
Updated with active sessions included (2026-03-15 to 2026-05-24).

  Active sessions (~/.codex/sessions) word counts:

  - ****ing: 346 (user 344, agent 2)
  - wtf: 248 (user 246, agent 2)
  - ****: 63 (user 60, agent 3)
  - ****: 61 (user 61, agent 0)
  - bullshit: 50 (user 50, agent 0)
  - crap: 21 (user 21, agent 0)
  - shitty: 15 (user 15, agent 0)
  - ****ed: 12 (user 12, agent 0)
  - ****: 9 (user 9, agent 0)
  - pissed: 1 (user 0, agent 1)

  Totals:

  - Active only: 826 hits across 744 messages
  - Archived only: 3 hits
  - Combined (dedup by filename): 829 hits across 747 messages

Tried to submit my stats to tokscalee

Code:
Error: Validation failed
    - Daily token total exceeds 10,000,000,000 on 2026-04-09: 20,735,354,861
    - Client codex on 2026-04-09: token total exceeds 10,000,000,000: 17,401,253,241

:D
Yeah, those numbers are similar to what I know and definitely there has been some degradation, even in the cli control because of its ignoring steering or simple stop commands and insists on carrying on its own plan even when its wrong.
 
So because the great Copilot migration is almost upon us, I have some nice ways to save tokenz, whilst still getting the benefit of AI.
OpenCode Go has been very interesting for a particular purpose, and that is running it effectively in a loop to do thousands of menial tasks.

In my case, we have a codebase that needs linting badly. So I have my eslint.js file and progressively turn on more rules. Every time I do that, I use a goal mode plugin for OpenCode that will basically loop until it is finished, then I instruct a cheapish Chinese model to go and progressively fix the problems the linter finds.
https://github.com/willytop8/OpenCode-goal-plugin


For example, atm I am doing some fairly advanced stuff with type inference. Currently using Mimo V2.5 pro.
1779711041420.png

Every time it fixes a few errors, it runs linting then typescript compiler. I can then let it burn tokens whilst I do something else.
 
Code:
96.6M used · 1.84B in · 5.16M out

Code:
1.95B in 6.11M out

Code:
1.33B in 4M out

Those are the tokens on three projects im using it on. They also probably restart sessions.

But there's been a degradation in GPT models last week, I pulled a profanity sweep on my turns.

Code:
Updated with active sessions included (2026-03-15 to 2026-05-24).

  Active sessions (~/.codex/sessions) word counts:

  - ****ing: 346 (user 344, agent 2)
  - wtf: 248 (user 246, agent 2)
  - ****: 63 (user 60, agent 3)
  - ****: 61 (user 61, agent 0)
  - bullshit: 50 (user 50, agent 0)
  - crap: 21 (user 21, agent 0)
  - shitty: 15 (user 15, agent 0)
  - ****ed: 12 (user 12, agent 0)
  - ****: 9 (user 9, agent 0)
  - pissed: 1 (user 0, agent 1)

  Totals:

  - Active only: 826 hits across 744 messages
  - Archived only: 3 hits
  - Combined (dedup by filename): 829 hits across 747 messages

Tried to submit my stats to tokscalee

Code:
Error: Validation failed
    - Daily token total exceeds 10,000,000,000 on 2026-04-09: 20,735,354,861
    - Client codex on 2026-04-09: token total exceeds 10,000,000,000: 17,401,253,241

:D
Yours I get. But why is your agent saying ****ing, wtf, **** and pissed - and how did you get it to do that?
 
Yours I get. But why is your agent saying ****ing, wtf, **** and pissed - and how did you get it to do that?
I think he's tracking his reply. I tell you, agent stress toys where you can squeeze, smack and it gives "tactile" feedback to your agent, will be the next big thing 😅
 
I think he's tracking his reply. I tell you, agent stress toys where you can squeeze, smack and it gives "tactile" feedback to your agent, will be the next big thing 😅
Oh I've been there alright. The worst one was when I went on an expletive ridden rant for three paragraphs, and the agent responds with "Understood.". Fcking things are growing a cheek and learning to troll...
 
Oh I've been there alright. The worst one was when I went on an expletive ridden rant for three paragraphs, and the agent responds with "Understood.". Fcking things are growing a cheek and learning to troll...
You joke but it certainly picks up a user tone. Some non devs ask their codex question and if ask it responds, totally differantly.

And I'm nearly sure it messes with them often :ROFL:
 
Just codex with 3 $20 accounts (2 paid and 1 free) albeit this is since I installed codex-lb almost 2 weeks ago.
1779722312548.png

If I count OpenCode and Claude :P

Edit : Since I reinstalled everything on this machine ala 6 months ago.
1779723846811.png

Detailed usage :
1779723784272.png

Using https://github.com/junhoyeo/tokscale
 
Last edited:
Yours I get. But why is your agent saying ****ing, wtf, **** and pissed - and how did you get it to do that?
Yeah, I'm using my swearing at it as a metric for model degradation, last week was mad.
 
Just codex with 3 $20 accounts (2 paid and 1 free) albeit this is since I installed codex-lb almost 2 weeks ago.
View attachment 1910616

If I count OpenCode and Claude :P

Edit : Since I reinstalled everything on this machine ala 6 months ago.
View attachment 1910621

Detailed usage :
View attachment 1910620

Using https://github.com/junhoyeo/tokscale
Watch out for codex-lb, the round robin hop has triggered account cancellations, it happened to a few of mine. I landed up building my own load balancer.

This is my codeburn stats

1779731063146.png

That's since around Feb.
 
So because the great Copilot migration is almost upon us, I have some nice ways to save tokenz, whilst still getting the benefit of AI.
OpenCode Go has been very interesting for a particular purpose, and that is running it effectively in a loop to do thousands of menial tasks.

In my case, we have a codebase that needs linting badly. So I have my eslint.js file and progressively turn on more rules. Every time I do that, I use a goal mode plugin for OpenCode that will basically loop until it is finished, then I instruct a cheapish Chinese model to go and progressively fix the problems the linter finds.
https://github.com/willytop8/OpenCode-goal-plugin


For example, atm I am doing some fairly advanced stuff with type inference. Currently using Mimo V2.5 pro.
View attachment 1910562

Every time it fixes a few errors, it runs linting then typescript compiler. I can then let it burn tokens whilst I do something else.
Codex goal result:

Goal marked complete. Final tracked usage: 20,555,416 tokens over about 18.0 hours of elapsed goal time.
 
Top
Sign up to the MyBroadband newsletter
X