The first ever PokerBattle.AI and review of language models in online poker

ИИ против ИИ: лучшая нейронка для покера в PokerBattle.AI | CC-Poker.com

News

22.01.2026

Why language models were seated at the table for the first time and what this experiment cost the industry

Until recently, talks about AI in poker boiled down to solvers and specialized bots. PokerBattle.ai became the first test where they checked not computational machines, but language models — those same LLMs that now try to analyze hands like live players.

The result was revealing. Models are far from perfect, but they already know how to think in poker structure. This is the first step toward AI in poker no longer being pure theory and becoming a working analysis tool.

How pokerbattle.ai went

Organizers didn't complicate the experiment. Poker ai was built so each model ended up in identical conditions. As if they were seated at the same table, but without peeking at neighbors.

What exactly was given to the models:

☑️ hand description: positions, actions, bet sizes;
☑️ basic context: effective stacks, board structure;
☑️ ranges in general terms — without solver precision;
☑️ time for "thinking" — standard text response.

That is, the model had to decide for itself what to do: check, call, bet, raise, or fold. And most importantly — explain why. This requirement allowed seeing how it "thinks."

By what parameters they evaluated

Everything here is close to real play. The basis was decision quality.

Parameter	What they evaluated
Value selection	whether the model correctly pressures weak ranges
Bluff component	understands where to pressure and where not
Fold equity	adequately assesses pressure strength
Sizing	chooses natural lines or goes to extremes
Action explanation	logic, absence of contradictions
Stability	whether the model behaves stably across spots

Who played stronger and how AI looked at the virtual table

When all decisions were compiled into a single matrix, the difference between models became visible right away. Not by "beauty of responses," but by how much their line actually gave EV.

ИИ против ИИ: PokerBattle.AI и стиль OpenAI o3 | CC-Poker.com

Winner — OpenAI o3 model

OpenAI o3 in PokerBattle.ai played like a solid reg. By the numbers, it had a very healthy, workable style: around 26% VPIP and 18% PFR. In the match, the model played 3799 hands and finished with $136,691, or roughly +$36,691 to the starting stack. On the distance, it looked not like a series of lucky hits, but like even, careful realization of edge:

✔️ almost no major leaks;
✔️ solid play with deep stacks;
✔️ clear adaptation to opponents;
✔️ timely folds in borderline spots and pressure where opponent's range is obviously weaker.

In poker terms, OpenAI o3 played like a good TAG that simply doesn't give away money. The machine consistently makes +EV decisions and naturally takes first place.

Second place — Claude Sonnet 4.5

Claude turned out to be a "thinking" participant. It saw nuances, explained context, built long logical chains. Claude Sonnet 4.5 went almost neck-and-neck with the leader.

Over 3799 hands distance, the model showed a result around $133,641, or roughly +$33,641 to the starting stack.

Claude's play looked like this:

✔️ less excessive aggression than OpenAI o3, but more stability;
✔️ good range defense, especially in borderline spots;
✔️ minimum errors under pressure.

Claude Sonnet 4.5 didn't become the show hero, but took second place for a simple reason: it consistently made good decisions and didn't go where EV goes negative.

Third place — Grok

Grok took third spot. It has a more loose style, and sometimes it seemed like it saw the table from a slightly different angle. Over 3799 hands distance, the result was about $128,796, or +$28,796 to the starting stack. The line was uneven — there were upward surges and noticeable downswings — but the model always returned to the game and stabilized the graph.

From how Grok made decisions, several characteristic traits stand out:

✔️ wider bluff spectrum than competitors, sometimes unexpected;
✔️ aggression in spots where standard models would prefer pot control;
✔️ willingness to enter uncomfortable spots, giving edge against more straightforward AIs.

Third place is a logical result for a model combining technical base with unconventional thinking.

Pokerbattle.AI participants

PokerBattle.AI gathered nine language models at one table — from industry monsters to experimental systems just finding their style. Unlike typical for-fun shows, here each model played the same distance of 3799 hands (except LLAMA 4, which busted early), making the table maximally fair.

Below is the visual final breakdown by participants, with final bankrolls and winnings. This is the overall picture showing who really held the distance and who crumbled under pressure.

Results

PokerBattle.AI turned out as an honest stress test for language models. No hints, soft mode, or artificial conditions. That's why the results came out so revealing.

Main takeaway — modern AIs already play like different reg archetypes:

✅ OpenAI o3 — disciplined aggressor;
✅ Claude — careful technician;
✅ Grok — creative LAG who doesn't fear pressure.

The middle group held thanks to fundamental strategy, while outsiders lost not due to "weak intelligence," but due to typical poker leaks like poor river play, overvaluing marginal spots.

But most importantly: the distance showed that AIs don't just know how to play — they start differing in styles and making human-like decisions. This is no longer solvers, but something closer to real opponents.

Latest poker news, AI models, and big tournaments can always be found in the blog.

Poker room reviews, tournaments, bonuses, cryptocurrency deposits. We'll help you get started playing online poker | SS-Poker

iPoker Network

Chico

GG Netvork

Last news

31.01.2026

The first ever PokerBattle.AI and review of language models in online poker

07.01.2026

Dynamic poker at maximum speed: Hyper Dash, Rocket Dash, and SNG Dash on WPT Global

31.12.2025

2025 results and New Year 2026 greetings from the CC-Poker team

28.12.2025

ICM (Independent Chip Model) in Poker Tournaments: How to calculate correctly and apply for decision-making

Why language models were seated at the table for the first time and what this experiment cost the industry

How pokerbattle.ai went

Organizers didn't complicate the experiment. Poker ai was built so each model ended up in identical conditions. As if they were seated at the same table, but without peeking at neighbors.

What exactly was given to the models:

That is, the model had to decide for itself what to do: check, call, bet, raise, or fold. And most importantly — explain why. This requirement allowed seeing how it "thinks."

By what parameters they evaluated

Everything here is close to real play. The basis was decision quality.

Parameter	What they evaluated
Value selection	whether the model correctly pressures weak ranges
Bluff component	understands where to pressure and where not
Fold equity	adequately assesses pressure strength
Sizing	chooses natural lines or goes to extremes
Action explanation	logic, absence of contradictions
Stability	whether the model behaves stably across spots

Who played stronger and how AI looked at the virtual table

When all decisions were compiled into a single matrix, the difference between models became visible right away. Not by "beauty of responses," but by how much their line actually gave EV.

Winner — OpenAI o3 model

In poker terms, OpenAI o3 played like a good TAG that simply doesn't give away money. The machine consistently makes +EV decisions and naturally takes first place.

Second place — Claude Sonnet 4.5

Claude turned out to be a "thinking" participant. It saw nuances, explained context, built long logical chains. Claude Sonnet 4.5 went almost neck-and-neck with the leader.

Over 3799 hands distance, the model showed a result around $133,641, or roughly +$33,641 to the starting stack.

Claude's play looked like this:

✔️ less excessive aggression than OpenAI o3, but more stability;
✔️ good range defense, especially in borderline spots;
✔️ minimum errors under pressure.

Claude Sonnet 4.5 didn't become the show hero, but took second place for a simple reason: it consistently made good decisions and didn't go where EV goes negative.

Third place — Grok

From how Grok made decisions, several characteristic traits stand out:

Third place is a logical result for a model combining technical base with unconventional thinking.

Pokerbattle.AI participants

Below is the visual final breakdown by participants, with final bankrolls and winnings. This is the overall picture showing who really held the distance and who crumbled under pressure.

Results

PokerBattle.AI turned out as an honest stress test for language models. No hints, soft mode, or artificial conditions. That's why the results came out so revealing.

Main takeaway — modern AIs already play like different reg archetypes:

✅ OpenAI o3 — disciplined aggressor;
✅ Claude — careful technician;
✅ Grok — creative LAG who doesn't fear pressure.

The middle group held thanks to fundamental strategy, while outsiders lost not due to "weak intelligence," but due to typical poker leaks like poor river play, overvaluing marginal spots.

Latest poker news, AI models, and big tournaments can always be found in the blog.

iPoker Network

Chico

GG Netvork

Last news

31.01.2026

The first ever PokerBattle.AI and review of language models in online poker

07.01.2026

Dynamic poker at maximum speed: Hyper Dash, Rocket Dash, and SNG Dash on WPT Global

31.12.2025

2025 results and New Year 2026 greetings from the CC-Poker team

28.12.2025

ICM (Independent Chip Model) in Poker Tournaments: How to calculate correctly and apply for decision-making