A plain-language log of what's been built, what works, what doesn't, and what's next.
Last updated: June 2026
What ThinAI Is
ThinAI is a system that learns to play arbitrary turn-based games from a natural-language description of the rules. It runs entirely on a single laptop — no cloud APIs, no pre-trained models, no GPU clusters. The system parses game rules into a structured format, generates evaluation features automatically, and learns through self-play.
The core question: can classical AI techniques (search, evaluation learning, structured representation) match or exceed what neural approaches do for game learning — at a fraction of the compute?
Architecture
Natural language parser — converts plain English game descriptions into a Game Description Language (GDL) JSON format
Game engine — loads any GDL spec and handles legal moves, state transitions, and win/draw detection
Auto-feature discovery — analyzes GDL structure to generate evaluation features without game-specific knowledge
Minimax search with alpha-beta pruning and selective deepening — at depth 1-2, considers all moves; at depth 3+, focuses on the top 8 most promising moves (like a human ignoring obviously bad options)
Sampling-based search — for card games with hidden information, samples possible opponent hands and evaluates moves across multiple possible worlds
Adaptive effort allocation — adjusts search depth per position based on branching factor (how many choices are available) and a node budget
Self-play training — learns feature weights through win/loss reinforcement over ~40 games, similar to how a human improves by playing practice matches
Phase Status
Phase
Description
Status
0
Foundations — GDL design, stack, initial games
done
1
Core learning — minimax, features, self-play training
Polish — chess support, opponent modeling in training, partnership games
planned
Key Numbers
22 playable games across 14 categories
418+ automated tests (including targeted regression tests for C4 and Checkers)
~12,000 lines of engine code
~2,700 word Scrabble dictionary
40 training games to reach competence (board games)
< 2 seconds per AI move (typically under 1 second)
1,500 / 2,500 max nodes per move (card / board games)
350+ game objects recognized by the parser (58 with SVG icons)
0 GPUs required
Recent Changes (June 2026)
Cribbage (new game): Full 2-player Cribbage — deal 6, discard 2 to crib, cut starter (his heels), pegging toward 31 with pairs/runs/15s/go, hand scoring (15s, pairs, runs with multiplicity, flush, nobs), crib scoring, dealer alternation, race to 121. Custom UI with peg count display, play area, Go button, and phase-aware prompts.
Generic card scoring module: New reusable engine/gdl/card_scoring.py with primitives for any card game: find_subsets_summing_to() (card subsets hitting a target value), count_rank_pairs(), count_rank_run_points() (cross-suit runs with duplicate multiplicity), count_flush(), and sequential play scoring (sequential_play_pairs/runs()). Configurable rank values (Cribbage A=1 vs standard A=14).
Cross-suit runs:melds.py find_runs() gains same_suit and rank_values parameters (backward compatible). Enables run detection across suits for Cribbage and similar games.
Score-racing features: Auto-detected for any game with score state variables — score_racing_lead, score_racing_progress, is_dealer. Benefits Cribbage, Canasta, and future score-target games without game-specific code.
Training regression fixes: Two regressions from the training stability commit (f06cb35) fixed:
Connect Four depth: Board game node budget restored to 2500 (was 2000). C4 needs depth 4 to see 3-in-a-row blocking threats (74=2401 nodes).
Checkers nosedive: Games with hand-crafted features (Checkers, Mancala, Reversi) restored to independent ReasonerOpponent instead of snapshot-of-self. Snapshot opponents caused a death spiral — learner fights frozen copy, degrades weights on every loss, never recovers.
Learning rate decay: Loss decay restored to 0.995 (was 0.92). The aggressive 8% decay killed learning capacity by game 30.
New regression tests: Targeted tests for C4 (search depth ≥ 4, training improves) and Checkers (uses hand-crafted features, no nosedive, correct opponent type) to prevent recurrence.
Changes (May 27, 2026)
Canasta (new game): Full 2-player Canasta with double deck (108 cards), multi-phase turns (draw → meld → discard), wild cards (2s, Jokers), discard pile pickup, canasta bonuses (7+ card meld: natural 500 pts, mixed 300 pts), and going out detection. Custom UI with table melds, meld action buttons showing actual cards, and discard prompts.
Generic meld system: Reusable module (engine/gdl/melds.py) with find_sets(), find_runs(), best_melds(), deadwood_value(), and can_add_to_meld(). Configurable wild cards, minimum meld size, and scoring bonuses. Any card game mentioning "melds" or "sets" can use it.
Multi-phase turns: Generic turn phase system — games define a phase list (e.g. ["draw", "meld", "discard"]) and phases advance automatically. Phase resets to the first phase when the turn passes to the next player.
Double deck support: New "double_deck" deck type (2×52 + 4 jokers = 108 cards) in the card system.
Strategy hints: Users can type plain-English strategy advice before training (e.g. "control the center", "save high cards"). Keywords are matched to feature priors, giving the AI a head start on what matters.
Spades (new game): 2-player trick-taking with bidding and spades-always-trump. Shares UI with Wizard.
Deep copy fix: State copying now properly handles nested lists (list-of-lists like melds), preventing card duplication during AI search.
Untrained AI speed: Card games without trained weights now use depth 1 instead of 4, making untrained AI respond instantly instead of getting stuck on large branching factors.
413 tests all passing.
Changes (May 23, 2026)
Visual training replay: After training, watch sampled game replays with the actual game board UI — pieces moving, cards being played. Play/Pause, speed slider, Prev/Next game navigation. 5 games sampled evenly across training to show the AI's progression from clumsy to competent.
Training stability: Fixed weight corruption spiral where consecutive losses degraded the AI's evaluation function. Learning rate now decays 8% per loss (was 0.5%) with per-update weight clamping. Feature-based opponents use graduated difficulty (random → self-snapshot) instead of fixed depth-3 that caused nosedives. 8 new regression tests verify no game's late win rate collapses to 0%.
Stop training: New "Stop Training" button lets users halt training mid-run while keeping all results accumulated so far.
GameBoardRenderer refactor: Extracted unified board routing component used by both play mode and training replay — replays now look identical to actual gameplay for all 20 games.
Click-to-move UI: Novel movement/capture games support click-to-select-piece → click-destination in the grid board with movable/selected/destination highlighting.
Card special powers: Parser detects wild cards, Skip, Reverse, Draw Two keywords and routes to appropriate engine. Uno deck auto-detected from description.
413 tests including 8 training quality regression tests across all major games.
Changes (May 21, 2026)
Wizard (new game): Trick-taking card game with trump suit and bidding. Multi-round with cumulative scoring.
Simplified Scrabble (new game): Word/tile placement on a 9×9 board with bonus squares and ~2,700 word dictionary. Click-to-place UI.
Generic movement engine: Novel games can now describe movement + capture mechanics: diagonal, orthogonal, or all-direction movement with jump capture. Forward-only movement supported. Piece promotion (reaching back row → becomes king with backward movement). Mandatory capture enforced (jumps override regular moves).
Novel game pipeline expansion: Seven novel game types now work end-to-end:
Grid placement (N-in-a-row, territory)
Gravity/column drop (Connect Four-style)
Diagonal capture with promotion (Checkers-like)
Forward movement to goal (race on grid)
Flanking/flipping with pass-when-stuck (Reversi-like)
Card comparing with escalating stakes
Dice race with bumping
Adaptive training: Novel games graduate from random opponent to self-snapshot at game 10. L2 feature discovery runs mid-training to find correlated features. Escalating-stakes card games get a "card conservation" feature with strong prior.
Clarification system wired: Parser clarification answers now actually modify the saved game definition via API. Movement games prompt for piece setup (5 layout options). Clarification choices preserved when renaming games.
Click-to-move UI: Novel movement/capture games now support click-to-select-piece → click-destination in the grid board. Movable pieces highlighted, valid destinations shown on selection.
Card special powers: Parser detects wild cards ("eights are wild"), Skip, Reverse, Draw Two/Four keywords. Routes to Uno engine when action cards detected, Crazy Eights otherwise. Uno-style deck auto-detected from "four colors numbered 0-9".
Extra turn rules: "Take another turn", "go again", "bonus turn" keywords detected and generate conditional turn rules.
Training transparency: Training results show the training partner's strategy. Dice games train at depth 1. Novel games can reach depth 3 with time guard.
Known limitations in novel games
Card action effects (Skip, Reverse, Draw Two) work via built-in Uno engine, not generically expressible for fully custom card effects
Extra turn triggers detected but the specific conditions that set the flag (e.g., "landing in store") require game-specific code
Capture games default to filling all squares — dark-square-only placement requires "dark squares" in the description
Games Implemented (22)
Game
Type
Training Quality
Notes
Reversi
Flanking
strong
Learns corner strategy, mobility. Consistently good play.
Connect Four
Placement
strong
Hand-crafted features. Blocks threats, plays center.
Checkers
Movement
strong
Captures, kings, advancement. Solid play.
Hex
Territory
decent
Connection progress + center control. Developing.
Backgammon
Race + Dice
decent
Pip advantage, blot safety. Depth-1 training (dice variance).
Mancala
Sowing
strong
Store lead, captures, extra turns. Reliable.
Hearts
Trick-taking
decent
Follows suit, avoids points. Reasonable play.
Wizard
Trick-taking + Bidding
decent
Trump + bidding. Learns to bid based on hand strength.
Learning from scratch: Games like Reversi, Mancala, Go Fish, and Checkers show clear improvement from zero knowledge to competent play in 20-40 training games.
Feature auto-discovery: Auto-generated features for line games, capture games (piece_advantage), territory (count_pieces), and escalating-stakes card games (card_conservation). L2 correlation discovery finds additional features mid-training.
Luck detection: Pure-luck games (War, Chutes & Ladders) are automatically identified and flagged — no false "mastery" claims.
Game variety: 22 built-in games across 14 categories, plus novel games parsed from English descriptions.
Novel game pipeline: Describe a game in English → parse → auto-generate features → train → play. Works for grid placement, gravity drop, movement/capture, flanking, territory, card games, dice races, and tile placement.
Movement and capture: Generic movement engine handles diagonal, orthogonal, and all-direction movement with jump capture, mandatory capture, and piece promotion (kinging).
Interpretable AI: Every decision can be traced to feature weights. Commentary explains moves. Training partner strength is disclosed.
What Needs Work
Complex dice: Two independent dice (Backgammon: move one piece by die1 AND another by die2), doubles giving extra moves — not supported for novel games.
Draw-then-play turns: Multi-phase turns are now supported for built-in games (Canasta) but not yet auto-detected by the parser for novel card games.
Meld detection for novel games: The generic meld system exists (find_sets, find_runs) but the parser doesn't yet auto-wire it for novel game descriptions mentioning "melds" or "sets".
Novel game strategy depth: Novel games graduate from random to self-snapshot opponent, but strategy discovery remains shallow for complex games.
Custom conditional effects: "When you land on a red space, draw a card" — arbitrary effect triggers not yet parseable beyond built-in patterns (Skip, Reverse, Draw Two).
Changes (May 20, 2026)
Auto-discovery replaces hand-crafted features: Tic-Tac-Toe, Connect Four, and Chutes & Ladders no longer use hand-coded evaluation features. Instead, the system auto-generates features from the game's rules (line_threats, center_control, position_lead, etc.) and learns their importance through self-play. This means these games show real learning curves — starting weak and improving — rather than being pre-programmed to play well.
Smarter auto-features: For line-win games (N-in-a-row), the system now skips noisy features like edge_presence and corner_presence that don't help, keeping only relevant ones (center_control, longest_line, line_threats, mobility). Reduces noise-to-signal ratio during learning.
Progressive search depth: Training now starts at depth 1 and deepens every 5 games, while the opponent stays at full depth (the "adult"). This mirrors how a kid learns — start with simple thinking, gradually look further ahead. The training curve shows both weight learning and depth progression.
Luck detection: The system now identifies pure-luck games (like Chutes & Ladders and War) using two checks — L0 analyzes the rules for absence of player decisions, L1 checks post-training for flat weights and ~50% win rate. Luck games show a notice instead of misleading self-assessment or mastery claims.
Race game support: Track/race games with forward/backward choice and opponent bumping. AI evaluates positions by distance to finish. Track board UI with winding path, directional arrows, and roll/choice buttons.
Parser reference: New modal overlay documenting all 12 supported game categories, recognized piece vocabulary (~60 words), colors, cosmetics, and tips for writing game descriptions.
UI improvements: Game list split into "Trained" / "Not yet trained" groups. Training dashboard properly resets when switching games. "Back to Games" returns to menu. Server restart script (restart.sh).
Stronger auto-priors: Line features start at 0.5 (was 0.15) and center control at 0.2 (was 0.05), giving the learner a better starting point for line-win games.
Changes (May 19, 2026)
Selective deepening: At search depth 3+, only the top 8 most promising moves (scored by quick eval) are explored. This lets the AI look 3-4 moves ahead on large boards without exponential blowup — like a human focusing on "interesting" moves. All 20 AI competency tests pass.
AI commentary: After each AI move, a brief explanation appears — "Blocked opponent's 3-in-a-row threat", "Took a central position for control", "Best for center control (improved)". Generated from feature deltas and pattern detection.
Game object library: 238 words recognized across 9 categories (pieces, chess, military, nature, characters, resources, symbols, board, cards). 58 SVG icons for visual rendering. Games described with "wizards" or "knights" show actual icons on the board.
Cosmetics system: Parser detects piece colors ("red and blue stones"), board styles ("wooden board", "checkerboard"), and applies them. Custom games render with the described visual theme.
Clarification questions: After parsing, the system detects missing details (no draw condition? no board size? one color for both players?) and asks the user before training.
Game naming: Users can name their custom games before training. Names display in the game selector and training dashboard.
Parser expansion: Now handles 10 game categories — added movement/capture, sowing/mancala, and expanded trick-taking, flanking, and matching/shedding detection. 26 parser tests.
Novel game training: Games without hand-crafted features now train against a random opponent (was default_eval, which was too strong for a learner starting from scratch). 10W 0L vs random on novel 6x6 grid game.
Generic track board: Novel track/race games auto-render with a winding numbered path.
Changes (May 18, 2026)
3 new games: Hex (territory, hex grid with SVG board), Backgammon (race+dice, click-to-move with SVG triangles and pip dice), Hearts (trick-taking, first of its kind in the system)
Auto-feature discovery: Two-level system for novel games — Level 1 infers features from GDL rule structure (e.g., "track game with bear-off → pip count matters"), Level 2 discovers features via win/loss correlation from play data. Outperforms hand-crafted features on Backgammon (12W vs 7W) and Hex (16W vs 13W).
Node budget reduced: Board games 5000→2500, card games 1500. All 20 AI competency tests still pass. Moving toward the goal of smart evaluation over brute-force search.
Hand-crafted features for Gin Rummy (deadwood, melds, near-knock), Poker (hand strength), Backgammon (pip count, blots, home progress), and Hex (connection progress, blocking).
UI refactoring: Extracted 14 board components from App.jsx (2300→1100 lines). Custom card game fallback. Gin Rummy shows AI's hand and melds at game over. Backgammon click-to-move with highlighted destinations.
Backgammon training fix: Move limit raised from 200→500 for dice games (was causing false draws). Proper features instead of generic auto-features.
Game-over card reveal: All hidden cards revealed at end of game across all card games.
Game Categories and Coverage
Turn-based games fall into several structural categories. ThinAI's architecture can handle some natively and others with extensions. Here's where we are and where we're headed:
Category
Examples
Status
Coverage
Placement — place pieces on a grid
Tic-Tac-Toe, Connect Four, Go, Gomoku
works
Handles any N-in-a-row or territory game
Flipping/Flanking — capture by surrounding
Reversi, Othello
works
Fully supported
Movement/Capture — move pieces, capture opponents
Checkers, Chess, Shogi
partial
Checkers works; chess-level complexity is a stretch goal
Sowing/Mancala — distribute tokens around a track
Mancala, Oware, Kalah
works
Standard mancala variants supported
Nim-like — remove items from piles
Nim, Wythoff's, Sprouts
works
Any take-away game
Matching/Shedding — match cards, empty your hand
Crazy Eights, Uno, Rummy
works
Color/rank matching, action cards, melds
Collecting/Melding — gather sets, lay melds
Go Fish, Canasta, Rummy
works
Set detection, meld system with wilds, multi-phase turns, canasta bonuses
Comparing — compare hands for best rank
Poker, Blackjack, War
works
Hand ranking, hit/stand, draw/discard
Trick-taking — play cards to win tricks
Hearts, Wizard, Spades, Bridge
works
Hearts, Wizard, Spades (all with trump/bidding). Follow suit, trick resolution.
Auction/Bidding — bid resources for advantage
Bridge (bidding), Power Grid
partial
Wizard has per-round bidding. Full auction mechanics planned.
Race — move pieces to finish line
Chutes & Ladders, Backgammon, Parcheesi
works
Backgammon, custom race games with forward/backward choice and bumping, luck detection
Territory — control areas of the board
Go, Hex, Blokus
partial
Hex works (7×7, connect sides). Go-level complexity is a stretch goal
Of the ~50 most commonly played tabletop/card games worldwide, ThinAI can represent roughly 85% with its existing game types. The remaining ~15% require partnership dynamics, real-time mechanics, or negotiation.
Technical Decisions
Why not neural networks?
Neural approaches like AlphaZero (DeepMind's game-playing system) and MuZero (its successor that learns without knowing the rules) need millions of training games and significant compute per game. They learn one game at a time and don't transfer knowledge between games. ThinAI's structured approach learns from rules, not just from playing, and transfers features across games via a shared game description format. The tradeoff: less raw playing strength, but vastly more sample-efficient and interpretable.
Why minimax instead of MCTS?
Minimax (a search algorithm that evaluates moves by assuming the opponent plays optimally) with alpha-beta pruning (a technique to skip branches that can't affect the outcome) is simpler, more predictable, and works well with learned evaluation functions. MCTS (Monte Carlo Tree Search — a probabilistic approach that samples random playouts) would be better for very high branching factor games like Go, but none of our current games require it. The evaluation function is the learning target — depth is just how far ahead we look.
Why JSON for GDL?
Machine-parseable, human-readable, and web-friendly. GDL (Game Description Language) is a formal way to describe game rules so a computer can understand them. Our version uses JSON: the parser converts English to JSON, the engine loads JSON, the frontend displays JSON. One format throughout. The tradeoff: JSON can't express truly arbitrary game logic — complex conditions need built-in functions.