Chess OCR: From Data to Deployment

Building a chess position recognition system from data collection to browser deployment for my module project at CAS AML Bern.

Introduction

Recognizing an entire board requires extensive training data. The simpler approach: split the board into 64 squares, classify each independently, then reconstruct the position. Output includes Lichess links for editing and analysis.

Chess recognition pipeline: board splitting and square classification

Assumptions:

  • Single board per image, no perspective distortion
  • Standard orientation: white squares at top-left and bottom-right corners

Dataset Collection

Ready-made chess image datasets are hard to find, probably because of copyright, so I had to make my own for private educational use. I used about 70 board images, removed borders, split them into 64 squares, and labeled the pieces. I also applied simple data augmentation like scaling, moving, and flipping to get more training data.

Chess board labeling process

Key challenges:

  • Class imbalance: Pawns appear far more frequently than kings or queens
  • Positional bias: White kings typically occupy dark squares while black kings occupy light squares
  • Style variation: Piece designs vary across books and publication periods, making it hard to cover all styles
Different chess piece styles across various board diagrams

Notebook: Data Preprocessing

Training

Representation Learning

For the CAS AML program, I trained both an autoencoder and SimCLR model to compare unsupervised learning approaches. Both used a simple CNN—faster to train, easier to visualize, and smaller to deploy than pretrained networks.

SimCLR separated pieces better in the embedding space, making it the better choice for classification.

t-SNE projection of autoencoder embeddings

Autoencoder t-SNE projection

t-SNE projection of SimCLR embeddings

SimCLR t-SNE projection - better piece separation

Classification

I took the SimCLR encoder and added a classifier on top (13 classes: 6 white pieces, 6 black pieces, empty). Then trained three models—one on white squares, one on black squares, and one on both combined.

Two approaches are available in the dropdown: split uses separate models per color, single uses one model for everything. Split usually performs better.

Notebooks: Representation Learning · Classification

Deployment

The goal: no-cost solution, no accounts, no infrastructure. I tested three deployment options:

Railway

✗ Short trial period before requiring payment

Render

✓ Free tier available
✗ Cold starts >50 seconds after 15 minutes of inactivity

ONNX Runtime Web + Pyodide

✓ Runs entirely in browser
✓ No server costs
✓ Instant response, no cold starts
✗ Larger initial download

I chose browser deployment. Pyodide handles preprocessing (border removal, square extraction), ONNX Runtime Web runs the CNN models.

Try It!

Initializing...

Conclusion

This article shows a complete ML pipeline—from collecting data to running models in your browser. Try uploading different chess diagrams and you'll probably find some that don't work, especially boards with unusual piece styles or low-quality scans.

Some ideas for making it better:

  • Generate synthetic data to cover more piece styles
  • Try transfer learning with larger pretrained models
  • Improve board detection with a pretrained model like YOLO or Mask R-CNN
  • Check for illegal positions (missing kings, pawns on the back rank, etc.)