Skip to content

mauricett/FishBrain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FishBrain

Logo
Project: training small but powerful NNs on Stockfish data.

   

FishBrain v1 - THE BIG REBOOT (YES!!!)

The goal of this project is to explore small NN architectures that excel at chess. In contrast, Stockfish uses an NN architecture that sacrifices model capacity for extreme speed, while Leela uses a powerful but slow NN that is infeasible to train on consumer hardware.

    With FishBrain, I explore the middle ground - fast NNs with no architectural sacrifices but small model size. I believe this is a viable approach to create a competitive chess engine at home, once tree search is also implemented. I hope to release a write-up about FishBrain's architecture and training soon.

FishBrain v0

I finished the first NN (see v0_legacy) in 2024 and it works okay. The NN is much smaller than DeepMind's and achieves a Blitz Elo of about 1800. Some of the code is "research quality" and it lacks an interface for users. Maybe just wait for the next version.

Dataset

I am in the process of updating and reworking the dataset. It's going to be 2.5x bigger and a friendlier format.

    Old dataset: HuggingFace dataset. The data is extracted from the lichess.org open database and contains all games from 2023 for which Stockfish evaluations were available. It's easy to use with the HuggingFace dataloader, but I'm unhappy with the dependency on zstd. It will be much better in the future.

Future directions?

  • Leela has produced enormous amounts of data of very high quality. Ideally, I want to extract as much of this as I can into a deduplicated dataset of FEN positions.
  • FishBrain should be an MoE, because it consumes little memory due to its small size. This means, we can scale it up to MoE virtually for free.
  • Quantization: maybe FP4 for acceleration on Nvidia 50xx series?
  • Pruning: train a very deep model and prune it in depth?

About

Exploring small but powerful NNs trained on Stockfish data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published