%load_ext autoreload
%autoreload 2
1. Motivation
For automated insights, we need a way to let the computer learn about our metrics. We need something to compare against our metrics, so I figured I start with stockfish, a very popular chess engine. Python chess also interfaces with stockfish so I won't need another dependency, either.
Stockfish is a small download and with the binaries having few dependencies they should be ready to go right after extracting the archive – no extra installation required. The engine should then help us by generating our training data.
!mkdir -p "tools"
!ls
Assuming this notebook runs on a Linux box, we'll get the Linux binary from the official website. We're not building high-end desktop software here (does such thing even exist?) so hard-coding the download URL for just the Linux version is OK. It might feel wrong, but the best advice to counter your (very valid) intuition is that we have to focus on our goal: automated insights. That is, don't spend time over-engineering the basic, non-data-sciency stuff in your pipeline unless your pipeline is ready to run in production and earn money for you. There is a reason why we use Python for all of this, so quick'n'dirty – to a certain degree – is the way to go.
We need to set a fake browser user-agent for our download request, otherwise we get 403'd. Normally that's a sign you're doing something wrong at extra costs to someone else (in this case ISP hosting fees for the stockfish communiy). At 1.7M of download size, which is probably smaller than a typical project's landing page these days, I don't feel guilty at all.
We still need to extract the downloaded Zip archive. Also, we need to make the stockfish binary executable. There are 3 binaries available in this version, we use what I guess is the default one. More quick'n'dirty hardcoding that I expect to break sooner than later.
stockfish = extract(download())
Check if we can run stockfish now, but let's not get stuck in stockfish's command prompt. So we send 'quit' to it immediately, which will still return the version of the binary and its authors.
!{stockfish} quit
The next step is to hook up the engine with Python chess, as described in their documentation: https://python-chess.readthedocs.io/en/latest/engine.html
PGN can be described as a hierarchical document model due to its ability to store variations next to the mainline. Python chess uses a recursive nodes structure to model PGN; GameNode::add_main_variation(move: chess.Move)->GameNode
processes a move, adds it as a child to the current node and returns the child. If we only process the mainline moves, the structure effectively represents a linked list. We could also just use a chess.Board instance to replay the engine results back to Python chess. Since our data module handles PGN however, I thought I go the extra step here. The way engine, game and board interact feels fragile to me but that is perhaps because board isn't just a plain chess position viewer. Instead, it has game/engine logic of its own.
Computer chess is extremely technical and moves will pile up. Even with short time limits a chess engine vs chess engine game can take a couple seconds or even minutes. Too short of a time limit and computer chess turns into garbage. Notice that it's not the engine that's slow, it's how we interact with it and process results.
%time game = playGame(stockfish)
display(game.mainline_moves())
Let's look at the engine's game through our UI.
import cheviz.ui
games_list = [cheviz.data.fromGame(game)]
cheviz.ui.showGameUi(games_list)
We use the same display wrap trick as in the 02_ui.ipynb
notebook.
from IPython.display import display, HTML
CSS = """
.output {
flex-direction: row;
flex-wrap: wrap;
}
"""
HTML('<style>{}</style>'.format(CSS))