YGO Agent is a project to create a Yu-Gi-Oh! AI using deep learning (LLMs, RL). It consists of a game environment and a set of AI agents.
YGO Agent is a project aimed at mastering the popular trading card game Yu-Gi-Oh! through deep learning. Based on a high-performance game environment (ygoenv), this project leverages reinforcement learning and large language models to develop advanced AI agents (ygoai) that aim to match or surpass human expert play. YGO Agent provides researchers and players with a platform for exploring AI in complex, strategic game environments.
[Discord](https://discord.gg/EqWYj4G4Ys)
## News
## News🔥
- July 2, 2024: We have a discord channel for discussion now! We are also working with [neos-ts](https://github.com/DarkNeos/neos-ts) to implement human-AI battle.
- April 18, 2024: We have fully switched to JAX for training and evaluation. Check the evaluation sections for more details and try the new JAX-trained agents.
- April 14, 2024: LSTM has been implemented and well tested. See `scripts/jax/ppo.py` for more details.
- April 7, 2024: We have switched to JAX for training and evalution due to the better performance and flexibility. The scripts are in the `scripts/jax` directory. The documentation is in progress. PyTorch scripts are still available in the `scripts` directory, but they are not maintained.
- 2024.7.2 - We have a discord channel for discussion now! We are also working with [neos-ts](https://github.com/DarkNeos/neos-ts) to implement human-AI battle.
- 2024.4.18 - LSTM has been implemented and well tested.
- 2024.4.7 - We have switched to JAX for training and evaluation due to the better performance and flexibility.
## Table of Contents
-[Subprojects](#subprojects)
-[ygoenv](#ygoenv)
-[ygoai](#ygoai)
-[Building](#building)
-[Common Issues](#common-issues)
-[Installation](#installation)
-[Building from source](#building-from-source)
-[Troubleshooting](#troubleshooting)
-[Evaluation](#evaluation)
-[Obtain a trained agent](#obtain-a-trained-agent)
-[Play against the agent](#play-against-the-agent)
-[Battle between two agents](#battle-between-two-agents)
-[Training (Deprecated, to be updated)](#training-deprecated-to-be-updated)
-[Training](#training)
-[Single GPU Training](#single-gpu-training)
-[Distributed Training](#distributed-training)
-[Plan](#plan)
-[Roadmap](#roadmap)
-[Environment](#environment)
-[Training](#training-1)
-[Inference](#inference)
...
...
@@ -40,93 +37,102 @@ YGO Agent is a project to create a Yu-Gi-Oh! AI using deep learning (LLMs, RL).
## Subprojects
### ygoenv
`ygoenv` is a high performance game environment for Yu-Gi-Oh! It is initially inspired by [yugioh-ai](https://github.com/melvinzhang/yugioh-ai]) and [yugioh-game](https://github.com/tspivey/yugioh-game), and now implemented on top of [envpool](https://github.com/sail-sg/envpool).
`ygoenv` is a high performance game environment for Yu-Gi-Oh!, implemented on top of [envpool](https://github.com/sail-sg/envpool) and [ygopro-core](https://github.com/Fluorohydride/ygopro-core). It provides standard gym interface for reinforcement learning.
### ygoai
`ygoai` is a set of AI agents for playing Yu-Gi-Oh! It aims to achieve superhuman performance like AlphaGo and AlphaZero, with or without human knowledge. Currently, we focus on using reinforcement learning to train the agents.
## Building
The following building instructions are only tested on Ubuntu (WSL2) and may not work on other platforms.
To build the project, you need to install the following prerequisites first:
## Installation
Pre-built binaries are available for Ubuntu 22.04 or newer. If you're using them, follow the installation instructions below. Otherwise, please build from source following [Building from source](#building-from-source).
1. Install JAX and other dependencies:
```bash
# Install JAX (CPU version)
pip install-U"jax<=0.4.28"
# Or with CUDA support
pip install-U"jax[cuda12]<=0.4.28"
# Install other dependencies
pip install flax distrax chex
```
2. Clone the repository and install pre-built binary (Ubuntu 22.04 or newer):
python -u eval.py --env-id"YGOPro-v1"--deck ../assets/deck/ --num_episodes 32 --strategy random --lang chinese --num_envs 16
```
If you see episode logs and the output contains this line, the environment is working correctly. For more usage examples, see the [Evaluation](#evaluation) section.
If you can't use the pre-built binary or prefer to build from source, follow these instructions. Note: These instructions are tested on Ubuntu 22.04 and may not work on other platforms.
#### Additional Prerequisites
- gcc 10+ or clang 11+
- CMake 3.12+
-[xmake](https://xmake.io/#/getting_started)
- jax 0.4.25+, flax 0.8.2+, distrax 0.1.5+ (CUDA is optional)
After that, you can build with the following commands:
After building, you can run the following command to test the environment. If you see episode logs, it means the environment is working. Try more usage in the next section!
```bash
cd scripts
python -u eval.py --env-id"YGOPro-v1"--deck ../assets/deck/ --num_episodes 32 --strategy random --lang chinese --num_envs 16
make dev
```
### Common Issues
### Troubleshooting
#### Package version not found by xmake
Delete `repositories`, `cache`, `packages` directories in the `~/.xmake` directory and run `xmake f -y` again.
Delete `repositories`, `cache`, `packages` directories in the `~/.xmake` directory and run `xmake f -y -c` again.
#### Install packages failed with xmake
Sometimes you may fail to install the required libraries by xmake automatically (e.g., `glog` and `gflags`). You can install them manually (e.g., `apt install`) and put them in the search path (`$LD_LIBRARY_PATH` or others), then xmake will find them.
If xmake fails to install required libraries automatically (e.g., `glog` and `gflags`), install them manually (e.g., `apt install`) and add them to the search path (`$LD_LIBRARY_PATH` or others).
#### GLIBC and GLIBCXX version conflict
Mostly, it is because your `libstdc++` from `$CONDA_PREFIX` is older than the system one, while xmake compiles libraries with the system one and you run programs with the `$CONDA_PREFIX` one. If so, you can delete the old `libstdc++` from `$CONDA_PREFIX` (backup it first) and make a soft link to the system one.
#### Other issues
Open a new terminal and try again. If you still encounter issues, you can join the[Discord channel](https://discord.gg/EqWYj4G4Ys) for help.
Open a new terminal and try again. If issues persist, join our[Discord channel](https://discord.gg/EqWYj4G4Ys) for help.
## Evaluation
### Obtain a trained agent
We provide trained agents in the [releases](https://github.com/sbl1996/ygo-agent/releases/tag/v0.1). Check these Flax checkpoint files named with `{commit_hash}_{exp_id}_{step}.flax_model` and download (the lastest) one to your local machine. The following usage assumes you have it.
If you are not in the `stable` branch or encounter any other running issues, you can try to switch to the `commit_hash` commit before using the agent. You may need to rebuild the project after switching:
```bash
xmake f -c
xmake b -r ygopro_ygoenv
```
We provide trained agents in the [releases](https://github.com/sbl1996/ygo-agent/releases/tag/v0.1). Check these Flax checkpoint files named with `{exp_id}_{step}.flax_model` and download (the lastest) one to your local machine. The following usage assumes you have it.
### Play against the agent
We can use `eval.py` to play against the trained agent with a MUD-like interface in the terminal. We add `--xla_device cpu` to run the agent on the CPU.
```bash
python -u eval.py --deck ../assets/deck --lang chinese --xla_device cpu --checkpoint checkpoints/350c29a_7565_6700M.flax_model --play
```
We can enter `quit` to exit the game. Run `python eval.py --help` for more options, for example, `--player 0` to make the agent play as the first player, `--deck1 TenyiSword` to force the first player to use the TenyiSword deck.
We can play against the agent with any YGOPro clients now. TODO.
### Battle between two agents
We can use `battle.py` to let two agents play against each other and find out which one is better.
We can use `battle.py` to let two agents play against each other and find out which one is better. Adding `--xla_device cpu` forces JAX to run on CPU.
We can set `--record` to generate `.yrp` replay files to the `replay` directory. The `yrp` files can be replayed in YGOPro compatible clients (YGOPro, YGOPro2, KoishiPro, MDPro). Change `--seed` to generate different games.
Training an agent requires a lot of computational resources, typically 8x4090 GPUs and 128-core CPU for a few days. We don't recommend training the agent on your local machine. Reducing the number of decks for training may reduce the computational resources required.
...
...
@@ -138,10 +144,8 @@ We can train the agent with a single GPU using the following command: