Update README

c0689b01 · sbl1996@126.com · 350c29a8 · c0689b01
Commit c0689b01 authored Apr 18, 2024 by sbl1996@126.com
Hide whitespace changes
Inline Side-by-side

Showing with 40 additions and 21 deletions

README.md README.md +40 -21

No files found.
--- a/README.md
+++ b/README.md
@@ -4,6 +4,8 @@ YGO Agent is a project to create a Yu-Gi-Oh! AI using deep learning (LLMs, RL).
 ## News
+- April 18, 2024: We have fully switched to JAX for training and evaluation. Check the evaluation sections for more details and try the new JAX-trained agents.
 - April 14, 2024: LSTM has been implemented and well tested. See `scripts/jax/ppo.py` for more details.
 - April 7, 2024: We have switched to JAX for training and evalution due to the better performance and flexibility. The scripts are in the `scripts/jax` directory. The documentation is in progress. PyTorch scripts are still available in the `scripts` directory, but they are not maintained.
@@ -14,12 +16,12 @@ YGO Agent is a project to create a Yu-Gi-Oh! AI using deep learning (LLMs, RL).
  - [ygoenv](#ygoenv)
  - [ygoai](#ygoai)
 - [Building](#building)
+  - [Common Issues](#common-issues)
 - [Evaluation](#evaluation)
  - [Obtain a trained agent](#obtain-a-trained-agent)
  - [Play against the agent](#play-against-the-agent)
  - [Battle between two agents](#battle-between-two-agents)
-  - [Serialize agent](#serialize-agent)
+- [Training (Deprecated, to be updated)](#training-deprecated-to-be-updated)
- [Training](#training)
  - [Single GPU Training](#single-gpu-training)
  - [Distributed Training](#distributed-training)
 - [Plan](#plan)
@@ -27,6 +29,7 @@ YGO Agent is a project to create a Yu-Gi-Oh! AI using deep learning (LLMs, RL).
  - [Training](#training-1)
  - [Inference](#inference)
  - [Documentation](#documentation)
+- [Sponsors](#sponsors)
 - [Related Projects](#related-projects)
@@ -47,20 +50,18 @@ To build the project, you need to install the following prerequisites first:
 - gcc 10+ or clang 11+
 - CMake 3.12+
 - [xmake](https://xmake.io/#/getting_started)
- PyTorch 2.0 or later with cuda support
+- jax 0.4.25+, flax 0.8.2+, distrax 0.1.5+ (CUDA is optional)
 After that, you can build with the following commands:
 ```bash
 git clone https://github.com/sbl1996/ygo-agent.git
 cd ygo-agent
-git checkout eval_with_ptj  # switch to the stable branch
+git checkout stable  # switch to the stable branch
 xmake f -y
 make
 ```
-Sometimes you may fail to install the required libraries by xmake automatically (e.g., `glog` and `gflags`). You can install them manually and put them in the search path (LD_LIBRARY_PATH or others), then xmake will find them.
 After building, you can run the following command to test the environment. If you see episode logs, it means the environment is working. Try more usage in the next section!
 ```bash
@@ -68,44 +69,61 @@ cd scripts
 python -u eval.py --env-id "YGOPro-v0" --deck ../assets/deck/  --num_episodes 32 --strategy random  --lang chinese --num_envs 16
 ```
+### Common Issues
+#### Package version not found by xmake
+Delete `repositories`, `cache`, `packages` directories in the `~/.xmake` directory and run `xmake f -y` again.
+#### Install packages failed with xmake
+Sometimes you may fail to install the required libraries by xmake automatically (e.g., `glog` and `gflags`). You can install them manually (e.g., `apt install`) and put them in the search path (`$LD_LIBRARY_PATH` or others), then xmake will find them.
+#### GLIBC and GLIBCXX version conflict
+Mostly, it is because your `libstdc++` from `$CONDA_PREFIX` is older than the system one, while xmake compiles libraries with the system one and you run programs with the `$CONDA_PREFIX` one. If so, you can delete the old `libstdc++` from `$CONDA_PREFIX` (backup it first) and make a soft link to the system one.
+#### Other xmake issues
+Open a new terminal and try again. God bless you.
 ## Evaluation
 ### Obtain a trained agent
-We provide some trained agents in the [releases](https://github.com/sbl1996/ygo-agent/releases/tag/v0.1). Check these TorchScript files named with `{commit_hash}_{exp_id}_{step}.ptj` and download them to your local machine. Switch to the corresponding commit hash before using it. The following usage assumes you have it.
+We provide trained agents in the [releases](https://github.com/sbl1996/ygo-agent/releases/tag/v0.1). Check these Flax checkpoint files named with `{commit_hash}_{exp_id}_{step}.flax_model` and download (the lastest) one to your local machine. The following usage assumes you have it.
-Notice that the provided `ptj` can only run on GPU, but not CPU. Actually, the agent can run in real-time on CPU, we will provide a CPU version in the future.
+If you are not in the `stable` branch or encounter any other running issues, you can try to switch to the `commit_hash` commit before using the agent. You may need to rebuild the project after switching:
+```bash
+xmake f -c
+xmake b -r ygopro_ygoenv
+```
 ### Play against the agent
-We can use `eval.py` to play against the trained agent with a MUD-like interface in the terminal.
+We can use `eval.py` to play against the trained agent with a MUD-like interface in the terminal. We add `--xla_device cpu` to run the agent on the CPU.
 ```bash
-python -u eval.py --agent --deck ../assets/deck  --lang chinese --checkpoint checkpoints/1234_1000M.ptj --play
+python -u eval.py --deck ../assets/deck --lang chinese --xla_device cpu --checkpoint checkpoints/350c29a_7565_6700M.flax_model --play
 ```
+We can enter `quit` to exit the game. Run `python eval.py --help` for more options, for example, `--player 0` to make the agent play as the first player, `--deck1 TenyiSword` to force the first player to use the TenyiSword deck.
 ### Battle between two agents
 We can use `battle.py` to let two agents play against each other and find out which one is better.
 ```bash
-python -u battle.py --deck ../assets/deck --checkpoint1 checkpoints/1234_1000M.ptj --checkpoint2 checkpoints/9876_100M.ptj --num-episodes=256 --num_envs=32 --seed 0
+python -u battle.py --deck ../assets/deck --checkpoint1 checkpoints/350c29a_7565_6700M.flax_model --checkpoint2 checkpoints/350c29a_1166_6200M.flax_model --num-episodes 32 --num_envs 8 --seed 0
 ```
-You can set `--num_envs=1 --verbose --record` to generate `.yrp` replay files.
+We can set `--record` to generate `.yrp` replay files to the `replay` directory. The `yrp` files can be replayed in YGOPro compatible clients (YGOPro, YGOPro2, KoishiPro, MDPro). Change `--seed` to generate different games.
-### Serialize agent
-After training, we can serialize the trained agent model to a file for later use without keeping source code of the model. The serialized model file will end with `.ptj` (PyTorch JIT) extension.
 ```bash
-python -u eval.py --agent --checkpoint checkpoints/1234_1000M.pt --num_embeddings 999 --convert --optimize
+python -u battle.py --deck ../assets/deck --checkpoint1 checkpoints/350c29a_7565_6700M.flax_model --checkpoint2 checkpoints/350c29a_1166_6200M.flax_model --num-episodes 16 --record --seed 0
 ```
-If you have used `--embedding_file` during training, skip the `--num_embeddings` option.
-## Training
+## Training (Deprecated, to be updated)
 Training an agent requires a lot of computational resources, typically 8x4090 GPUs and 128-core CPU for a few days. We don't recommend training the agent on your local machine. Reducing the number of decks for training may reduce the computational resources required.
@@ -175,6 +193,7 @@ The script options are mostly the same as the single GPU training. We only scale
 ### Training
 - League training (AlphaStar, ROA-Star)
 - Nash equilibrium training (OSFP, DeepNash)
+- Individual agent for first and second player
 - Centralized critic with full observation
 ### Inference
@@ -182,7 +201,7 @@ The script options are mostly the same as the single GPU training. We only scale
 - Support of play in YGOPro
 ### Documentation
- JAX training and evaluation
+- JAX training
 ## Sponsors