Run and train on OpenChatKit with 80GB A100 GPU
OpenChatKit provides an open-source framework to train general-purpose chatbots. It includes a pre-trained 20B parameter language model as a good starting point.
Firstly, we will prepare the Conda environment. Let's request an interactive shell from a compute node.
srun -N1 -c8 -p batch --pty bash
Run the following commands inside the interactive shell.
# the pre-trained 20B model takes 40GB of space, so we use the scratch folder
cd $SCRATCH
# check out the kit
module load Anaconda3/2022.05 GCCcore git git-lfs
git clone https://github.com/togethercomputer/OpenChatKit.git
cd OpenChatKit
git lfs install
conda config --set channel_priority flexible
# create the Conda environment based on the provided environment.yml
# it may takes over an hour to resolve and install all python dependencies
conda env create --name OpenChatKit -f environment.yml python=3.10.9
# verify it is created
conda env list