Running the Vicuna-13B/7B Chatbot with FastChat
Introduction
The Vicuna-13B chatbot is an open-source conversational AI model trained using fine-tuning LLaMA on user-shared conversations collected from ShareGPT. It has demonstrated remarkable performance, surpassing other models such as OpenAI ChatGPT, Google Bard, LLaMA, and Stanford Alpaca in more than 90% of cases. This IT tutorial will guide you through initializing the environment and running the Vicuna-13B chatbot using the FastChat inference software.
Model and Software References:
- Vicuna-13B/7B Blog: [https://lmsys.org/blog/2023-03-30-vicuna/]
- FastChat GitHub Repository: [https://github.com/lm-sys/FastChat]
Installation and Setup
# Create conda environment
# conda create -n [env_name]
conda create -n chatbotDemo
# source activate [env_name]
source activate chatbotDemo
# Install required packages
conda install pip
pip3 install fschat
Run chatbot
After creating the conda environment, you can activate it at any time by running:
# source activate [env_name]
source activate chatbotDemo
Single GPU Case
To run the Vicuna-13B chatbot using a GPU (requires 28GB of GPU memory), execute the following command:
# request 4 core, 50 GB RAM, 3g.40gb GPU resource with interactive shell
srun -p gpu --gpus 3g.40gb:1 -c 4 --mem 50000 --pty bash
source activate chatbotDemo
python3 -m fastchat.serve.cli --model-path /pfss/toolkit/vicuna-13b --style rich
# or Vicuna-7B
python3 -m fastchat.serve.cli --model-path /pfss/toolkit/vicuna-7b --style rich
CPU-Only Case
If you prefer to run the chatbot on a CPU (requires around 60GB of CPU memory), follow these steps:
# request 4 core, 70 GB resource with interactive shell
srun -p batch -c 4 --mem 70000 --pty bash
source activate chatbotDemo
python3 -m fastchat.serve.cli --model-path /pfss/toolkit/vicuna-13b --style rich
Conclusion
Following these steps, you can successfully set up and run the Vicuna-13B chatbot using the FastChat inference software. Feel free to explore fine-tuning the model and evaluating the chatbot using the resources available on the Vicuna-13B website (ref: [https://lmsys.org/blog/2023-03-30-vicuna/]).