Skip to main content

Run nemo-megatron-gpt-5B model with NVIDIA NeMo

Introduction

NVIDIA NeMo is a powerful toolkit designed for researchers working on various conversational AI tasks, including automatic speech recognition (ASR), text-to-speech synthesis (TTS), large language models (LLMs), and natural language processing (NLP). It aims to facilitate the reuse of existing code and pretrained models while enabling the creation of new conversational AI models. In this tutorial, we will explore NeMo's capabilities and learn how to use the Megatron-GPT 5B language model for language modeling tasks.

Model and Software References:
Launch Jupyter Lab Job

Create a Jupyter Lab job with the following specifications:

  • CPU Cores: 4
  • Memory: 64 GB
  • GPU: 3g.40gb

Open your web browser and navigate to the Jupyter Lab web interface.

Screenshot from 2023-06-08 17-05-47.png

In the Jupyter Lab menu, open the Terminal.

Screenshot from 2023-06-08 17-27-38.png

Enabling the NeMo Container Kernel in Jupyter Lab

Execute the following commands in the Terminal:

cd $HOME
mkdir -p .local/share/jupyter/kernels/ngc.nemo.22.07
echo '
{
 "language": "python",
 "argv": ["/usr/bin/singularity",
   "exec",
   "--nv",
   "-B",
   "/run/user:/run/user",
   "/pfss/containers/ngc.nemo.22.07.sif",
   "python",
   "-m",
   "ipykernel",
   "-f",
   "{connection_file}"
 ],
 "display_name": "nemo22.07"
}
' > .local/share/jupyter/kernels/ngc.nemo.22.07/kernel.json

After adding the content to the kernel.json file, refresh your browser by pressing F5. You should now see "nemo22.07" under the Notebook section in Jupyter Lab.Lab Launcher.

Screenshot from 2023-06-08 17-47-00.png

Launch eval server

Execute the following command in the Terminal:

# set the TMPDIR environment variable
export TMPDIR=/pfss/scratch02/appcara/nlp/tmp

# start eval server with nemo-megatron-gpt-5B model by nemo container
singularity run --nv /pfss/containers/ngc.nemo.22.07.sif python /pfss/scratch02/appcara/nlp/NeMo/examples/nlp/language_modeling/megatron_gpt_eval.py gpt_model_file=/pfss/scratch02/appcara/nlp/nemo_gpt5B_fp16_tp1.nemo server=true tensor_model_parallel_size=1 trainer.devices=1 port=5556
Send prompts to the model

Copying the Jupyter Lab File:

# copy the jupyter example file into your home folder
cp $SCRATCH_APPCARA/nlp/nemo-megatron-gpt-template.ipynb $HOME

In the "File Browser" section of Jupyter Lab, locate the copied file and open it. Also change the kernel to nemo22.07.

selectkernel.png


Edit what you want to talk with chatbot in the "sentences" section of the file. Run the program.

Screenshot from 2023-06-08 18-08-45.png