Running Large Language Models on CPU

Large language models (LLM) can be run on CPU. However, the performance of the model would depend on the size of the model and the complexity of the task it is being used for. Large language models such as GPT-3, which have billions of parameters, are often run on specialized hardware such as GPUs or TPUs to achieve faster performance.

Running a large language model on a CPU would require a powerful processor, sufficient memory, and efficient implementation of the model code. Additionally, the training and inference time would be slower on CPU compared to specialized hardware.

That being said, smaller language models can be run efficiently on CPU and can still perform well for a range of tasks such as text classification and language translation.

This tutorial will guide you through the installation of multiple Large Language Models (LLM) AI projects, and provide detailed steps to allow you to install and experiment with them on your own. The projects we will be installing are:

  • Alpaca.cpp
  • llma.cpp
  • GPT4All
  • GPT4All-J

1. Alpaca.cpp

Alpaca.cpp from Antimatter15 is a project written in C++ that allows us to run a fast ChatGPT-like model locally on our PC. To compile an application from its source code, you can start by cloning the Git repository that contains the code. If you haven't installed Git on your system already, you'll need to do that first.

$ sudo apt install git wget
$ git clone https://github.com/antimatter15/alpaca.cpp.git
$ cd alpaca.cpp/

Compile the binary from the source code:

$ make 

Download ggml-alpaca-7b-q4.bin and place it in the same folder as the chat executable in the zip file:

7B model:
$ wget https://huggingface.co/Sosaka/Alpaca-native-4bit-ggml/resolve/main/ggml-alpaca-7b-q4.bin

We can also download a and use larger Alpaca 30B model to get better result:

$ wget https://huggingface.co/Pi3141/alpaca-lora-30B-ggml/resolve/main/ggml-model-q4_0.bin

However, it's important to keep in mind that using a larger model also requires more computational resources. Therefore, it may not always be feasible depending on the available hardware.

By default the chat utility is looking for a model ggml-alpaca-7b-q4.bin in the directory from which the application is started. We change change path to a model with the paramater -m:

Run:
$ ./chat --model ggml-alpaca-7b-q4.bin.bak

Also,  chat is using 4 threads for computation by default. If you want to utilize all CPU threads during computation try the start chat as following (Figure 1):

$ ./chat --model ggml-alpaca-7b-q4.bin.bak --threads $(lscpu | grep "^CPU(s)" | awk '{print $2}')

Figure 1 - Running 7B Alpaca model Using Alpca.cpp

I have also included an answer generated by the 7B Alpaca model in response to the given prompt:

> write an article about ancient Romans. First give me a outline which consist of headline, teaser and several subheadings. Complete all subheadings.

Generated by Stable Diffusion 2.1

Headline:

Ancient Roman Culture Teaser: The Romans were one of the greatest civilizations in history, with an impressive legacy that still influences our world today.

Subheading I: Introduction to Rome and its People

The ancient city-state of Rome was founded as a small community on the banks of the Tiber River around 753 BC by legendary figures such as Romulus and Remus, two brothers who were said to have built the foundations for one of history’s greatest empires. The Roman people quickly established themselves through their strength in battle against neighboring tribes, eventually expanding Rome's influence over a large portion of Europe, Africa and the Middle East by 27 BC when Augustus became Emperor.

Subheading II: Language & Law

The Romans were known for having an impressive legal system which was based on their highly developed language - Latin. This language evolved into many modern languages such as French, Spanish or Italian due to its widespread use in the Roman Empire and beyond. The law of Rome also had a lasting impact with concepts like “civil law” still being used today.

Generated by Stable Diffusion 2.1

Subheading III: Religion & Arts

The Romans were known for their impressive architecture which can be seen throughout Europe, such as the Colosseum in Italy or the Pantheon temple in Paris. Roman religion was also highly influential with gods like Jupiter and Juno being adopted by many European countries after Rome’s fall around 476 AD. The Romans were known for their impressive sculptures, paintings and literature which are still studied today as classic works of art such as Virgil's Aeneid or Livy's History Of Rome.

Subheading IV: Conclusion & Legacy The Roman Empire was one of the greatest civilizations in history with an impressive legacy that is still felt to this day, from their language and law being used around the world today to their incredible architecture which can be seen throughout Europe. The Romans left a lasting impact on our modern society through their strength in battle, advanced legal system and influential religion as well as artistic works of classic literature that are still studied today.

The screenshot below is from running 30B model on Alpaca.cpp.

Figure 2 - Running 30B Alpaca Model Using Alpca.cpp

Figure 3 displays the answer generated by the Alpaca.cpp program using the 30B Alpaca model.

> what is the best weapon we can use to fight ancient daemons? Give me some examples in more details.

Figure 3 - Running 30B Alpaca model with Alpca.cpp

2.  LLAMA.cpp

To enable the use of a wider range of models on a CPU, it's recommended to consider LLMA.cpp instead of Alpaca.cpp. This option allows users to access a broader range of models, including:

  • LLaMA
  • Alpaca
  • GPT4All
  • Chinese LLaMA / Alpaca
  • Vigogne (French)
  • Vicuna
  • Koala

We plan to use LLAMA.cpp to run both the Alpaca 7B model and the GPT4All model. Firstly, clone git repository and compile application from its source:

$ git clone https://github.com/ggerganov/llama.cpp
$ cd llma.cpp

$ make

2.1 LLMA.cpp Using Alpaca 7B Model

If you have already downloaded the Alpaca 7B model in Part 1, you can skip this step. Otherwise, please download it now.

$ wget https://huggingface.co/Sosaka/Alpaca-native-4bit-ggml/resolve/main/ggml-alpaca-7b-q4.bin

After downloading the model, we can run it in an instruct mode:

$ ./main -m models/ggml-alpaca-7b-q4.bin -ins -t $(lscpu | grep "^CPU(s)" | awk '{print $2}') --color

2.2 LLMA.ccp  Using GPT4All Model

Download the gpt4all-lora-quantized-ggml.bin model:

$ wget https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized-ggml.bin

After downloading the model, we can run it in an instruct mode:

$ ./main -m models/gpt4all-lora-quantized-ggml.bin -t $(lscpu | grep "^CPU(s)" | awk '{print $2}') -ins --color

We will ask the same question about ancient Romans as we did in case of Alpaca 7B model. Unlike the answer generated by the Alpaca model, the given answer  is not completely in line with the instructions since it doesn't have distinct sections for a headline, teaser, and subheadings.

> write an article about ancient Romans. First give me a outline which consist of headline, teaser and several subheadings. Complete all subheadings.

The Ancient Roman Empire was one of the most influential civilizations in history, with its influence still felt today. The empire ruled by Caesar Augustus and later on expanded to cover much of Europe and North Africa.

The ancient Romans were known for their highly developed culture, including art, architecture, and literature. The Roman Empire was established during the 6th century BC in Italy, which is currently considered modern-day Italy. It started with a small city state called Rome that evolved into one of the most powerful civilizations in history. The Romans were known for their military prowess and expansionist policies, which helped them gain control over much of Europe and North Africa.

The Roman Empire was ruled by Caesar Augustus, who is considered to be one of the most influential figures in ancient Rome. He transformed the empire into a highly developed civilization, with his policies contributing greatly to the Roman Empire's success. The Romans also had an advanced culture, which included art, architecture, and literature.

The Romans were known for their excellent artwork, including sculpture, mosaics, and paintings that are still admired today. They produced some of the most iconic buildings in history, such as the Pantheon, Colosseum, and Forum of Augustus. The Romans also had a rich literary tradition, which included works by Virgil, Horace, and Ovid.

In conclusion, the Roman Empire was one of the most influential civilizations in history, with its influence still felt today. Its success is due to a combination of military prowess, cultural development, and expansionist policies that led to its expansion across Europe and North Africa. The Romans are known for their outstanding artwork and architecture, as well as an impressive literary tradition that included works by some of the greatest poets in history.

3. GPT4All and GPT4All-J

GPT4All is demo, data, and code developed by nomic-ai to train open-source assistant-style large language model based on GPT-J and LLaMa.

Clone git repository and compile application from its source:

$ git clone https://github.com/nomic-ai/gpt4all
$ cd gpt4all

3.1 GPT4All

GPT4All functions similarly to Alpaca and is based on the LLaMA 7B model. The team fine-tuned the LLaMA 7B models and trained the final model on the post-processed assistant-style prompts, of which there were 437,605.

Download unfiltered model that has all refusal-to-answer responses removed from training and, thus, doesn’t know how to say “I don’t know or refuse to answer”.

$ wget https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin

Note: If you prefer an original filtered model you can download it as following:

$ wget https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-filtered-quantized.bin

Run the model:

$ ./chat/gpt4all-lora-quantized-linux-x86 -m gpt4all-lora-unfiltered-quantized.bin -t $(lscpu | grep "^CPU(s)" | awk '{print $2}') -i

> write an article about ancient Romans. First give me a outline which consist of headline, teaser and several subheadings. Complete all subheadings.

The Ancient Romans were one of history's most influential civilizations that shaped the world we live in today through their innovations such as roads, aqueducts, bridges, bathhouses, sewers and public utilities like water supply systems which are still used to this day. The Roman Empire was a vast empire with territories stretching from Britain across Europe into Africa and Asia Minor (modern Turkey).

The Romans were known for their military prowess as they conquered much of the world around them, but also had great artistic achievements such as architecture, sculpture, literature, poetry and music. The Roman Empire was ruled by emperors who wielded absolute power over all aspects of life in Rome from religion to politics to entertainment. The Romans were known for their love of luxury items like jewelry made with precious stones such as emeralds, rubies or sapphires and also loved exotic animals like lions tigers and elephants which they kept in huge public displays called amphi-theatres.

The Romans were known for their love of food too especially meat dishes like roasted pork belly with honey, stuffed peacocks or even whole boars cooked over an open fire pit. The Ancient Roman Empire was a complex and fascinating civilization that shaped the world we live in today through its innovations as well as artistic achievements which are still admired by historians around the globe to this day!

3.2 GPT4All-J

GPT4All-J differs from GPT4All in that it is trained on GPT-J model rather than LLaMa. The primary advantage of using GPT-J for training is that unlike GPT4all, GPT4All-J is now licensed under the Apache-2 license, which permits commercial use of the model.

$ git clone --recurse-submodules https://github.com/nomic-ai/gpt4all-chat
$ cd gpt4all-chat
$ git clone https://github.com/ggerganov/ggml
$ cd ggml
$ mkdir build
$ cd build/
$ cmake ..
$ cmake --build . --parallel

$ wget https://gpt4all.io/models/ggml-gpt4all-j-v1.2-jazzy.bin

$ ./bin/gpt-j -m ggml-gpt4all-j-v1.2-jazzy.bin -p "write an article about ancient Romans. First give me a outline which consist of headline, teaser and several subheadings. Complete all subheadings." -t $(lscpu | grep "^CPU(s)" | awk '{print $2}')

The ancient Romans were a fascinating group of people who left a lasting impact on history. They were known for their exceptional architecture, sophisticated society, and unwavering commitment to justice. The Roman Empire was one of the most powerful and prosperous societies of its time, with an economy that was fueled by trade and commerce.

The Romans were also known for their innovative engineering, with many of their buildings and structures still standing today. From the grand temples of the Forum to the simple yet elegant aqueducts, the Romans were a people of great skill and creativity. Their legacy can still be seen and felt throughout the world, from the grandeur of ancient Rome to the many museums and monuments that still stand today.

End.

One thought on “Running Large Language Models on CPU

  1. llama_model_load: loading model from 'ggml-alpaca-7b-q4.bin.bak' - please wait ...
    llama_model_load: failed to open 'ggml-alpaca-7b-q4.bin.bak'
    main: failed to load model from 'ggml-alpaca-7b-q4.bin.bak'

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.