Running Alpaca AI on Low Cost PC with DALAI

This tutorial will guide you through the process of running LLaMA and Alpaca models on Debian Linux 11 using the DALAI project. With this project, you can launch a web UI server and explore uncensored Alpaca models that run on CPU, all from the convenience of your web browser. This is a great option if you don't have access to expensive GPUs and want to leverage your older PC with an average CPU for this task.

Before we delve into the installation and testing process, let's take some time to review the theory behind Large Language Models (LLMs).

1. What are LLMs, Meta LLaMA and Alpaca Models?

1.1 Large Language Models in General

Large Language Models (LLMs) are machine learning models that are designed to process and understand natural language. These models use deep learning algorithms, such as neural networks, to analyze large amounts of text data and learn patterns and relationships in the language.

Once trained, LLMs can be used for a wide range of natural language processing (NLP) tasks, such as language translation, sentiment analysis, and question-answering systems. Their ability to process and understand natural language has made them a valuable tool for many industries, from healthcare to finance to entertainment.

1.2 Meta LLaMA

LLaMA (Large Language Model Meta AI) is a large language model (LLM) released by Meta AI in February 2023. Meta LLaMA works by taking a sequence of words as an input and predicts a next word to recursively generate text. Meta's LLaMA models are available at several sizes (7B, 13B, 33B, and 65B parameters). Meta's LLaMA 65B and LLaMA 33B are trained on 1.4 trillion tokens and their smallest model, LLaMA 7B, is trained on one trillion tokens.

1.3 Stanford Alpaca

Stanford's Alpaca is a seven-billion parameter variant of Meta's LLaMA, fine-tuned with 52,000 instructions generated by GPT-3.5. In tests, Alpaca performed comparably to OpenAI's model, but produced more hallucinations. Training is cost less than $600. Renowned alignment researcher Eliezer Yudkowsky sees the low price, combined with the training data used, as a threat to companies like OpenAI and expects similar projects [1].

So, with the LLaMA 7B model up and running, the Stanford team then basically asked GPT-3.5 (text-davinci-003) to take 175 human-written instruction/output pairs, and start generating more in the same style and format, 20 at a time. This was automated through one of OpenAI's helpfully provided APIs, and in a short time, the team had some 52,000 sample conversations to use in post-training the Meta's LLaMA model. Generating this bulk training data cost less than US$500.

Then, they used that data to fine-tune the Meta's LLaMA model – a process that took about three hours on eight 80-GB A100 cloud processing computers. This cost less than US$100. Next, they tested the resulting model, which they called Alpaca, against ChatGPT's underlying language model across a variety of domains including email writing, social media and productivity tools. Alpaca won 90 of these tests, GPT won 89 [2].

1.4 Difference Between Tokens and Parameters

A token is a discrete unit of text that an LLM uses to learn patterns in language. Tokens can be individual words or parts of words, such as prefixes or suffixes. The LLM processes each token in a text sequence and generates an output, which can be used for tasks like language modeling, text classification, and question answering.

Example Tokens:

In the sentence "The cat sat on the mat", the tokens would be "The", "cat", "sat", "on", "the", and "mat".
In the word "unhappy", the tokens would be "un-" and "happy".

Parameters are the numeric values that the LLM uses to adjust its internal calculations and learn from the input data. In other words, parameters are the numerical values that an LLM modifies during training in order to improve its accuracy and performance on a given task. Examples of parameters include weights, biases, and learning rates.

2. Hardware Requirements for DALAI Project

2.1 RAM Requirements

Alpaca runs on most modern computers. Unless a computer is very very old, it should work. According to a llama.cpp discussion thread, here are the memory requirements for each model:

7B => ~4 GB
13B => ~8 GB
30B => ~16 GB
65B => ~32 GB

2.2. Disk Space Requirements

Alpaca:

Currently 7B and 13B models are available via alpaca.cpp.

7B Alpaca comes fully quantized (compressed), and the only space you need for the 7B model is 4.21GB
13B Alpaca comes fully quantized (compressed), and the only space you need for the 13B model is 8.14GB.

LLaMA:

We need a lot of space for storing the models. The model name must be one of: 7B, 13B, 30B, and 65B. The model name must be one of: 7B, 13B, 30B, and 65B.

7B Full: The model takes up 31.17GB, Quantized: 4.21GB
13B Full: The model takes up 60.21GB, Quantized: 4.07GB * 2 = 8.14GB
30B Full: The model takes up 150.48GB, Quantized: 5.09GB * 4 = 20.36GB
65B Full: The model takes up 432.64GB, Quantized: 5.11GB * 8 = 40.88GB

3. DALAI Installation

3.1 Installing Node js & NPM on Debian 11 Bullseye

Firstly, we need to install Node.js. Node.js is a platform that enables developers to use JavaScript on the server-side to build scalable, high-performance web applications. It uses the Google V8 JavaScript engine to execute JavaScript code outside of a web browser environment, allowing developers to write server-side code in the same language as client-side code.

If your Linux doesn't have node.js installed yet, make sure to install node.js >= 18.

Install Curl:

$ sudo apt install curl

Add Node Js Repository:

$ curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash -

Install Nodejs and NPM on Debian 11

$ sudo apt install nodejs

Check Node version:

$ node -v
v18.15.0

3.2 Adding Alpaca and LLaMA Models

Currently supported engines are llama and alpaca. To download 7B alpaca model, you can run:

$ npx dalai alpaca install 7B

Figure 1 - Alpaca 7B Model Downloaded Successfully

The ggml-model-q4_0.bin model, which is approximately 4GB in size, is saved to the directory ~/dalai/alpaca/models/7B. The 7B model is sufficient for our needs, but it's also possible to download and add larger Alpaca models with more parameters, such as the 30B model. However, remember that the larger the model, the greater the performance cost.

$ npx dalai alpaca install 30B

To download llama models, you can run:

$ npx dalai llama install 7B

Or to download 7B and 13B at once:

$ npx dalai llama install 7B 13B

3.3 Starting Web UI

After everything has been installed, run the following command to launch the web UI server and open http://localhost:3000 in your browser:

$ npx dalai serve

4. Alpaca Testing

Let's ask ChatGPT and Alpaca about Donald Trump's hair style and compare their answers:

"Tell me why Donald Trump masks his baldness with a special hairstyle to give the impression that he has a lot of hair?"

ChatGPT refused to answer the question, explaining that it does not spread rumors and does not have access to personal information (see Figure 2).

Figure 2 - Censored Answer from ChatGPT Regarding Donald Trump

Unlike ChatGPT, Alpaca was able to provide a detailed answer about Donald Trump's hairstyle and his motives behind hiding his hair loss, as shown in Figure 3.

Figure 3 - Alpaca Provided Uncensored Answer about Donald Trump's Hair Style

Let's try asking if Elon Musk is showing signs of borderline personality disorder:

"List the characteristics of people with borderline personality disorder and which ones are present in Elon Musk based on the knowledge you have of his actions"

Although ChatGPT declined to answer a question about whether Elon Musk is exhibiting signs of borderline personality disorder, Alpaca was able to provide a response by listing the characteristics of this disorder and analyzing which ones may apply to Musk based on his actions. The details can be found in Figure 4.

Figure 4 - Alpaca Elon Musk Answer Personality

Let's explore Alpaca's insights further and ask about Elon Musk's bold decisions since becoming the founder and owner of Tesla (Figure 5).

Figure 5 - Alpaca's Discussion on Elon Musk's Bold Decisions as Tesla Founder and Owner

Contrary to our expectations, ChatGPT provided a more detailed answer about Musk's bold decisions compared to Alpaca:

Figure 6 - ChatGPT Discussing Elon Musk's Bold Decisions as Tesla Founder & Owner

Conclusion

This tutorial showed how to use Alpaca models on Debian Linux 11 using the DALAI project. We launched a web UI server to explore Alpaca models that run on CPU, making it accessible to those without expensive GPUs.

We also compared the abilities of Alpaca and ChatGPT in providing answers on sensitive topics. While Alpaca gave satisfactory responses, ChatGPT declined to answer. The combination of the DALAI project and Alpaca models provides a valuable tool for exploring the capabilities of AI language models.

However, it is essential to verify and double-check the accuracy and reliability of the responses provided by Alpaca and other AI language models. It is important to approach all information with critical thinking and caution, especially for sensitive or controversial topics.