In the previous tutorial "Cracking WPA/WPA2 Pre-shared Key Using GPU", we showed how to crack WPA/WPA2 pre-shared keys (PSKs) using the Hashcat tool with GPU acceleration. In this tutorial, we will focus on optimizing the PSK cracking process for NVIDIA GPUs using CUDA. We will also provide instructions on how to install and use both the legacy and current versions of Hashcat to crack WPA2 PSKs.
When it comes to brute-force attacks, CUDA can be advantageous due to the ability to provide massive parallelism and computational speed. GPUs are designed with thousands of cores, allowing for massive parallel processing. This makes them well-suited for brute-force attacks, which involve repetitive calculations to systematically generate and test potential solutions. CUDA can distribute the workload across these cores, significantly accelerating the process compared to traditional CPU-based approaches.
1. CUDA versus OpenCL
Before we delve into the installation process, let's explore why we will use CUDA toolkit for enhancing brute-force techniques.
1.1 CUDA
CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model developed by NVIDIA. CUDA can run on Windows, Linux, and MacOS, but only on NVIDIA hardware. Hashcat will pick CUDA, if your hardware supports it, because CUDA is faster than OpenCL (see thread). If your compute device does not support CUDA, hashcat will fall back to the OpenCL backend.
1.2 OpenCL
Open Computing Language (OpenCL) serves as an independent, open standard for cross-platform parallel programming. OpenCL applications can run on almost any operating system, and on most types of hardware, including NVIDIA, AMD, FPGAs and ASICs. A study that directly compared CUDA programs with OpenCL on NVIDIA GPUs showed that CUDA was 30% faster than OpenCL [1].
1.3 Do I need to Install NVIDIA proprietary Driver if I want to Use OpenCl for NVIDIA?
OpenCL is a framework for parallel programming that allows developers to write code that can be executed on various types of processors, including GPUs. However, to utilize OpenCL specifically with NVIDIA GPUs, we need to have the appropriate driver installed. The NVIDIA proprietary driver provides the necessary components and optimizations for OpenCL to work effectively on NVIDIA hardware.
2. Used Hardware/Software and Files for Download:
In this section, I'll be sharing the hardware and software specifications of the Legion 5 laptop that was utilized in the test. Additionally, I'll provide all the necessary files that can be used with both legacy and modern versions of Hashcat to assess the compatibility of your hardware and software for hash brute-forcing.
Hardware:
- Laptop Legion 5 Pro 16IAH7H (for generating AI images with NVIDIA GPU)
- GPU: NVIDIA GeForce RTX 3070 Ti 8GB 150 W GPU, 8GB vRAM
- CPU: 12th Gen Intel(R) Core(TM) i9-12900H
- RAM: 32 GB
Software:
- OS Debian 11
- NVIDIA Driver 470.182.03 from Debian 11 repository
- CUDA Version 11.4 from Debian 11 repository
- Hashcat 6.1.1 from Debian 11 repository
- Hashcat 6.2.6 from GitHub
Downloads:
- Pcap file output_file-01.7z with 4-way handshake (ESSID: HackMeIfYouCan, passphrase: submarine)
- Binary hash file hash file - output_file-01.hccapx.zip for legacy Hashcat up to 6.2.3
- Hash file output_file-01.hc22000.zip for Hashcat 6.0.0 and later
3. Installing Hashcat
Before we get started with the CUDA installation, let's show you how to install the legacy Hashcat from the repository and the modern Hashcat from Git. This will help us compare the performance of Hashcat when using OpenCL and CUDA, and see which one has a better hash rate.
3.1 Legacy Hashcat 6.1.1
To install the legacy version of Hashcat (version 6.1.1) from the Debian 11 repository, enter the command bellow. The Hashcat version up to 6.2.3 supports the 2500 - WPA-EAPOL-PBKDF2 hash mode and uses the binary hash format hccapx (as shown in Figure 1).
Note: Hccapx file contains 13 attributes such as EAPOL, EAPOL length, ESSID, BSSID (MAC of AP) [2].
$ sudo apt install hashcat
Figure 1 - The Hashcat version 6.1.1 Installed from Debian 11 Repository in /usr/bin
In the Downloads section, you will find two files: "output_file-01.7z," which contains the captured pcap file with the 4-way handshake, and "output_file-01.hccapx.zip," which is the corresponding hash format file. The "output_file-01.hccapx.zip" file has been created using cap2hccapx, and we have previously provided instructions on how to install and use cap2hccapx in the Cracking WPA/WPA2 Pre-shared Key Using GPU tutorial. Since those instructions have been covered, we will now proceed to share the hccapx file. This file is in the hash format and can be directly used with Hashcat 6.1.1.
3.2 Hashcat 6.2.6
If you are using Hashcat version 6.2.4 or later, please note that the plugins 2500/2501 and 16800/16801 are outdated and no longer functional. As a result, you cannot use the hash mode 2500 or the hccapx hash format.
In the Downloads section, you will find the file "output_file-01.hc22000.zip" specifically prepared for Hashcat version 6.0.0 and latyer. This file has been generated by converting the captured EAPOL messages (4-way handshake) to the hash mode 22000, which is known as WPA-PBKDF2-PMKID+EAPOL. You can use this file directly with Hashcat 6.2.4 to proceed with the passphrase cracking process.
To demonstrate the process of cracking a passphrase using the latest version of Hashcat (version 6.2.6), you can follow these steps to download it from the repository and compile it:
$ git clone https://github.com/hashcat/hashcat.git
$ cd hashcat/
$ make
$ sudo make install
Figure 2 - The Hashcat Version 6.2.6 from Github
To install the hcxtool utility from the Debian 11 repository, you can use the following command:
$ sudo apt install hcxtool
To convert the captured EAPOL messages (4-way handshake) to the hash format recognized by Hashcat 6.0.0 and later versions, you can use the following command:
$ hcxpcapngtool -o output_file-01.hc22000 output_file-01.cap
4. NVIDIA Driver Installation and OpenCL Benchmark
In this section, we will proceed with the installation of the NVIDIA proprietary driver from the Debian 11 repository.
Firstly, add contrib and non-free repositories to /etc/apt/sources.list
Figure 3 - Contrib and Non-free Repositories Added to /etc/apt/ources.list
Firstly, update the local package database on your system by fetching information about available software updates from the repositories specified in /etc/apt/souces.lisr.
$ sudo apt update
Secondly, install NVIDIA proprietary driver vversion 470.182.03. from Debian 11 repository:
$ sudo apt install nvidia-driver
Next, let's begin the benchmark process for 2500 hash types on NVIDIA cards using OpenCL. To initiate the benchmark, please use the command provided below.
$ /usr/bin/hashcat -m 2500 -b
Figure 4 - Hashcat 6.1.1 Benchmark for Hash Mode 2500 and NVIDIA GPU when OpenCL is Used
The hashrate for the NVIDIA GeForce RTX 3070 Ti GPU (Device 1) when using OpenCL with Hashcat 6.1.1 is 547,700 hashes per second, or 547.7 kH/s. The Intel CPU (Device 2) was not used in the benchmark, but we will show you how to use both the GPU and CPU if needed.
Note: The hashrate for Hashcat 6.2.6 with hash mode 22000 is about the same. The command for the benchmark is:
$ /usr/local/bin/hashcat -m 22000 -b
5. CUDA Toolkit Installation
In this section, we will install the CUDA toolkit from the Debian 11 repository. In order to run a CUDA application, the system needs to have a CUDA-enabled GPU and an NVIDIA display driver that is compatible with the CUDA toolkit. Since we will install both the NVIDIA driver and CUDA from the Debian 11 repository, we will meet this requirement.
Figure 5 - Minimum Required Driver Version for CUDA Minor Version Compatibility [2]
According to the CUDA Application Compatibility Support Matrix, CUDA up to version 11.8 can be used with NVIDIA 470.182.03. This means that you can install CUDA 11.8 on your system with the NVIDIA 470.182.03 driver and it will work correctly.
$ sudo apt install nvidia-cuda-toolkit
The following packages are installed along with nvidia-cuda-toolkit package.
We successfully installed CUDA, so Hashcat now prefers it over OpenCL. The hashcat benchmark for CUDA API (Device 1) is 536 kH/s (Figure 6). OpenCL (Device 2, an NVIDIA GPU) and Device 4 (an Intel CPU) were skipped.
$ /usr/bin/hashcat -m 2500 -b
Figure 6 - Hashcat 6.1.1 Benchmark for Hash Mode 2500 and NVIDIA GPU (Device 1) when CUDA is Used
The CUDA speed of 536 kH/s is a bit lower than the 547 kH/s achieved with OpenCL. While some benchmarks might favor OpenCL for raw speed, CUDA is generally the preferred option. It's not just about speed; other factors matter too. OpenCL can be trickier to work with on Nvidia cards due to inconsistencies.
For instance, it has memory limitations that can impact loading hash lists or attacks. Nvidia applies a rule where single allocations are limited to 1/4 of total memory, forcing us to find workarounds. CUDA doesn't have this restriction, making it a smoother choice [3].
Note: The Hashcat 6.2.6 hashrate is 541.6 kH/s with hash mode 22000. In that case, the command for the benchmark is:
$ /usr/local/bin/hashcat -m 22000 -b
6. GPU and CPU Combination for Cracking WPA2 Passphrase
During our testing, Hashcat exclusively utilized the GPU for benchmarking, specifically targeting the hash type identified by the switch -m 2500 or 22000. This approach is entirely rational. GPUs are highly suitable for executing brute-force attacks due to their remarkable ability for massive parallel processing. In general, harnessing the power of GPUs with password cracking tools leads to substantially faster password cracking speeds. This is in comparison to relying solely on the computational capabilities of a CPU.
However, what if someone desires to exclusively utilize a CPU or combine the computational power of both the GPU and CPU? Hashcat provides the flexibility to designate the cracking device using the -D switch.
# | Device Type
===+=============
1 | CPU
2 | GPU
3 | FPGA, DSP, Co-Processor
For instance, if the intention is to solely use the CPU in the benchmark, the command would be as follows:
$ /usr/bin/hashcat -m 2500 -b -D 1
On the other hand, if a combination of both the GPU and CPU is preferred, the command would be:
$ /usr/bin/hashcat -m 2500 -b -D 1,2
Now, let's run Hashcat benchmark for 22000 hash type using both NVIDIA GPU CUDA and CPU as shown on Figure 7.
$ /usr/local/bin/hashcat -m 22000 -b -D 1,2
Figure 7 - Hashcat 6.2.6 - Combined CPU and GPU Benchmark for Hash Mode 22000
Note:
While it holds true that GPUs can drastically expedite password cracking in comparison to CPUs, the speed of cracking is also influenced by factors such as the specific hash algorithm in use, the complexity of the passwords, and the particular model of GPU employed. Certain algorithms possess a greater resilience against brute-force attacks due to their design, and passwords with heightened complexity can pose increased challenges for cracking attempts due to their intricate nature.
7. Cracking WPA2 Passphrase with GPU nad CPU
Now, we are ready to advance with the hash cracking process in the output_file-01.hc22000 file, utilizing both GPU and CPU resources. Execute the following command to initiate the cracking procedure:
$ /usr/local/bin/hashcat -m 22000 -a 3 output_file-01.hc22000 ?l?l?l?l?l?l?l?l -D 1,2 -w 3
This command employs the specified hash mode (-m 22000) and attack mode (-a 3) to launch the cracking operation on the output_file-01.hc22000 hash file. The mask ?l?l?l?l?l?l?l?l represents the character set and pattern to be used for generating password candidates. The -D 1,2 flag designates the usage of both CPU (Device 1) and GPU (Device 2), maximizing computational resources. The -w 3 flag sets the workload profile to a balanced level.
Figure 8 - Hashcat 6.2.6 - Combined CPU and GPU Benchmark for Hash Mode 22000
The combined effective hash rate is 550.9 kHz, comprised of 530.9 kH/s from the NVIDIA 3070 TI utilizing CUDA, and 19889 kH/s from the CPU. The GPU is operating at 99% utilization with a temperature of 76°C, while the CPU is running at 81% utilization with a temperature of 100°C. Assuming the passphrase is composed of 8 lowercase characters, the estimated time to uncover this passphrase would be 4 days and 9 hours. As of now, Hashcat has attempted 587,008 hashes out of a total of 208,827,064,576 hashes.
Conclusion
To wrap things up, this tutorial has guided us through optimizing passphrase cracking for NVIDIA GPUs using CUDA. We've seen that GPUs, especially with CUDA, are great for cracking passwords due to their parallel processing power. We've covered Hashcat installation, CUDA benefits, and even looked at Hashcat's flexibility.
Remember, while GPUs speed up cracking, other factors matter too, like hash complexity and password intricacy. Now, armed with these insights, you're all set to crack hashes using both GPU and CPU. The combined hash rate of 550.9 kHz – 530.9 kH/s from NVIDIA 3070 TI and 19889 kH/s from the CPU – is impressive. This tutorial equips you with the skills to tackle hash cracking effectively and efficiently.