Benchmarking and stress testing is sometime necessary to optimize system performance and remove system bottlenecks caused by hardware or software. In a Linux system , this could be done easily with few basic command line tools.
There are many all in one dedicated benchmarking tool with a pretty GUI available for Linux. But here we are focusing on simple command line tools only and going to test the followings listed bellow.
- CPU stress testing and benchmarking.
- Hard drive I/O performance testing.
- Network performance and speed testing.
- OpenSSL performance testing.
- RAM speed testing.
Have a look at this guide for a detailed GPU benchmarking and stress testing.
Contents
1. CPU Benchmark and stress testing
Prime number calculation test within a given range with sysbench,
sysbench --test=cpu --num-threads=4 --cpu-max-prime=9999 run
this command will calculate all prime numbers within 9999, change the number 9999 if you wish.
Integer calculation performance test with one-line command
time $(i=0; while (( i < 9999999 )); do (( i ++ )); done)
this will return the the time required to crunch the integers between 0 to 9999999.
real 0m47.354s user 0m47.328s sys 0m0.000s
Single threaded CPU stress testing
md5sum /dev/urandom
the /dev/urandom file is a special type of device file, which returns random junk characters endlessly, so calculating md5 hash of /dev/urandom is basically stressing the CPU.
Multi threaded CPU stress testing
This command is to impose a multi threaded load on the CPU for 5minuite,
stress --cpu 4 --timeout 300s
change --cpu 4
value according to your system, if you have a quad core 8 threaded CPU, then use --cpu 8
.
2. Hard drive I/O performance testing
Hard drive I/O performance have a great impact on Linux system performance and user experience. Simply faster HDD means better user experience and in most cases, hard drive I/O speed is more important than CPU and RAM speed.
HDD raw read speed test
A vary simple raw hard drive read speed could be determined with cat and pipebench . This command requires root privilege, so run it as a root user or use sudo.
cat /dev/sda3 | pipebench -q > /dev/null
HDD file write speed test with dd
dd bs=16k count=102400 oflag=direct if=/dev/zero of=test_data
This will create a file of round 1.6 GB containing only zeros, at the present directory and will give you an overview of your HDD write speed.
HDD file read speed test with dd
Now we will use the test_data file created in the above step to determine HDD read speed.
dd bs=16K count=102400 iflag=direct if=test_data of=/dev/null
This command will read the test_data file and throug the data to a black hole like special device file, /dev/null
.
Direct drive read speed test with hdparm
sudo hdparm -t /dev/sda
This test returns timing buffered disk read speed and this result could be assumed as the fastest disk read speed.
note: In Linux, it is common that native filesystems (ext4, btrfs, xfs etc.) always tends to perform better than other third party filesystems like Windows NTFS or Mac HFS+ .
3. Network performance and speed testing
Network interface throughput measurement and benchmarking is very common for networking devices like routers, bridges, hubs etc. Everyone want the best speed out of their network, so it's necessary to benchmark and optimize the network.
Simple network throughput measurement with iperf
Before going further we've to make some assumptions according to the network setup.
While writing this, my network's IPv4 address was in the 192.18.72.1/24 range. The server IP address was 192.168.72.1 and the client PC was 192.168.1.4 So, you've to modify these settings according to your network setup.
- First start a iperf server in the server machine,
iperf -s -p 8000
this will start a iperf in server mode, which binds port 8080 with all available network interfaces.
- Now connect to that server with iperf as client, just find the correct server IP address and run the command bellow
iperf -c 192.168.72.1 -p 8000
This will give you overview of your networks tcp connection throughput speed.
4. Network speed test with ncmeter
Ncmeter is a simple bash shell script which uses netcat-traditional to measure data throughput of a network. This scripts is a part of the netcat-traditional package in Debian based distros, located at the /usr/share/doc/netcat-traditional/examples/contrib folder. Copy that script to /usr/local/bin , make it executable with chmod +x and then run the script.
cp /usr/share/doc/netcat-traditional/examples/contrib/ncmeter ~/ chmod +x ~/ncmeter ./ncmeter --help # show available options
Start the ncmeter in server mode
./ncmeter
which binds all available network interfaces with port 23457.
Open up a new terminal or new tab then run the following command.
./ncmeter 192.168.72.1 256M
Replace the IP 192.168.72.1 with your ncmeter servers IP.
5. OpenSSL benchmark testing
This test basically means how fast the CPU can calculate cryptographic hashes like md5, sha1, aes, rsa etc. etc. OpenSSL benchmark is also important for networking devices like routers, bridges, hubs etc.
OpenSSL benchmarking could be done easily with the commands bellow.
openssl speed des des-ede3 dsa2048 hmac idea-cbc md5
This performance varies greatly if there is any hardware cryptographic accelerator present in the CPU, for more detail see the openssl man page and openssl website.
6. RAM speed testing
STREAM is the de facto industry standard tool for testing RAM speed. However it's not in the Ubuntu's repository. So, you've to compile it yourself to do the test.
I'm not going into the details of how to compile a software on Linux with command line tools. Just the commands for any Debian based system and and little comments after them.
- Install some essential tools for binary compiling.
sudo apt-get install git g++ build-essential pkg-config
- Copy the repository of STREAM.
git clone https://github.com/jeffhammond/STREAM.git
- Go to the directory and compile.
cd STREAM/ && make all
- Test the C version of the application.
./stream_c.exe
Well, that's it, a screenshot of the results below, average speed was above 10 GB/s with all the 4 CPU threads being used.
I don't have any system with a modern 1600 MHz DDR3L RAM or with a 2400 MHz DDR4 RAM, if you have one please do this test share your experience.
RAM speed testing with dd
The dd
unix tool isn't exactly for testing RAM speed, it's also unnecessary to do such testing however this may be considered as an experiment.
tmpfs is a RAM based super fast file system, something like a ramdisk, so by doing a read write speed test on a tmpfs mounted folder will give a rough idea about RAM speed. So, lets have a look at commands bellow.
mkdir RAM_test sudo mount tmpfs -t tmpfs RAM_test/ # mount the tmpfs filesystem cd RAM_test dd if=/dev/zero of=data_tmp bs=1M count=512 # write to RAM test dd if=data_tmp of=/dev/null bs=1M count=512
Look at the result ! It's incredibly fast ! I achieved around 2.8 GB/s write speed and 3.8 GB/s read speed with a 4GB 1333 MHz DDR3 RAM module. You can find out more about linux memory usage here.
Conclusion
So, that's it, simple benchmark test in Linux, this will give you a rough overview of your system performance. This methods should work perfectly on wide range of UNIX or GNU/Linux distributions.
Though benchmarking results could vary test by test as system performance depends upon many things like currently running applications, background tasks even on ambient temperature.
If you have any further question, feel free to ask it and also don't forget to share this tutorial with your friends.
Invict says
When I enter the command 'cd STREAM/ && make all', I get this error output:
gcc-4.9 -O2 -fopenmp -c -o mysecond.o mysecond.c
make: gcc-4.9: Command not found
: recipe for target 'mysecond.o' failed
make: *** [mysecond.o] Error 127
Any idea how to resolve this?
Arnab Satapathi says
Gcc-4.9 is not available any more in the standard repo.
You need to compile with latest GCC version, gcc-7.
Ned says
I read the comments in stream.c. And figured out how to change the array size. Here is the output with Array size = 100000000. On a Ryzen 9 3900x, running Debian 9.12 on a MSI X470 Gaming + with ddr4-3600.
~/STREAM$ ./stream_c.exe
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 100000000 (elements), Offset = 0 (elements)
Memory per array = 762.9 MiB (= 0.7 GiB).
Total memory required = 2288.8 MiB (= 2.2 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested = 24
Number of Threads counted = 24
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 66498 microseconds.
(= 66498 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 16481.3 0.097214 0.097080 0.097514
Scale: 16465.6 0.097327 0.097172 0.097490
Add: 18419.3 0.130490 0.130298 0.130987
Triad: 18581.5 0.129334 0.129161 0.129577
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
Ned says
Hi, Thanks for the test tools.
Tested ram on a Ryzen 9 3900x ddr4-3200 on a MSI X470 Gaming +
The ddr4 is set to run at 3200 in the bios, Debain 9.12 amd64.
~/STREAM$ ./stream_c.exe
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 10000000 (elements), Offset = 0 (elements)
Memory per array = 76.3 MiB (= 0.1 GiB).
Total memory required = 228.9 MiB (= 0.2 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested = 24
Number of Threads counted = 24
-------------------------------------------------------------
Your clock granularity/precision appears to be 2 microseconds.
Each test below will take on the order of 6789 microseconds.
(= 3394 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 17670.0 0.009254 0.009055 0.009376
Scale: 16344.9 0.010832 0.009789 0.014405
Add: 17564.4 0.013784 0.013664 0.014412
Triad: 18512.8 0.013166 0.012964 0.013749
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
BTW, how do I change the array size? I read the stream website but couldn't find any command line options.
josegut85 says
Thanks for share.
time is a good tool.
Menard says
I tried ncmeter method but i don't understand how it works : what IP adress to send this data ? where ?
Dony says
Why not iperf3? Its more accurate.
In addition, there are a lot of test servers available for iperf3: https://iperf.cc
Kay B. says
Hi! dd is a nice tool, but for RAM benchmarks it's not the right thing. It uses a single core and goes 100% on that (obvious bottleneck). You might consider, that your "1333 MHz DDR3 RAM module" should give around 10GB/s. You can use "STREAM", it's specially made for RAM testing and also a command line tool. Another thing is memtest86+. But that's a boot image and I guess it just reads out the type of RAM and calculates the theoretically reachable speed (which can be handy, too).
Regards!
Arnab Satapathi says
Thanks for mentioning the tool STREAM I'll definitely test it.
Nasher_87(ARG) says
Hello, thank you very much, what do you think of these values? It is an AMD APU A6-7400K, 4GB Kingston ddr3 1866mhz, disk WD 160gb sata2 and disk wd sata3 500gb.
CPU speed: events per second: 2191.62
General statistics:
total time: 10.0009s
total number of events: 21924
Latency (ms):
min: 0.72
avg: 0.91
max: 13.64
95th percentile: 1.67
sum: 19969.94
Threads fairness:
events (avg/stddev): 10962.0000/448.00
execution time (avg/stddev): 9.9850/0.00
dd if=/dev/zero of=data_tmp bs=1M count=512 # write to RAM test
512+0 registros leídos
512+0 registros escritos
536870912 bytes (537 MB, 512 MiB) copied, 0,774664 s, 693 MB/s
dd if=data_tmp of=/dev/null bs=1M count=512
512+0 registros leídos
512+0 registros escritos
536870912 bytes (537 MB, 512 MiB) copied, 0,31687 s, 1,7 GB/s
udo cat /dev/sda1 | pipebench -q > /dev/null
Summary:
Piped 22.83 GB in 00h02m56.58s: 132.43 MB/second
sudo cat /dev/sda2 | pipebench -q > /dev/null
Summary:
Piped 1024.00 B in 00h00m00.20s: 4.77 kB/second
sudo cat /dev/sda3 | pipebench -q > /dev/null
Summary:
Piped 9.64 GB in 00h01m28.28s: 111.90 MB/second
sudo cat /dev/sda4 | pipebench -q > /dev/null
Summary:
Piped 2.01 GB in 00h00m32.15s: 64.15 MB/second
sudo cat /dev/sda5 | pipebench -q > /dev/null
Summary:
Piped 8.17 GB in 00h01m14.76s: 111.93 MB/second
sudo cat /dev/sdb1 | pipebench -q > /dev/null
Summary:
Piped 1.86 GB in 00h00m29.06s: 65.58 MB/second
sudo cat /dev/sdb3 | pipebench -q > /dev/null
Summary:
Piped 1024.00 B in 00h00m00.30s: 3.29 kB/second
sudo cat /dev/sdb5 | pipebench -q > /dev/null
Summary:
Piped 1.94 GB in 00h00m45.50s: 43.79 MB/second
sudo cat /dev/sdb2 | pipebench -q > /dev/null
Summary:
Piped 1.92 GB in 00h00m30.26s: 65.20 MB/second
Arnab Satapathi says
Well, a lot of benchmark results.
Hard drive read/write results are within the acceptable range.
Don't know why,but seems like the RAM is running a bit slow on AMD based systems. Of course, take it with a pinch of salt.
tron says
I tried this on my Core i5 4690k OC to 4x 4,6Ghz with DDR3 Corsair 1866Mhz CL 9 2T RAM:
tron@debian:~/RAM_test$ dd if=/dev/zero of=data_tmp bs=1M count=512
512+0 Datensätze ein
512+0 Datensätze aus
536870912 Bytes (537 MB, 512 MiB) kopiert, 0,142725 s, 3,8 GB/s
tron@debian:~/RAM_test$ dd if=data_tmp of=/dev/null bs=1M count=512
512+0 Datensätze ein
512+0 Datensätze aus
536870912 Bytes (537 MB, 512 MiB) kopiert, 0,0605748 s, 8,9 GB/s
Ash Ward says
Hi
Very useful thanks.
I got these results on my old Lenovo T410 with 8GB 1300M ram
[root@localhost RAM_test]# dd if=data_tmp of=/dev/null bs=1M count=512
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 0.152887 s, 3.5 GB/s
Karl says
Throughput is also going to depend on your CPU. This is with an older thinkpad W530 i7 @3.7ghz with DDR3 12800/1600mhz memory.
W530:~/RAM_test$ dd if=/dev/zero of=data_tmp bs=1M count=512 # write to RAM test
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 0.121758 s, 4.4 GB/s
W530:~/RAM_test$
W530:~/RAM_test$ dd if=data_tmp of=/dev/null bs=1M count=512
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 0.0795917 s, 6.7 GB/s
Carlos Ijalba says
Great Article Arnab, extremely useful!
I'm glad to let you know that I have a brand new system with 2400MHz DDR4 RAM and the results of the RAM tests are the following (I'm on a virtual HW, not on RAW machine, mind):
# dd if=/dev/zero of=data_tmp bs=1M count=512
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 0.19393 s, 2.8 GB/s
# dd if=data_tmp of=/dev/null bs=1M count=512
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 0.105065 s, 5.1 GB/s
Arnab Satapathi says
And great feedback too, thanks Carlos for sharing your experience.
Seems like Read speed improved with DDR4 RAM modules.
Anand Rai says
Thanks for the valuable information. Just wanted to know that if any other option to test the Network adapter, where we should not have to install any tool.