CPU Performance Analysis in Linux

(Is your server not living up to its potential? Order a server from us with promo code PACKETS for 15% off your first invoice)

The CPU is critical in servers used mainly for applications and databases. It is also usually a source of performance bottlenecks. However, high CPU utilization does not always mean that the CPU is doing work; it could also be waiting on another subsystem. When you do performance analysis, always look at the system as a whole, taking care to inspect all subsystems because there may be a cascade effect trickling down that is causing the issue.

UsefulÂ commandsÂ for performance analysis

uptime

Uptime gives a one-line display of the following information:

- Current time
- How long the system has been running
- How many users are currently logged on
- System load averages in the past 1, 5, or 15 minutesâ€¨

System load averages is the average number of processes that are either in a runnable or uninterruptable state. A process in a runnable state is either using the CPU or waiting to use the CPU. A process in uninterruptable state is waiting for some I/O access, e.g. waiting for disk. The averages are taken over the three (3) time intervals. Load averages are not normalized for the number of CPUs in a system, so a load average of one (1) means a single CPU system is loaded all the time while on a four (4) CPU system, it means it was idle 75% of the time.

(SourceÂ â€œman uptimeâ€).

top

TheÂ topÂ command displays CPU utilization and which processes may be causing the problem. It shows actual process activity. By default, it sorts the process list and displays the most CPU-intensive tasks running on the server in descending order. It then updates this list every three (3) seconds. You can opt to sort the processes by the different available information labels in the table.

An interesting section of theÂ topÂ command output is the header portion, which shows the CPU statistics. Pay attention to the line that shows CPU state percentages based on the interval since the last refresh. The said line contains the following labeled information:

Note: Where two labels are shown below, those for more recent kernel versions are shown first.â€¨Â Â Â Â Â Â Â Â Â Â

- us, userÂ Â Â : time running un-niced user processesâ€¨
- sy, systemÂ : time running kernel processes
- ni, niceÂ Â Â : time running niced user processes
- wa, IO-wait : time waiting for I/O completion
- hi : time spent servicing hardware interruptsâ€¨Â Â Â Â Â Â Â Â
- si : time spent servicing software interruptsâ€¨
- st : time stolen from this vm by the hypervisor

(SourceÂ â€œman topâ€.)

If you see the CPU is busy 100% of the time, but is busy more than say 70% of time in the â€œwaâ€ state, then the likely cause of the issue is an I/O problem. High â€œhiâ€ and â€œsiâ€ values are also indicators of intensive I/O process. You can read more about this topicÂ here.

By default,Â topdisplays average system load for the interval desired for symmetric multiprocessing-based (SMP) systems. If you want to see the load per CPU (core), press â€œ1â€³. In doing this, you turn a line like this:

%Cpu(s):Â 8.8 us,Â 4.7 sy,Â 0.0 ni, 86.5 id,Â 0.0 wa,Â 0.0 hi,Â 0.0 si,Â 0.0 st

into this:

%Cpu0Â :Â 4.3 us,Â 6.6 sy,Â 0.0 ni, 89.2 id,Â 0.0 wa,Â 0.0 hi,Â 0.0 si,Â 0.0 stâ€¨
%Cpu1Â : 20.6 us,Â 3.5 sy,Â 0.0 ni, 75.9 id,Â 0.0 wa,Â 0.0 hi,Â 0.0 si,Â 0.0 stâ€¨
%Cpu2Â :Â 7.9 us,Â 2.2 sy,Â 0.0 ni, 89.8 id,Â 0.0 wa,Â 0.0 hi,Â 0.0 si,Â 0.0 stâ€¨
%Cpu3Â :Â 4.2 us,Â 3.5 sy,Â 0.0 ni, 92.3 id,Â 0.0 wa,Â 0.0 hi,Â 0.0 si,Â 0.0 st

Thepscommand lists all running processes. In the example below, the output is the top ten (10) currently running CPU consumer processes in the system:

ps -eo pcpu,pid,user,args â€“sort â€œ-%cpuâ€ |head -10

Cache flushes in SMP systems

In SMP environments, there is a concept called CPU affinity wherein you can bind processes to certain CPUs. CPU affinity optimizes CPU cache because it keeps the same process on one CPU instead of moving between processors. If a process hops between CPUs, the cache of the new CPU must be flushed. Having several processes do this therefore causes many flushes to occur. This, in turn, makes an individual process finish longer. This unbound-process scenario is actually difficult to detect, as the CPU load will appear balanced and not necessarily peaking.Â Use the command taskset to bind processes to CPUs.

Tuning

Always check if CPU is the one causing the performance issue and not one of the other subsystems. If you were able to isolate the point to the processor as the origin of the bottleneck, do the following actions to improve performance:

Use the commandÂ ps-efÂ to ensure that no unnecessary applications are running in the background.
If you find that there are programs running in the background, terminate them and use theÂ cronÂ command to schedule them to run during off-peak hours.
Identify critical and CPU-intensive processes by using theÂ topÂ command and change their priority using theÂ reniceÂ command.
Depending whether the application is designed to take advantage of multiple processors, it might be better to scale up (bigger CPUs) than toâ€¨scale out (more CPUs). This depends on whether or not your application was designed toâ€¨effectively take advantage of more processorsâ€”e.g. a single-threaded applicationâ€¨would scale better with a faster CPU and not with more CPUs.â€¨
Make sure you are using the latest drivers and firmware of your hardware and software. Not having either can affect the load they have on the CPU.

More on this atÂ Â http://www.redbooks.ibm.com/redpapers/pdfs/redp4285.pdf

Knowledgebase

Categories

Categories

Related Articles

Tag Cloud

Support

Services

About QuickPacket

Knowledgebase

Categories

Categories