I’ve been doing a large system monitoring project the past month and have setup a centralized monitoring solution that tracks over 800 servers using Nagios.
as part of that we established multiple trending reports and taught the network operations center support staff how to run various tools in LInux for server monitoring (Most of the NOC staff at this company is MS Centric with limited exposure to linux.)
These commands should be well known to anyone doing linux system administration. If you manage Linux servers and aren’t familiar with any of the commands on this list you should spend some time playing with the various options of these tools, knowing what they can do and knowing how to use them can be very useful in troubleshooting a system issue.
the commands we covered are:
top – which provides a dynamic real-time view of running processes. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds.
vmstat – reports information about processes, memory, paging, IO, traps and CPU activity
w – displays who is online (logged in) and reports what they are doing
uptime – reports how long the system has been running
ps – displays running processes
free – Displays total free and used physical and swap memory
iostat – repots statistics on i/o
sar – collects and reports on system activity
mpstat –multiprocessor statistics
pmap – Pricess memory usage
netstat – Network Statistics
ss – network statistics
iptraf – IP Lan Monitor (real time network statistics)
tcpdump – command line packet dump utility for network analysis
strace – system calls trace – useful for debugging
nmap – much much more than just a port scanner –
cacti – web based monitoring tool
ntop – Network Top – displays the top network users
htop – enhanced version of top -
vnstat – console based network traffic monitor
wireshark – the best protocol analyzer around
nagios – Open Source enterprise System Monitor - look for several articles coming soon on the use of nagios.
dstat – combines the output from vmstat, iostat, ifstat, netstat and other tools –
powertop – monitors power consumption of application based on how much time the cpu stays in low power mode vs. Turbo Modes – requires acpi
whowatch – basically shows who is logged in and what they are doing in real time (similar to ‘w’ but it continuously updates)
dtrace – DTrace can be used to get a global overview of a running system, such as the amount of memory, CPU time, filesystem and network resources used by the active processes. It can also provide much more fine-grained information, such as a log of the arguments with which a specific function is being called, or a list of the processes accessing a specific file.