Monday, October 25, 2010

System Monitoring Commands

I’ve been doing a large system monitoring project the past month and have setup a centralized monitoring solution that tracks over 800 servers using Nagios.

as part of that we established multiple trending reports and taught the network operations center support staff how to run various tools in LInux for server monitoring (Most of the NOC staff at this company is MS Centric with limited exposure to linux.)

These commands should be well known to anyone doing linux system administration. If you manage Linux servers and aren’t familiar with any of the commands on this list you should spend some time playing with the various options of these tools, knowing what they can do and knowing how to use them can be very useful in troubleshooting a system issue.

the commands we covered are:

top – which provides a dynamic real-time view of running processes. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds.

vmstat – reports information about processes, memory, paging, IO, traps and CPU activity

w – displays who is online (logged in) and reports what they are doing

uptime – reports how long the system has been running

ps – displays running processes

free – Displays total free and used physical and swap memory

iostat – repots statistics on i/o

sar – collects and reports on system activity

mpstat –multiprocessor statistics

pmap – Pricess memory usage

netstat – Network Statistics

ss – network statistics

iptraf – IP Lan Monitor (real time network statistics)

tcpdump – command line packet dump utility for network analysis

strace – system calls trace – useful for debugging

nmap – much much more than just a port scanner –

cacti – web based monitoring tool

ntop – Network Top – displays the top network users

htop – enhanced version of top -

vnstat – console based network traffic monitor

wireshark – the best protocol analyzer around

nagios – Open Source enterprise System Monitor - look for several articles coming soon on the use of nagios.

dstat – combines the output from vmstat, iostat, ifstat, netstat and other tools –

powertop – monitors power consumption of application based on how much time the cpu stays in low power mode vs. Turbo Modes – requires acpi

whowatch – basically shows who is logged in and what they are doing in real time (similar to ‘w’ but it continuously updates)

dtrace – DTrace can be used to get a global overview of a running system, such as the amount of memory, CPU time, filesystem and network resources used by the active processes. It can also provide much more fine-grained information, such as a log of the arguments with which a specific function is being called, or a list of the processes accessing a specific file.

No comments:

Post a Comment