Monday, October 25, 2010

System Monitoring Commands

I’ve been doing a large system monitoring project the past month and have setup a centralized monitoring solution that tracks over 800 servers using Nagios.

as part of that we established multiple trending reports and taught the network operations center support staff how to run various tools in LInux for server monitoring (Most of the NOC staff at this company is MS Centric with limited exposure to linux.)

These commands should be well known to anyone doing linux system administration. If you manage Linux servers and aren’t familiar with any of the commands on this list you should spend some time playing with the various options of these tools, knowing what they can do and knowing how to use them can be very useful in troubleshooting a system issue.

the commands we covered are:

top – which provides a dynamic real-time view of running processes. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds.

vmstat – reports information about processes, memory, paging, IO, traps and CPU activity

w – displays who is online (logged in) and reports what they are doing

uptime – reports how long the system has been running

ps – displays running processes

free – Displays total free and used physical and swap memory

iostat – repots statistics on i/o

sar – collects and reports on system activity

mpstat –multiprocessor statistics

pmap – Pricess memory usage

netstat – Network Statistics

ss – network statistics

iptraf – IP Lan Monitor (real time network statistics)

tcpdump – command line packet dump utility for network analysis

strace – system calls trace – useful for debugging

nmap – much much more than just a port scanner –

cacti – web based monitoring tool

ntop – Network Top – displays the top network users

htop – enhanced version of top -

vnstat – console based network traffic monitor

wireshark – the best protocol analyzer around

nagios – Open Source enterprise System Monitor - look for several articles coming soon on the use of nagios.

dstat – combines the output from vmstat, iostat, ifstat, netstat and other tools –

powertop – monitors power consumption of application based on how much time the cpu stays in low power mode vs. Turbo Modes – requires acpi

whowatch – basically shows who is logged in and what they are doing in real time (similar to ‘w’ but it continuously updates)

dtrace – DTrace can be used to get a global overview of a running system, such as the amount of memory, CPU time, filesystem and network resources used by the active processes. It can also provide much more fine-grained information, such as a log of the arguments with which a specific function is being called, or a list of the processes accessing a specific file.

Review your Log Data

Read your logs using logwatch or logcheck. These tools make your log reading life easier. You get detailed reporting on unusual items in syslog via email.

Wednesday, October 20, 2010

keep user accessible data on separate disk partitions

Separation of the operating system files from user files may result in a more secure system. ideally the following filesystems should be mounted on separate partitions:

  • /usr
  • /home
  • /var and /var/tmp
  • /tmp

I also suggest separate partitions for Apache and FTP server roots. Edit /etc/fstab file and make sure you add the following configuration options:

  1. noexec - Do not set execution of any binaries on this partition (prevents execution of binaries but allows scripts).
  2. nodev - Do not allow character or special devices on this partition (prevents use of device files such as zero, sda etc).
  3. nosuid - Do not set SUID/SGID access on this partition (prevent the setuid bit).

Sample /etc/fstab entry to to limit user access on /dev/sda5 (www server root directory):

/dev/sda5  /srv/www/htdocs          ext3    defaults,nosuid,nodev,noexec 1 2

Tuesday, October 12, 2010

establish password aging policies

The chage command changes the number of days between password changes and the date of the last password change. This information is used by the system to determine when a user must change his/her password. The /etc/login.defs file defines the site-specific configuration for the shadow password suite including password aging configuration. To disable password aging, enter:


chage -M 99999 userName

To get password expiration information, enter:

chage -l userName

You can also manually specify the information in the /etc/shadow file which has the following fields

{userName}:{password}:{lastpasswdchanged}:{Minimum_days}:{Maximum_days}:{Warn}:{Inactive}:{Expire}:



Note that the “Expire” date is in Unix Time (seconds since Jan 1, 1970)


The chage command is usually easier than manually editing the /etc/shadow file. 



chage –M 60 –m 7 –W 7 <accountname>

Lock accounts after failed login attempts

You can use the faillog command to set login failure limits and to display a list of failed login attempts.

to unlock an account you can use:

faillog –r –u <accountname>

you can also use the passwd file to lock or unlock accounts manually.

passwd –l <accountname>

passwd –u <accountname>

Sunday, October 3, 2010

Disable unnecessary services

You should periodically review what services are running and remove any that are no longer needed. One way to check is to use the following command: (Note that this command checks for services running in run level 3)

chkconfig –-list |grep ‘3:on’

If you see any services you need to stop and disable you can use these commands:

service <servicename> stop

chkconfig <servicename> off

the first one stops the service; the second one removes it from the list of services that start when you initialize a runlevel (such as system startup).

Check what ports are listening

You can check your servers listening ports with the following command:

netstat –tulpn

or

nmap –sT –0 <hostname>

if any aren’t needed you should consider shutting down that service or blocking access to the port with iptables.