Perl and the Practical System Administrator

by
Randy Appleton
rappleto@nmu.edu

Introduction

System administrators are very busy people.  Sometimes a bit of common sense programming can go a long way towards eliminating some of the repetative tasks associated with system administration.  Perl can be used to automate many recurring system administrative tasks.  AccountCheck is an example perl script that automatically checks for many problems that can occur when user accounts are made and deleted without every necessary step being completed.  Partially created or deleted accounts can be a security hazard, acting as hidden doors to let intruders enter a system.  Most systems have such hidden doors.  This perl script checks for all the common types of errors that can occur.  It checks for:

Problem Description

Generally speaking, system administrators are busy people.  In my case, I'm often asked by students to create or delete accounts just before I have to leave to teach.  Although I know how to do these things well, like all other humans I will sometimes make mistakes.  If a system administrator successfully creates and deletes nineteen accounts out of twenty, he gets 5% wrong.  Create and delete one hundred accounts per year for a few years, and the number of errors becomes large.  Even worse, some accounts become dormant.  No one told the system administrator the account is no longer needed, but the person who wanted it has since left.  In an academic environment, we tend to have many such 'orphan accounts'.  Every error results in a incorrect entry in /etc/passwd or /etc/shadow, or an unneeded home directory or mail spool file.  These things waste disk space, and can act as a door for intruders to enter.

Utilities such as 'adduser' and 'userdel' help.  Never the less, these errors and orphan accounts occur.

Upgrading the system does not help.  Typically, we upgrade by installing a distribution onto a new system, and then copying over the old /etc/passwd, /etc/shadow, and home directories.  This just copies the old errors onto the new system.

Searching for these problems manually is possible, but annoying and time consuming.  Better is to have an automated tool to search for these problems.  Perl is a language designed for such tasks, and AccountCheck is a perl script representing just such a tool.

Perl Program

Perl is the perfect language for checking accounts.  The tasks needed to check accounts include parsing the password and shadow files, and checking for the existence of various directories and other files.  These are the tasks perl was designed to do.

Although one might assume the first step is to read /etc/passwd, in practice it works better to read /etc/shadow first.  Once /etc/shadow has been read, all of the checks can be done with one pass through /etc/passwd,.

The first test is to make sure the file /etc/shadow exists since not all distributions use shadow passwords.  Existence is checked using the -f operator in line 19.  If it does exist, /etc/shadow is opened in line 25.  The loop starting at line 26 is means "While there are lines left in the file, read the next line and it to $line.  When the file has been completely read, exit the loop to line 32."  Each entry in /etc/shadow consists of a user name, a password, and some other stuff, all separated by colons.  This data is parsed at line 29, and the password for a user is stored in the associative array 'password'.  The password for 'joe' is stored in 'password{"joe"}.  Associative arrays (arrays where the index can be a string) are a wonderful tool, and will make this perl script much easier to write.
  1:#################################################################
  2:# Change the parameters here
  3:#################################################################
  4:$LOWUID=100;             # don't look at accounts with uid below this
  5:                         # they're special system accounts
  6:$HOME="/home";           # where are the home dirs
  7:$PASS="/etc/passwd";     # where is the password file
  8:$SHADOW="/etc/shadow";   # where is the password shadow file
  9:$MAIL="/var/spool/mail"; # where are the mail spool files
 10:$OLD=30;                 # how many days of no activity before an account is old
 11:$BIGACCOUNT=10000;       # How many KB of disk space before an account is too big
 12:$MAXDEPTH=0000;          # Max depth of directories to seach for files.
 13:
 14:#################################################################
 15:# Nothing to change below here
 16:#################################################################                                                          
 17:#
 18:# Read /etc/shadow if it exists.
 19:if (! -f $SHADOW) {
 20:        print "No shadow password file.  Testing without it.\n";
 21:        $shadow = 0;
 22:}
 23:else {
 24:        $shadow = 1;
 25:        open(SH, $SHADOW) || die "Unable to open password shadow file $SHADOW\n";
 26:        while ($line = <SH>) {
 27:               chomp($line);
 28:               local($junk); # make the -w not error the variable junk
 29:               ($name, $password, $junk) = split(":", $line);
 30:               $password{$name} = $password;
 31:       }
 32:}

The next step to to open and read the password file.  We don't have to check for the existence of /etc/passwd, since all distributions have this file.  The password file is opened at line 35, and the while loop at line 36 reads each entry.  Note how the while loop at line 36 looks just like the while loop at line 26.  Program fragments that look the same and have similar purposes are called program phrases, and serve the same purpose in program languages as clichés do in spoken languages.  They provide a concise and easily repeated way of expressing a common idea.

One problem with parsing the password file is that sometimes the account includes contact information, and other times it does not.  In other words, sometimes the /etc/passwd entry look like ftp:x:14:50:FTP User:/home/ftp:/bin/bash, and other times it looks like ftp:x:14:50::/home/ftp:/bin/bash.  When parsed by perl's split function, entries of the first type have home entry in field #6 and the shell entry in field #7, but the other type of password line has the home and shell entries one field earlier.  The if statement starting at line 43 corrects errors, setting the $home and $shell as needed.

Finally, the name of the user is recorded in the associative array %users.  If $user{"scott"} is equal to one, the user "scott" exists in the password file.  Otherwise, he done not.  This will be used later to ensure that all mail spool files have an associated user.
 
 33:# Open and read in the password file
 34:local($name, $password, $uid, $gid, $longname, $home, $shell);
 35:open(PW, $PASS) || die "Unable to open password file $PASS\n";
 36:while ($line = <PW>) {
 37:       chomp($line);
 38:       $shell="";
 39:
 40:       # split the password entry, and correct if they
 41:       # don't have a longname
 42:       ($name, $password, $uid, $gid, $longname, $home, $shell) = split(":", $line);
 43:        if ($shell eq "") {
 44:                $home=$longname;
 45:                $shell=$home;
 46:        }
 47:        $users{$name} = 1;
Now that the data has been read, the various checks must be performed.  First, we check to ensure that the user ID corresponds to a valid user.  Generally, user IDs below 100 are for general system purposes (ftp, the web, etc.) and should not be checked for errors.  If the account uid is too low, perl's 'next' command restarts the while loop at line 20, reading a new line from /etc/passwd.  User ID and group ID 0 are special.  Any account with that user or group ID has full root privileges, even if it's not named root.  We had a hacker break into one of our machines, and create accounts with innocent sounding names but with a user ID of zero.  This hacker did substantial damage.  Now we run this script, and all such accounts are flagged for inspection.
 
  48: # Check some obvious stuff
 49:        if ($uid < $LOWUID) {
 50:                next;
 51:        }
 52:        print "Checking $name\n";
 53:        if ($uid == 0 || $gid == 0) {
 54:                print "    $name has $uid and gid $gid!\n";
 55:         }
The password should be checked.  If there is no password at all, then anyone can log in using that account.  This is an open door to everyone, and is checked in line 56.  If the shadow password file was located, then there should be an entry for this user in /etc/shadow.  Line 43 prints an error if both shadow passwords were found ($shadow == 1, set in line 24) and that the particular user lacks an entry (!defined($password{$name}), set in line 40).  The "!" operator means "not".

Interestingly, even in systems with shadow passwords, not all accounts will have shadow passwords.  Accounts made by hand, and accounts made before the system was upgraded to use shadow passwords, will store the password in /etc/passwd and not /etc/shadow.  Even changing passwords will not cause the new password to be stored in /etc/shadow.  Password entries must be moved from /etc/passwd to /etc/shadow manually.
 
  56:        if ($password eq "") {
 57:                print "    $name has no password!\n";
 58:        }
 59:        if ($shadow == 1 && !defined($password{$name})) {
 60:                print "    $name has no entry in shadow password file $SHADOW!\n";
 61:        }

Next, the shell should be checked.  If the shell is not one of the approved shells (checked at line 62), then an error is printed.  Often the shell is not on the approved list because the sysadmin disabled the account.  Here we disable accounts by changing the shell to /bin/DISABLED.  When the user tries to log in, the login process will attempt to run the nonexistent /bin/DISABLED.  When that fails, the user will be logged back out immediately.  Setting the shell to /bin/DISBALED is sometimes more desirable than just deleting the account because it documents right in the password file that the account did exist, and has been disabled on purpose.

The home directory should also exist.  If the directory does not exist (checked with the -d operator, line 66) then an error message is printed.  If it does exist, then we check to see if the user shares a home directory with someone else.  Someone is already using this directory if the variable $homes{$home} has a value.  For example, if $homes{"/home/randy"} has been set, then no one else should use the home directory "/home/randy". At my school, both our secretaries use the same home directory, since they work on the same shared things.  However, such sharing is often an error , and should be flagged for human inspection.
 
 62:        if ($shell ne "/bin/bash" && $shell ne "/bin/csh" && 
 63:                $shell ne "/bin/tcsh") {
 64:                print "    $name is disabled with shell $shell\n";
 65:        }
 66:        if (! -d $home) {
 67:                print "    $name has home dir $home which is not really a directory.\n";
 68:        }        else {
 69:                  if (defined($homes{$home})) {
 70:                          print "    $name shares a home dir with $homes{$home}.\n";
 71:                  }

At this point, is has been established the the account has a valid home directory.  It is also important to establish that the account is in use.  One indication of being used is that files are being changed.  Although there are occasions when an account is used read-only, they are rare.  The next segment of code attempts to find a file in the home directory tree of the user that has been modified recently.  Of course, what's recent to one person might be the distant past to another.  We deleted an account after 120 days on inactivity.  Later that same day the user attempted to make her first modification in the last four months, and was supprised her account had been deleted.

The code to examine the home directory looking for recently modified files starts at line 73.  It can only be reached if the test for home directory existence (line 66) and uniqueness (line 69) have been successfully passed.  The next bit of code will check every file in the users account, and therefore is a convenient place to also check two other problems.  The user should be the owner of every file in his account, and the total space used by the account should be reasonable.  Although these things could be checked for in separate passes, it is more efficient to check all at once.

Although perl can certainly read a directory using built in perl commands (like opendir, readdir, and stat) it is often easier to read a directory hierarchy using find.  Line 83 runs the find process, which produces a line of output for every file to be checked.  Each line looks like this:  /home/randy/comdex-letter.html-XQC-4-XQC-randy-XQC-949629416.  The filename is '/home/randy/comdex-letter.html', the size is 4K, the owner is 'randy', and the last modified time is 949,629,416 seconds since the Jan 1, 1970.  The '-XQC-' are markers seperating the fields, using the assumption that no filename has an '-XQC-' within it.  Line 85 reads information from find, and lines 88 and 89 parse it into $filename, $size, $owner, and $date.
 72:           else {
 73:                       #
 74:                        # Scan the users home directory looking for 
 75:                        # files owned by someone else, and checking
 76:                        # the amount of space used.
 77:                        #
 78: 
 79:                        $homes{$home} = $name;
 80:                        # Checks the total space used by a subdir, and the access time of
 81:                        # the newest file in that dir.
 82:                        $latest = $filesize = $badowner = 0;
 83:                        open(FIND, "find $home -maxdepth $MAXDEPTH -printf \"%p-XQC-%k-XQC-%u-XQC-%C@\n\" |") 
 84:                                || die "Unable to open find";
 85:                        while ($line = <FIND>) {
 86:                                # get a file and it's info
 87:                                #
 88:                                chomp($line);
 89:                                ($filename, $size, $owner, $date) = split("-XQC-", $line);
Now that the output of find has been parsed, the checking can begin.  The if statement at line 90 records the most recent modified time.  After all files have been checked, the most recent time will be compared against the current time (line 104) and an error printed if this most recently modified file was modified too long ago, since that would mean the account has not been used in a long time.

The owner of the file is compared to the owner of the account in lines 93-100.  In an earlier version of this script, every mismatched file was printed, which can be a long list.  This version prints an initial message on the first mismatch, and then prints "has more" on the second error, and ignores any further errors.  The variable $badowner' controls this.  Initially, the variable is set to zero (line 82).  If the first mismatched file is detected (line 93),an error is printed (line 94) and $badowner is set to one (line 95).  If a second mismatched file is later found (line 97), an error is printed (line 98), and $badowner set to two (line 99).  From then on no more errors of this type will be printed.  Such use of a variable to control the number of errors printed is common, and such variables are sometimes called 'state variables'.
 90:                                if ($latest < $date) {
 91:                                        $latest = $date;
 92:                                }
 93:                                if ($name ne $owner && $badowner == 0) {
 94:                                        print "    $name has a file ($filename) owned by $owner.\n";
 95:                                        $badowner = 1;
 96:                                }
 97:                                if ($name ne $owner && $badowner == 1) {
 98:                                        print "    $name has more files not owned by $name.\n";
 99:                                        $badowner = 2;
100:                                }
101:                                $filesize += $size;
The close braket at line 102 terminates the while loop reading from find. At this point, information from all files within the account has been gathered.  There are just two more checks needed.  The modified time of the most recently modified file is compared to the current time at line 103.  The difference is divided by 60 (number of minutes per hour) and 60 again (number of seconds per minute) and 24 (hours per day).  Although one could simply divide by 86,400 (which is 60 *60 * 24) coding the divide in this way helps to self-document the code, and leaves clues for later programmers to help understand what's happening and why.  The check at line 107 compares the total file size against the allowed limit, and complains of the account stores too much data.
 
102:                        }
103:                        $old = (time() - $latest) / 24 / 60 / 60;
104:                        if ($old > $OLD) {
105:                                print "    $name has not changed a file in $old days.\n";
106:                        }
107:                        if ($filesize > $BIGACCOUNT) {
108:                                print "    $name uses $filesize KB of disk space\n";
109:                        }
110:                }
111:        }
112:}

Conclusion

There are two categories of things to learn from AccountCheck.  First are several perl programming techniques, like leaving self-documenting code (line 103), the use of state variables to help control output (lines 93-100) and the use of external programs to make perl tasks easier (line 83).  The second is some system administration advice..  Perl can be both very useful, and very easy for helping catch system administration tasks.  Performing these checks by hand could take forever, and once the tool is written the same tool can be used many times to catch errors.  Developing programming skills can help any system administrator save time, and run a tighter system.

URLs

The source code can be found at http://euclid.nmu.edu/~randy/Survey.html.  One of the best sites for perl can be found at http://www.perl.org.