Systems Administration

Monitoring IIS Servers

Systems Administration usually comes with a lot of monitoring. But what exactly are we monitoring and why are we monitoring? Are we just gathering raw data that sits on spindles in a PC underneath your desk?

I’m on the topic of monitoring web servers running IIS. There are a couple of questions I had to ask myself before blindly selecting counters in a whim to fulfill my need to monitor.

  • How will I know it’s time to upgrade my web server? And how do I know what kind of hardware to purchase?
  • Do I have a baseline of how the load looks like on a regular day?
  • What resources are targetted on a web server?

Those are just some of the questions. To get an idea of what I’m dealing with I’ll give a few statistics.

  • Small time, Mission Critical. Our site is relatively small but is crucial for keeping the lights on.
  • Public vs Private. We have a public facing site and a membership site.
  • 30k. There are about 30k unique visitors a month. Total we have about 60k total visits a month.
  • Windows. We’re a Windows shop so we’re using IIS and SQL.

So what am I looking to monitor on my web servers?

% Processor Time

The CPU processor counter which determines the percentage of time the processor is busy. Ideally this should be less than 75% for a web server. If the counter is greater than 70% you should plan to upgrade.

Memory available bytes

Memory counter which displays the amount of available memory. The lower the better.

% Disk Time

The hard disk counter monitors the percentage of time that the disk is busy with read/write activity. The lower the better. If you’re looking at high % disk time then you may want to monitor the disk queue to ensure that requests are not waiting to be processed.

I asked a question on Server Fault on input regarding monitoring the Logical Disk or the Physical Disks. The responses are interesting but points to what you’re trying to achieve with monitoring the disks.

Requests Succeeded

ASP.NET counter which will display the number of requests that have successfully executed. This will give a status code of 200. There are other codes you could monitor such as 404s, 500s, etc.

Application Restarts

ASP.NET counter which displays the aggregate number of restarts for all ASP.NET applications. There are many reasons why an application would restart and to figure out why you’ll have to hit up the logs.

Get Requests/sec

A request counter that displays the rate (in seconds) at which HTTP requests have been made.

So far this is my running list. Consider it a work-in-progress as I am still finding things that need to be monitored. The next step would be determining thresholds for these counters. At what value would be a warning threshold and a critical threshold. For example, is 30% available memory an OK, warning or critical status.

Once the thresholds are nailed down, what do we do if we reach a warning or critical status. Do notifications get emailed to systems administrators? Do we reboot a service?

I’ll continue this conversation on another post. What counters are you using to monitor your IIS web servers?

{ 0 comments }