Sunday, 28 February 2010

In-band monitoring for HTTP.... protect the customer from error pages.

Today's topic is in-band monitoring.
  1. The problem
  2. The cause
  3. The solution
1. The Problem
"404 - Page/Object Not Found", "500 - Server Error". Sound familiar? We receive these errors daily. Sometimes we receive them without noticing - ever noticed an image missing after a page loads?

















These are all examples of wasting precious customer time. Customers with high expectations!


2. The cause
Web Applications are made of up both Static and Dynamic elements. Static: A basic web server request to get 'logo.gif'. Dynamic: a request to a database to list all the transactions where username is 'Bob'.

Typically, a 404 is received when a static object is requested of a web server but the object does not exist. Maybe it wasn't copied to the server? Was accidentally deleted? Permission problem? Server has a bad disk? There are a lot of reasons why this can happen but all have the same result. Poor customer experience.

A 500 Server error is usually (I did say usually - not always) returned from the Application Server and could be the result of: a hung process, poorly coded database timeout, over utilised server, code bug.... the list goes on. Before I get flamed by some Application Developers, the code they right is complex and the developers themselves are often under significant pressure from the business to deploy new functionality, to remain competitive. Rapid time to market = rushed code.


3. The solution
In-band monitoring. Why assume that every response from the Web Server or Application Server is a response that you are willing to share with your customer? A 500 Server Error is a valid response, after all! With an in-band monitoring solution you have the option of rejecting a bad response and retrying the request with a different server.

How's this sound in real time: If server A returns 500/404 then try again with Server B. The customer doesn't need to know.

Friday, 26 February 2010

What is Maximum Uptime?

For those unfamiliar, it has nothing to do with how long I can ride a unicycle. Nor has it anything to do with when I first learning to ski. Maximum Uptime is what we should all strive for in the arena of Internet Service Delivery.

We live in a time when almost everything can be done on-line. Paying bills, ordering food, buying a car and many other things. But the more society adopts on-line services the greater the expectation that these services are always on. The web site must be up, the Point of Sale terminal must be able to authorise transactions, the email confirmation must be received. Furthermore, the accessibility of information has further fuelled growth in expectations. Customers are fickle. They won't stand for delays. "Back in 5 mins", is NOT acceptable.

The solution is to implement systems based on Maximum Uptime.