Category Archives: SLA

INTERNAL AND EXTERNAL SLA’S

A very important point to remember at all times is that you need to have a more aggressive Internal SLA vs. the one that you are offering to your customers. 
I know it sounds self evident doesn’t it, but there are no end of organizations that I’ve dealt with where customers are offered a 4hr SLA on a 24/7 basis and the engineers that can actually fix the problem are either unavailable till the next business day or NOT even on call!!! 
 Let me state this once again and very clearly so that there is NO CONFUSION … If you are offering your customers an SLA of ‘X Hours’ and your Engineering (or Development or Project Management or … etc…) team is only offering you an SLA of ‘X + Y Hours’ … YOU WILL LOSE MONEY and YOU WILL LOSE CUSTOMERS!!! 
 It is imperative that your internal SLA be better than the one you are offering to your customers and you need to ensure that your Sales team and Senior Management are both on board with this. 
Remember, also, that this must go all the way up the chain … your Engineering team has agreed to an internal SLA of ‘X – Y Hours’ (woohoo!! That will solve 80% of your problems) but the Development team is only offering them an SLA of ‘Z’ (assume ‘Z’ is a multiple of ‘X + Y’) … for those 20% of customers and problems that cannot be solved by your Tier 2 (Engineering team in this example) group … you are still going to be in trouble. 
The question, now becomes how much are you and your company willing to invest in protecting yourself from that 20%? Just like everything else there are things you can and cannot do, and you need to decide what your investment will be to give you the best “bang for your buck”.

WHAT IS A HELPDESK?

OK, to start with it’s not a desk that helps people! A help desk is a team of individuals (generally support staff) that provide solutions and resolutions to customers experiencing problems. Generally working at the 1st tier of the support model they are responsible for Incident reporting and resolution vs. Problem Management (I shall discuss those terms in greater depth below).


What is an Incident?
Simply put, an Incident is anything related to customer contact (Incidents are also reported by automatic means via monitoring tools and I will discuss those types of incidents in greater depth in later posts). Incidents related to customers can be anything really – Information requests, Account Updates, Issue reporting are all examples of Incidents. Incidents can also be reported through a variety of different methods – this could include the phone (probably the most common), email (a close 2nd) and even chat. As mentioned previously, automated monitoring tools can also generate incidents.


All of these different Incidents coming from/through different sources would get routed to your Incident Management tool. For smaller teams, this could be something as simple as a spreadsheet but in larger organizations either in-house customer-built applications or enterprise level tools prevail.



Incident Management (in a nutshell)
Your helpdesk is responsible for reviewing the information in each of these incidents and checking if there is an appropriate solution already available to the customer. For those instances for example where the customer wishes to update their Account Information, the helpdesk would look at the Incident, obtain the correct new information (& assuming that all appropriate security questions had been reviewed) log into the customers account and update the information. Once the information had been updated, they would inform the customer and then close the Incident. This is probably one of the simpler examples of an Incident from start to finish.


If the customer is reporting a problem or an issue, the Helpdesk staff are responsible for updating the Incident with all the relevant details as supplied by the customer. If the customer’s issue matches a known fix they are able to inform or supply that fix to the customer, however, if that is not the case they would need to escalate the issue to the Problem Management team. The simplest way to think of the Incident Management (Helpdesk/Tier1) team and the issues they resolve is that if a “band-aid” exists they can apply it. If more drastic attention is required they will need to call the Doctor!



Problem Management
Problem Management is where the interesting work really happens. Incident Management due to its repetitive nature can get tedious and is definitely a drain on the more skilled staff in your organization … if you have people like that, think about moving them into Problem Management if you have such a team or create one if you don’t! Problem Management is more in-depth. It’s where more often than not a single Problem is the cause of multiple Incident’s from multiple customers … as such you want your best people at this level. Generally, you would consider this Tier 2 or Tier 3 from an escalation and staffing perspective and dependent on your product or service you would have some very technically oriented people there. Their goal is not to just provide a band-aid, but rather to find out why the problem happened in the first place and fix it. Ideally, they should be looking at ways to fix it in such a way as to ensure that it doesn’t happen again!!



KPI’s
Now each of these teams would have different metrics in place. Obviously, your Tier1 team (Incident Management/Customer Service/Helpdesk) needs to get back to the customer in a timely manner. Their goal as already mentioned is to fix it, fix it fast and move on. A band-aid will not always reattach the finger though, so it’s up to the Tier2 team to ensure that the surgery goes smoothly which obviously takes a lot more time as you don’t want the surgeon doing a shoddy job!




Response Time – So with that analogy in mind … you want to have an aggressive goal set for your Helpdesk – try to work with the 80/20 rule … 80% of incidents responded to in 20 seconds (If you have the resources, otherwise maybe 20 minutes? Or 20 hours (that’s less than 1 day so might still be good – especially if you’re doing email support)? Or 20 days ß well that’s probably not really worthwhile) but hopefully you get the point? You want to set a specific goal for measuring how quickly your customers are getting a response.



Resolve Time – notice that I have separated these out. As much as you’d like to be able to resolve 100% of issues at that first contact, it’s not always going to be possible. However, you can have another measurement in place that tracks this which is the Resolve Time (sometimes called MTTR (Mean Time to Repair)). The Goal here is also to get that band-aid on as quickly as possible so you need to ensure that your Incident Management system has some sort of a knowledge base which helps your staff find the solution to commonly placed issues/questions. If they have the answer every time, then a 100% resolution at 1st contact is achievable! If not, however … it gets a bit more complicated because all of a sudden your Incident Management team becomes the customer and the team they go to is the Problem Management team. Guess what? They have a different measurement for Response Time and Resolve Time too!


Problem Management Response Time – now as previously mentioned these are generally your more senior staff and as much as you’d like them to be available 24/7 unless you have an extremely large organization this is probably fairly unlikely. So you are going to have built or determined some relevant response times based on their availability. In addition, as these escalated issues are generally issues that cannot easily be resolved, your resolution time is going to be extended also. Pick some appropriate intervals that meet your customers SLAs. Your main goal for this team (in addition to resolving the problem of course) is communication, communication, communication!!! They must inform your customer-facing agents what the issue is, what they are doing to resolve it and when they expect to have it resolved. If they cannot provide an estimated resolution time, they MUST provide your Tier1 team with an estimated update time.

Technical Support and Tiered Support Levels

In Customer Service and Technical Support it is all about getting that client issue to the right person (based on skills and language) as quickly as possible and ensuring that you meet or exceed your SLA. Now this can be accomplished through a variety of different methods and depending upon the size of your contact center, you should ensure that you explore some or all of them.

Training 

Probably the most important criteria is training. You need to ensure that you have explored the requirements and needs of your customers fully and that based on these needs, the majority of your agents have the requisite skills to resolve their issues and assist them. Determining Their Needs 

If you do not know what your clients need then this is absolutely the first area of concern. You need to conduct surveys and do analysis of your past and historical incidents and contacts and determine from that what they are going to be asking. You will find that there is a significant amount of repetition with regards to client inquiries and if you are in a business with a growing customer base you will see this repetition play out most frequently with new accounts. Once you know what they are going to be asking, then you can put a training plan into place to ensure that you plug those holes. The quicker and sooner you are able to do this, the more satisfied your customers will be. 

Tiered Support Model 

As important as training is, you are not going to be able to have all of your staff at the same level. This is actually not a bad thing as the questions and queries that you will be receiving will also be at differing levels of complexity. By putting in place a plan that allows you to tier your teams based on their skills not only are you being more efficient with your resources, but you are also building an escalation model and a promotion path into your support organization. 

Erlang ‘C’ & Scheduling for Call Centres 

Tiered Support ensures that your training dollars are best spent where they are most useful and also allows you to offer your customers an increased level of service in various different fashions. 

Tier’ing Your Customers 

As you might recall from my previous posts on the 80/20 rule (here and here), you are best served by distributing your clients based on their “value” to your business. As much as you might like to treat all clients the same, the unfortunate fact is that they are not! You will often find that 20% of your customers are responsible for 80% of your issues and also (and perhaps more importantly!) 20% of your clients are responsible for 80% of your revenue. Unfortunately also, these two different “circles” do not always overlap and it is absolutely key that you determine which of your clients fit into which circle. 
Once you have made that determination however, things become much clearer and easier to handle. By putting your customers into tiers, you are able to offer the ones with higher value to your business a different path to the support that they need in contrast to your other customers.

What do you do when your Company is constantly having Outages?

Its been a while for me, but there was a period of my life where I was working for a company that was in a constant state of outage.  They had a mix of services, and over the course of 2 years, I was flown across the country and around the world apologizing for the (lack) of services that my company provided.  While I love traveling and accumulating Air Miles, this was not my idea of a trip as you can imagine.

So what did I do right and wrong?  Well, I got very good at apologizing and groveling and it helped me write my policy on dealing with Irate Customers.  While not exactly ideal I definitely learned a lot from this experience and I definitely made a positive impact on my companies bottom line.  How?

Well, simply put, the Customers stayed! As you can well imagine, when a business and its service is being impacted by a 3rd party the natural inclination of anyone is to pull the service and move to another vendor.  When SLAs are constantly being missed and month on month, services are not improving this is even more likely.

Now – it’s easy to say that I “saved” the customers … but how?

Communication 

Its easier to say than to do – especially when you don’t have any news or even worse when you have bad news (you expected it to be fixed in 1 day and it’s going to take 1 week!).  As I mentioned earlier, I quickly became skilled at speaking to Customer’s face-to-face which happened with quite a few of our Tier1 customers.  I also became skilled at sending out mass emails, posts on message boards and forums and phone calls.  Setting a timeline for an update by any/all of these methods and then ensuring that I met that criteria were key.

Now communication is actually a two-way thing.  Speaking to the customers is great, but what if you don’t have anything to tell them?  Support and Helpdesk teams and Management are frequently on the “short end” of the stick without any updates from Engineering and Programming teams.  More often than not, these internal teams have no concept of the impact that the service interruption are causing to the customers.  It’s your job to persuade them that the customers MATTER and the reason you & they are in a job – working for your company is the money that your customers are paying!!  They will take their business away eventually if you don’t tell them what is going on.
OK, assuming that you’re talking to your customers and your other internal teams are talking to you … what’s next?  Well, you need to ensure that your company is actually doing something to fix the problem!!  The company that I mentioned with constant outages?  Well, they were all with different services … each time one thing was fixed another in a different product was impacted.

From my point of view, it was 2 years of hell, but no one single customer was impacted for that total amount of time.  How do you fix this though, because it is extremely draining on your staff regardless … well, Quality Control is useful.  Make sure that any new product launches are properly tested and tested and tested again before being released into a live environment.  Try to get your staff to break it if possible while it’s in the testing phase.  Make sure your documentation, release notes, and training material are complete and accurate.

Ensure that Senior Management gets involved at the appropriate intervals based on your Escalation Matrix so that they are aware of the impact to the Customers … DO NOT be afraid of escalating.  If you are ON CALL 24/7 so are they!  The money will be released when the phone rings at 2am!

The Curse of the “berry”


Are you invaluable?  How about irreplaceable?  Will the world stop turning if you don’t pick up the phone or answer that email? No?

OK, so why are you ignoring your family (or friends or yourself??) to pick up the phone?  It’s very easy for companies to take advantage of employees & even more so managers who feel a personal responsibility for the performance of the team and department.  Now I’m not talking about those of you who get paid for being “on call” – unfortunately, I’ve found that Managers rarely get compensated for this – but rather the ones who don’t. 

Companies need to understand and realize that employees lives and health are at stake and for some of you (you know who you are) … their family lives also.  Staff needs time away from work and away from the stresses of the job if for no other reason than to recharge their batteries for the next day.  In addition, if staff members are constantly contacted outside of regular business hours than their staffing and hiring needs to be looked at and examined.  

Management needs to create and have in place a proper escalation plan for customers of course and a Manager should be included in there at the appropriate level.  However a Manager should not be the FINAL point of escalation and if Customers matter (which all companies state, but few actual shows), Senior Management should also form part of that plan and in addition, perhaps appropriate out of hours coverage should be put into place!

What is a Helpdesk?

OK, to start with it’s not a desk that helps people! A help desk is a team of individuals (generally support staff) that provide solutions and resolutions to customers experiencing problems. Generally working at the 1st tier of the support model they are responsible for Incident reporting and resolution vs. Problem Management (I shall discuss those terms in greater depth below).

What is an Incident?

Simply put, an Incident is anything related to a customer contact (Incidents are also reported by automatic means via monitoring tools and I will discuss those types of incidents in greater depth in later posts). Incidents related to customers can be anything really – Information requests, Account Updates, Issue reporting are all examples of Incidents. Incidents can also be reported through a variety of different methods – this could include the phone (probably the most common), email (a close 2nd) and even chat. As mentioned previously, automated monitoring tools can also generate incidents.
All of these different Incidents coming from/through different sources would get routed to your Incident Management tool. For smaller teams, this could be something as simple as a spreadsheet but in larger organizations, either in-house customer-built applications or enterprise level tools prevail.

Incident Management (in a nutshell)

Your helpdesk is responsible for reviewing the information in each of these incidents and checking if there is an appropriate solution already available to the customer. For those instances for example where the customer wishes to update their Account Information, the helpdesk would look at the Incident, obtain the correct new information (& assuming that all appropriate security questions had been reviewed) log into the customers account and update the information. Once the information had been updated, they would inform the customer and then close the Incident. This is probably one of the simpler examples of an Incident from start to finish.
If the customer is reporting a problem or an issue, the Helpdesk staff are responsible for updating the Incident with all the relevant details as supplied by the customer. If the customer’s issue matches a known fix they are able to inform or supply that fix to the customer, however, if that is not the case they would need to escalate the issue to the Problem Management team. The simplest way to think of the Incident Management (Helpdesk/Tier1) team and the issues they resolve is that if a “band-aid” exists they can apply it. If more drastic attention is required they will need to call the Doctor!

Problem Management

Problem Management is where the interesting work really happens. Incident Management due to its repetitive nature can get tedious and is definitely a drain on the more skilled staff in your organization … if you have people like that, think about moving them into Problem Management if you have such a team or create one if you don’t! Problem Management is more in-depth. It’s where more often than not a single Problem is the cause of multiple Incident’s from multiple customers … as such you want your best people at this level. Generally, you would consider this Tier 2 or Tier 3 from an escalation and staffing perspective and dependent on your product or service you would have some very technically oriented people there. Their goal is not to just provide a band-aid, but rather to find out why the problem happened in the first place and fix it. Ideally, they should be looking at ways to fix it in such a way as to ensure that it doesn’t happen again!!

KPI’s

Now each of these teams would have different metrics in place. Obviously, your Tier1 team (Incident Management/Customer Service/Helpdesk) needs to get back to the customer in a timely manner. Their goal as already mentioned is to fix it, fix it fast and move on. A band-aid will not always reattach the finger though, so it’s up to the Tier2 team to ensure that the surgery goes smoothly which obviously takes a lot more time as you don’t want the surgeon doing a shoddy job!
Response Time – So with that analogy in mind … you want to have an aggressive goal set for your Helpdesk – try to work with the 80/20 rule … 80% of incidents responded to in 20 seconds (If you have the resources, otherwise maybe 20 minutes? Or 20 hours (that’s less than 1 day so might still be good – especially if you’re doing email support)? Or 20 days ß well that’s probably not really worthwhile) but hopefully you get the point? You want to set a specific goal for measuring how quickly your customers are getting a response.
Resolve Time – notice that I have separated these out. As much as you’d like to be able to resolve 100% of issues at that first contact, it’s not always going to be possible. However, you can have another measurement in place that tracks this which is the Resolve Time (sometimes called MTTR (Mean Time to Repair)). The Goal here is also to get that band-aid on as quickly as possible so you need to ensure that your Incident Management system has some sort of a knowledge base which helps your staff find the solution to commonly placed issues/questions. If they have the answer every time, then a 100% resolution at 1st contact is achievable! If not, however … it gets a bit more complicated because all of a sudden your Incident Management team becomes the customer and the team they go to is the Problem Management team. Guess what? They have a different measurement for Response Time and Resolve Time too!
Problem Management Response Time – now as previously mentioned these are generally your more senior staff and as much as you’d like them to be available 24/7 unless you have an extremely large organization this is probably fairly unlikely. So you are going to have built or determine some relevant response times based on their availability. In addition, as these escalated issues are generally issues that cannot easily be resolved, your resolution time is going to be extended also. Pick some appropriate intervals that meet your customers SLAs. Your main goal for this team (in addition to resolving the problem of course) is communication, communication, communication!!! They must inform your customer-facing agents what the issue is, what they are doing to resolve it and when they expect to have it resolved. If they cannot provide an estimated resolution time, they MUST provide your Tier1 team with an estimated update time.

Internal & External SLA’s


A very important point to remember at all times is that you need to have a more aggressive Internal SLA vs. the one that you are offering to your customers. I know it sounds self-evident doesn’t it, but there is no end of organizations that I’ve dealt with where customers are offered a 4hr SLA on a 24/7 basis and the engineers that can actually fix the problem are either unavailable till the next business day or NOT even on call!!!


Let me state this once again and very clearly so that there is NO CONFUSION …


If you are offering your customers an SLA of ‘X Hours’ and your Engineering (or Development or Project Management or … etc…) team is only offering you an SLA of ‘X + Y Hours’ … YOU WILL LOSE MONEY and YOU WILL LOSE CUSTOMERS!!!

It is imperative that your internal SLA be better than the one you are offering to your customers and you need to ensure that your Sales team and Senior Management are both on board with this. 

Remember, also, that this must go all the way up the chain … your Engineering team has agreed to an internal SLA of ‘X – Y Hours’ (woohoo!! That will solve 80% of your problems) but the Development team is only offering them an SLA of ‘Z’ (assume ‘Z’ is a multiple of ‘X + Y’) … for those 20% of customers and problems that cannot be solved by your Tier 2 (Engineering team in this example) group … you are still going to be in trouble. 


The question, now becomes how much are you & your company willing to invest in protecting yourself from that 20%?


I hope that this gives you the ammunition that you need in your discussions with Senior MGMT. Any help you need or further suggestions, please feel free to contact me using the form on the right side of the page.