Incident Management
An incident is a disruption of normal service that affects the user and business. The goal of Incident Management is to restore IT services to the normal state as soon as possible with workarounds or solutions to make sure that it does not affect business.
An incident is an event that is not part of the standard operation; it is an event that you don
’t want to happen but eventually happens. In simple words, Incident Management is a process to manage disruptions in critical IT services and restore them ASAP.
It may sound like a sugar-coated-sophisticated trouble ticketing system. However, Incident management tells you how to implement an IT Helpdesk that understand and works to meet business priorities.
Incident Management outlines the need to have a process to restore services. Service Desk function is the glue that binds the Service Support modules together with a Single Point of Contact to the user and ensures that IT Services stay focused on business.
• Record Basic User Details
• Is the user reporting an outage or asking for a new service
• If he is asking for new service �C New Service Request
o Train your helpdesk analysts to get back to users who ask for new services
o Train them to record details of requests with urgency and priority
o Train the helpdesk team to look for new service plans and milestones
o Train them on where should they look for answers to FAQs
• If he is reporting about outage or disruption―Incident
o Determine whether it is an Incident or not with basic diagnosis
o Check whether you can help with a resolution from the knowledge base
o Assign Incident to Specialist Support Group
o Work closely with Specialist Support Group to provide resolution to the user
o Close the incident with user confirmation
Here
’s a sample Incident Management workflow. Consider this as a basic format and make changes wherever required.
事件管理
事件响应是一项日常的工作,直接影响到用户和业务。事件管理的目标是要恢复
IT
服务到正常状态,尽快寻求解决方法或解决方案,确保它不会影响正常业务。
事件响应仅仅是一个事件,并不是业务运营的一部分,一个你不希望发生但最终会发生的事件。简言之,事故管理是关键
IT
服务的一个中断管理的过程,从而尽快恢复能够服务。
这似乎更像一个精密的,被糖衣包裹的问题处理系统。然而,事件管理告诉你如何使得
IT
服务台能够明白以及处理业务的优先级。
事故管理对需求的描述,必须有一个过程来恢复服务。
Service Desk
的职能是让服务支持模块同用户之间紧密联络,并确保它服务专注于业务。
•
记录基本用户详细
情况
•
是用户报告服务中断或一项新的服务请求
•
如果他正在了解的新服务
-
新的服务请求
Ø
指导你的服务台去分析哪位用户寻求服务;
Ø
指导他们详细记录请求的紧急和优先程度;
Ø
指导服务团队寻找新的服务计划和里程碑
Ø
指导他们从哪里寻找到常见问题的答案(
FAQ
是英文
Frequently Asked Questions
的缩写,中文意思就是“经常问到的问题”,或者更通俗地叫做“常见问题解答”。)
•
如果他对报告事故中断或受阻,
Ø
确定它是否是一个事件或是不符合基本的诊断
o
检查您是否可以从基础知识库中得到帮助
Ø
指定事件到专家支持小组
Ø
密切与专家支持小组的工作,从而提供给用户帮助
Ø
与用户确认关闭事件
下面是一个标准事件管理工作流程。这是一个基本流程,我们可以根据需要来更改。
Problem Management
The goal of Problem Management is to find the root cause of incidents and reduce the impact on business. Problem Management is a proactive approach that prevents recurrence of incidents.
Problem management brings strategy to your helpdesk; it helps you to move from your firefighting mode to a proactive mode. In simple words, the disruptions faced by users are mostly different instances of a problem. When you find and eliminate the root cause of all the Incidents, you also prevent future incidents.
Record the Problem and Match with Known Error Database
A problem can be raised directly or by combining one or more Incidents. Once the problem is recorded, the problem technicians will check if it has been reported before and it there is a known workaround or solution.
Problems that have Workaround/Solution: Known Error
If the reported problem has a workaround or solution, it is a Known Error. The Helpdesk technician can get back to the user with the workaround/solution. The technician needs to note that the problem has occurred and increase the problem count to measure the frequency of the problem.
Classify the Problem to Determine the Right Priority
It is important to classify the problem with
•
Category, Sub Category and Item
•
Business impact and urgency
The classification helps technicians to determine the priority of the problem.
Analyze the problem to determine the root cause
When the problem is classified, it gives a clear picture to the problem technicians as to where they should start. Depending on whether the problem is in the users
’
machine, or in the proxy server or in the firewall, technicians may use various tools to diagnose and resolve it. The technician records all symptoms and the root cause along with a workaround or solution.
Provide Resolution or Initiate a Request for Change
Technicians can get back to users if there is a resolution readily available. If the problem requires a few changes in the system, they can provide a workaround and initiate a Request for Change.
Eg: A group of users are not able to access the Internet, the root cause of which is the firewall. Technicians can provide users with a workaround to access the Internet and initiate a Change Request to replace the firewall to prevent Internet unavailability in the future.
Closing the Problem
Although the Problem technicians close the problem, it is the responsibility of helpdesk engineers or frontline support staff to update users about all the activities. When users have a single point of contact, they don
’
t have to repeat themselves to different technicians. Also, the frontline staff who have logged the call ensure that the solution meets the users
’
needs exactly.
问题管理
问题管理的目标是要找到事故的根本原因和减少业务影响。问题管理是一种主动的做法,可以防止事故重复发生。
问题管理对帮助台的策略,它可以帮助你从你灭火模式转到积极主动的模式。简单来说,用户所面对的几乎多是一个问题的不同面。
当你找到并消除事件的根源,那么就能防止今后发生此类事件。
记录问题和匹配
已知错误数据库
一个问题可以直接由一个或多个事件整理出来。一旦问题被记录,处理问题的技术人员首先将检查它是否出现过,它有没有一个已知的解决方法或解决方案。
问题已有的解决方法
/
解决方案:已知的错误
如果报告的问题有一个解决办法或解决方案,那么它是一个已知的错误。帮助台技术人员可以反馈给用户这个解决方法
/
解决方案。技术人员需要注意这个问题发生的次数来判断这个问题的频率。
问题的分类以确定正确的优先级
这是很重要的问题与分类
•
分类、分类别和项目
•
业务影响和紧迫性
帮助技术人员的分类,以确定问题的优先性。
分析问题以找到出现的根本原因
当问题被进行分类,它提供了清晰的图像给问题处理人员让他们知道从那儿开始。根据是否问题出在用户的机器,或在代理服务器或防火墙,
技术人员可以使用各种工具来诊断和解决。技术人员记录所有问题情况和根从而逐步找到解决方法或解决方案。
提供解决方案或提出一个变更请求
技术人员可以取回用户如果有一个现成的决议。如果该问题在系统中有很大变化,他们可以提供一个解决办法并主动提出更改请求。
例如:一组用户无法上网,其中的根本原因是防火墙阻止。技术人员可以提供给用户一种替代方法来访问
Internet
,并通过变更申请来解决防火墙
问题,防止以后互联网还是不可访问。
关闭问题
尽管问题的技术处理人员关闭该问题,服务台工程师或前线技术支持人员有责任更新所有有关的活动用户。当用户有一个单一的联络方式,他们不用重复反映自己的问题不同的技术人员。此外,一线员工谁也无法仅仅从通话记录确保该解决方案能与用户的需求完全相符。