Incident response is a structured process organizations use to identify and deal with cybersecurity incidents. Response includes several stages, including preparation for incidents, detection and analysis of a security incident, containment, eradication, and full recovery, and post-incident analysis and learning.
This post is a shorter summary of NIST official documentation. (https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-61r2.pdf)
Establishing an incident response capability should include the following actions:
- Creating an incident response policy and plan
- Developing procedures for performing incident handling and reporting
- Setting guidelines for communicating with outside parties regarding incidents
- Selecting a team structure and staffing model
- Establishing relationships and lines of communication between the incident response team and other
- groups, both internal (e.g., legal department) and external (e.g., law enforcement agencies)
- Determining what services, the incident response team should provide
- Staffing and training the incident response team
Organizations should reduce the frequency of incidents by effectively securing networks, systems, and applications.
Preventing problems is often less costly and more effective than reacting to them after they occur. Thus,
incident prevention is an important complement to an incident response capability. Incident handling can be performed more effectively if organizations complement their incident response capability with adequate resources to actively maintain the security of networks, systems, and applications. This includes training IT staff on complying with the organization’s security standards and making users aware of policies and procedures regarding appropriate use of networks, systems, and applications.
Organizations should document their guidelines for interactions with other organizations regarding incidents.
During incident handling, the organization will need to communicate with outside parties, such as other
incident response teams, law enforcement, the media, vendors, and victim organizations. Because these
communications often need to occur quickly, organizations should predetermine communication
guidelines so that only the appropriate information is shared with the right parties.
Organizations should be generally prepared to handle any incident but should focus on being prepared to handle incidents that use common attack vectors
Incidents can occur in countless ways, so it is infeasible to develop step-by-step instructions for handling every incident. This publication defines several types of incidents, based on common attack vectors. Different types of incidents merit different response strategies
What is the difference between an attack vector, attack surface and data breach?
- Attack vector: A method or way an attacker can gain unauthorized access to a network or computer system.
- Attack surface: The total number of attack vectors an attacker can use to manipulate a network or computer system or extract data.
- Data breach: Any security incident where sensitive, protected, or confidential data is accessed or stolen by an unauthorized party.
The attack vectors are:
- External/Removable Media: An attack executed from removable media (e.g., flash drive, CD) or a peripheral device.
- Attrition: An attack that employs brute force methods to compromise, degrade, or destroy systems, networks, or services.
- Web: An attack executed from a website or web-based application.
- Email: An attack executed via an email message or attachment.
- Improper Usage: Any incident resulting from violation of an organization’s acceptable usage policies by an authorized user, excluding the above categories.
- Loss or Theft of Equipment: The loss or theft of a computing device or media used by the organization, such as a laptop or smartphone.
- Other: An attack that does not fit into any of the other categories.
Organizations should emphasize the importance of incident detection and analysis throughout the organization.
Organizations should establish logging standards and procedures to ensure that adequate information is collected by logs and security software and that the data is reviewed regularly.
Automation is needed to perform an initial analysis of the data and select events of interest for human review. Event correlation software can be of great value in automating the analysis process. However, the effectiveness of the process depends on the quality of the data that goes into it.
Organizations should create written guidelines for prioritizing incidents.
Incidents should be prioritized based on the relevant factors, such as
- the functional impact of the incident (effect on the confidentiality, integrity, and availability of the organization’s information).
- the information impact of the incident (effect on the confidentiality, integrity, and availability of the organization’s information)
- the recoverability from the incident (the time and types of resources that must be spent on recovering from the incident)
Organizations should use the lessons learned process to gain value from incidents
After a major incident has been handled, the organization should hold a lessons learned meeting to review the effectiveness of the incident handling process and identify necessary improvements to existing security controls and practices.
Organizing a Computer Security Incident Response Capability
- One of the first considerations should be to create an organization-specific definition of the term “incident” so that the scope of the term is clear.
- The organization should decide what services the incident response team should provide, consider which team structures and models can provide those services, and select and implement one or more incident response teams.
- Incident response plan, policy, and procedure creation is an important part of establishing a team, so that incident response is performed effectively, efficiently, and consistently.
The plan, policies, and procedures should reflect the team’s interactions with other teams within the organization as well as with outside parties, such as law enforcement, the media, and other incident response organizations.
Events and Incidents
An event is any observable occurrence in a system or network. Events include
- a user connecting to a file share
- a server receiving a request for a web page
- a user sending email
- firewall blocking a connection attempt.
Adverse events are events with a negative consequence, such as
- system crashes
- packet floods
- unauthorized use of system privileges
- unauthorized access to sensitive data
- execution of malware that destroys data.
A computer security incident is a violation or imminent threat of violation1 of computer security policies, acceptable use policies, or standard security practices. Examples of incidents are:
- An attacker commands a botnet to send high volumes of connection requests to a web server, causing it to crash.
- Users are tricked into opening a “quarterly report” sent via email that is actually malware; running the tool has infected their computers and established connections with an external host.
- An attacker obtains sensitive data and threatens that the details will be released publicly if the organization does not pay a designated sum of money.
- A user provides or exposes sensitive information to others through peer-to-peer file sharing services
Benefits from Incident Response
- it supports responding to incidents systematically (i.e., following a consistent incident handling methodology) so that the appropriate actions are taken.
- Helps personnel to minimize loss or theft of information and disruption of services caused by incidents
- Ability to use information gained during incident handling to better prepare for handling future incidents and to provide stronger protection for systems and data
- Helps with dealing properly with legal issues that may arise during incidents
Incident Response Policy, Plan, and Procedure Creation
Policy governing incident response is highly individualized to the organization. However, most policies include the same key elements:
- Statement of management commitment
- Purpose and objectives of the policy
- Scope of the policy (to whom and what it applies and under what circumstances)
- Definition of computer security incidents and related terms
- Organizational structure and definition of roles, responsibilities, and levels of authority; should include the authority of the incident response team to confiscate or disconnect equipment and to monitor suspicious activity, the requirements for reporting certain types of incidents, the requirements and guidelines for external communications and information sharing (e.g., what can be shared with whom, when, and over what channels), and the handoff and escalation points in the incident management process
- Prioritization or severity ratings of incidents
- Performance measures
- Reporting and contact forms.
Organizations should have a formal, focused, and coordinated approach to responding to incidents, including an incident response plan that provides the roadmap for implementing the incident response capability. Each organization needs a plan that meets its unique requirements, which relates to the organization’s mission, size, structure, and functions. The plan should lay out the necessary resources and management support. The incident response plan should include the following elements:
- Strategies and goals
- Senior management approval
- Organizational approach to incident response
- How the incident response team will communicate with the rest of the organization and with other organizations
- Metrics for measuring the incident response capability and its effectiveness
- Roadmap for maturing the incident response capability
- How the program fits into the overall organization.
The organization’s mission, strategies, and goals for incident response should help in determining the structure of its incident response capability.
Once an organization develops a plan and gains management approval, the organization should implement the plan and review it at least annually to ensure the organization is following the roadmap for maturing the capability and fulfilling their goals for incident response.
Procedures should be based on the incident response policy and plan. Standard operating procedures
(SOPs) are a delineation of the specific technical processes, techniques, checklists, and forms used by the incident response team.
SOPs should be tested to validate their accuracy and usefulness, then distributed to all team members. Training should be provided for SOP users; the SOP documents can be used as an instructional tool.
Sharing Information With Outside Parties
Organizations often need to communicate with outside parties regarding an incident, and they should do so whenever appropriate, such as
- contacting law enforcement
- fielding media inquiries
- seeking external expertise
- Internet service providers (ISPs)
- the vendor of vulnerable software
- other incident response teams.
The incident response team should discuss information sharing with the organization’s public affairs office, legal department, and management before an incident occurs to establish policies and procedures regarding information sharing.
The team should document all contacts and communications with outside parties for liability and evidentiary purposes.
The following sections provide guidelines on communicating with several types of outside parties
For discussing incidents with the media, organizations often find it beneficial to designate a single point of contact (POC) and at least one backup contact
- Conduct training sessions on interacting with the media regarding incidents, which should include the importance of not revealing sensitive information, such as technical details of countermeasures that could assist other attackers, and the positive aspects of communicating important information to the public fully and effectively
- Establish procedures to brief media contacts on the issues and sensitivities regarding a particular incident before discussing it with the media.
- Maintain a statement of the current status of the incident so that communications with the media are consistent and up-to-date.
- Remind all staff of the general procedures for handling media inquiries
One reason that many security-related incidents do not result in convictions is that some organizations do not properly contact law enforcement. Several levels of law enforcement are available to investigate incidents: for example
- Federal Bureau of Investigation [FBI]
- U.S. Secret Service)
- district attorney offices
- state law enforcement
- and local law enforcement
Law enforcement should be contacted through designated individuals in a manner consistent with the requirements of the law and the organization’s procedures.
The person designated to be the primary POC should be familiar with the reporting procedures for all relevant law enforcement agencies and well prepared to recommend which agency, if any, should be contacted
Incident Reporting Organizations
FISMA requires Federal agencies to report incidents to the United States Computer Emergency Readiness Team (US-CERT), which is a government wide incident response organization that assists Federal civilian agencies in their incident handling efforts.
Each agency must designate a primary and secondary POC with US-CERT and report all incidents consistent with the agency’s incident response policy. Organizations should create a policy that states who is designated to report incidents and how the incidents should be reported.
Requirements, categories, and timeframes for reporting incidents to US-CERT are on the US-CERT website
Other Outside Parties
An organization may want to discuss incidents with other groups, including those listed below. When reaching out to these external parties, an organization may want to work through US-CERT or its ISAC, as a “trusted introducer” to broker the relationship.
- Organization’s ISP. An organization may need assistance from its ISP in blocking a major network based attack or tracing its origin.
- Owners of Attacking Addresses. If attacks are originating from an external organization’s IP address space, incident handlers may want to talk to the designated security contacts for the organization to alert them to the activity or to ask them to collect evidence. It is highly recommended to coordinate such communications with US-CERT or an ISAC.
- Software Vendors. Incident handlers may want to speak to a software vendor about suspicious activity. This contact could include questions regarding the significance of certain log entries or known false positives for certain intrusion detection signatures, where minimal information regarding the incident may need to be revealed.
- Other Incident Response Teams. An organization may experience an incident that is similar to ones handled by other teams; proactively sharing information can facilitate more effective and efficient incident handling.
- Affected External Parties. An incident may affect external parties directly. External parties may be affected is if an attacker gains access to sensitive information regarding them, such as credit card information. In some jurisdictions, organizations are required to notify all parties that are affected by such an incident.
Incident Response Team Structure
An incident response team should be available for anyone who discovers or suspects that an incident involving the organization has occurred. One or more team members, depending on the magnitude of the incident and availability of personnel, will then handle the incident.
The incident handlers analyze the incident data, determine the impact of the incident, and act appropriately to limit the damage and restore normal services.
The incident response team’s success depends on the participation and cooperation of individuals throughout the organization.
- Central Incident Response Team. A single incident response team handles incidents throughout the organization. This model is effective for small organizations and for organizations with minimal geographic diversity in terms of computing resources.
- Distributed Incident Response Teams. The organization has multiple incident response teams, each responsible for a particular logical or physical segment of the organization. This model is effective for large organizations. The teams should be part of a single coordinated entity so that the incident response process is consistent across the organization and information is shared among teams. This is particularly important because multiple teams may see components of the same incident or may handle similar incidents.
- Coordinating Team. An incident response team provides advice to other teams without having authority over those teams
Incident response team staffing models
- Employees (Internal). The organization performs all of its incident response work, with limited technical and administrative support from contractors.
- Partially Outsourced. The organization outsources portions of its incident response work. Although incident response duties can be divided among the organization and one or more outsourcers in many ways, a few arrangements have become commonplace
- The most prevalent arrangement is for the organization to outsource 24-hours-a-day, 7-days-aweek (24/7) monitoring of intrusion detection sensors, firewalls, and other security devices to an offsite managed security services provider (MSSP).
- Some organizations perform basic incident response work in-house and call on contractors to assist with handling incidents, particularly those that are more serious or widespread.
- Fully Outsourced. The organization completely outsources its incident response work, typically to an onsite contractor. This model is most likely to be used when the organization needs a full-time, onsite incident response team but does not have enough available, qualified employees. It is assumed that the organization will have employees supervising and overseeing the outsourcer’s work.
Team Model Selection considerations
- The Need for 24/7 Availability. This typically means that incident handlers can be contacted by phone, but it can also mean that an onsite presence is required. Real-time availability is the best for incident response because the longer an incident lasts, the more potential there is for damage and loss.
- Full-Time Versus Part-Time Team Members. Organizations with limited funding, staffing, or incident response needs may have only part-time incident response team members, serving as more of a virtual incident response team. In this case, the incident response team can be thought of as a volunteer fire department. When an emergency occurs, the team members are contacted rapidly, and those who can assist do so. An existing group such as the IT help desk can act as a first POC for incident reporting.
- Employee Morale. Incident response work is very stressful, as are the on-call responsibilities of most team members. This combination makes it easy for incident response team members to become overly stressed. Many organizations will also struggle to find willing, available, experienced, and properly skilled people to participate, particularly in 24-hour support.
- Cost. Cost is a major factor, especially if employees are required to be onsite 24/7. Organizations may fail to include incident response-specific costs in budgets, such as sufficient funding for training and maintaining skills. Because the incident response team works with so many facets of IT, its members need much broader knowledge than most IT staff members.
- Staff Expertise. Incident handling requires specialized knowledge and experience in several technical areas; the breadth and depth of knowledge required varies based on the severity of the organization’s risks. Outsourcers may possess deeper knowledge of intrusion detection, forensics, vulnerabilities, exploits, and other aspects of security than employees of the organization.
When considering outsourcing, organizations should keep these issues in mind
- Current and Future Quality of Work. Organizations should consider not only the current quality (breadth and depth) of the outsourcer’s work, but also efforts to ensure the quality of future work.
- Division of Responsibilities. Organizations are often unwilling to give an outsourcer authority to make operational decisions for the environment (disconnecting a web server). It is important to document the appropriate actions for these decision points.
- Sensitive Information Revealed to the Contractor. Dividing incident response responsibilities and restricting access to sensitive information can limit this
- Lack of Organization-Specific Knowledge. Accurate analysis and prioritization of incidents are dependent on specific knowledge of the organization’s environment. The organization should provide the outsourcer with the following
- regularly updated documents that define what incidents it is concerned about,
- which resources are critical
- what the level of response should be under various sets of circumstances.
- The organization should also report all changes and updates made to its IT infrastructure, network configuration, and systems.
- Lack of Correlation. Correlation among multiple data sources is very important. If the intrusion detection system records an attempted attack against a web server, but the outsourcer has no access to the server’s logs, it may be unable to determine whether the attack was successful. To be efficient, the outsourcer will require administrative privileges to critical systems and security device logs remotely over a secure channel
- Handling Incidents at Multiple Locations. Effective incident response work often requires a physical presence at the organization’s facilities. If the outsourcer is offsite, consider where the outsourcer is located, how quickly it can have an incident response team at any facility, and how much this will cost.
- Maintaining Incident Response Skills In-House. Organizations that completely outsource incident response should strive to maintain basic incident response skills in-house. Situations may arise in which the outsourcer is unavailable, so the organization should be prepared to perform its own incident handling.
Incident Response Personnel
- A single employee, with one or more designated alternates, should be in charge of incident response.
- In a fully outsourced model, this person oversees and evaluates the outsourcer’s work. All other models generally have a team manager and one or more deputies who assumes authority in the absence of the team manager.
- The managers typically perform a variety of tasks, including acting as a liaison with upper management and other teams and organizations, defusing crisis situations, and ensuring that the team has the necessary personnel, resources, and skills.
- Some teams also have a technical lead—a person with strong technical skills and incident response experience who assumes oversight of and final responsibility for the quality of the team’s technical work.
- Members of the incident response team should have excellent technical skills, such as system administration, network administration, programming, technical support, or intrusion detection. Every team member should have good problem solving skills and critical thinking abilities.
- Teamwork skills are of fundamental importance because cooperation and coordination are necessary for successful incident response.
- Every team member should also have good communication skills. Speaking skills are important because the team will interact with a wide variety of people, and writing skills are important when team members are preparing advisories and procedures.
Providing opportunities for learning and growth.
Suggestions for building and maintaining skills are as follows
- Budget enough funding to maintain, enhance, and expand proficiency in technical areas and security disciplines, as well as less technical topics such as the legal aspects of incident response. This should include sending staff to conferences and encouraging or otherwise incentivizing participation in conferences, ensuring the availability of technical references that promote deeper technical understanding, and occasionally bringing in outside experts (contractors) with deep technical knowledge in needed areas as funding permits.
- Give team members opportunities to perform other tasks, such as creating educational materials, conducting security awareness workshops, and performing research.
- Consider rotating staff members in and out of the incident response team, and participate in exchanges in which team members temporarily trade places with others (e.g., network administrators) to gain new technical skills.
- Maintain sufficient staffing so that team members can have uninterrupted time off work (e.g., vacations).
- Create a mentoring program to enable senior technical staff to help less experienced staff learn incident handling.
- Develop incident handling scenarios and have the team members discuss how they would handle them.
Dependencies within Organizations
It is important to identify other groups within the organization that may need to participate in incident handling so that their cooperation can be solicited before it is needed. Every incident response team relies on the expertise, judgment, and abilities of others, including:
- Management. Management establishes incident response policy, budget, and staffing. Ultimately, management is held responsible for coordinating incident response among various stakeholders, minimizing damage, and reporting to Congress, OMB, the General Accounting Office (GAO), and other parties.
- Information Assurance. Information security staff members may be needed during certain stages of incident handling (prevention, containment, eradication, and recovery)—for example, to alter network security controls (e.g., firewall rulesets).
- IT Support. IT technical experts (e.g., system and network administrators) not only have the needed skills to assist but also usually have the best understanding of the technology they manage on a daily basis. This understanding can ensure that the appropriate actions are taken for the affected system, such as whether to disconnect an attacked system.
- Legal Department. Legal experts should review incident response plans, policies, and procedures to ensure their compliance with law and Federal guidance, including the right to privacy. In addition, the guidance of the general counsel or legal department should be sought if there is reason to believe that an incident may have legal ramifications, including evidence collection, prosecution of a suspect, or a lawsuit, or if there may be a need for a memorandum of understanding (MOU) or other binding agreements involving liability limitations for information sharing.
- Public Affairs and Media Relations. Depending on the nature and impact of an incident, a need may exist to inform the media and, by extension, the public.
- Human Resources. If an employee is suspected of causing an incident, the human resources department may be involved—for example, in assisting with disciplinary proceedings.
- Business Continuity Planning. Organizations should ensure that incident response policies and procedures and business continuity processes are in sync. Computer security incidents undermine the business resilience of an organization. Business continuity planning professionals should be made aware of incidents and their impacts so they can fine-tune business impact assessments, risk assessments, and continuity of operations plans
- Physical Security and Facilities Management. Some computer security incidents occur through breaches of physical security or involve coordinated logical and physical attacks. The incident response team also may need access to facilities during incident handling—for example, to acquire a compromised workstation from a locked office.
Incident Response Team Services
The main focus of an incident response team is performing incident response, but it is fairly rare for a team to perform incident response only. The following are examples of other services a team might offer:
- Intrusion Detection. The first tier of an incident response team often assumes responsibility for intrusion detection. The team generally benefits because it should be poised to analyze incidents more quickly and accurately, based on the knowledge it gains of intrusion detection technologies.
- Advisory Distribution. A team may issue advisories within the organization regarding new vulnerabilities and threats. Advisories are often most necessary when new threats are emerging, such as a high-profile social or political event (e.g., celebrity wedding) that attackers are likely to leverage in their social engineering. Only one group within the organization should distribute computer security advisories to avoid duplicated effort and conflicting information.
- Education and Awareness. Education and awareness are resource multipliers—the more the users and technical staff know about detecting, reporting, and responding to incidents, the less drain there should be on the incident response team. This information can be communicated through many means: workshops, websites, newsletters, posters, and even stickers on monitors and laptops.
- Information Sharing. Incident response teams often participate in information sharing groups, such as ISACs or regional partnerships. Accordingly, incident response teams often manage the organization’s incident information sharing efforts, such as aggregating information related to incidents and effectively sharing that information with other organizations, as well as ensuring that pertinent information is shared within the enterprise.
- Establish a formal incident response capability. Organizations should be prepared to respond quickly and effectively when computer security defenses are breached. FISMA requires Federal agencies to establish incident response capabilities.
- Create an incident response policy. The incident response policy is the foundation of the incident response program. It defines which events are considered incidents, establishes the organizational structure for incident response, defines roles and responsibilities, and lists the requirements for reporting incidents, among other items.
- Develop an incident response plan based on the incident response policy. The incident response plan provides a roadmap for implementing an incident response program based on the organization’s policy. The plan indicates both short- and long-term goals for the program, including metrics for measuring the program. The incident response plan should also indicate how often incident handlers should be trained and the requirements for incident handlers.
- Develop incident response procedures. The incident response procedures provide detailed steps for responding to an incident. The procedures should cover all the phases of the incident response process. The procedures should be based on the incident response policy and plan.
- Establish policies and procedures regarding incident-related information sharing. The organization should communicate appropriate incident details with outside parties, such as the media, law enforcement agencies, and incident reporting organizations. The incident response team should discuss this with the organization’s public affairs office, legal department, and management to establish policies and procedures regarding information sharing. The team should comply with existing organization policy on interacting with the media and other outside parties.
- Provide pertinent information on incidents to the appropriate organization. Federal civilian agencies are required to report incidents to US-CERT; other organizations can contact US-CERT and/or their ISAC. Reporting is beneficial because US-CERT and the ISACs use the reported data to provide information to the reporting parties regarding new threats and incident trends.
- Consider the relevant factors when selecting an incident response team model. Organizations should carefully weigh the advantages and disadvantages of each possible team structure model and staffing model in the context of the organization’s needs and available resources.
- Select people with appropriate skills for the incident response team. The credibility and proficiency of the team depend to a large extent on the technical skills and critical thinking abilities of its members. Critical technical skills include system administration, network administration, programming, technical support, and intrusion detection. Teamwork and communications skills are also needed for effective incident handling. Necessary training should be provided to all team members.
- Identify other groups within the organization that may need to participate in incident handling. Every incident response team relies on the expertise, judgment, and abilities of other teams, including management, information assurance, IT support, legal, public affairs, and facilities management.
- Determine which services the team should offer. Although the main focus of the team is incident response, most teams perform additional functions. Examples include monitoring intrusion detection sensors, distributing security advisories, and educating users on security.