graphgrc

Incident Response Process

Process for responding to security incidents from detection through resolution.

Roles and Responsibilities

Incident Commander (IC): Leads response, coordinates teams, makes decisions (Security Team Lead)
Security Team: Investigation, containment, forensics
Infrastructure Team: System remediation, infrastructure changes
Engineering Team: Application-level remediation, code changes
Legal: Regulatory compliance, external notifications
Executive Team: Business decisions, customer communication

Prerequisites

Incident response plan approved and documented
On-call rotation defined for security team
Communication channels established (Slack #security-incidents, PagerDuty)
Incident response runbooks and playbooks available

Process Steps

Step 1: Detection and Reporting

Incident is detected via automated alert or reported by user/third party.

Detection sources:

SIEM alerts (GuardDuty, CloudWatch)
User reports (phishing, suspicious activity)
Vendor notification
Penetration test findings
Security researcher disclosure

Actions:

Create incident ticket
Page on-call security engineer for Sev 1/2
Post to #security-incidents Slack channel

Owner: Anyone (reporter), Security Team (triage) Duration: Immediate

Step 2: Initial Assessment

On-call security engineer performs initial triage and assigns severity.

Assessment criteria:

Data confidentiality impact (is data compromised?)
System integrity impact (is code/config modified?)
Availability impact (is service down?)
Scope (one user vs. all customers?)

Severity assignment: See incident-response-standard.md

Actions:

Assign incident commander (IC)
Escalate to security team lead if Sev 1/2
Notify infrastructure/engineering if needed

Owner: Security Team Duration: 15 minutes for Sev 1/2, 1 hour for Sev 3/4

Step 3: Containment

IC coordinates immediate containment to prevent further damage.

Containment actions (examples):

Isolate compromised EC2 instance (security group rules to deny all)
Revoke compromised credentials (IAM user, API keys)
Block attacker IP at WAF/CloudFront
Disable compromised user accounts
Take snapshot of affected systems before remediation (forensics)

Goal: Stop the bleeding, preserve evidence

Owner: Security Team, Infrastructure Team Duration: 15-30 minutes for Sev 1, 1-2 hours for Sev 2/3

Step 4: Investigation

Security team investigates root cause and determines full scope.

Investigation tasks:

Review logs (CloudTrail, application logs, VPC Flow Logs)
Identify entry vector (how did attacker get in?)
Determine timeline (when did breach start?)
Assess data accessed (query audit logs for data access)
Check for persistence mechanisms (backdoors, additional compromised accounts)

Tools: Log analysis (CloudWatch Insights, Athena), forensics (disk imaging if needed)

Owner: Security Team Duration: Hours to days depending on complexity

Step 5: Eradication

Remove attacker access and remediate root cause.

Eradication actions:

Patch vulnerability that was exploited
Remove malware/backdoors
Reset all potentially compromised credentials
Apply security controls to prevent recurrence
Rebuild compromised systems from clean images if needed

Owner: Security Team, Infrastructure Team, Engineering Team Duration: Hours to days

Step 6: Recovery

Restore systems to normal operation.

Recovery actions:

Bring systems back online
Verify no attacker access remains
Monitor closely for signs of re-compromise
Restore data from backups if needed

Owner: Infrastructure Team Duration: Minutes to hours

Step 7: Communication

Notify stakeholders based on severity and impact.

Internal communication:

Update #security-incidents channel with regular status updates
Notify executive team for Sev 1/2
Brief all-hands if customer-impacting

External communication:

Customer notification if data breach (follow data-breach-response-process.md)
GDPR breach notification within 72 hours if required
Regulatory notifications (SOC 2 auditor, cyber insurance)

Owner: Incident Commander, Legal, Executive Team Duration: Varies (GDPR 72 hours deadline)

Step 8: Post-Incident Review

Conduct blameless retrospective within 7 days of incident closure.

Review topics:

What happened (timeline)
What went well
What could be improved
Root cause analysis
Action items to prevent recurrence

Deliverable: Post-incident report with lessons learned

Owner: Security Team Duration: 1-2 hour meeting, written report within 7 days

Step 9: Follow-up Actions

Implement improvements identified in post-incident review.

Examples:

Deploy additional monitoring/alerting
Update runbooks and playbooks
Security training for teams
Architecture changes
Policy updates

Owner: Security Team, Engineering Team Duration: Weeks (tracked in backlog)

Validation and Evidence

Incident ticket with full timeline
Investigation notes and log exports
Remediation actions documented
Communication records (emails, Slack messages)
Post-incident report
Retain all evidence for 7 years

Severity-Specific Response Times

Sev 1: Immediate response, page on-call, IC assigned within 15 min
Sev 2: Response within 1 hour, IC assigned within 2 hours
Sev 3: Response within 4 hours, next business day for IC
Sev 4: Response within 24 hours

References

Related controls: OPS-03
Related standards: incident-response-standard.md, logging-monitoring-standard.md
Related process: data-breach-response-process.md
NIST SP 800-61r2: Computer Security Incident Handling Guide

Control Mapping

OPS-03: Incident Response ^[9-step incident response: detection, assessment, containment, investigation, eradication, recovery, communication, PIR, follow-up]
INF-03: Logging & Monitoring ^[CloudTrail, CloudWatch, and GuardDuty used for detection and investigation]
DAT-04: Data Privacy (GDPR Compliance) ^[GDPR breach notification within 72 hours, legal team involvement for external communication]

Referenced By

This section is automatically generated by make generate-backlinks. Do not edit manually.

This site is open source. Improve this page.