Security Incident Response: Complete Guide
When things go wrong. Learn how to detect, contain, and recover from security incidents.
🎯 What You'll Learn
- Understand the incident response lifecycle
- Learn detection and containment strategies
- Know how to investigate incidents
- Recover and learn from incidents
- Build an incident response plan
When Things Go Wrong
Security incidents happen to everyone-breaches, malware, data leaks. What matters is how you respond.
Fast, effective response minimizes damage. Panic and improvisation make it worse.
The Incident Response Lifecycle
Phase 1: Preparation
Before incidents happen:
Build Your Team
| Role | Responsibility |
|---|---|
| Incident Commander | Decisions, communication |
| Technical Lead | Investigation, containment |
| Communications | Internal/external messaging |
| Legal | Compliance, notification |
Create Runbooks
Pre-written playbooks for common scenarios:
- Malware infection
- Data breach
- DDoS attack
- Account compromise
Prepare Tools
- Log aggregation (ELK, Splunk)
- Forensic tools
- Communication channels (out-of-band)
- Contact lists
Phase 2: Detection
Recognize that an incident is occurring:
Detection Sources
| Source | Example |
|---|---|
| Monitoring alerts | Unusual login patterns |
| User reports | ”My computer is acting weird” |
| External notification | Researcher, customer, attacker |
| Log analysis | Failed auth spikes |
Triage Questions
- What is happening?
- When did it start?
- What systems are affected?
- Is it ongoing?
- What’s the potential impact?
Phase 3: Containment
Stop the bleeding:
Short-Term Containment
| Action | Purpose |
|---|---|
| Isolate system | Prevent lateral movement |
| Block IPs | Stop ongoing attack |
| Disable accounts | Prevent access |
| Preserve evidence | Don’t destroy logs |
# Example: Isolate network
iptables -A INPUT -j DROP
iptables -A OUTPUT -j DROP
# Or network-level
# Move to quarantine VLAN
Long-Term Containment
Keep business running while you investigate:
- Temporary workarounds
- Clean systems in parallel
- Monitor for re-infection
Phase 4: Eradication
Remove the threat completely:
Find Root Cause
- How did they get in?
- What did they access?
- What did they leave behind?
Clean Up
- Remove malware
- Close vulnerabilities
- Reset compromised credentials
- Patch exploited systems
Phase 5: Recovery
Return to normal operations:
Restore Safely
1. Verify system is clean
2. Restore from known-good backup
3. Monitor closely after restoration
4. Gradual return to production
Validation
- Systems functioning correctly
- Security controls in place
- No signs of persistent access
Phase 6: Lessons Learned
Every incident is a learning opportunity.
Post-Mortem Meeting
Within 1-2 weeks:
- What happened (timeline)
- What went well
- What could improve
- Action items
Document Everything
# Incident Report: [Title]
## Summary
Brief description of what happened.
## Timeline
- HH:MM - Detection
- HH:MM - Containment began
- HH:MM - Root cause identified
- HH:MM - Systems restored
## Impact
- Systems affected
- Data exposed
- Duration
- Cost
## Root Cause
How the incident occurred.
## Response Actions
What was done to contain and eradicate.
## Lessons Learned
What to improve.
## Action Items
- [ ] Item 1 (Owner, Due date)
- [ ] Item 2 (Owner, Due date)
Practice Exercises
Exercise 1: Triage (Beginner)
An employee reports: “I can’t access my email and there are weird files on my desktop.”
What are your first 3 questions?
Answer
- When did you first notice this?
- Did you click any links or open attachments recently?
- Are your coworkers experiencing the same issue?
Exercise 2: Containment (Intermediate)
You’ve confirmed malware on a developer’s laptop that has SSH access to production servers.
What containment actions do you take?
Answer
- Isolate the laptop from network
- Revoke developer’s SSH keys
- Check production access logs
- Force password reset on affected accounts
- Monitor for unusual production activity
Exercise 3: Post-Mortem (Advanced)
Write a brief post-mortem for this scenario:
- Attacker gained access via phishing
- Had access for 3 days before detection
- Exfiltrated customer database
Knowledge Check
-
What are the six phases of incident response?
-
Why preserve evidence during containment?
-
What’s the difference between short-term and long-term containment?
-
Why do a post-mortem?
-
What’s the first thing you should do when you detect an incident?
Answers
-
Preparation, Detection, Containment, Eradication, Recovery, Lessons Learned.
-
For investigation and potential legal action. Destroying evidence makes it impossible to understand what happened.
-
Short-term = immediate isolation. Long-term = temporary workarounds while you investigate and clean up properly.
-
Learn and improve. Understand what happened, what worked, what didn’t, so you’re better prepared next time.
-
Don’t panic. Then assess the scope and impact before taking containment actions.
Summary
| Phase | Goal |
|---|---|
| Preparation | Be ready before incidents |
| Detection | Recognize incidents quickly |
| Containment | Stop the damage |
| Eradication | Remove the threat |
| Recovery | Return to normal |
| Lessons | Improve for next time |
What’s Next?
🎯 Continue learning:
- Security Logging - Detection foundation
- Zero Trust - Reduce blast radius
You now know how to respond when security fails. 🚨
Questions about this lesson? Working on related infrastructure?
Let's discuss