· PathShield Team · Tutorials · 9 min read
How to Build a Security Incident Response Plan for Your Startup
Create an effective security incident response plan tailored for startups. Learn the essential components, tools, and processes to handle security incidents quickly and minimize damage.
How to Build a Security Incident Response Plan for Your Startup
When a security incident strikes, the difference between a minor hiccup and a company-ending breach often comes down to preparation. Yet many startups operate without an incident response plan, hoping they’ll never need one. This guide helps you build a practical, actionable incident response plan that fits your startup’s resources and needs.
Why Your Startup Needs an Incident Response Plan
Security incidents are not a matter of “if” but “when.” Consider these statistics:
- 43% of cyberattacks target small businesses
- Average time to identify a breach: 197 days
- Average cost for startups: $200,000+ per incident
- 60% of small companies go out of business within 6 months of a breach
An incident response plan:
- Reduces response time from hours to minutes
- Minimizes damage through quick containment
- Preserves evidence for investigation
- Maintains customer trust through transparent communication
- Meets compliance requirements for SOC 2, ISO 27001, etc.
The 6 Phases of Incident Response
1. Preparation (Before an Incident)
This is where you build your foundation:
# incident-response-team.yaml
team:
incident_commander:
primary: "CTO"
backup: "Lead Engineer"
responsibilities:
- Overall incident coordination
- External communication decisions
- Resource allocation
technical_lead:
primary: "Senior DevOps Engineer"
backup: "Security Engineer"
responsibilities:
- Technical investigation
- System isolation and remediation
- Evidence collection
communications_lead:
primary: "CEO"
backup: "Head of Customer Success"
responsibilities:
- Customer communication
- Stakeholder updates
- PR coordination
Essential Preparation Checklist:
- Define incident severity levels
- Create team contact list with phone numbers
- Set up secure communication channel (Signal, Slack private channel)
- Document system inventory and criticality
- Establish evidence collection procedures
- Create incident response runbooks
- Prepare communication templates
- Regular team training (quarterly)
2. Detection and Analysis
Early detection is critical. Set up monitoring for:
# security_alerts.py
import boto3
import json
from datetime import datetime
class SecurityAlertSystem:
def __init__(self):
self.sns = boto3.client('sns')
self.severity_thresholds = {
'critical': ['root_login', 'data_exfiltration', 'privilege_escalation'],
'high': ['failed_auth_spike', 'unusual_api_calls', 'config_changes'],
'medium': ['new_user_created', 'permission_changes', 'unusual_location']
}
def analyze_event(self, event):
severity = self.determine_severity(event)
if severity in ['critical', 'high']:
self.trigger_incident_response(event, severity)
def trigger_incident_response(self, event, severity):
alert = {
'severity': severity,
'timestamp': datetime.utcnow().isoformat(),
'event_type': event['type'],
'details': event['details'],
'affected_systems': self.identify_affected_systems(event),
'recommended_actions': self.get_response_actions(event['type'])
}
# Alert the team
self.sns.publish(
TopicArn='arn:aws:sns:us-east-1:123456789012:security-incidents',
Subject=f'[{severity.upper()}] Security Incident Detected',
Message=json.dumps(alert, indent=2)
)
Key Detection Sources:
- CloudTrail logs (AWS API calls)
- Application logs
- Network flow logs
- Container runtime monitoring
- User behavior analytics
- Third-party security tools
3. Containment
Quick containment prevents spread:
#!/bin/bash
# containment.sh - Emergency containment script
INSTANCE_ID=$1
SECURITY_GROUP_ID="sg-emergency-isolation"
echo "[$(date)] Starting containment for instance: $INSTANCE_ID"
# 1. Isolate the instance
aws ec2 modify-instance-attribute \
--instance-id $INSTANCE_ID \
--groups $SECURITY_GROUP_ID
# 2. Create snapshot for forensics
aws ec2 create-snapshot \
--volume-id $(aws ec2 describe-instances \
--instance-ids $INSTANCE_ID \
--query 'Reservations[0].Instances[0].BlockDeviceMappings[0].Ebs.VolumeId' \
--output text) \
--description "Incident snapshot - $(date)"
# 3. Disable IAM access keys if compromised
if [ ! -z "$2" ]; then
aws iam update-access-key \
--access-key-id $2 \
--status Inactive
fi
echo "[$(date)] Containment completed"
Containment Strategies:
- Network isolation: Move to isolated security group
- Account suspension: Disable compromised accounts
- Access revocation: Rotate credentials, revoke tokens
- Service shutdown: Stop affected services if necessary
4. Eradication
Remove the threat completely:
# eradication_checklist.py
class EradicationProcedure:
def __init__(self, incident_type):
self.incident_type = incident_type
self.actions_taken = []
def malware_eradication(self):
steps = [
"Identify all infected systems",
"Isolate infected systems from network",
"Run anti-malware scans",
"Rebuild from clean images if necessary",
"Update all security patches",
"Change all potentially compromised credentials"
]
return self.execute_steps(steps)
def compromised_credentials_eradication(self):
steps = [
"Identify all systems accessed with compromised credentials",
"Force password reset for affected accounts",
"Revoke all active sessions",
"Review and remove unauthorized access",
"Enable MFA if not already enabled",
"Audit recent actions by compromised accounts"
]
return self.execute_steps(steps)
def execute_steps(self, steps):
for step in steps:
print(f"[ ] {step}")
# Log completion of each step
self.actions_taken.append({
'step': step,
'timestamp': datetime.utcnow().isoformat(),
'completed_by': os.environ.get('USER')
})
return self.actions_taken
5. Recovery
Restore normal operations:
# recovery-runbook.yaml
recovery_procedures:
service_restoration:
- step: "Verify threat elimination"
validation: "Security scan results clean"
- step: "Restore from backups if needed"
validation: "Data integrity verified"
- step: "Apply all security patches"
validation: "Vulnerability scan passed"
- step: "Gradually restore network access"
validation: "No suspicious activity for 24 hours"
- step: "Monitor closely for 72 hours"
validation: "All metrics within normal range"
validation_checks:
- name: "Security scan"
command: "trivy image ${IMAGE_NAME}"
expected: "0 vulnerabilities"
- name: "Access review"
command: "aws iam get-account-authorization-details"
expected: "No unauthorized changes"
- name: "Log analysis"
command: "python analyze_logs.py --last-24h"
expected: "No anomalies detected"
6. Lessons Learned
Turn incidents into improvements:
# Incident Post-Mortem Template
## Incident Summary
- **Incident ID:** INC-2025-001
- **Date/Time:** 2025-04-27 14:30 UTC
- **Duration:** 2 hours 15 minutes
- **Severity:** High
- **Impact:** 15% of users experienced authentication failures
## Timeline
- 14:30 - Unusual spike in failed login attempts detected
- 14:35 - Security alert triggered
- 14:40 - Incident response team assembled
- 14:45 - Attacker IP addresses identified and blocked
- 15:00 - Root cause identified: exposed API key in public repo
- 15:30 - API key rotated, affected systems secured
- 16:45 - All systems verified secure, monitoring enhanced
## Root Cause Analysis
**What happened:**
Developer accidentally committed AWS credentials to public GitHub repo
**Why it happened:**
- No pre-commit hooks to detect secrets
- Security training gap on credential management
- Lack of automated secret scanning
## Action Items
- [ ] Implement git-secrets pre-commit hooks (Due: May 1)
- [ ] Mandatory security training for all developers (Due: May 15)
- [ ] Deploy automated secret scanning in CI/CD (Due: May 7)
- [ ] Rotate all static credentials to IAM roles (Due: May 30)
## What Went Well
- Rapid detection (5 minutes from first attempt)
- Quick team assembly and response
- Clear communication throughout incident
- No customer data was accessed
## What Could Be Improved
- Faster credential rotation process needed
- Better documentation of API key locations
- Automated containment for credential exposures
Building Your Incident Response Toolkit
Essential Tools for Startups
Free/Open Source:
- TheHive - Incident response platform
# docker-compose.yml for TheHive
version: '3'
services:
thehive:
image: thehiveproject/thehive:latest
ports:
- "9000:9000"
environment:
- TH_CONFIG_FILE=/etc/thehive/application.conf
- DFIR ORC - Collection of forensic tools
# Collect system artifacts
dfir-orc.exe /out:C:\incident\artifacts /jobs:10
- GRR Rapid Response - Remote incident response
# Deploy GRR agent for incident investigation
grr_client = grr.deploy_agent(target_host)
grr_client.collect_artifacts(['BrowserHistory', 'LoginEvents'])
Automation Scripts
Incident Detection Dashboard:
# incident_dashboard.py
from flask import Flask, render_template, jsonify
import boto3
from datetime import datetime, timedelta
app = Flask(__name__)
@app.route('/api/incidents/active')
def get_active_incidents():
incidents = []
# Check CloudWatch alarms
cloudwatch = boto3.client('cloudwatch')
alarms = cloudwatch.describe_alarms(StateValue='ALARM')
for alarm in alarms['MetricAlarms']:
if 'Security' in alarm['AlarmName']:
incidents.append({
'type': 'cloudwatch_alarm',
'name': alarm['AlarmName'],
'description': alarm['AlarmDescription'],
'severity': determine_severity(alarm),
'timestamp': alarm['StateTransitionTime']
})
# Check GuardDuty findings
guardduty = boto3.client('guardduty')
detector_id = get_guardduty_detector_id()
findings = guardduty.list_findings(
DetectorId=detector_id,
FindingCriteria={
'Criterion': {
'service.archived': {'Eq': ['false']},
'severity': {'Gte': 4}
}
}
)
return jsonify(incidents)
Communication Templates
Customer Notification Template
Subject: Important Security Update
Dear [Customer Name],
We are writing to inform you of a security incident that may have affected your account.
**What Happened:**
[Brief, clear description of the incident]
**When:**
[Date and time range]
**What Information Was Involved:**
[Specific data types potentially affected]
**What We Are Doing:**
- Immediately contained the incident
- Conducted thorough investigation
- Implemented additional security measures
- [Other specific actions]
**What You Should Do:**
- Change your password as a precaution
- Review your recent account activity
- Enable two-factor authentication
- [Other specific recommendations]
We take the security of your data seriously and apologize for any concern this may cause. If you have questions, please contact our security team at security@company.com.
Sincerely,
[Your Security Team]
Internal Status Update Template
🚨 **Incident Status Update** 🚨
**Incident ID:** INC-2025-042
**Current Status:** Containment Phase
**Severity:** High
**Start Time:** 2025-04-27 14:30 UTC
**Current Situation:**
- Suspicious activity detected on production servers
- 3 instances isolated for investigation
- No evidence of data exfiltration
**Actions Completed:**
✅ Incident team assembled
✅ Affected systems identified
✅ Network isolation implemented
✅ Forensic snapshots created
**Next Steps:**
- Complete malware analysis (ETA: 30 min)
- Begin eradication procedures
- Prepare customer communication
**Team Assignments:**
- Tech Lead: System analysis
- Comms: Draft customer notice
- Legal: Review compliance requirements
Next update in 30 minutes or sooner if status changes.
Testing Your Incident Response Plan
Tabletop Exercises
Run quarterly scenarios:
# tabletop_scenarios.py
scenarios = [
{
"name": "Ransomware Attack",
"description": "Encryption detected on file server",
"injects": [
"Backup system also encrypted",
"Ransom note demands Bitcoin",
"Media inquiry received"
]
},
{
"name": "Data Breach",
"description": "Customer database exposed on internet",
"injects": [
"Posted on hacking forum",
"Includes payment information",
"Regulatory notification required"
]
},
{
"name": "Insider Threat",
"description": "Departing employee downloading large amounts of data",
"injects": [
"Employee has admin access",
"Downloading customer lists",
"Headed to competitor"
]
}
]
Purple Team Exercises
Combine red team (attack) with blue team (defense):
#!/bin/bash
# purple_team_exercise.sh
echo "Starting Purple Team Exercise"
# Red Team Action
echo "[RED TEAM] Simulating credential theft..."
# (Safe simulation code here)
# Blue Team Detection
echo "[BLUE TEAM] Monitoring for suspicious activity..."
# Check if detection systems catch the activity
# Measure metrics
DETECTION_TIME=$(calculate_detection_time)
RESPONSE_TIME=$(calculate_response_time)
echo "Exercise Results:"
echo "Detection Time: $DETECTION_TIME seconds"
echo "Response Time: $RESPONSE_TIME seconds"
Common Mistakes to Avoid
1. Not Testing the Plan
Problem: Beautiful plan that fails in reality Solution: Regular drills and exercises
2. Unclear Roles
Problem: Everyone (or no one) takes charge Solution: Clear RACI matrix for all roles
3. Poor Communication
Problem: Stakeholders learn about breach from news Solution: Pre-drafted templates and notification trees
4. Insufficient Logging
Problem: Can’t investigate due to missing logs Solution: Comprehensive logging strategy
5. No Legal/PR Involvement
Problem: Making situation worse with poor messaging Solution: Include legal/PR in planning and exercises
Metrics for Success
Track these KPIs:
- Mean Time to Detect (MTTD): < 1 hour
- Mean Time to Respond (MTTR): < 4 hours
- Mean Time to Contain (MTTC): < 6 hours
- Mean Time to Recover (MTTR): < 24 hours
- False Positive Rate: < 10%
- Exercise Participation: > 90%
Compliance Considerations
Different frameworks require different response capabilities:
- SOC 2: Documented procedures, evidence of execution
- ISO 27001: Regular testing, continuous improvement
- GDPR: 72-hour breach notification
- CCPA: Consumer notification requirements
- PCI DSS: Specific forensic requirements
Building an Incident Response Culture
- Blameless Post-Mortems: Focus on system improvements, not finger-pointing
- Regular Training: Monthly security awareness, quarterly IR drills
- Clear Escalation: Everyone knows when and how to escalate
- Continuous Improvement: Every incident makes you stronger
Conclusion
A security incident response plan isn’t about if you’ll need it, but when. Start simple, test regularly, and improve continuously. Remember: a basic plan executed well beats a perfect plan that sits on a shelf.
Your incident response plan is a living document. It should grow with your company, adapt to new threats, and improve with each exercise and real incident.
Next Steps:
- Download and customize the incident response template
- Schedule your first tabletop exercise
- Set up basic security monitoring
- Train your team on their roles
When an incident strikes, you’ll be ready. Your customers, investors, and team will thank you for the preparation.