I watched nice presentation about how Cloudflare protects itself against DoS. Most of us are not able to do that exactly like them but some of tips were general enough to be used on typical web front server.
I took notes from this presentation and presented here. Thanks to Marek agreement I also reposted all examples (in easier to copy paste way).
Howto prepare against ACK/FIN/RST/X-mas flood
Use conntrack
rule:
iptables -A INPUT --dst 1.2.3.4 -m conntrack --ctstate INVALID -j DROP
which will only work with disabled tcp_loose
setting (it’s by default enabled) with addition to sysctl
:
sysctl -w net.netfilter.nf_conntrack_tcp_loose=0
Howto prepare against SYN floods
SYN flood is hard case - because when you use conntrack it will make your performance worst validating state for every new single packet.
The only way to get around this is to enable syncookies:
sysctl -w net.ipv4.tcp_syncookies=1
sysctl -w net.ipv4.tcp_timestamps=1
Enabling syncookies will cause loose of some of connection informations, that are pretty useful like:
- window scaling factor
- ECN bit (Explicit Congestion Notification)
For that we will use tcp_timestamp
option, that will use few bits from timestamp field to store some of this informations.
This still may be not efficient enough, but in kernel 4.4 there will be some update to how syncookies are served that should make it few times faster than with older one.
Related docs: https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
Howto prepare against botnet attack
Symptoms:
- concurrent connection count going up
- many sockets in
orphaned
state - many sockets in
time wait
state
Solutions:
- Enable
connlimit
feature onconntrack
to limit amount of concurrent connections to our service - Use hashlimits to rate limit SYN packets per IP
- Use
ipsets
to efficiently block many IP/subnet addresses- manual blacklisting - feed IP blacklist from HTTP server logs
- supports subnets, timeouts
- automatic blacklisting hashlimits
- Disable HTTP keep-alives to make this attack look more like SYN flood
This may still not work against DDoS because huge amount of bots won’t allow you to block them efficiently enough.
Some exciting system tweaks and examples from this presentation
I hope to find some time to merge them into template/script that could be used much easier - but first I have to play with these rules a little and test what will be most useful.
NIC: Discard with flow steering
ethtool -N eth3 flow-type udp4 dst-ip 129.168.254.30 dst-port 53 action -1
Flow steering for priority
ethtool -X eth3 weight 0 1 1 1 1 1 1 1 1 1 1
ethtool -N eth3 flow-type tcp4 dst-port 22 action 0
SYN backlog size
sysctl -w net.core.somaxconn=65535
sysctl -w net.ipv4.tcp_max_syn_backlog=65535
It’s rounded to next power of two (in this case to 65536).
SYN backlog decay
sysctl -w net.ipv4.tcp_synack_retries=1
L7 connection count
sysctl -w net.ipv4.tcp_max_orphans=262144
sysctl -w net.ipv4.tcp_orphan_retries=1
sysctl -w net.ipv4.tcp_max_tw_buckets=360000
sysctl -w net.ipv4.tcp_tw_reuse=1
sysctl -w net.ipv4.tcp_fin_timeout=5
L3: u32
iptables -A INPUT \
--dst 1.2.3.4 \
--p udp -m udp --dport 53 \
-m u32 --u32 "6&0xFF=0x6 && 4&0x1FFF=0 && 0>>22&0x3C@4=0x29" \
-j DROP
L4: Conntrack
iptables -t raw -A PREROUTING \
-i eth2 \
--dst 1.2.3.4 \
-j ACCEPT
iptables -t raw -A PREROUTING \
-i eth2 \
-j NOTRACK
iptables -A INPUT \
--dst 1.2.3.4 \
-m conntrack --ctstate INVALID \
-j DROP
Tuning conntrack
sysctl -w net.netfilter.nf_conntrack_tcp_loose=0
sysctl -w net.netfilter.nf_conntrack_helper=0
sysctl -w net.nf_conntrack_max=2000000
echo 2500000 > /sys/module/nf_conntrack/parameters/hashsize
More info about conntrack sysctl options: https://www.kernel.org/doc/Documentation/networking/nf_conntrack-sysctl.txt
L7: Connlimit
iptables -t raw -A PREROUTING \
-i eth2 \
--dst 1.2.4.5 \
-j ACCEPT
iptables -A INPUT \
--dst 1.2.3.4 \
-p tcp -m tcp --dport 80 \
-p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN \
-m connlimit \
--connlimit-above 10 \
--connlimit-mask 32 \
--connlimit-saddr \
-j DROP
L7: ipset for blacklisting
ipset -exist create ta_d335c5 hash:net family inet
ipset add ta_d335c5 192.168.0.0/16
ipset add ta_d335c5 10.0.0/8
iptables -A INPUT \
-m set --match-set ta_d335c5 src \
-j DROP
L7: being evil - TARPIT
iptables -A INPUT \
-m set --match-set ta_d335c5 src \
-j TARPIT
TARPIT target will imitate successful connection for the client (bot in this case) but without responding to it’s queries. It will cost that bot a lot more resources and time to timeout and drop this connection than when using DROP or REJECT here.
L7: hashlimit for rate limiting
iptables -A INPUT \
--dst 1.2.3.4 -p tcp -m tcp --dport 80 \
--tcp-flags FIN,SYN,RST,PSH,ACK,URG SYN \
-m hashlimit \
--hashlimi-above 123/sec \
--hashlimit-burst 5 \
--hashlimit-mode srcip \
--hashlimit-srcmask 24 \
--hashlimit-name 341654b1d4af9bf \
-j DROP
L7: auto-blacklist
ipset -exist create blacklist hash:net timeout 60
iptables -A INPUT \
--dst 1.2.3.4 \
-m set --match-set blaclist src \
-j DROP
iptables -A INPUT \
--dst 1.2.3.4 -p tcp -m tcp --dport 80 \
--tcp-flags FIN,SYN,RST,PSH,ACK,URG SYN \
-m hashlimit \
--hashlimit-above 100/sec \
--hashlimit-mode srcip \
--hashlimit-srcmask 24 \
--hashlimit-name hl_blacklist \
-j SET --add-set blacklist src
L7+: payload in TCP - string
iptables -A INPUT \
--dst 1.2.3.4 \
-p tcp --dport 80 \
-m string \
--hex-string 486f737777777777... \
--from 231 --to 300 \
-j DROP
For more informations and explanations watch this great presentation:
And here is the whole presentation with additional examples:
https://speakerdeck.com/majek04/lessons-from-defending-the-indefensible