Is it worth running fail2ban?

October 12, 2012

Part of the standard security advice for anyone running a machine with an SSH daemon which is open to the world is to install the fail2ban software to block brute-force attacks.

In Informatics we use it to monitor various log files for login failures. When more than a certain number of failures are seen from a single source address within a short period of time we deny access to that address for a while. This is done using basic tcpwappers rules (i.e. hosts.deny). Since we do this on all hosts which have holes in the firewall that allow incoming SSH connections it’s note to easy to tell exactly how much good this is doing. The question is, without these blocks would the attackers go away after a few failures anyway?

Recently we had an opportunity to see exactly what does happen when you open SSH to the world for a machine for the first time and then do not run fail2ban. At about 10:50 on 4th October a new firewall hole was opened to allow incoming SSH connections to a machine. At 15:12 we see the first login failure in the logs, by the end of that day we had 478 login failures. Here are the stats for the following days:

Day Failure Count Total Failures (all hosts)
Thursday 4th October 478 2048
Friday 5th October 2015 3510
Saturday 6th October 36 1473
Sunday 7th October 1323 2810
Monday 8th October 100 1702
Tuesday 9th October 36542 38296
Wednesday 10th October 20093 21714
Thursday 11th October 3455 5033

We do regular monitoring of the failure counts for all our hosts so the sudden increase, by an order of magnitude, in failure counts set the alarm bells ringing fairly quickly.

An interesting question is whether all these failures are coming from single hosts or a wide range of addresses, i.e. are the attacks coming from botnets? Here’s the counts for each different source address for the two peak days:

  1. 30415
  2. 3211
  3. 2434
  4. 320
  5. 87
  6. 48
  7. 21
  8. 4
  9. 2
  1. 9510
  2. 6852
  3. 3171
  4. 488
  5. 50
  6. 12
  7. 10

So, the attacks are coming in large numbers from just a few specific machines.

It’s also interesting to look at the top user names which all these attacks are trying to compromise, here’s all user names with more than 150 attempts.

userid count
root 29065
test 1054
oracle 887
nagios 743
admin 634
user 449
mysql 398
guest 384
postgres 329
www 318
testuser 298
temp 273
backup 256
support 251
tomcat 239
web 234
ftpuser 234
mythtv 188
webmaster 185
teste 169
apache 159
bin 155

This demonstrates two particular issues.

Firstly, you should never allow root SSH logins, in fact 45% of all login failures were for the root account, with openssh you should always set the PermitRootLogin option to no.

Secondly, most attacks were against “system” accounts. All of those with 150 or more failures were for accounts which are not used by real live users – they are for daemons, system utilities or testing accounts. To avoid any of these accounts being compromised you should restrict login access to some group which only contains real users, this is done using the AllowGroups option in openssh.

This clearly shows that running fail2ban does result in a big reduction in the number of attacks we see each day. With fewer opportunities to attempt to login the chances of successfully cracking a password by brute-force and seriously reduced. Also, a few simple tweaks to the openssh daemon configuration which will not affect the experience of normal users results in a great improvement in security.