Some hours ago I had some server problems: A php script on a quite busy server was getting timeouts and thus the number of apache serer had reached the limit and no requests were served (as all servers where busy running the php script that was waiting for a timeout - great).
As the solution was quite obvious I first changed the php script and was hoping to see the number of apache servers going down. After some minutes of looking at the output of watch ‘ps ax|grep apache2|wc -l’ I realised that something was wrong. I still had hundreds of apache servers running and no new connections were accepted.
Killing apache was my next step, but after all processes had disappeared and I tried to restart apache an error showed up that it could not bind port 80 and 443. Why was that?
I checked with netstat –inet -n –listen to see that something was still listening to those ports. A little bit of google research gave a great tip: lsof is able to show not only open files but also open ports and which programs are keeping them open.
Running lsof -i|grep http and lsof -i|grep https showed that indeed several apache processes (that didn’t exist according to ps) where still listening and keeping the ports open.
Next thing I did was try to kill the connections. tcpkill was my choice for that. It is contained in the dsniff package and can kill tcp connection based on tcpdump rules.
Running tcpkill -i eth0 -9 port 80 or port 443 produced a lot of log messages but did not close the ports. At that point I was really considering a reboot ;-)
Finally I took another look at the output of lsof and realised that at the beginning of each line it said “sendmail”. Why should sendmail be listening on port 80? Especially as sendmail is not running as a service on that machine. The strange solution was that some error mails in my scripts had started sendmail processes and those processes did not finish. After killing all sendmail procs the hard way (killall -9 sendmail) closed the ports and a restart of apache was possible.
Now I just wonder why killing apache did not kill the sendmail processes that seemed to be child procs of apache?

Technorati Tags: , , , , , ,