Today one of our production machine (in Amazon EC2) was down and I couldn’t bring-up the instance due to unavailability of SSH in PORT 22, connection was refused and I had no clue what to do, after sometime I was able to bring-up the instance – (ok short and sweet).
Though I got it working after some work-around, I wanted to make sure what I did was correct, hence I asked experts in stackoverflow.com – got good advice/suggestion and I hope all set for a fix. BTW, the answer on the question also helped in learning a new stuff called auditd on the disk – a must have tool for system admin.
Okay, the reason why I wanted to pin-down this blog is to show what I did to fix /dev/urandom and how did I get the server up and running. Let me go straight to there.
I recommend any further reading require an insight knowledge of Amazon AWS – I will write some blogs about it very soon.
Luckily (I knew sometime I had to face such problems) I had an EBS instance for the production server – which allowed me to start/stop the instance. (I will follow the point based explanation)
- I stopped the production machine.
- Detached the root-disk (it is connected to /dev/sda1).
- Created a micro-instance with same zone as the above volume.
- Attached #2 volume into #3 created micro-instance.
- Verified the logs (messages/syslog)
- Found a problem in SSH (since I badly wanted to SSH) – which gave me an hint on sshPRNG (yes – ie. PseudoRandomNumberGenerator)
- That helped me trace the actual problem – /dev/urandom file was missing.
- So wrote a quick shell script which will create the above file if missing – please see below for the script.
- Added the above script in an existing init-script – since any existing script will be triggered on system boot.
- Detached #4 volume and attached to production machine.
- Started the production machine - voila.
- I can SSH now without any problem.
- Do not forget to terminate the micro-instance
if [ -f /dev/urandom ] ; then cd /dev; /sbin/MAKEDEV urandom; /etc/init.d/ssh start fi