Running forever with the Raspberry Pi Hardware Watchdog

Running forever with the Raspberry Pi Hardware Watchdog

At Diode we have deployed a couple of long-running Raspberry Pis equipped with cameras and sensors reporting into our network 24/7. All these are under uptime monitoring for us to keep track of the network availability. Every time there is a software problem we want to know it. For that we’re using some external and internal tools.

Diode.io status page

Eventually we have run into a problem not related to our software. From to time one of our Raspberry Pis freezes in the field either because of a kernel or a hardware issue. In that case there is nothing that can be done with software anymore. You can’t connect to it, can’t ping the Pi – It becomes impossible to send it a restart command in any way to bring it back to normal operation. Debugging into these events you might find indications of such as freeze in the /var/log/kernel.log file. Below an example of a freeze that first shows a kernel exception and then just garbage ^@^@^@^@^@

Kernel log at freeze

That’s the point the Pi froze, and only manual powering down, and powering up again brought it back to live.

Enter Watchdog

But the Pis are very resourceful tools. And one of their underdocumented features is a builtin hardware watchdog. This little hardware service will once enabled watch the system activity and automatically power cycle the Raspberry Pi once it gets stuck.

So if you’re running your Raspberry Pi as a remote sensor reachable remotely from any place in the world with Diode network, then we recommend to enable this hardware watchdog for your devices as well.

It’s done in few steps directly on a terminal on your Pi:

1) Enable the hardware watchdog on your Pi and reboot

sudo su
echo 'dtparam=watchdog=on' >> /boot/config.txt
reboot

After this reboot the hardware device will be visible to the system. The next steps install the software side of this to communicate with the watchdog.

2) Install the watchdog system service

sudo apt-get update
sudo apt-get install watchdog

3) Configure the watchdog service

sudo su
echo 'watchdog-device = /dev/watchdog' >> /etc/watchdog.conf
echo 'watchdog-timeout = 15' >> /etc/watchdog.conf
echo 'max-load-1 = 24' >> /etc/watchdog.conf

4) Enable the service

sudo systemctl enable watchdog
sudo systemctl start watchdog
sudo systemctl status watchdog

If everything worked then you should after the last command output similar to this:

Watchdog service running

Now next time your Raspberry Pi freezes, the hardware watchdog will restart it automatically after 15 seconds.

If you want to test this you can try running a fork bomb on your shell:

sudo bash -c ':(){ :|:& };:'

WARNING Running this code will render your Raspberry Pi inaccessible until it’s reset by the watchdog. /WARNING

If you got any troubles with your Pi or running Diode on the Pi feel free to reach out to us on Telegram and ask questions directly there.

And if you want to learn more about Diode be sure to check out the Diode FAQs.

Update: July 21st

In some cases our Raspberry Pi Zero W would crash it’s WiFi driver but not completely go down. This wouldn’t trigger the watchdog, because the device is still running - just not communicating anymore… To handle this case we added one more line to the watchdog.conf configuration file. Like this:

sudo su
echo 'interface = wlan0' >> /etc/watchdog.conf

With this additional configuration line, the watchdog will also power cycle the Raspberry Pi when the WiFi interface wlan0 gets into trouble.

Use your Raspberry Pi to:

1) Host a decentralized website 2) Replace Dropbox / Google Drive 3) Be a private chat server