Free host metrics, graphs, and alerting on Digital Ocean

Host metrics

What are host metrics? Those are things obtained from the (virtual) servers running your application. These are not application metrics like number of requests per seconds or specific heap size. Here are the most common host metrics we care about:

CPU
Load
Memory
Disk I/O
Disk Usage
Bandwith

If the Digital Ocean host agent is installed, you will be able to see 1h, 6h, 24h, 7 days, and 14 days graphs of all of these. Here's how they look like:

DO Agent

These metrics are luckily completely free as in terms of money, but Digital Ocean agent has to be installed on every Droplet (virtual server) and running the agent will take a bit of your resources. The agent installation is also just a tick box when provisioning new servers, so it doesn't require any real work. Just don't forget to tick the box to include it.

$ ps aux | grep do-agent
do-agent    1832  0.0  0.0   4784   384 ?        Ss   Mar10   0:00 /bin/bash ./start.sh
do-agent    1961  0.0  0.8 206464 16520 ?        S    Mar10   0:01 Xvfb :99 -screen 0 1024x768x16 -nolisten tcp -nolisten unix
do-agent    1962  0.0  0.0   2644   128 ?        S    Mar10   0:00 dumb-init -- node ./build/index.js
do-agent    1966  0.7  5.2 11379780 105168 ?     Ssl  Mar10 650:42 node ./build/index.js
do-agent  267239  0.0  0.7 1238588 14704 ?       Ssl  Mar27  10:35 /opt/digitalocean/bin/do-agent --syslog

Monitoring alerts

Perhaps even better than the graphs themselves are alerts. These will tell you in time that something is wrong and let you prevent many disasters. However you won't find the setting with the Droplet, you need to open Monitoring from the left menu.

Once opened you can review your existing resource alerts or create new ones.

Here's an example of an alert:

CPU utilization (metric) is above (rule) 70% (threshold) for 10 minutes (duration).

You can choose from these metrics:

CPU utilization
1 minute load average
5 minute load average
10 minute load average
Disk Read I/O
Disk Write I/O
Public outbound bandwidth
Public inbound bandwidth
Private outbound bandwidth
Private inbound bandwidth

Next you'll select the specific hosts to watch out for. Or simply opt to have the alert for all of them. Super handy.

Finally you can choose from email or Slack alerts for the notifications and you are done!

Although a simpler version to proper commercial tools, these metrics and alerts is likely all you need before you go big.