Monitoring & Alerting
This section is dedicated to highlighting how you can monitor and alert on the services that run on the platform. This will cover at a high level how to use the platform monitoring and alerting tools to monitor and alert on the services that run on the platform.
You can find more information about the platform monitoring and alerting in the platform monitoring and alerting section.
Tools
- Google Monitoring Metrics (prometheus)
- Statsd for metrics via the Otel collector
How to write alerts and monitor your services
Over in the monitoring-tf repo you can find the terraform resources that define the monitoring and alerting for the platform.
You should go and create a pull request to add specific alerts and monitoring for your service.
In future we will be moving towards a more developer friendly approach to monitoring and alerting on your services, but for now you can use the platform monitoring and alerting tools to monitor and alert on your services.