Presented by:

9318223f2211d987a3ace5505601770b

Jason Yee

Datadog

Jason is a technical evangelist at Datadog, where he works to inspire developers and ops engineers with the power of metrics and monitoring. Previously, he was the community manager for DevOps & Performance at O’Reilly Media and a software engineer at MongoDB. He’s currently exploring the world while living as a nomad and would love to hear about the part of the world that you call home.

No video of the event yet, sorry!

At Datadog we handle trillions of points of data per day from the thousands of customers that rely on us to monitor their applications and infrastructure. In this session, I’ll share how we’ve scaled PostgreSQL to not only handle the deluge of data, but how we’ve made our PostgreSQL systems more resilient.

I’ll also discuss which metrics to watch and how troubleshooting based on those metrics will help you solve problems more quickly. In this session, we will look at a framework for your metrics and how to use it to find solutions to the issues that come up.

We will cover the three types of monitoring data; what to collect; what should trigger an alert (avoiding an alert storm and pager fatigue); and how to follow the resources to find the root causes of problems.

This focus of this session is not tool specific, so attendees will leave with strategies and frameworks they can implement in environments today regardless of the platforms and tools they use.

Date:
2018 April 18 11:00 EDT
Duration:
20 min
Room:
Liberty I
Conference:
PostgresConf US 2018
Language:
English
Track:
Use Cases
Difficulty:
Easy