Presented by:

Edf6a0cfcf7b61e04b91c2794b430ef9

David Kohn (david.kohn@moat.com)

Data Engineer at Moat Inc.

David is a Data Engineer focusing mostly on Postgres at Moat Inc. How he came to do data engineering at an advertising analytics company from battery engineering and electrochemistry startups is a story too long for a short bio, but you can ask him if you'd like (it even involves PGConf!)

No video of the event yet, sorry!

At Moat we add ~500 million rows each day to just a few tables in our data warehouse. Previously, we used Postgres for data for the last month (which takes ~15TB of disk), and we love its ability to handle concurrency and its cost compared to the commercial column-stores that we use for our historical data. By using a combination of composite-types, arrays and Postgres' TOAST mechanism we were able to build a custom data warehouse that combines the best aspects of Postgres and a commercial column-store database, tailored specifically for our needs without modifying any of the Postgres source. Our new store uses ~1TB of disk/month, has sped our ETL considerably (we're now compute rather than I/O bound) and will allow us to migrate our full historical data warehouse to a single Postgres instance. This talk will cover: - An overview of our data problems and why we chose to go this route - A deep dive into the different Postgres mechanisms (TOAST, arrays, composite types, SRF's) we use to make our solution work, how we put them together, and how others might apply similar techniques to their use-cases - Strengths and weaknesses of the approach, lessons learned - Upcoming features in Postgres that are going to make this approach even more attractive - Future work/things that aren't upcoming (yet!?) that would really help us out

Date:
2017 March 31 10:00
Duration:
50 min
Room:
Liberty III
Conference:
PGConf US 2017
Language:
Track:
Use Cases
Difficulty:
Medium