Distributing big astronomical catalogues with Greenplum
Presented by:
Pilar de Teodoro
Pilar de Teodoro is physicist and a database expert/database administrator working for the European Space Agency Science Data Centre, located in Madrid, Spain. She has been more that 10 years working at ESA using different database systems. Before that, she was Oracle consultant in Oracle.
No video of the event yet, sorry!
When there is no option to continue scaling up resources, there is a need for scaling out. At the ESA science data center (ESDC) we envisage a growth of the archive data stored in our databases of about 50TB in 2 years. The current technology used, which is vanilla PostgreSQL will not be enough. In order to fulfill the user requirements for the different missions with such big amounts of data, distributed databases will be necessary. After testing other flavours of distributed PostgreSQL such as Citusdata and Postgres-XL, we investigated the parallel commercial DBMS Greenplum. This talk will describe a number of tests performed with some big ESA astronomical catalogues such as Gaia (1,6B rows) and Euclid catalogues (2,7B rows) with the aim to check the feasibility of the solution.
- Date:
- 2019 March 19 11:30 EDT
- Duration:
- 20 min
- Room:
- Grammercy
- Conference:
- Postgres Conference
- Language:
- Track:
- Greenplum Summit
- Difficulty:
- Easy