There’s no need to re-invent the wheel! Dozens of people have already tried…and succeeded. This talk is a categorized and illustrated overview on most popular and/or useful PostgreSQL specific scripts, utilities and whole toolsets that DBAs should be aware of for solving daily tasks. Inlcuding – performance monitoring, logs management/analyzis, identifying/fixing most common adminstration problems around areas of general performance metrics, tuning, locking, indexing, bloat, leaving out high-availability topics. Covered are venerable oldies from wiki.postgresql.org as well as my newer favourites from Github.
Create our own replication table(s) from oracle to postgresql.
## Broadening Your Database Capability with Extensions ##
The ability to extend PostgreSQL’s feature rich environment to meet additional use case requirements is a compelling reason to incorporate PostgreSQL into your infrastructure. This presentation looks into the interesting area of extending capability through the use of extensions.
We look at what extensions are and how they work, potential benefits and issues associated with using extensions. We then look at some simple examples of how to write an extension (C and SQL), install it and use it.
You will learn about PGXS, a PostgreSQL build infrastructure for extensions should you consider authoring and distributing extensions one day.
The presentation will then finish off by looking at some of the extensions that are already available for PostgreSQL, why they could be useful to you and where you can find them.
Kubernetes is a solid leader among different cloud orchestration engines and its adoption rate is growing on a daily basis. Naturally people want to run both their applications and databases on the same infrastructure.
There are a lot of ways to deploy and run PostgreSQL on Kubernetes, but most of them are not cloud-native. Around one year ago Zalando started to run HA setup of PostgreSQL on Kubernetes managed by Patroni. Those experiments were quite successful and produced a Helm chart for Patroni. That chart was useful, albeit a single problem: Patroni depended on Etcd, ZooKeeper or Consul.
Few people look forward to deploy two applications instead of one and support them later on. In this talk I would like to introduce Kubernetes-native Patroni. I will explain how Patroni uses Kubernetes API to run a leader election and store the cluster state. I’m going to live-demo a deployment of HA PostgreSQL cluster on Minikube and share our own experience of running more than 130 clusters on Kubernetes.
Patroni is a Python open-source project developed by Zalando in cooperation with other contributors on GitHub: https://github.com/zalando/patroni
I will be sharing my learning from an integration project implemented for one of the top insurance firms in APAC. The project involved exchange of XML messages between the insurance firm and its partner bank. Initially the project was implemented using an ETL tool and PostgreSQL as source database. Later we modified the solution to use XML features of PostgreSQL.
These XML features of PostgreSQL helped us off-load the complex ETL processing to database. I will be sharing the improvement achieved by off-loading the XML creation from ETL process to PostgreSQL. This helped us to achieve better response time for XML generation.
In this session, I will walk through the various Postgres XML functions available and how they can be used for generating complex nested XML documents from relational objects. In this session I will cover some simple use cases and examples of XML functions like –
I will be sharing how the features offered by PostgreSQL also helped us keep the code maintainable as compared to building the same using a programming language e.g. Java or an ETL solution.
This talk was presented in Percona Live Europe 2017. However, we did not have enough time to test against more scenario. We will be giving an updated talk with a more comprehensive tests and numbers. We hope to run it against citusDB and MongoRocks as well to provide a comprehensive comparison.
Your PostgreSQL database is one of the most important pieces of your architecture – yet the level of introspection available in Postgres is often hard to work with. Its easy to get very detailed information, but what should you really watch out for, send reports on and alert on?
In this talk we’ll discuss how query performance statistics can be made accessible to application developers, critical entries one should monitor in the PostgreSQL log files, how to collect EXPLAIN plans at scale, how to watch over autovacuum and VACUUM operations, and how to flag issues based on schema statistics.
We’ll also talk a bit about monitoring multi-server setups, first going into high availability and read standbys, logical replication, and then reviewing how monitoring looks like for sharded databases like Citus.
The talk will primarily describe free/open-source tools and statistics views readily available from within Postgres.
PostgreSQL and reliability go hand-in-hand – but your data is only truly
safe with a solid and trusted backup system in place, and no matter how
good your application is, it’s useless if it can’t talk to your database.
In this talk we’ll demonstrate how to set up a reliable replication
cluster using open source tools closely associated with the PostgreSQL
project. The talk will cover following areas:
Ian Barwick has worked for 2ndQuadrant since 2014, and as well as making various
contributions to PostgreSQL itself, is lead `repmgr` developer. He lives in
It is a usual approach in data science to dump the database and build models upon outdated information. In some cases, it may be unacceptable. So in this session, it will be discussed how PostgreSQL can help on an example of the real project.
First of all, we will discuss the problem data scientists face in their jobs when they want to use PostgreSQL. This problem is unstructured data, which should be somehow stored and accessed. In relational databases, it might be a challenging task, but according to the latest SQL standard, there is a solution. PostgreSQL already provides a special type “jsonb” which can solve the problem.
Tthe next theme of the session is “Real-time data processing”. Nowadays PostgreSQL can handle lots of different connections and data changes. In order to work with actual data, it is necessary to react instantly to any changes. In this part, we will use stored functions and features like “transform” and “plpython”.
Stored functions are really strong tools, but sometimes it is hard to implement the needed feature. So, our next block of the session will be dedicated to plpython and data science models usage on an example of plpython+scikit-learn.
Pgpool-II has been around to complement PostgreSQL over a decade and provides many features like connection pooling, failover, query caching, load balancing, and HA. High Availability (HA) is very critical to most enterprise application, the clients needs the ability to automatically reconnect with a secondary node when the master nodes goes down.
This is where Pgpool-II watchdog feature comes in, the core feature of Pgpool-II provides HA by eliminating the SPOF is the Watchdog. This watchdog feature has been around for a while but it went through major overhauling and enhancements in recent releases. This talk aims to explain the watchdog feature, the recent enhancements went into the watchdog and describe how it can be used to provide PostgreSQL HA and automatic failover.
Their is rising trend of enterprise deployment shifting to cloud based environment, Pgpool II can be used in the cloud without any issues. In this talk we will give some ideas how Pgpool-II is used to provide PostgreSQL HA in cloud environment.
Finally we will summarise the major features that have been added in the recent major release of Pgpool II and whats in the pipeline for the next major release.
Aiven PostgreSQL is available in five different public cloud providers’ infrastructure in more than 60 regions around the world, including 18 in APAC. This has given us a unique opportunity to benchmark and compare performance of similar configurations in different environments.
We’ll share our benchmark methods and results, comparing various PostgreSQL configurations and workloads across different clouds.
While the physical replication in PostgreSQL is quite robust, however, it doesn’t fit well in the picture when:
The answer to these use cases is logical replication
This talk will discuss and cover these use cases followed by a logical replication demo.
Bringing the provability and immutability of blockchain to performance and efficency of traditional DBMS.
Blockchain technology has several unique properties including provability and immutability. Every blockchain transaction is signed by its author, and it could be verified by any blockchain network member. Also, once data is stored in blockchain, it can’t be altered in the future. Many databases operating traditional DBMS would also benefit from provability and immutability properties. However, inclusion of all the transaction data in the public blockchain is very expensive.
Credereum is the platform, which allows creation and maintaining of databases, whose contents and history are provable and immutable without sacrifice the performance and efficiency of traditional DBMS. Thanks to Credereum, database owner can prove the validity of query results, while users can verify them. Database owner don’t have to reveal the whole database contents or full history of transactions to provide the proof of database query results. Therefore, Credereum database may contain private sensitive information. Credereum utilized bleeding-edge technologies including, but not limited to decentralized cloud, public blockchain with sharding. Credereum is an emerging technology of trusted and private databases.
We will explain why PostgreSQL is suitable database for Credereum and what we need to develop in Postgres to support signed transactions and cryptographic storage.