planet postgresql

Subscribe to planet postgresql のフィード
Planet PostgreSQL
更新: 1時間 57分 前

Julien Rouhaud: PoWA 4: changes in powa-archivist!

2019-06-05(水) 23:26:17

This article is part of the PoWA 4 beta series, and describes the changes done in powa-archivist.

For more information about this v4, you can consult the general introduction article.

Quick overview

First of all, you have to know that there is not upgrade possible from v3 to v4, so a DROP EXTENSION powa is required if you were already using PoWA on any of your servers. This is because this v4 involved a lot of changes in the SQL part of the extension, making it the most significant change in the PoWA suite for this new version. Looking at the amount changes at the time I’m writing this article, I get: | 14 + powa--4.0.0dev.sql | 2075 +++++++++++++++++++++------- powa.c | 44 +- 3 files changed, 1629 insertions(+), 504 deletions(-)

The lack of upgrade shouldn’t be a problem in practice though. PoWA is a performance tool, so it’s intended to have data with high precision but with a very limited history. If you’re looking for a general monitoring solution keeping months of counters, PoWA is definitely not the tool you need.

Configuring the list of remote servers

Concerning the features themselves, the first small change is that powa-archivist does not require the background worker to be active anymore, as it won’t be used for remote setup. That means that a PostgreSQL restart is not needed needed anymore to install PoWA. Obviously, a restart is still required if you want to use the local setup, using the background worker, or if you want to install additional extensions that themselves require a restart.

Then, as PoWA needs some configuration (frequency of snapshot, data retention and so on), some new tables are added to be able to configure all of that. The new powa_servers table stores the configuration for all the remote instances whose data should be stored on this instance. This local PoWA instance is call a repository server (that typically should be dedicated to storing PoWA data), in opposition to remote instances which are the instances you want to monitor. The conte

カテゴリー: postgresql

Hans-Juergen Schoenig: Tech preview: Improving COPY and bulkloading in PostgreSQL 12

2019-06-05(水) 22:03:47

If you are relying heavily on the PostgreSQL COPY command to load data into PostgreSQL quickly, PostgreSQL 12 might offer a feature, which is most likely very beneficial to you. Bulkloading is an important operation and every improvement in this area is certainly going to help many people out there, who want to import data into PostgreSQL as fast as possible.

COPY: Loading and unloading data as fast as possible

When taking a closer look at the syntax of the COPY command in PostgreSQL 12 you will quickly see two things:

• \h will now point to the correct page in the documentation
• COPY now supports a WHERE condition

Here is the complete syntax overview:

db12=# \h COPY Command: COPY Description: copy data between a file and a table Syntax: COPY table_name [ ( column_name [, ...] ) ] FROM { 'filename' | PROGRAM 'command' | STDIN } [ [ WITH ] ( option [, ...] ) ] [ WHERE condition ] COPY { table_name [ ( column_name [, ...] ) ] | ( query ) } TO { 'filename' | PROGRAM 'command' | STDOUT } [ [ WITH ] ( option [, ...] ) ] where option can be one of: FORMAT format_name FREEZE [ boolean ] DELIMITER 'delimiter_character' NULL 'null_string' HEADER [ boolean ] QUOTE 'quote_character' ESCAPE 'escape_character' FORCE_QUOTE { ( column_name [, ...] ) | * } FORCE_NOT_NULL ( column_name [, ...] ) FORCE_NULL ( column_name [, ...] ) ENCODING 'encoding_name' URL:

While having a link to the documentation around is certainly beneficial, the WHERE condition added to PostgreSQL 12 might even be more important. What is the purpose of this new feature? So far it was possible to completely import a file. However, in some cases this has been a problem: More often than not people only wanted to load a subset of data and had to write a ton of code to filter data before the import or once data has been written into the database already.

COPY … WHERE: Applying filters while importing data

Im PostgreSQL data can be filtered

カテゴリー: postgresql

Venkata Nagothi: How to Optimize PostgreSQL Logical Replication

2019-06-05(水) 18:48:00

Logical Replication or Pglogical is a table level, WAL based replication mechanism which replicates the data of specific Tables between two PostgreSQL instances. There seems to be a confusion between “pglogical” and “Logical Replication”. Both of them provide the same kind of replication mechanism with some differences in features and capabilities. Logical Replication is introduced in PostgreSQL-10 as an in-built feature unlike pglogical which is an extension. “Pglogical” with ongoing continuous developments, remains as the only option for implementing Logical Replication for those environments using PostgreSQL versions prior to 10. Eventually, all the features part of pglogical will be part of Logical Replication. In other words, pglogical (extension) became Logical Replication (in-built feature). The basic advantage of Logical Replication is that it does not need any extensions to be installed / created which is in turn beneficial to those environments where-in installing extensions is restricted.

Related resources  ClusterControl for PostgreSQL  An Overview of Logical Replication in PostgreSQL  PostgreSQL Streaming Replication vs Logical Replication

This blog will focus on optimizing Logical Replication. That means, the optimization tips and techniques highlighted in this blog will apply for both pglogical and Logical Replication.

Logical Replication is a WAL based replication which is first of its kind. As a DBA, this would be much more reliable and performant replication mechanism when compared to other trigger based replication solutions. The changes made to the tables part of pglogical replication are replicated in real-time via WAL records which makes it highly efficient and non complex. All of the other replication mechanisms in the market are trigger based which can pose performance and maintenance challenges. With Logical Replication coming in, dependency on trigger based replication is almost gone.

There are other blogs which explain how to configure Logical Replication in quite a detail.


カテゴリー: postgresql

Michael Paquier: Postgres 12 highlight - Table Access Methods and blackholes

2019-06-04(火) 14:16:34

Postgres is very nice when it comes to extending with custom plugins, with many set of facilities available, like:

After a heavy refactoring of the code, Postgres 12 ships with a basic infrastructure for table access methods which allow to customize how table data is stored and accessed. By default, all tables in PostgreSQL use the historical heap, which works on a page-based method of 8kB present in segment files of 1GB (default sizes), with full tuple versions stored. This means, in simple words, that even updating one attribute of a tuple requires storing a full new version. This makes the work related to vacuum and autovacuum more costly as well. Well, the goal of this post is not to discuss about that, and there is documentation on the matter. So please feel free to refer to it.

Table access methods are really cool, because they basically allow to plugin directly into Postgres a kind of equivalent to MySQL storage engines, making it possible to implement things like columnar storage, which is something where heap is weak at. It is possible to roughly classify what is possible to do into two categories:

  • Access method going through the storage manager of Postgres, which makes use of the existing shared buffer layer, with the exiting paging format. This has two advantages: backups and checksums are normally, and mostly, automatically supported.
  • Access method not going through Postgres, which has the advantage to not rely on Postgres shared buffers (page format can be a problem as well), making it possible to rely fully on the OS cache. Note that it is then up to you to add support for checksumming, backups, and such.

Access methods could make a comparison with foreign data wrappers, but the reliability is much different, one big point being that they are fully transactional with the backend they work with, which is usually a big deal for applications, and have transparent DDL and command su

カテゴリー: postgresql

Wim Bertels: 2019

2019-06-03(月) 21:47:00
A round up of the first

Stefan F. already wrote a nice blog about the conference.

Most of the presentations, a couple recordings and a few photos are available at the website

Using the well known and broadly spread technique of inductive reasoning we came to the conclusion that this first conference was a success, as well as the art work. No animals or elephants we’re hurt during this event.

The statistics are

  • 53 participants with a low standardeviation regarding attendance time

    • depending on the session, an extra 5 to 30 students attended as well

  • 11 speakers

  • 5 sponsors

This conference wouldn’t have been possible without the help of volunteers.
To conclude a big thank you to all the speakers, sponsors and attendants.
Without them a conference is just a like tee party.

カテゴリー: postgresql

Regina Obe: PostGIS 3.0.0alpha2

2019-06-02(日) 09:00:00

The PostGIS development team is pleased to release PostGIS 3.0.0alpha2.

This release works with PostgreSQL 9.5-12beta1 and GEOS >= 3.6

Best served with PostgreSQL 12beta1.

Continue Reading by clicking title hyperlink ..
カテゴリー: postgresql

elein mustain: Beautiful things, strings.

2019-06-01(土) 08:45:09
This blog today is going to talk about strings: how they are stored, how they are input, and lots of examples of how to use string operators and functions in order to manipulate them. Strings, strings, strings. What we are not going to cover is regular expressions, although we will use them. The Fine Manual […]
カテゴリー: postgresql

Magnus Hagander: Nordic PGDay 2020 - Date and location

2019-06-01(土) 06:44:22

We're happy to announce that Nordic PGDay 2020 will be held in Helsinki, Finland, on March 24th, 2020, at the Hilton Helsinki Strand. Join us for a full day of PostgreSQL content!

For now, mark your calendars -- registrations and call for papers will open in the fall!

カテゴリー: postgresql

Bruce Momjian: Updated Sharding Presentation

2019-06-01(土) 01:00:00

I presented my sharding talk today at PGCon in Ottawa. The slides have been updated to more clearly show what has been accomplished toward the goal of built-in sharding, and what remains to be done. The talk was well attended. I also attended a breakfast meeting this morning about sharding.

カテゴリー: postgresql

Álvaro Hernández: What it is PostgreSQL Ibiza and why you should attend

2019-05-31(金) 16:49:11
PostgreSQL for Thinkers Please check PostgreSQL Ibiza’s conference website. If you haven’t done so, please also take two minutes to watch our conference video. Now if you think that PostgreSQL Ibiza is another PostgreSQL Conference, just on the beach (which is not that bad anyway!), then keep reading. Because it is much more than that. PostgreSQL Ibiza is a new, disruptive PostgreSQL conference. We are all in a rush. I have been to many dozens of PostgreSQL conferences in the last decade.
カテゴリー: postgresql

Luca Ferrari: Normalize to save space

2019-05-31(金) 09:00:00

It is no surprise at all: a normalized database requires less space on disk than a not-normalized one.

Normalize to save space

Sometimes you get a database that Just Works (tm) but its data is not normalized. I’m not a big fan of data normalization, I mean it does surely matter, but I don’t tend to “over-normalize” data ahead of design. However, one of my database was growing more and more because of a table with a few repeated extra information.
Of course a normalized database gives you some more disk space at the cost of the joins during query execution, but having a decent server and a small join table is enough to sleep at night!
Let’s see what we are talking about:

mydb=# select pg_size_pretty( pg_database_size( 'mydb' ) ); pg_size_pretty ---------------- 13 GB (1 row)

Ok, 13 GB is not something scarying, let’s say it is a fair database to work on (please note the size if reported after a full VACUUM). In such database, I’ve a table root that handles a lot of data from hardware sensors; such table is of course partitioned on a time base scale. One thing the table was storing was information about the sensor name, a text string repeated over and over on child tables too. While this was not a problem in the beginning, it was wasting space over time.

Shame on me!

Let’s go normalize the table!
Normalizing a table is quite straightforward, and I’m not interesting in sharing details here. Let’s say this was quite easy because my...

カテゴリー: postgresql

Dimitri Fontaine: Introducing pg_auto_failover: A high availability and automated failover Postgres extension

2019-05-30(木) 23:02:00

As part of the Citus team (Citus scales out Postgres horizontally, but that’s not all we work on), I’ve been working on pg_auto_failover for quite some time now and I’m excited that we have now introduced pgautofailover as Open Source, to give you automated failover and high availability!

When designing pg_auto_failover, our goal was this: to provide an easy to set up Business Continuity solution for Postgres that implements fault tolerance of any one node in the system. The documentation chapter about the pgautofailover architecture includes the following:

It is important to understand that pgautofailover is optimized for Business Continuity. In the event of losing a single node, then pgautofailover is capable of continuing the PostgreSQL service, and prevents any data loss when doing so, thanks to PostgreSQL Synchronous Replication.

Introduction to pgautofailover

The pg_auto_failover solution for Postgres is meant to provide an easy to setup and reliable automated failover solution. This solution includes software driven decision making for when to implement failover in production.

The most important part of any automated failover system is the decision making policy, and we have a whole documentation chapter online about pgautofailover faul tolerance mechanisms.

When using pgautofailover, multiple active agents are deployed to keep track of your production Postgres setup properties:

  • the monitor, a Postgres database itself equipped with the pg_auto_failover extension, registers and checks the health of the active Postgres nodes.

  • each Postgres node that is registered in the pg_auto_failover monitor must also run a local agent, the pg_autoctl run service.

  • each Postgres service that is managed has two Postgres nodes set up together in the same group. A single monitor setup may manage as many Postgres groups as needed.

With such a deployment, the monitor connects to every registered node on a regular schedule (20s by default) and registers success or failure in its pgautofailover.node table.

In additio

カテゴリー: postgresql

Dave Page: Avoiding Gmail's confidential mode

2019-05-30(木) 06:11:00
So this is one of the very few (maybe the first?) blog entries I've written that aren't directly related to PostgreSQL, however, it does affect how I (and others) may work on the project.

Last night I received email from Google about my personal G Suite account which I use for all my day-to-day email, which for the most part is related to work on pgAdmin and PostgreSQL. Google were proudly announcing the rollout of their new Gmail Confidential Mode update. If you've not come across this yet, then essentially what it does is allow users to send emails that can be deleted or expired after a certain amount of time, optionally require SMS verification to open them, and prevent printing (but not screen-shots of course), forwarding or downloading etc.

When using the Gmail web interface, this all works fairly seamlessly. I can see why some people would want it if that's all they use, however, like many people, I also use other clients, for example, via IMAP. In that case, instead of the original email Gmail sends a placeholder email to replace the actual message which contains a link to allow you to login to Google and view the message online (assuming the SMS verification passes and the message hasn't been deleted or expired of course). That's going to be quite inconvenient to me, besides which, I really don't want anyone to be able to control access to emails they've sent me, after I've received them.

There's another problem affecting PostgreSQL's mailing lists however. How long will it be until someone sends such a message to one of the PostgreSQL lists, where it will do nothing but clutter up the archives and annoy other users (who won't be able to read the message anyway as they won't be able to login to Google as or whatever the list address was)?

Fixing the PostgreSQL mail servers After some discussion with some of the PostgreSQL sysadmin team, we discovered that Gmail adds a header to the messages that have confidential mode enabled (X-Gm-Locker: <token>). This is easy for us to [...]
カテゴリー: postgresql

elein mustain: Swoop de Dupe

2019-05-30(木) 01:31:36
The problem: duplicate rows Once upon a time, there was a database person who knows about referential integrity and its importance for creating database tables.  The interesting part  is that referential integrity, including primary keys, enable keeping with the standard of not having duplicate rows. However, they just wanted a private little table on their […]
カテゴリー: postgresql

Luca Ferrari: PostgreSQL is almost the best (according to Stack Overflow Survery)

2019-05-29(水) 09:00:00

Stack Overflow 2019 Suvery results are available, and PostgreSQL is almost leading in the database field.

PostgreSQL is almost the best (according to Stack Overflow Survery)

According to the 2019 suvery made by Stack Overflow and available here, PostgreSQL is the second top database, slightly ahead of Microsoft SQL Server and cleary ahead of Oracle. And this is true both for community and professional users that take the survey.

PostgreSQL is keeping its high position year after year and this means that the database is growing as a professional choice. In particular, in the professional users’ opinion PostgreSQL is more used and MySQL and MS SQL loose some points.

カテゴリー: postgresql

Dave Cramer: Deploying Active-Active PostgreSQL on Kubernetes

2019-05-29(水) 05:06:15

Kubernetes is a very popular container orchestration framework. I’ll show you how to get Symmetric-DS working on a single Kubernetes instance.

I had previously explored how to build an active-active PostgreSQL cluster using Symmetric-DS.  The steps are essentially the same on Kubernetes:

  1. Start 2 PostgreSQL pods
  2. Create a user and a database on each pod/instance
  3. Start  the primary symmetric-ds pod
  4. Add the symmetric-ds triggers and routes
  5. Open registration
  6. Start the secondary symmertic-ds pod

However, there are some interesting nuances I discovered while building out this setup, which I will discuss below. Before diving into the details, I'd also like to credit my colleagues Patrick McLaughlin and Simon Nielly as co-authors for helping me to put together this guide.

Anyway, let's explore how to deploy an active-active PostgreSQL cluster on Kubernetes.

カテゴリー: postgresql

Umair Shahid: Postgres is the coolest database – Reason #4: It is extendable

2019-05-28(火) 17:50:10
PostgreSQL is packed with features. What may not be part of the core is available as extensions. What are extensions – you ask? PostgreSQL exposes APIs that are designed to easily allow external programs to load into the database and function just like core features. So if you find that you need a feature in […]
カテゴリー: postgresql

Paolo Melchiorre: Upgrading PostgreSQL from version 10 to 11 on Ubuntu 19.04 (Disco Dingo)

2019-05-28(火) 07:00:00

Howto guide for upgrading PostgreSQL from version 10 to 11 on Ubuntu, after its upgrade from version 18.10 to 19.04.

カテゴリー: postgresql

Paul Ramsey: Parallel PostGIS and PgSQL 12

2019-05-28(火) 01:00:00

For the last couple years I have been testing out the ever-improving support for parallel query processing in PostgreSQL, particularly in conjunction with the PostGIS spatial extension. Spatial queries tend to be CPU-bound, so applying parallel processing is frequently a big win for us.

Initially, the results were pretty bad.

  • With PostgreSQL 10, it was possible to force some parallel queries by jimmying with global cost parameters, but nothing would execute in parallel out of the box.
  • With PostgreSQL 11, we got support for parallel aggregates, and those tended to parallelize in PostGIS right out of the box. However, parallel scans still required some manual alterations to PostGIS function costs, and parallel joins were basically impossible to force no matter what knobs you turned.

With PostgreSQL 12 and PostGIS 3, all that has changed. All standard query types now readily parallelize using our default costings. That means parallel execution of:

  • Parallel sequence scans,
  • Parallel aggregates, and
  • Parallel joins!!

PostgreSQL 12 and PostGIS 3 have finally cracked the parallel spatial query execution problem, and all major queries execute in parallel without extraordinary interventions.

What Changed

With PostgreSQL 11, most parallelization worked, but only at much higher function costs than we could apply to PostGIS functions. With higher PostGIS function costs, other parts of PostGIS stopped working, so we were stuck in a Catch-22: improve costing and break common queries, or leave things working with non-parallel behaviour.

For PostgreSQL 12, the core team (in particular Tom Lane) provided us with a sophisticated new way to add spatial index functionality to our key functions. With that improvement in place, we were able to globally increase our function costs without breaking existing queries. That in turn has signalled the parallel query planning algorithms in PostgreSQL to parallelize spatial queries more aggressively.


In order to run these tests yourself, you will need:

  • PostgreSQL 12
  • PostGIS 3
カテゴリー: postgresql

Chris Travers: Table Inheritance: What's it Good For?

2019-05-26(日) 17:03:00
Table inheritance is one of the most misunderstood -- and powerful -- features of PostgreSQL.  With it, certain kinds of hard problems become easy.  While many folks who have been bitten by table inheritance tend to avoid the feature, this blog post is intended to provide a framework for reasoning about when table inheritance is actually the right tool for the job.

Table inheritance is, to be sure, a power tool and thus something to use only when it brings an overall reduction in complexity to the design.  Moreover the current documentation doesn't provide a lot of guidance regarding what the tool actually helps with and where are the performance costs and because inheritance sits orthogonal to relational design, working this out individually is very difficult.

This blog post covers uses of table inheritance which simplify overall database design and are not addressed by declarative partitioning, because they are used in areas other than table partitioning.

Table Inheritance Explained PostgreSQL provides the ability for tables to exist in an inheritance directed acyclic graph.  Columns provided by parent tables are merged in name and type into the child table.  Altering a parent table and adding a column thus cascades this operation to all child tables, though if any child table has a column with the same name and different type, the operation will fail.
Inheritance, Tables, and Types Every table in PostgreSQL has a corresponding campsite type, and any table can be implicitly cast to any parent table.  This is transitive.  Combined with tuple processing functions, this gives you a number of very powerful ways of working with data at various different levels of scale.
Indexes and foreign keys are not inherited.  Check constraints are inherited unless set to NO INHERIT.
Inheritance and Querying When a table is queried, by default all child tables are also queried and their results appended to the result.  Because of exclusion constraint processing, this takes out an ACCESS SHARE lock on all child tables at pl[...]
カテゴリー: postgresql