planet postgresql

Subscribe to planet postgresql のフィード
Planet PostgreSQL
更新: 15分 32秒 前

Álvaro Herrera: PostgreSQL 11: Patch Reviewers for Partitioning Patches

2018-09-28(金) 15:06:42

We seldom credit patch reviewers. I decided to pay a little homage to those silent heroes for a few of them: here’s the list of people who were credited as having reviewed the patches mentioned in my previous article for PostgreSQL 11. The number in front is the number of times they were credited as reviewers.

7 Amit Langote, Robert Haas 5 Dilip Kumar, Jesper Pedersen, Rajkumar Raghuwanshi 4 Peter Eisentraut 3 Amul Sul, David Rowley, Rafia Sabih, Simon Riggs, Thomas Munro 2 Antonin Houska, Ashutosh Bapat, Kyotaro Horiguchi 1 Álvaro Herrera, Amit Kapila, Amit Khandekar, Etsuro Fujita, Jaime Casanova, Keith Fiske, Konstantin Knizhnik, Pascal Legrand, Pavan Deolasee, Rajkumar Raghuanshi, Rushabh Lathia, Sven Kunze, Thom Brown, Yugo Nagata
カテゴリー: postgresql

Gary Sahota: Fun with SQL: Unions in Postgres

2018-09-28(金) 02:40:00

Before joining the Citus Data team to work on the Postgres extension that transforms Postgres into a distributed database, I checked out the Citus Data blog to learn a bit more. And found all sorts of material for Postgres beginners as well as power users, with one of my favorites being the “Fun with SQL” series. After I joined, I jumped at the chance to write a Fun with SQL post, and the only question was what to write about. I chose unions.

In SQL, the UNION and UNION ALL operators help take multiple tables and combine the results into a single table of all matching columns. This operator is extremely useful if you want the results to come back as a single set of records.

You can use UNION to enhance your SQL queries. In past posts we’ve also covered topics like common functions in Postgres, window functions in Postgres and CTEs (common table expressions) in Postgres.

When do you Union in Postgres?

The UNION operator is often used when multiple tables have the same structure but have been split for some reason (usually for performance considerations or archival purposes).

Before we begin, some basic requirements you should know are:

  • All individual queries must have the same number of columns and compatible data types.
  • Mismatched queries will not execute.
  • Validating consistency when building your query is important, so there are no surprises after.
  • By default, UNION only returns distinct values.

One example might be movies where you have a different table for each year of releases. Or if you have different types of users and email address, such as within a CRM like Copper where you have leads and contacts. Your leads table may look like:

leads | FirstName | LastName | Email | ----------------+---------------+------------------------ | Jennifer | Matthews | | | Tom | Jones | | | Daniel | Farina | | contacts | FirstName | LastName | Email | ---------[...]
カテゴリー: postgresql

Magnus Hagander: Nordic PGDay 2019 - Location announced and call for papers open

2018-09-27(木) 21:33:11

Nordic PGDay 2019 will be returning to Copenhagen, Denmark on March 19, 2019. This year, the conference is held right on the waterfront in the Copenhagen Marriott Hotel.

Our call for papers is now open, accepting proposals until the end of the year. We are looking for speakers both from the Nordic region, the rest of Europe, and the whole world. Technical details, case studies, good ideas or bad ideas -- all are good ideas for topics. All speakers get free entrance, so it's also a good excuse to come visit Denmark!

Registration and call for sponsors will open shortly!

カテゴリー: postgresql

Joshua Otwell: Understanding Check Constraints in PostgreSQL

2018-09-27(木) 18:43:48

Managing data is a big challenge. As our world turns, data continues to be widespread, abundant, and intensive. Therefore, we must take measures to handle the influx.

Validating every single piece of data 'by hand' around the clock is simply impractical. What a fantastic dream. But, after all, it is just that. A dream. Bad data is bad data. No matter how you slice it or dice it (pun intended). It is a problem from the onset, leading to even more issues.

Modern databases handle much of the heavy lifting for us. Many provide built-in solutions to assist in managing this particular area of data.

A sure way to control the data entered into a table's column is with a data type. Need a column with decimal numbers, having a total digit count of 4, with 2 of those after the decimal?

Sure thing! No problem at all.

NUMERIC(4,2), a viable option, is guarding that column like a watchdog. Can character text values slip in there? Not a snowball's chance.

PostgreSQL offers a multitude of data types. Chances are, one already exists to satisfy your need(s). If not, you can create your own. (See: PostgreSQL CREATE TYPE)

Yet, data types alone are not enough. You cannot assure the most specific requirements are covered and conform to such broad structuring. Compliance rules and some sort of 'standard' are typically required when designing a schema.

Suppose in that same NUMERIC(4,2) column, you only want values greater than 25.25 but less than 74.33? In the event, value 88.22 is stored, the data type is not at fault. By allowing 4 total digits, with 2 at most after the decimal, it is doing its job. Lay the blame elsewhere.

How do we win on this front when it comes to controlling the data allowed in our database? Data consistency is of utmost priority and is integral for any sound data solution. On the (off) chance you controlled the collected data from the onset of its origination source, consistency would likely be less of an issue.

But, a perfect world only exists (maybe) in one of those many fantasy novels I love to read.


カテゴリー: postgresql

Christophe Pettus: Find the value of all database sequences

2018-09-27(木) 03:06:42

Upon occasion, you want to get the current value of all the sequences in the database. For example, you may have fields that are integer rather than bigint, and you’re concerned how close you are to overflowing one of them (since sequences are bigint and will happily crash through the size of a 32-bit integer).

Unfortunately, currval() requires that you have actually accessed the sequence value using nextval() in the current session… but you don’t want to increment the value, just test it!

Here’s a cheap and cheerful PL/pgSQL function that returns all of the current sequence values:

CREATE OR REPLACE FUNCTION sequence_values() RETURNS TABLE(name text, value bigint) AS $sequence_values$ DECLARE nsp_name TEXT; seq_name TEXT; BEGIN FOR nsp_name, seq_name IN SELECT nspname::text, relname::text FROM pg_class JOIN pg_namespace ON pg_class.relnamespace = pg_namespace.oid WHERE relkind='S' LOOP RETURN QUERY EXECUTE 'SELECT ''' || nsp_name || '.' || seq_name || '''::text, last_value FROM "' || nsp_name || '"."' || seq_name || '"'; END LOOP; END; $sequence_values$ LANGUAGE plpgsql;
カテゴリー: postgresql

Dan Langille: Adding and populating CONFLICTS

2018-09-26(水) 21:49:51
This post serves two goals: document the CONFLICTS implementation within FreshPorts show how you can use triggers to update a secondary table The FreeBSD ports tree / package collection has a method for resolving conflicts. This feature allows a port maintainer to specify what, if any, packages conflict with a given port. The reasons why [...]
カテゴリー: postgresql

Dimitri Fontaine: A history lesson on SQL joins (in Postgres)

2018-09-26(水) 03:41:00

Our beloved Structured Query Language may be the lingua franca for relational databases—but like many languages, SQL is in a state of constant evolution. The first releases of SQL didn’t even have a notation for joins. At the time, SQL only supported inner joins.

Cross Joins and Where Filters

As a result, back in early eighties, the only way to express a join condition between tables would be in the WHERE clause.

select,, campaigns.monthly_budget from companies, campaigns where = campaigns.company_id and company_id = 5;

In that query, the PostgreSQL planner realizes that we are doing an INNER JOIN in between the companies table and the campaigns table, even though what we are actually writing is a CROSS JOIN in between the tables. In PostgreSQL documentation chapter Table Expressions we can read more about the CROSS JOIN operation:

For every possible combination of rows from T1 and T2 (i.e., a Cartesian product), the joined table will contain a row consisting of all columns in T1 followed by all columns in T2. If the tables have N and M rows respectively, the joined table will have N * M rows.

FROM T1 CROSS JOIN T2 is equivalent to FROM T1 INNER JOIN T2 ON TRUE (see below). It is also equivalent to FROM T1, T2.

In my opinion, it is so rare that what you want to achieve actually is a cross join that cross joins should fail your SQL unit testing. After all, code should always express the intentions of the programmer, and if your intention is not to implement a CROSS JOIN then using a comma separated list of tables in your FROM clause looks deadly suspicious.

Inner And Outer Joins in SQL

The previous SQL query can be written properly with an inner join syntax:

select,, campaigns.monthly_budget from companies inner join campaigns on = campaigns.company_id where company_id = 5;

Lucky for us, the PostgreSQL planner is smart enough to realize that both queries are specifying the same res

カテゴリー: postgresql

Pavel Trukhanov: USE, RED and real world PgBouncer monitoring

2018-09-26(水) 02:57:45

Brendan Gregg’s USE (Utilization, Saturation, Errors) method for monitoring is quite known. There are even some monitoring dashboard templates shared on the Internet. There’s also Tom Wilkie’s RED (Rate, Errors, Durations) method, which is suggested to be better suited to monitor microservices than USE.

We, at, recently updated our PgBouncer monitoring plugin and while doing that we’ve tried to comb everything and we used USE and RED as frameworks to do so.

Why we needed both and not just stuck with USE, as it is more commonly known? To answer that we need to understand their applicability first. While they are know, I don’t think they are widely systematically applied in practice of covering IT-systems with monitoring.


Using Brendan Gregg’s own words:

For every resource, check utilization, saturation, and errors.

Where resource is all and any physical server functional component (CPUs, disks, busses, …). But also some software resources as well, or software imposed limits/resource controls (containers, cgroups, etc).

Utilization: the average time that the resource was busy servicing work. So CPU utilization or disk IO utilisation of 90% means that it is idle, not doing work only 10% of the time, and busy 90% of time. But also, for such resources as memory, where one can’t apply the idea of “non idle percentage of time”, one could measure the proportion of a resource that is used.

Anyways, 100% utilization means no more “work” can be accepted, either at all, i.e. when memory is full, it is full, you can’t do anything about it. Or it’s 100% utilized only now, at the moment (as with CPU), and new work could be put into a waiting list, queue or something. And these two scenarios are covered by the corresponding remaining two USE metrics:

Saturation: the degree to which the resource has extra work which it can’t service, often queued.

Errors: the count of error events, i.e. such as “resource is busy”, “Cannot allocate memory”, “Not enough space”. While those usually do not produce performance impact

カテゴリー: postgresql

Pavan Deolasee: [Video] Power of Indexing in PostgreSQL

2018-09-26(水) 02:27:06

The video of my presentation below walks you through ‘Indexing in PostgreSQL’ – a key component to database performance.

This presentation covers the following topics:

  • Various ways to access data in PostgreSQL
  • Different types of indexes supported by PostgreSQL
  • Internals of BTree and BRIN indexes
  • Overview of GIN and GiST indexes
  • How to find missing indexes
  • How to find unused indexes
カテゴリー: postgresql

Jobin Augustine: PostgreSQL Backup Strategy for an Enterprise-Grade Environment

2018-09-26(水) 02:13:16

In this post we cover the methods used to achieve an enterprise-grade backup strategy for the PostgreSQL® cluster. In setting up our demonstration system, we use pg_basebackup and continuous archiving. The size of the database and your database environment—if it is on colo or cloud—matters. It is very important to understand how we can ensure minimalistic or no data loss at the time of disasters using our preferred backup tools.

As discussed in the introductory blog post, in our webinar of October 10, 2018 we highlight important aspects an enterprise should consider for their PostgreSQL environments. In our last blog post we looked at security.

Backing up…

The ability to recover a database to a certain point in time is always the ultimate aim of a backup strategy. Without a recoverable backup, you have no backup! So there are always two parts to a backup strategy: backup AND recovery. The backup tool you use to achieve your aims may vary. Let’s discuss some of the options.


The community version of PostgreSQL comes with a trusted backup tool called pg_basebackup. This utility can help you take a consistent online binary backup that is very fast, can be used for point-in-time-recovery, and also for setting up slaves/replicas.

Archiving and continuous archiving

Postgres does not flush data pages to disk upon every transaction commit. To ensure that PostgreSQL achieves durability and endures crash recovery, changes are written to transactions logs (a.k.a. WALs, Write-Ahead Logs) stored on disk.

WALs in PostgreSQL are similar to transaction log files in the InnoDB storage engine for MySQL. However, WAL files are recycled according to the values set to the parameters wal_keep_segments and max_wal_size. Hence, if WAL files are not copied to a safe location—such as a backup server or another file system—it won’t be possible to achieve point-in-time-recovery (PITR).

In order to archive WAL segments to a safe location, the parameter archive_mode must be set to ‘ON’ and we must pass an appropriate shell comma

カテゴリー: postgresql

Joshua Drake: PostgresConf 2019: Back on the island and call for papers

2018-09-26(水) 02:04:00
We are pleased to announce that PostgresConf 2019 will be held at the Sheraton Times Square March 18th - 22nd, 2019. 
The call for papers is now open!
Following on the success of PostgresConf 2018, 2019 will include five days with added training and partner summits.

March 18th and 19th will have immersive training. Instructors are encouraged to submit half and full day material for consideration. The preferred topics are centered around Postgres but anything People, Postgres, or Data related will be considered.

Monday, March 18th through Friday, March 22nd will host several partner summits, including popular and upcoming topics within the Postgres community and the annually hosted Regulated Industry Summit. Break out sessions will be held from Wednesday - Friday.

Important Dates:
  • Call for Papers Open: 09/12/2018 
  • Call for Papers Close: 01/11/2019 
  • Confirmation/Acceptance: 01/18/2019
Can't wait until March? Join us at our West Coast event, PostgresConf Silicon Valley, October 15th and 16th, 2018 at the Hilton San Jose.
About PostgresConf:
PostgresConf is a global nonprofit conference series with a focus on growing community through increased awareness and education of Postgres and related technologies. PostgresConf is known for its highly attended national conference with the mission of:

カテゴリー: postgresql

Vasilis Ventirozos: Logical Replication in aws RDS

2018-09-25(火) 22:51:00
Recently Amazon annnounced that postgres 10 supports logical replication syntax and that it can work as either a publisher or a subscriber. The announcement can be found here.

In order to test it out i made 2 10.4 RDS instances. Set them up to be in the same security,
subnet and parameter groups so i ended up with something like this :

user : vasilis password : Ap4ssw0rd db instance : lrnode1 & 2 dbname : lrtest port : 5432
only thing i had to change in order to enable logical replication is to change rds.enable_logical_replication to 1 in my parameter group.  
After verifying connectivity
psql -h -U vasilis -d lrtest -W psql -h -U vasilis -d lrtest -W
i created a dummy table on my publisher and subscriber (lrnode1 and 2)
create table lr_test_tbl (     id int primary key,     name text );
on lrnode1 (publisher) i created a publication CREATE PUBLICATION lr_pub FOR TABLE lr_test_tbl ;

on lrnode2 and as vasilis i created the subscription CREATE SUBSCRIPTION lr_sub CONNECTION '  dbname=lrtest user=vasilis password=Ap4ssw0rd' PUBLICATION lr_pub ;
There only 3 things you need to know, you need to adjust your subnet / security groups allowing instances to see each other, you have to change the RDS.enable_logical_replication parameter  and that the user you are supposed to use for this is the user you set when you created the instance.   That was it, pretty fast and simple.
Vasilis Ventirozos OmniTI Computer Consulting Inc.
カテゴリー: postgresql

Hubert 'depesz' Lubaczewski: pg_terminator released

2018-09-25(火) 19:04:07
I just released first version of pg_terminator. It's a tool that is supposed to be run on PostgreSQL db server, monitor a database, and cancel or terminate offending queries/connections. It can be used, for example to: kill leftover psql sessions that are not doing anything cancel too long queries on production servers kill connections that […]
カテゴリー: postgresql

Douglas Hunley: Monitoring pgBackRest with tail_n_mail

2018-09-24(月) 23:58:22

After a lively discussion at work today about monitoring tools and uses cases, I decided to see if I could use tail_n_mail, which I already use to monitor my PostgreSQL logs, to monitor my pgBackRest logs. It turns out that it can, and can do so fairly trivially.

For reference, our .tail_n_mail.conf looks like this:

## Config file for the tail_n_mail program ## This file is automatically updated LOG_LINE_PREFIX: %t P%p EMAIL: MAILSUBJECT: HOST pgBackRest errors NUMBER INCLUDE: WARN: INCLUDE: ERROR: INCLUDE: FATAL: INCLUDE: PANIC: FILE1: /var/log/pgbackrest/pgbackrest.log

You would need to change the EMAIL and FILE1 and the rest should just work.

To actually invoke the monitoring, simply add a cron entry that looks like:

* * * * * /path/to/tail_n_mail ~/.tailnmail.conf --pgmode=0

(Obviously, you should adjust the periodicity. I highly doubt you need to check every minute.)

Once you’ve saved the crontab, you should be all good. If you doubt the setup, you can add --mailzero to the invocation to get an email even if everything is OK.

See? Easy.

カテゴリー: postgresql

Luca Ferrari: pgenv get configuration!

2018-09-23(日) 22:01:43

I have already written about a very useful and powerful small pearl by theory: pgenv. Now the tool does support for user configuration!

pgenv get configuration!

I spent some time implementing a very rudimental approach to configuration for pgenv. The idea was simple: since the program is a single Bash script, the configuration can be done using a single file to source variables in.

But before this was possible, I had to do a little refactoring over here and there in order to make all the commands behave smooth across the configuration. And at least, it seems to work, with some parts that can be improved and implemented better (as always it is!). However, I designed from scratch to support every single version of PostgreSQL, that means configuration could be different depending on the specific version you are running. This allows, for example, to set particular flags for ancient versions, without having to get crazy when switching to more recent ones.

Now pgenv supports a config command that, in turn, support for several subcommands:

  • write to store the configuration in an hidden file named after the PostgreSQL version (e.g., .pgenv.10.5.conf);
  • edit just to launch you $EDITOR to manipulate the configuration;
  • delete to remove a configuration file;
  • show to dump the configuration.

The idea is simple: each time a new PostgreSQL version is built, a configuration file is created for such instance. You can then customize the file in order to make pgenv behave differently for that particular version of PostgreSQL. As an example, you can set different languages (e.g., PL/Perl) or different startup/stop modes. If the configuration file for a particular version is not found, a global configuration is loaded. If neither that is...

カテゴリー: postgresql

Regina Obe: PostGIS 2.5.0

2018-09-23(日) 09:00:00

The PostGIS development team is pleased to release PostGIS 2.5.0.

Although this release will work for PostgreSQL 9.4 and above, to take full advantage of what PostGIS 2.5 offers, you should be running PostgreSQL 11beta4+ and GEOS 3.7.0 which were released recently.

Best served with PostgreSQL 11 beta4 and pgRouting 2.6.1.

WARNING: If compiling with PostgreSQL+JIT, LLVM >= 6 is required Supported PostgreSQL versions for this release are: PostgreSQL 9.4 - PostgreSQL 12 (in development) GEOS >= 3.5


Continue Reading by clicking title hyperlink ..
カテゴリー: postgresql

Chris Travers: PostgreSQL at 20TB and Beyond Talk (PGConf Russia 2018)

2018-09-22(土) 00:48:00
It came out a while ago but I haven't promoted it much yet.

This is the recorded version of the PostgreSQL at 20TB and Beyond talk.  It covers a large, 500TB analytics pipeline and how we manage data.

For those wondering how well PostgreSQL actually scales, this talk is worth watching.
カテゴリー: postgresql

Chris Travers: Thoughts on the Code of Conduct Controversy

2018-09-22(土) 00:37:00
My overall perspective here is that the PostgreSQL community needs a code of conduct, and one which allows the committee to act in some cases for off-infrastructure activity, but that the current code of conduct has some problems which could have been fixed if efforts had been better taken ensure that feedback was gathered when it was actionable.

This piece discusses what I feel was done poorly but also what was done well and why, despite a few significant missteps, I think PostgreSQL as a project is headed in the right direction in this area.

But a second important point here is to defend the importance of a code of conduct to dissenters here, explain why we need one, and why the scope needs to extend where it needs to extend to, and why we should not be overly worried about this going in a very bad direction.  The reason for this direction is that in part I found myself defending the need for a code of conduct to folks I collaborate with in Europe and the context had less to do with PostgreSQL than with the Linux kernel.  But the projects in this regard are far more different than they are similar.
Major Complaint:  Feedback Could Have Been Handled Better (Maybe Next Time)
In early May there was discussion about the formation of a code of conduct committee, in which I argued (successfully) that it was extremely important that the committee be geographically and culturally diverse so as to avoid one country's politics being unintentionally internationalized through a code of conduct.  This was accepted and as I will go into below this is the single most important protection we have against misuse of the code of conduct to push political agendas on the community.  However after this discussion there was no further solicitation for feedback until mid-September.
In Mid-September, the Code of Conduct plan was submitted to the list.  In the new code of conduct was a surprising amendment which had been made the previous month, expanding the code of conduct to all interactions between community members unless anot[...]
カテゴリー: postgresql

Shaun M. Thomas: On Rocks and Sand

2018-09-22(土) 00:00:15

When working with database capacity planning, there are a lot of variables to consider, and Postgres is no different in this regard. One of the elements which requires management is storage. However, there’s an aspect of storage that escapes inspection almost without exception, and it’s hiding in the shadows between the columns themselves.

Alignment Basics

In most low-level computer languages like C, in which the venerable Postgres is written, data types are addressed by their maximum size, regardless of how large they actually are. Thus a standard 32-bit integer which can store a value of just over 2-billion, must be read as a whole unit. This means even the value of 0 requires 4 bytes of storage.

Further, Postgres is designed such that its own internal natural alignment is 8 bytes, meaning consecutive fixed-length columns of differing size must be padded with empty bytes in some cases. We can see that with this example:

SELECT pg_column_size(row()) AS empty, pg_column_size(row(0::SMALLINT)) AS byte2, pg_column_size(row(0::BIGINT)) AS byte8, pg_column_size(row(0::SMALLINT, 0::BIGINT)) AS byte16; empty | byte2 | byte8 | byte16 -------+-------+-------+-------- 24 | 26 | 32 | 40

This suggests that an empty Postgres row requires 24 bytes of various header elements, a SMALLINT is 2 bytes, a BIGINT is 8 bytes, and combining them is… 16 bytes? That’s no mistake; Postgres is padding the smaller column to match the size of the following column for alignment purposes. Instead of 2 + 8 = 10, our math becomes 8 + 8 = 16.

Intensity Intervals

By itself, this may not necessarily be a problem. But consider a contrived ordering system with this table:

CREATE TABLE user_order ( is_shipped BOOLEAN NOT NULL DEFAULT false, user_id BIGINT NOT NULL, order_total NUMERIC NOT NULL, order_dt TIMESTAMPTZ NOT NULL, order_type SMALLINT NOT NULL, ship_dt TIMESTAMPTZ, item_ct INT NOT NULL, ship_cost NUMERIC, receive_dt TIMESTAMPTZ, tracking_cd [...]
カテゴリー: postgresql

Avinash Kumar: Securing PostgreSQL as an Enterprise-Grade Environment

2018-09-21(金) 22:50:08

In this post, we review how you can build an enhanced and secure PostgreSQL database environment using community software. We look at the features that are available in PostgreSQL that, when implemented, provide improved security.

As discussed in the introductory blog post of this series, in our webinar of October 10, 2018 we highlight important aspects an enterprise should consider for their PostgreSQL environments. This series of blogs addressing particular aspects of the enterprise-grade postgres environment complements the webinar. This post addresses security.

Authentication Layer Client connections to PostgreSQL Server using host based authentication

PostgreSQL uses a host based authentication file (pg_hba.conf) to authorize incoming connections. This file contains entries with a combination of 5 categories: type, database, user, address, and method. A client is allowed to connect to a database only when the combination of username, database and the hostname of the client matches an entry in the pg_hba.conf file.

Consider the following entry in pg_hba.conf file :

# TYPE DATABASE USER ADDRESS METHOD host percona pguser md5

This entry says that connections from server are only allowed from user pguser and only to the database percona. The method md5 forces password authentication.

The order of the entries in the pg_hba.conf file matters. If you have an entry that rejects connections from a given server followed by another that allows connections from it, the first entry in the order is considered. So, in this case, the connection is rejected.

This is the first layer of protection in authentication. If this criteria is not satisfied in this Access Control List (ACL), PostgreSQL will discard the request without considering even the server authentication.

Server Authentication

Historically, PostgreSQL uses MD5 digest as a password hash by default. The problem with pure MD5 hashing is that this function will always return the same hash for a given password, which renders a MD5 digest

カテゴリー: postgresql