planet postgresql

Subscribe to planet postgresql のフィード
Planet PostgreSQL
更新: 2時間 58分 前

Alexander Sosna: HowTo: Central and semantic logging for PostgreSQL

2019-04-02(火) 18:00:00
HowTo: Central and semantic logging for PostgreSQL Today it is no longer necessary to argue why central logging makes sense or is even necessary. Most medium-sized companies now have a central logging system or are just introducing it. Once the infrastructure has been created, it must be used sensibly and efficiently! Especially as an... 02-04 Alexander Sosna
カテゴリー: postgresql

Tatsuo Ishii: Statement level load balancing

2019-04-02(火) 15:53:00
In the previous article I wrote about one of the new features of upcoming Pgpool-II 4.1.
This time I would like to introduce "statement level load balancing" feature of 4.1.

Pgpool-II can distribute read queries among PostgreSQL backend nodes. This allows to design a scale out cluster using PostgreSQL. The particular database node used for distributing read query is determined at the session level: when a client connects to Pgpool-II. This is so called "session level load balancing". For example, if a client connects to Pgpool-II and the load balance node is node 1 (we assume that this is a streaming replication standby node), then any read query will be distributed to the primary (master) node and the load balance node (in this case node1, the standby node). The distribution ratio is determined by "backend weight" parameter in the Pgpool-II configuration file (usually named "pgpool.conf"), typically "backend_weight0" or "backend_weight1", corresponding to node 0 and node 1 respectively.

This is good as long as clients connects to Pgpool-II, issue some queries, and disconnect, since next time a client connects to Pgpool-II, different load balance node may be chosen according to the backend weight parameters.

However, if your client already has a connection pooling feature, this way (session level load balancing) might be a problem, since the selection of load balance node is performed only once when the connection pooling from client to Pgpool-II is created.

The statement level load balancing feature is created to solve the problem. Unlike the session level load balancing, the load balancing node is determined when a new query is issued. The new parameter for this is "statement_level_load_balance". If this is set to on, the feature is enabled (the parameter can be changed by reloading the pgpool.conf).

At first "select_cnt" is 0, which means no SELECTs were issued.

test=# show pool_nodes;

node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | last_st[...]
カテゴリー: postgresql

Koichi Suzuki: Postgres-XL and global MVCC

2019-04-02(火) 11:58:21
Back to the PG

I’m very excited to become a 2ndQuadrant member.  I was involved in PostgreSQL activities in NTT group (Japanese leading ICT company, see here and here), including log shipping replication and PostgreSQL scale out solution as PostgresXC and PostgresXL.

At NTT I had several chances to work very closely with 2ndQuadrant.  After three years involvement in deep learning and accelerator usage in various applications I’m now back to PostgreSQL world.  And I’m still very interested in PostgreSQL scale out solutions and applying PostgreSQL to large scale analytic workload as well as stream analytics.

Here, I’d like to begin with a discussion of scaled out parallel distributed database with full transaction capabilities  including atomic visibility.

ACID property, distributed database and atomic visibility

Traditionally, providing full ACID property is the responsibility of each database server.   In scale out solutions, database consists of multiple servers and two phase commit protocol (2PC) is used to provide ACID properties in updating multiple servers.

Although 2PC provides write consistency among multiple servers,  we should note that it does not provide atomic visibility which ensures global transaction update can be visible to all the other transaction in a same time.    In two phase commit protocol, each server receives “COMMIT” in different time clock and this can make partial transaction update visible to others.

How Postgres-XL works, global transaction management

Postgres-XL provides atomic visibility of global transaction and this is essentially what GTM (global transaction manager) is doing.   GTM helps to share the snapshot among all the transactions so that any such partial COMMIT is not visible until all the COMMITs are successful and GTM updates the snapshot.   Note that GTM-proxy is used to reduce interaction between each server and GTM by copying the same snapshot to different local transactions.

This is very similar to what standalone PostgreSQL is doing as MVCC.   It is somewhat centr

カテゴリー: postgresql

Sebastian Insausti: How to Deploy Highly Available PostgreSQL with Single Endpoint for WordPress

2019-04-01(月) 23:00:27

WordPress is an open source software you can use to create your website, blog, or application. There are many designs and features/plugins to add to your WordPress installation. WordPress is a free software, however, there are many commercial plugins to improve it depending on your requirements.

WordPress makes it easy for you to manage your content and it’s really flexible. Create drafts, schedule publication, and look at your post revisions. Make your content public or private, and secure posts and pages with a password.

Related resources  ClusterControl for PostgreSQL  Scaling Wordpress and MySQL on Multiple Servers for Performance  Wordpress Application Clustering using Kubernetes with HAProxy and Keepalived

To run WordPress you should have at least PHP version 5.2.4+, MySQL version 5.0+ (or MariaDB), and Apache or Nginx. Some of these versions have reached EOL and you may expose your site to security vulnerabilities, so you should install the latest version available according to your environment.

As we could see, currently, WordPress only supports the MySQL and MariaDB database engines. WPPG is a plugin based on PG4WP plugin, that gives you the possibility to install and use WordPress with a PostgreSQL database as a backend. It works by replacing calls to MySQL specific functions with generic calls that map them to other database functions and rewriting SQL queries on the fly when needed.

For this blog, we’ll install 1 Application Server with WordPress 5.1.1 and HAProxy, 1.5.18 in the same server, and 2 PostgreSQL 11 database nodes (Master-Standby). All the operating system will be CentOS 7. For the databases and load balancer deploy we’ll use the ClusterControl system.

This is a basic environment. You can improve it by adding more high availability features as you can see here. So, let’s start.

Database Deployment

First, we need to install our PostgreSQL database. For this, we’ll assume you have ClusterControl installed.

To perform a deployment from ClusterControl, simply select the option

カテゴリー: postgresql

Regina Obe: SQL Server on Linux

2019-04-01(月) 22:31:00

Today is April 1st. Having no thoughts on Fools jokes for today, I dug up one of our old April fools, and it was pretty scary how the joke is just about true now. Yes SQL Server now really does run on Linux and is on it's 2017th edition, but still a poor competition to PostgreSQL.

A goody from our old joke archives

CatchMe - Microsoft SQL Server for Unix and Linux
カテゴリー: postgresql

Devrim GÜNDÜZ: End of the naming game: The PostgreSQL project changes its name

2019-04-01(月) 18:36:00
I started using PostgreSQL around September 1998. The first problem I had was pronouncing it, and even using right capital letters at the right place.

Was is PostGreSQL? PostgresSQL? PoStGreSQL? PostgreySQL?

Recently Craig also mentioned about the same problem.

Starting today, the PostgreSQL Global Development Group (abbreviated as PGDG) announced that the project will be written as PostgresQL. This will solve the problems (hopefully), and will also help use to drop the "QL" in 2024. Starting v12, all packages will also "provide" postgresXY as the package name, for a smooth change in 2024. Meanwhile, as of today, the project will accept "Postgre" as an alias for those who did not want to learn about the name of the software they are using. I heard rumours that they also say "Orac" or "SQ Serv" or "MyS", so they will now be free to drop SQL in our name, too.

Thanks to everyone who made this change. This was a real blocker for the community, and it will also help newbies in the PostgreSQL Facebook group -- they will now be free to use "Postgre" from now on.
カテゴリー: postgresql

Raghavendra Rao: Install PL/Java 1.5.2 in PostgreSQL 11

2019-04-01(月) 08:09:14
PostgreSQL 11 includes several procedural languages with the base distribution: PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python. In addition, there are a number of procedural languages that are developed and maintained outside the core PostgreSQL Distribution like PL/Java (Java), PL/Lua (Lua), PL/R (R), PL/sh (Unix Shell), and PL/v8 (JavaScript). In this post, we are going to see...
カテゴリー: postgresql

Andrew Dunstan: Where and when you need a root.crt file

2019-03-30(土) 04:09:50

This is something people seem to get confused about quite often. A root.crt file is used to validate a TLS (a.k.a. SSL) certificate presented by the other end of a connection. It is usually the public certificate of the Certificate Authority (CA) that signed the presented certificate, and is used to validate that signature. If a non-root CA was used to sign the other end’s TLS certificate, the root.crt file must contain at least the root of the CA chain, and enough other elements of the chain that together with the certificate can connect the root to the signing CA.

In the simple and most common case where client certificates are not being used, only the client needs a root.crt file, to validate the server’s TLS certificate, if using 'verify-ca' or 'verify-full' ssl mode. The server doesn’t need and can’t use a root.crt file when client certificates are not being used.

On the other hand, if you are using client certificates, the server will also need a root.crt file to validate the client certificates. There is no requirement that same root.crt be used for both sides. It would be perfectly possible for the server’s certificate to be signed by one CA and the client certificates by another.

If more than one CA is used in a certain context, i.e. if the client connects to servers with certificates signed by more than one CA, or of the server accepts connections from clients with certificates signed by more than one CA, then the certificates of all the CAs can be placed in the root.crt file, one after the other. The connection will succeed as long as one of the certificates (or certificate chains) in the file is that of the relevant signing authority.

カテゴリー: postgresql

Craig Kerstiens: A health checkup playbook for your Postgres database

2019-03-30(土) 01:59:00

I talk with a lot of folks that set their database up, start working with it, and then are surprised by issues that suddenly crop up out of nowhere. The reality is, so many don’t want to have to be a DBA, instead you would rather build features and just have the database work. But your is that a database is a living breathing thing. As the data itself changes what is the right way to query and behave changes. Making sure your database is healthy and performing at it’s maximum level doesn’t require a giant overhaul constantly. In fact you can probably view it similar to how you approach personal health. Regular check-ups allow you to make small but important adjustments without having to make dramatic life altering changes to keep you on the right path.

After years of running and managing literally millions of Postgres databases, here’s my breakdown of what your regular Postgres health check should look like. Consider running this on a monthly basis to be able to make small tweaks and adjustments and avoid the drastic changes.

Cache rules everything around me

For many applications not all the data is accessed all the time. Instead certain datasets are accessed one and then for some period of time, then the data you’re accessing changes. Postgres in fact is quite good at keeping frequently accessed data in memory.

Your cache hit ratio tells you how often your data is served from in memory vs. having to go to disk. Serving from memory vs. going to disk will be orders of magnitude faster, thus the more you can keep in memory the better. Of course you could provision an instance with as much memory as you have data, but you don’t necessarily have to. Instead watching your cache hit ratio and ensuring it is at 99% is a good metric for proper performance.

You can monitor your cache hit ratio with:

SELECT sum(heap_blks_read) as heap_read, sum(heap_blks_hit) as heap_hit, sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio FROM pg_statio_user_tables; Be careful of dead tuples

Under the cov

カテゴリー: postgresql

Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – REINDEX CONCURRENTLY

2019-03-30(土) 00:00:02
On 29th of March 2019, Peter Eisentraut committed patch: REINDEX CONCURRENTLY   This adds the CONCURRENTLY option to the REINDEX command. A REINDEX CONCURRENTLY on a specific index creates a new index (like CREATE INDEX CONCURRENTLY), then renames the old index away and the new index in place and adjusts the dependencies, and then drops … Continue reading "Waiting for PostgreSQL 12 – REINDEX CONCURRENTLY"
カテゴリー: postgresql

Ibrar Ahmed: PostgreSQL: Access ClickHouse, One of the Fastest Column DBMSs, With clickhousedb_fdw

2019-03-29(金) 23:01:12

Database management systems are meant to house data but, occasionally, they may need to talk with another DBMS. For example, to access an external server which may be hosting a different DBMS. With heterogeneous environments becoming more and more common, a bridge between the servers is established. We call this bridge a “Foreign Data Wrapper” (FDW). PostgreSQL completed its support of SQL/MED (SQL Management of External Data) with release 9.3 in 2013. A foreign data wrapper is a shared library that is loaded by a PostgreSQL server. It enables the creation of foreign tables in PostgreSQL that act as proxies for another data source.

When you query a foreign table, Postgres passes the request to the associated foreign data wrapper. The FDW creates the connection and retrieves or updates the data in the external data store. Since PostgreSQL planner is involved in all of this process as well, it may perform certain operations like aggregate or joins on the data when retrieved from the data source. I cover some of these later in this post.

ClickHouse Database

ClickHouse is an open source column based database management system which claims to be 100–1,000x faster than traditional approaches, capable of processing of more than a billion rows in less than a second.


clickhousedb_fdw is an open source project – GPLv2 licensed – from Percona. Here’s the link for GitHub project repository:

It is an FDW for ClickHouse that allows you to SELECT from, and INSERT INTO, a ClickHouse database from within a PostgreSQL v11 server.

The FDW supports advanced features like aggregate pushdown and joins pushdown. These significantly improve performance by utilizing the remote server’s resources for these resource intensive operations.

If you would like to follow this post and try the FDW between Postgres and ClickHouse, you can download and set up the ontime dataset for ClickHouse.  After following the instructions, the test that you have the desired data. The ClickHouse

カテゴリー: postgresql

Andreas 'ads' Scherbaum: PostgreSQL Europe Community User Group Recognition Guidelines

2019-03-29(金) 21:00:00

Over the past months, a great number of PostgreSQL User Groups and Meetups showed up all over Europe. It’s good to see that interest in PostgreSQL is growing!

Some of the user groups approached the PostgreSQL Europe board, and asked for support. Mostly for swag, but also for sending speakers, or other kind of support. We are happy to help!

In order to handle all of these requests, the PostgreSQL Europe board created a set of guidelines for user group meetings. The current version can be found on the PostgreSQL Europe website, under “Community”, and then “Community User Group Recognition Guidelines”. User groups which approach the PostgreSQL Europe board for support are expected to comply by these guidelines. Every user group is self-certified under these guidelines. If you have reason to believe that a self-certified status for a user group is not correct, please contact the PostgreSQL Europe board under “Contact”.

カテゴリー: postgresql

Daniel Vérité: Text search: a custom dictionary to avoid long words

2019-03-28(木) 21:51:02

The full text search is based on transforming the initial text into a tsvector. For example:

test=> select to_tsvector('english', 'This text is being processed.'); to_tsvector ---------------------- 'process':5 'text':2

This result is a sorted list of lexems, with their relative positions in the initial text, obtained by this process:

Raw text => Parser => Dictionaries (configurable) => tsvector

When there is enough data, we tend to index these vectors with a GIN or GIST index to speed up text search queries.

In SQL we can inspect the intermediate results of this process with the ts_debug function:

test=> select * from ts_debug('english', 'This text is being processed.'); alias | description | token | dictionaries | dictionary | lexemes -----------+-----------------+-----------+----------------+--------------+----------- asciiword | Word, all ASCII | This | {english_stem} | english_stem | {} blank | Space symbols | | {} | | asciiword | Word, all ASCII | text | {english_stem} | english_stem | {text} blank | Space symbols | | {} | | asciiword | Word, all ASCII | is | {english_stem} | english_stem | {} blank | Space symbols | | {} | | asciiword | Word, all ASCII | being | {english_stem} | english_stem | {} blank | Space symbols | | {} | | asciiword | Word, all ASCII | processed | {english_stem} | english_stem | {process} blank | Space symbols | . | {} | |

The parser breaks down the text into tokens (token column), each token being associated with a type (alias and description columns). Then depending on their types, these tokens are submitted as input to dictionaries mapped to these types, which may produce one lexem, or several, or zero to eliminate the term from the output vector.

In the above example, spaces and punctuation are eliminated be

カテゴリー: postgresql