フィードアグリゲーター

Pierre-Emmanuel André: Setup a PostgreSQL cluster with repmgr and pgbouncer

planet postgresql - 2018-12-22(土) 09:32:06
Setup a PostgreSQL cluster with repmgr and pgbouncer

Recently I had to setup a PostgreSQL cluster and one of the prerequisites was to use repmgr.

In this post, I will explain you the work I did and how to setup this kind of cluster.

カテゴリー: postgresql

Avinash Kumar: Backup and Restore a PostgreSQL Cluster With Multiple Tablespaces Using pg_basebackup

planet postgresql - 2018-12-22(土) 02:35:59

pg_basebackup is a widely used PostgreSQL backup tool that allows us to take an ONLINE and CONSISTENT file system level backup. These backups can be used for point-in-time-recovery or to set up a slave/standby. You may want to refer to our previous blog posts, PostgreSQL Backup StrategyStreaming Replication in PostgreSQL and Faster PITR in PostgreSQL where we describe how we used pg_basebackup for different purposes. In this post, I’ll demonstrate the steps to restore a backup taken using pg_basebackup when we have many tablespaces that store databases or their underlying objects.

A simple backup can be taken using the following syntax.

Tar and Compressed Format $ pg_basebackup -h localhost -p 5432 -U postgres -D /backupdir/latest_backup -Ft -z -Xs -P Plain Format $ pg_basebackup -h localhost -p 5432 -U postgres -D /backupdir/latest_backup -Fp -Xs -P

Using a tar and compressed format is advantageous when you wish to use less disk space to backup and store all tablespaces, data directory and WAL segments, with everything in just one directory (target directory for backup).

Whereas a plain format stores a copy of the data directory as is, in the target directory. When you have one or more non-default tablespaces, tablespaces may be stored in a separate directory. This is usually the same as the original location, unless you use

--tablespace-mapping   to modify the destination for storing the tablespaces backup.

PostgreSQL supports the concept of tablespaces. In simple words, a tablespace helps us maintain multiple locations to scatter databases or their objects. In this way, we can distribute the IO and balance the load across multiple disks.

To understand what happens when we backup a PostgreSQL cluster that contains multiple tablespaces, let’s consider the following example. We’ll take these steps:

  • Create two tablespaces in an existing master-slave replication setup.
  • Take a backup and see what is inside the backup directory.
  • Restore the backup.
  • Conclude our findings
Create 2 tablespaces and take a backup (ta[...]
カテゴリー: postgresql

Bruce Momjian: Compiled PL/pgSQL?

planet postgresql - 2018-12-22(土) 01:12:18

PL/pgSQL has good alignment with SQL. When first executed in a session, the PL/pgSQL source is compiled to an abstract syntax tree which is then executed every time the PL/pgSQL function is executed in the session. Other languages have different compile behavior:

  • PL/Perl compiles to bytecode on first call, similar to PL/pgSQL
  • PL/Python loads pyc bytecode files
  • SPI loads machine-language instruction files (object files)

This email thread covers some of the details. Keep in mind that most server-side functions spend the majority of their time running SQL queries, so the method of compilation is often insignificant.

カテゴリー: postgresql

Hans-Juergen Schoenig: A PostgreSQL story about “NULL IS NULL = NULL AND NOT NULL”

planet postgresql - 2018-12-21(金) 17:59:22

After years of software development, some might still wonder: What is a NULL value? What does it really mean and what is its purpose? The general rule is: NULL basically means “undefined”. Many books state that NULL means “empty” but I think that is not the ideal way to see things: If you wallet is empty, your financial situation is perfectly defined (= you are broke). But, “undefined” is different. It means that we don’t know the value. If we don’t know how much cash you got, you might still be a millionaire. So using the word “unknown” to describe NULL in SQL is really better than to use word “empty”, which can be pretty misleading in my judgement.

NULL values in PostgreSQL: Basic rules

First of all: NULL is a super useful thing in SQL and people should be aware of the details associated with it. Before digging deeper into NULL it is necessary to take a look at the most basic rules. The following example shows a mistake commonly made by many developers:

test=# SELECT 10 = NULL; ?column? ---------- (1 row)

Many people assume that the output of this query is actually “false”, which is not correct. The result is NULL. Why is that? Suppose you got 10 bucks in your left pocket and nobody knows how much cash you got in your right pocket. Is the amount of cash in your pockets the same? We don’t know. It might be very well so but we simply don’t know. Thus the result of this query has to be NULL.

Let us try something else:

test=# SELECT NULL = NULL; ?column? ---------- (1 row)

The same is true for this query. The result has to be NULL. We don’t know how much cash is in your left pocket and we got no idea how much there is in your right pocket. Is it identical? Again: We have absolutely no idea – the result is undefined.

To figure out if two values are actually NULL we have to use the following syntax:

test=# SELECT NULL IS NULL; ?column? ---------- t (1 row)

In this case the result is true because “IS” actually check if both value are indeed NULL. Consequently the next query is going to return false:

test=[...]
カテゴリー: postgresql

Michael Paquier: Postgres 12 highlight - wal_sender_timeout now user-settable

planet postgresql - 2018-12-21(金) 14:04:27

The following commit has reached PostgreSQL 12, which brings more flexibility in managing replication with standbys distributed geographically:

commit: db361db2fce7491303f49243f652c75c084f5a19 author: Michael Paquier <michael@paquier.xyz> date: Sat, 22 Sep 2018 15:23:59 +0900 Make GUC wal_sender_timeout user-settable Being able to use a value that can be changed on a connection basis is useful with clusters distributed geographically, and makes failure detection more flexible. A note is added in the documentation about the use of "options" in primary_conninfo, which can be hard to grasp for newcomers with the need of two single quotes when listing a set of parameters. Author: Tsunakawa Takayuki Reviewed-by: Masahiko Sawada, Michael Paquier Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1FAAD3AE@G01JPEXMBYT05

For some deployments, it matters to be able to change wal_sender_timeout depending on the standby and the latency with its primary (or another standby when dealing with a cascading instance). For example, a shorter timeout for a standby close to its primary allows faster problem detection and failover, while a longer timeout can become helpful for a standby in a remote location to judge correctly its health. In Postgres 11 and older versions, and this since wal_sender_timeout has been introduced since 9.1, this parameter can only be set at server-level, being marked as PGC_SIGHUP in its GUC properties. Changing the value of this parameter does not need an instance restart and the new value can be reloaded to all the sessions connected, including WAL senders.

The thread related to the above commit has also discussed if this parameter should be changed to be a backend-level parameter, which has the following properties:

  • Reload does not work on it. Once this parameter is changed at connection time it can never change.
  • Changing this parameter at server level will make all new connections using the new value.
  • Role-level configuration is not possible.

Still, for default values, it is a h

[...]
カテゴリー: postgresql

Ibrar Ahmed: Benchmark PostgreSQL With Linux HugePages

planet postgresql - 2018-12-21(金) 03:13:13

Linux kernel provides a wide range of configuration options that can affect performance. It’s all about getting the right configuration for your application and workload. Just like any other database, PostgreSQL relies on the Linux kernel to be optimally configured. Poorly configured parameters can result in poor performance. Therefore, it is important that you benchmark database performance after each tuning session to avoid performance degradation. In one of my previous posts, Tune Linux Kernel Parameters For PostgreSQL Optimization, I described some of the most useful Linux kernel parameters and how those may help you improve database performance. Now I am going to share my benchmark results with you after configuring Linux Huge Page with different PostgreSQL workload. I have performed a comprehensive set of benchmarks for many different PostgreSQL load sizes and different number concurrent clients.

Benchmark Machine
  • Supermicro server:
    • Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
    • 2 sockets / 28 cores / 56 threads
    • Memory: 256GB of RAM
    • Storage: SAMSUNG  SM863 1.9TB Enterprise SSD
    • Filesystem: ext4/xfs
  • OS: Ubuntu 16.04.4, kernel 4.13.0-36-generic
  • PostgreSQL: version 11
Linux Kernel Settings

I have used default kernel settings without any optimization/tuning except for disabling Transparent HugePages. Transparent HugePages are by default enabled, and allocate a page size that may not be recommended for database usage. For databases generally, fixed sized HugePages are needed, which Transparent HugePages do not provide. Hence, disabling this feature and defaulting to classic HugePages is always recommended.

PostgreSQL Settings

I have used consistent PostgreSQL settings for all the benchmarks in order to record different PostgreSQL workloads with different settings of Linux HugePages. Here is the PostgreSQL setting used for all benchmarks:

shared_buffers = '64GB' work_mem = '1GB' random_page_cost = '1' maintenance_work_mem = '2GB' synchronous_commit = 'on' seq_page_cost = '1' max_wal_size = '100GB' checkpoint_time[...]
カテゴリー: postgresql

Luca Ferrari: PGVersion: a class to manage PostgreSQL Version (strings) within a Perl 6 Program

planet postgresql - 2018-12-20(木) 09:00:00

While writing a program in Perl 6 I had the need to correctly parse and analyze diffefent PostgreSQL version strings. I wrote a simple and minimal class to the aim, and refactored so it can escape in the wild.

PGVersion: a class to manage PostgreSQL Version (strings) within a Perl 6 Program

As you probably already know, PostgreSQL has changed its versioning number scheme from a major.major.minor approach to a concise major.minor one. Both are simple enought to be evaulated with a regular expression, but I found myself wrinting the same logic over and over, so I decided to write a minimal class to do the job for me and provide several information.
Oh, and this is Perl 6 (that I’m still learning!).
The class is named Fluca1978::Utils::PostgreSQL::PGVersion and is released as it is under the BSD Licence.

Quick, show me something!

Ok, here it is how it works:

use Fluca1978::Utils::PostgreSQL::PGVersion; for <10.1 11beta1 11.1 9.6.5 6.11> { my $v = PGVersion.new: :version-string( $_ ); say "PostgreSQL version is $v"; say "or for short { $v.gist }"; say "and if you want a detailed version:\n{ $v.Str( True ) }"; say "URL to download: { $v.http-download-url }"; say '~~~~' x 10; }

The above simple loop provides the following output:

% perl6 -Ilib usage-example-pgversion.pl PostgreSQL version is v10.1 or for short 10.1 and if you want a detailed version: 10.1 (Major: 10,...
カテゴリー: postgresql

Ajay Kulkarni: How we are building a self-sustaining open-source business in the cloud era

planet postgresql - 2018-12-20(木) 03:31:25

Today, we are announcing that we have started developing new open-code (or source-available) features that will be made available under the (also new) Timescale License (TSL). You will be able to follow the development of these features on our GitHub.

Some of these new features (“community features”) will be free (i.e., available at no charge) for all users except for the <0.0001% who just offer a hosted “database-as-a-service” version of TimescaleDB. Other features will require a commercial relationship with TimescaleDB to unlock and use for everyone (“enterprise features”). And to be clear, we will also continue to invest in the core of TimescaleDB, which will remain open-source under the Apache 2 license.

Going forward, all new features will be categorized into one of the three following buckets: open-source, community, and enterprise.

Software licenses like the TSL are the new reality for open-source businesses like Timescale. This is because the migration of software workloads to the cloud has changed the open-source software industry. Public clouds currently dominate the space, enabling some of them to treat open-source communities and businesses as free R&D for their own proprietary services.

While this behavior (also known as “strip mining”) is legally permissible within the boundaries of classic open-source licenses, we believe it is not good for the open-source community because it is not sustainable. It makes it impossible to build the independent businesses required to support large-scale open-source projects (without being tied to the whims of a large corporate sponsor with varied interests). We ultimately believe that a self-sustaining business builds better open-source software.

(The new Confluent Community License is another example of an open-source business enacting a new license to counteract strip mining by some of the public clouds.)

In this post we explain what we are doing to address this challenge. We also explain what this might mean for you, an existing or potential TimescaleDB user.

W[...]
カテゴリー: postgresql

Bruce Momjian: WAL Directory Location

planet postgresql - 2018-12-20(木) 01:45:01

You might have noticed that initdb has a -X/--waldir option to relocate the write-ahead log (pg_wal) directory. This allows the WAL I/O traffic to be on a device different from the default data directory. (This is similar to the use of tablespaces.) If you wish to move the pg_wal (or older pg_xlog) directory to a different device after running initdb, you can simply shut down the database server, move pg_wal to a new location, and create a symbolic link from the data directory to the new pg_wal location.

カテゴリー: postgresql

Bruce Momjian: Windows and Shared Buffers

planet postgresql - 2018-12-18(火) 02:30:01

Perhaps one overlooked change in Postgres 10 was the removal of the recommendation of smaller shared_buffers on Windows. This email thread discussed its removal. So, if you have been minimizing the size of shared_buffers on Windows, you can stop now.

カテゴリー: postgresql

Sebastian Insausti: PostgreSQL on the Rise: 2018 Postgres Findings & 2019 Trends

planet postgresql - 2018-12-17(月) 22:21:47

A strong community coupled with an increasingly stable and feature-rich platform are making PostgreSQL a strong contender for replacing the “big boys” of enterprise data. We here at Severalnines have also seen this trend and the growth of popularity of PostgreSQL. Below we will look at the future of PostgreSQL and why its growth is booming!

PostgreSQL Market Rise

According to DB-Engines PostgreSQL is the number four top database in use today and the number two most used open source database (behind MySQL). What’s interesting about this ranking however is the year-over-year score change that shows MySQL dropping in the triple digits while PostgreSQL is increasing by more than 75 points.

Source: DB-Engines. Database Rankings December 2018

DB-Engines also declared PostgreSQL as the DBMS of the Year for 2017, the 2018 winner has not yet been announced.

The 2018 Stack Overflow survey found PostgreSQL ranked as number three as the top used database technologies behind MySQL and SQL Server. This is particularly interesting considering Stack Overflow is primarily a site for developers and in spite of this PostgreSQL is still preferred over MongoDB (which is a favorite amongst that audience).

Source: Stack Overflow 2018 User Survey

In an article titled Why Oracle's missteps have led to PostgreSQL's 'moment' in the database market” author Matt Asay said “Postgres is cool in part because of how great a database it is, and partly because it's the clearest like-for-like alternative to Oracle, which no one seems to want to use if they have any choice in the matter. No wonder, then, that out of all databases in existence, Postgres outperformed them all in terms of rising popularity.”

This sentiment was echoed by Tony Baer in his article Has the time finally come for PostgreSQL? , who said “it has become the go-to open source platform for third parties seeking to deliver their own relational database products. That is directly attributable to the conservative nature of the open source project that has prioritized stability

[...]
カテゴリー: postgresql

Michael Paquier: Postgres 12 highlight - Controlling SSL protocol

planet postgresql - 2018-12-15(土) 10:46:34

The following commit has happened in Postgres 12, adding a feature which allows to control and potentially enforce the protocol SSL connections can use when connecting to the server:

commit: e73e67c719593c1c16139cc6c516d8379f22f182 author: Peter Eisentraut <peter_e@gmx.net> date: Tue, 20 Nov 2018 21:49:01 +0100 Add settings to control SSL/TLS protocol version For example: ssl_min_protocol_version = 'TLSv1.1' ssl_max_protocol_version = 'TLSv1.2' Reviewed-by: Steve Singer <steve@ssinger.info> Discussion: https://www.postgresql.org/message-id/flat/1822da87-b862-041a-9fc2-d0310c3da173@2ndquadrant.com

As mentioned in the commit message, this commit introduces two new GUC parameters:

  • ssl_min_protocol_version, to control the minimal version used as communication protocol.
  • ssl_max_protocol_version, to control the maximum version used as communication protocol.

Those can also take different values, which defer depending on what the version of OpenSSL PostgreSQL is compiled with is able to support or not, with values going from TLS 1.0 to 1.3: TLSv1, TLSv1.1, TLSv1.2, TLSv1.3. An empty string can also be used for the maximum, to mean that anything is supported, which gives more flexibility for upgrades. Note that within a given rank, the latest protocol will be the one used by default.

Personally, I find the possibility to enforce that quite useful, as up to Postgres 11 the backend has been taking automatically the newest protocol available with SSLv2 and SSLv3 disabled by being hardcoded in the code. However sometimes there are requirements which pop up, telling to make sure that at least a given TLS protocol needs to be enforced. Such things would not matter for most users but for some large organizations sometimes it makes sense to enforce some control. This is also useful for testing a protocol when doing development on a specific patch, which can happen when working on things like SSL-specific things for authentication. Another area where this can be useful is if a flaw is found in a specific protocol to mak

[...]
カテゴリー: postgresql

SunshinePHP 2019

php.net - 2018-12-15(土) 09:00:01
カテゴリー: php

Will Leinweber: \watch ing Star Wars in Postgres

planet postgresql - 2018-12-15(土) 04:54:00

I recently had the honor of speaking at the last Keep Ruby Weird. A good part of the talk dealt with Postgres and since Citus is a Postgres company I figured sharing those parts on the blog would be a good idea. If you’d like to see it in talk form, or you’d also like to know how to watch movies rendered as emojis in your terminal, I encourge you to watch the talk.

So, obviously we want to watch Star Wars in psql. The first step though is getting all of the frames into a format that makes sense. http://asciimation.co.nz by Simon Jansen is an ASCII version of Star Wars, and is perfect for our needs. Simon hand drew all of the frames and it looks like this:

==== Help me, o o~~ Obi-Wan Kenobi! _\O /_ ___ / \ / \ /() \ //| | |\\ _|_____|_ // | | |// ,@ | | === | | // | | // /=- |_| O |_|(' |===(| || || O || | || | || ||__*__|| (_)(_) --~~--- |~ \___/ ~| |_||_| | | /=\ /=\ /=\ |_||_| _________|_____|______[_]_[_]_[_]____/__][__\______________________

We can get all of the data using curl as long as we get it using gzip, and then we can extract the frames out of the single, very long javacript line that starts with var film =. The format is a number representing how many times this particular frame should be repeated and is followed by fourteen lines separated by new line characters.

raw = File.readlines("site.html") .find {|line| line.start_with? "var film"} .encode('UTF-8','binary',invalid: :replace,undef: :replace,replace: '') .gsub(%q{\'},"'")[12..-16] .split('\n') .each_slice(14)

Once we have that we can store it in a postgres table

CREATE TABLE film ( i serial PRIMARY KEY, count int NOT NULL, frame text NOT NULL ); require 'sequel' db = Sequel.connect("postgre[...]
カテゴリー: postgresql

Bruce Momjian: Micro-Optimizations

planet postgresql - 2018-12-14(金) 23:30:02

This email thread discusses the tradeoffs of adding optimization improvements that affect only a small percentage of queries, i.e., micro-optimizations. The big sticking point is that we don't know if the time required to check for these optimizations is worth it. For short-running queries, it probably isn't, but for long-running queries, it probably is. The problem is that we don't know the final cost of the query until the end the optimization stage — this makes it impossible to decide if checks for micro-optimizations are worthwhile during optimization.

During the email discussion, optimizing X = X clauses was considered to be a win for all queries, so was applied. Optimization to convert OR queries to use UNION is still being considered. Figuring out a way to estimate the cost before optimization starts was recently discussed.

カテゴリー: postgresql

Bruce Momjian: Optimizer Hints

planet postgresql - 2018-12-13(木) 00:30:01

You might know that Postgres has optimizer hints listed as a feature we do not want to implement. I have covered this topic in the past.

This presentation from Tatsuro Yamada covers the trade-offs of optimizer hints in more detail than I have ever seen before. Starting on slide 29, Yamada-san explains the number-one reason to use optimizer hints — as a short-term fix for inefficient query plans. He uses pg_hint_plan (produced by NTT) to fix inefficient plans in his use case. He then goes on to consider possible non-hint solutions to inefficient plans, such as recording data distributions found during execution for use in the optimization of later queries.

Most interesting to me was his reproduction of the optimizer hints discussion on the Postgres wiki, including his analysis of how pg_hint_plan fits that criteria. There are certainly environments where optimizer hints are helpful, and it seems Yamada-san has found one. The reason the community does not plan to support hints is that it is considered likely that optimizer hints would cause more problems for users than they solve. While Postgres has some crude manual optimizer controls, it would certainly be good if Postgres could come up with additional solutions that further minimize the occurrence of significantly inefficient plans.

カテゴリー: postgresql

Luca Ferrari: PostgreSQL 11 Server Side Programming - Now Available!

planet postgresql - 2018-12-12(水) 09:00:00

A quick start guide on implementing and deploying code to PostgreSQL.

PostgreSQL 11 Server Side Programming - Now Available!

Near the end of November, Packt published the book PostgreSQL 11 Server Side Programming Quick Start Guide, authored by me.



*This post has the only aim of describing the book contents and the reason behind choices related to it.**

Following a consolidated tradition, Packt is producing more and more books on PostgreSQL and related technologies, and this is the first one that covers aspects about the freshly released PostgreSQL 11 version.

Nevertheless, this does not mean that the book is only for PostgreSQL 11 users and administrators: it covers topics, concepts and provide examples that can be use as-is or ported to older versions of PostgreSQL, as well as probably to newer ones. In fact, while the book code examples have been tested against a PostgreSQL 11 cluster, only the examples related to the new object PROCEDURE, introduced by PostgreSQL 11, are strongly tied to such a version.

This book is a Quick Start Guide, and therefore it has a very practical approach to a limited scope, and only that. Therefore the book assumes you are able to install, manage and run a PostgreSQL 11 cluster, that you know how to connect and how to handle basic SQL statements. A basic knowledge in general programming is also required.

The book consists of 10 chapters, each focusing on a particular aspect of developing and deploying code within a PostgreSQL cluster. The main programming language used in the book is PL/pgSQL, the default procedural language for PostgreSQL; however several examples are...

カテゴリー: postgresql

Achilleas Mantzios: Using PostgreSQL Logical Replication to Maintain an Always Up-to-Date Read/Write TEST Server

planet postgresql - 2018-12-12(水) 02:53:42

In this blog entry we’ll talk about logical replication in PostgreSQL: its use cases, general information on the status of this technology, and a special use case in particular on how to setup a subscriber (replica) node of the primary server in order to function as the database server for the testing environment, and the challenges met.

Intro

Logical replication, officially introduced in PostgreSQL 10, is the latest replication technology offered by the PostgreSQL community. Logical replication is a continuation of the legacy of physical replication with which it shares a lot of ideas and code. Logical replication works like physical replication using the WAL to record logical changes independent from the version or specific architecture. In order to be able to provide logical replication to the core offering the PostgreSQL community has gone a long way.

Types of replication and History of PostgreSQL replication

The types of replication in databases can be classified as follows:

  • Physical (AKA binary) replication
    • Operating system level (vSphere replication)
    • File system level (DRBD)
    • Database level (WAL based)
  • Logical replication (Database level)
    • Trigger based (DBMirror, Slony)
    • Middleware (pgpool)
    • WAL based (pglogical, Logical Replication)

The roadmap that brings to today’s WAL based logical replication was:

  • 2001: DBMirror (trigger based)
  • 2004: Slony1 (trigger based), pgpool (middleware)
  • 2005: PITR (WAL based) introduced in PostgreSQL 8.0
  • 2006: Warm standby in PostgreSQL 8.2
  • 2010: Physical streaming replication, hot standby in PostgreSQL 9.0
  • 2011: Synchronous streaming replication in PostgreSQL 9.1
  • 2012: Cascading streaming replication in PostgreSQL 9.2
  • 2013: Background workers in PostgreSQL 9.3
  • 2014: Logical decoding API, replication slots. (The foundations for Logical Replication) in PostgreSQL 9.4
  • 2015: 2ndQuadrant introduces pglogical, the ancestor or Logical Replication
  • 2017: Logical Replication in core PostgreSQL 10!

As we can see lots of technologies collaborated to make Logical Replication a r

[...]
カテゴリー: postgresql

Joshua Drake: PostgresConf 2019: Early bird tickets now available and CFP still open

planet postgresql - 2018-12-12(水) 02:20:00
We are pleased to announce that Early Bird tickets to Postgres Conference 2019 are now available. Whether you are seeking a world class big data and analytics learning opportunity from the Greenplum Summit, a deep dive from the Google Cloud Summit, Regulated Industry Information from the Regulated Industry track, or a dizzying amount of learning opportunities from over 100 breakout sessions, PostgresConf 2019 is the show not to miss! Oh, and did we mention there will be training options as well?

Register here: https://postgresconf.org/conferences/2019

Call For Papers is still open! Submit your presentation at the above link.
 

PostgresConf would be impossible without the generous support of our sponsors including: EnterpriseDB, Pivotal, Google, Microsoft and 2ndQuadrant.
Thanks for all of the support and we look forward to seeing you in March!

The Postgres Conference Organizer Team

カテゴリー: postgresql

Laurenz Albe: Be prepared for prepared transactions

planet postgresql - 2018-12-11(火) 18:00:45

Prepared transactions are disabled in PostgreSQL by default, since the parameter max_prepared_transactions has the default value 0.

You don’t need prepared transactions in most cases. However, they can cause nasty problems, so I think that everybody who runs a PostgreSQL database should understand them.

To illustrate these problems, I’ll show you how to use prepared transactions to get a PostgreSQL into an inaccessible state.

 

What are prepared transactions?

Normally, a database transaction that spans multiple statements is ended with COMMIT or ROLLBACK. With prepared transactions, another step is added:

  1. BEGIN or START TRANSACTION: starts a transaction as usual.
  2. PREPARE TRANSACTION 'name': prepares the transaction for commit or rollback and assigns a name to it.
  3. { COMMIT | ROLLBACK } PREPARED 'name': commits or rolls back a previously prepared transaction.

 

The PREPARE TRANSACTION step performs all actions that may fail during COMMIT. That way, both COMMIT PREPARED and ROLLBACK PREPARED are guaranteed to succeed once a transaction is prepared. Moreover, PREPARE TRANSACTION persists the still open transaction, so that it will survive a crash or server restart.

Once a transaction is prepared, it is complete. Subsequent SQL statements belong to different transactions. You cannot do anything with a prepared transaction except COMMIT PREPARED and ROLLBACK PREPARED.

 

What is the use of prepared transactions?

Prepared transactions are used to implement “distributed transactions”.
Distributed transactions are transactions that affect more than one data source.
The protocol is as follows:

  1. Start a transaction on all data sources involved in the transaction.
  2. Modify data in the data sources. If there is a problem, ROLLBACK all involved transactions.
  3. Once you are done, PREPARE all involved transactions.
  4. If the PREPARE step fails in any of the transactions, issue ROLLBACK PREPARED everywhere.
  5. If the PREPARE step succeeds everywhere, COMMIT PREPARED all involved transactions.

 

This so-called “two-phase commit protocol”

[...]
カテゴリー: postgresql

ページ