planet postgresql

Subscribe to planet postgresql のフィード
Planet PostgreSQL
更新: 2時間 41分 前

Jobin Augustine: pgBackRest – A Great Backup Solution and a Wonderful Year of Growth

2019-05-10(金) 23:14:20

pgBackRest addresses many of the must-have features that you’ll want to look for in a PostgreSQL backup solution. I have been a great fan of the pgBackRest project for quite some time, and it gets better all the time. Historically, it was written in perl and now over the last year, the project is making steady progress converting into native C code. At the time of writing, the latest version is 2.13 and there remains dependency on a long list of perl libraries. In case you’ve never tried pgBackRest, now it is a great time to do it. This post should help you to set up a simple backup with a local backup repository.

Installation

The pgBackRest project packages are maintained in the PGDG repository. If you have already added the PGDG repository to package manager,  installation is a breeze.

On RHEL/CentOS/OEL:

$ sudo yum install pgbackrest

On Ubuntu/Debian

$ sudo apt install pgbackrest

This will fetch all the required perl libraries too:

The backrest is a native executable now (version 2):

$ file /usr/bin/pgbackrest /usr/bin/pgbackrest: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=5e3f6123d02e0013b53f6568f99409378d43ad89, not stripped

Some of the other changes DBAs should keep in mind are:

  1. thread-max option is no longer valid – use process-max instead.
  2. archive-max-mb option is no longer valid and has been replaced with the archive-push-queue-max option which has different semantics
  3. The default for the backup-user (deprecated), which is a new repo-host-user, has changed from backrest to pgbackrest.
  4. The configuration file has changed from /etc/pgbackrest.conf to /etc/pgbackrest/pgbackrest.conf
Building from source

We may want to build pgBackRest depending on our environment and version. Building pgBackrest from source on Debian / Ubuntu is already covered in the official documentation. Below I’ve provided the steps to follow for the Red Hat family.

Get the tarball of the latest release:

curl -LO https://github.com/pgbackrest/pgback[...]
カテゴリー: postgresql

Laurenz Albe: PostgreSQL v12 new feature: optimizer support for functions

2019-05-10(金) 17:00:18
© Laurenz Albe 2019

 

PostgreSQL commit 74dfe58a5927b22c744b29534e67bfdd203ac028 has added “support functions”. This exciting new functionality that allows the optimizer some insight into functions. This article will discuss how this will improve query planning for PostgreSQL v12. If you are willing to write C code, you can also use this functionality for your own functions.

Functions as “black boxes”

Up to now, the PostgreSQL optimizer couldn’t really do a lot about functions. No matter how much it knew about the arguments of a function, it didn’t have the faintest clue about the function result. This also applied to built-in functions: no information about them was “wired into” the optimizer.

Let’s look at a simple example: language=”sql”

EXPLAIN SELECT * FROM unnest(ARRAY[1,2,3]); QUERY PLAN ------------------------------------------------------------- Function Scan on unnest (cost=0.00..1.00 rows=100 width=4) (1 row)

PostgreSQL knows exactly that the array contains three elements. Still, it has no clue how many rows unnest will return, so it estimates an arbitrary 100 result rows. If this function invocation is part of a bigger SQL statement, the wrong result count can lead to a bad plan. The most common problem is that PostgreSQL will select bad join strategies based on wrong cardinality estimates. If you have ever waited for a nested loop join to finish that got 10000 instead of 10 rows in the outer relation, you know what I’m talking about.

There is the option to specify COST and ROWS on a function to improve the estimates. But you can only specify a constant there, which often is not good enough.

There were many other ways in which optimizer support for functions was lacking. This situation has been improved with support functions.

Support function syntax

The CREATE FUNCTION statement has been extended like this: like

CREATE FUNCTION name (...) RETURNS ... SUPPORT supportfunction AS ...

This way a function gets a “support function” that knows about the function and can

[...]
カテゴリー: postgresql

Ibrar Ahmed: Improving OLAP Workload Performance for PostgreSQL with ClickHouse Database

2019-05-10(金) 03:00:55

Every database management system is not optimized for every workload. Database systems are designed for specific loads, and thereby give better performance for that workload. Similarly, some kinds of queries work better on some database systems and worse on others. This is the era of specialization, where a product is designed for a specific requirement or a specific set of requirements. We cannot achieve everything performance-wise from a single database system. PostgreSQL is one of the finest object databases and performs really well in OLTP types of workloads, but I have observed that its performance is not as good as some other database systems for OLAP workloads. ClickHouse is one of the examples that outperforms PostgreSQL for OLAP.

ClickHouse is an open source, column-based database management system which claims to be 100–1,000x faster than traditional approaches, capable of processing of more than a billion rows in less than a second.

Foreign Data Wrapper (Clickhousedb_fdw)

If we have a different kind of workload that PostgreSQL is not optimized for, what is the solution? Fortunately, PostgreSQL provides a mechanism where you can handle a different kind of workload while using it, by creating a data proxy within PostgreSQL that internally calls and fetches data from a different database system. This is called Foreign Data Wrapper, which is based on SQL-MED. Percona provides a foreign data wrapper for the Clickhousedb database system, which is available at Percona’s GitHub project repository.

Benchmark Machine
  • Supermicro server:
    • Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
    • 2 sockets / 28 cores / 56 threads
    • Memory: 256GB of RAM
    • Storage: Samsung  SM863 1.9TB Enterprise SSD
    • Filesystem: ext4/xfs
  • OS: Linux smblade01 4.15.0-42-generic #45~16.04.1-Ubuntu
  • PostgreSQL: version 11
Benchmark Workload

Ontime (On Time Reporting Carrier On-Time Performance) is an openly-available dataset I have used to benchmark. It has a table size of 85GB with 109 number of different types of columns, and its designed queries more

[...]
カテゴリー: postgresql

Sebastian Insausti: How to Deploy PostgreSQL to a Docker Container Using ClusterControl

2019-05-09(木) 18:55:18

Docker has become the most common tool to create, deploy, and run applications by using containers. It allows us to package up an application with all of the parts it needs, such as libraries and other dependencies, and ship it all out as one package. Docker could be considered as a virtual machine, but instead of creating a whole virtual operating system, Docker allows applications to use the same Linux kernel as the system that they're running on and only requires applications to be shipped with things not already running on the host computer. This gives a significant performance boost and reduces the size of the application.

In this blog, we’ll see how we can easily deploy a PostgreSQL setup via Docker, and how to turn our setup in a primary/standby replication setup by using ClusterControl.

How to Deploy PostgreSQL with Docker

First, let’s see how to deploy PostgreSQL with Docker manually by using a PostgreSQL Docker Image.

The image is available on Docker Hub and you can find it from the command line:

$ docker search postgres NAME DESCRIPTION STARS OFFICIAL AUTOMATED postgres The PostgreSQL object-relational database sy… 6519 [OK]

We’ll take the first result. The Official one. So, we need to pull the image:

$ docker pull postgres

And run the node containers mapping a local port to the database port into the container:

$ docker run -d --name node1 -p 6551:5432 postgres $ docker run -d --name node2 -p 6552:5432 postgres $ docker run -d --name node3 -p 6553:5432 postgres

After running these commands, you should have this Docker environment created:

$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 51038dbe21f8 postgres "docker-entrypoint.s…" About an hour[...]
カテゴリー: postgresql

Luca Ferrari: The role of a role within another role

2019-05-09(木) 09:00:00

A recursive title for a kind of recursive topic: what does really mean to have a role into another one? This article tries to figure out some basic knowledge about it.

The role of a role within another role

After reading the very excellent article by Hans-Jürgen Schönig about roles, I decided to provide my own vision about users, groups and the more abstract role concept.

The word role

First of all, the word role has little to do with PostgreSQL: it is a word used in the SQL standard, so don’t blame our favourite database for using the same word to express different concepts like user and group.

Roles: are they users or groups?

The wrong part of the question is or: roles are both users and groups. Period. A role is a stereotype, an abstraction for saying a collection of permissions to do some stuff. Now, often a collection of permission is granted to a user, and therefore a role smells like an user account, but in my opinion this is just a coincidence. And in fact, as in the best system administration tradition, when you have to assign a collection of permissions to more than one user you need a group; roles can therefore smell like a group.
Remember: roles are collection of permission, what makes they smell as a group or an user is just the way you use them. If you use a role for a single user, then it is fine to think the role as an user account. If you use the role for more than one user, then it is fine to think the role as a group.
Now, if you think this is trivial and simple, consider that a role can smell...

カテゴリー: postgresql

Hans-Juergen Schoenig: PostgreSQL: Using CREATE USER with caution

2019-05-08(水) 15:45:45

PostgreSQL offers powerful means to manage users / roles and enables administrators to implement everything from simple to really complex security concepts. However, if the PostgreSQL security machinery is not used wisely, things might become a bit rough.

This fairly short post will try to shed some light on to this topic.

The golden rule: Distinguish between users and roles

The most important thing you got to remember is the following: You cannot drop a user unless there are no more permissions, objects, policies, tablespaces, and so on are assigned to it. Here is an example:

test=# CREATE TABLE a (aid int); CREATE TABLE test=# CREATE USER joe; CREATE ROLE test=# GRANT SELECT ON a TO joe; GRANT

As you can see “joe” has a single permission and there is already no way to kill the user without revoking the permission first:

test=# DROP USER joe; ERROR: role "joe" cannot be dropped because some objects depend on it DETAIL: privileges for table a

Note that there is not such thing as “DROP USER … CASCADE” – it does not exist. The reason for that is that users are created at the instance level. A user can therefore have rights in potentially dozens of PostgreSQL databases. If you drop a user you cannot just blindly remove objects from other databases. It is therefore necessary to revoke all permissions first before a user can be removed. That can be a real issue if your deployments grow in size.

Using roles to abstract tasks

One thing we have seen over the years is: Tasks tend to exist longer than staff. Even after hiring and firing cleaning staff for your office 5 times the task is still the same: Somebody is going to clean your office twice a week. It can therefore make sense to abstract the tasks performed by “cleaning_staff” in a role, which is then assigned to individual people.

How can one implement this kind of abstraction?

test=# CREATE ROLE cleaning_staff NOLOGIN; CREATE ROLE test=# GRANT SELECT ON a TO cleaning_staff; GRANT test=# GRANT cleaning_staff TO joe; GRANT ROLE

First we create a role called “cl

[...]
カテゴリー: postgresql

Andreas Scherbaum: Google Summer of Code 2019 - PostgreSQL participates with 5 projects

2019-05-08(水) 05:24:00

Andreas 'ads' Scherbaum

For the 13th year, the PostgreSQL Project is participating in Google Summer of Code (GSoC). This project is a great opportunity to let students learn about Open Source projects, and help them deliver new features. It is also a chance to engage the students beyond just one summer, and grow them into active contributors.

In GSoC, students first learn about the Open Source organization, and either pick a summer project from the list provided by the org, or submit their own idea for review. After a “community bonding” period, the students have time to implement their idea, under supervision of mentors from the Open Source organization. There is also an incentive: first, Google pays the students for their work on improving Open Source projects. And second, having a completed GSoC project in a CV is well recognized.

Continue reading "Google Summer of Code 2019 - PostgreSQL participates with 5 projects"
カテゴリー: postgresql

Craig Kerstiens: Introducing Hyperscale (Citus) on Azure Database for PostgreSQL

2019-05-06(月) 22:00:00

For roughly ten years now, I’ve had the pleasure of running and managing databases for people. In the early stages of building an application you move quickly, adding new tables and columns to your Postgres database to support new functionality. You move quickly, but you don’t worry too much because things are fast and responsive–largely because your data is small. Over time your application grows and matures. Your data model stabilizes, and you start to spend more time tuning and tweaking to ensure performance and stability stay where they need to. Eventually you get to the point where you miss the days of maintaining a small database, because life was easier then. Indexes were created quickly, joins were fast, count(*) didn’t bring your database to a screeching halt, and vacuum was not a regular part of your lunchtime conversation. As you continue to tweak and optimize the system, you know you need a plan for the future and know how you’re going to continue to scale.

Now in Preview: Introducing Hyperscale (Citus) on Azure Database for PostgreSQL

With Hyperscale (Citus) on Azure Database for PostgreSQL, we help many of those worries fade away. I am super excited to announce that Citus is now available on Microsoft Azure, as a new deployment option on the Azure Database for PostgreSQL called Hyperscale (Citus).

Hyperscale (Citus) scales out your data across multiple physical nodes, with the underlying data being sharded into much smaller bits. The same database sharding principles that work for Facebook and Google are baked right into the database. But, unlike traditional sharded systems, your application doesn’t have to learn how to shard the data. With Azure Database on PostgreSQL, Hyperscale (Citus) takes Postgres, the open source relational database, and extends it with low level internal hooks.

This means you can go back to building new features and functionality, without having to deal with a massive database that is a pain to maintain. When you provision a Hyperscale (Citus) server group, you’ll have a c

[...]
カテゴリー: postgresql

Josh Williams: LinuxFest Northwest 2019

2019-05-03(金) 09:00:00

I’m sitting in an airport, writing this in an attempt to stay awake. My flight is scheduled to depart at 11:59 PM, or 2:59 AM in the destination time zone which I’m still used to. This is the first red eye flight I’ve attempted, and I’m wondering why I’ve done this to myself.

I have dedicated a good portion of my life to free, open source software. I’ll occasionally travel to conferences, sitting on long flights and spending those valuable weekends in talks about email encryption and chat bots. I’ve also done this to myself. But even with all this I have zero regrets.

This little retrospective comes courtesy of my experience at LinuxFest Northwest this last weekend in Bellingham, Washington.

Specifically I think it was some of the talks, painting things in broad strokes, that did it. I attended Jon “maddog” Hall’s beard-growing Fifty Years of Unix, and later sat in on the Q&A, which was a bit less technical than expected. So I didn’t ask about the “2038 problem.” But that’s okay.

I felt a little guilty, on one hand, doing these general interest sessions instead of something on a much more specific topic, like ZFS, which would have arguably had a more direct benefit. On the other hand, doing those general interest talks helps me stay grounded, I suppose, helps me keep perspective.

I did attend some more specialized talks, naturally. LFNW was a packed conference, often times there were a number of discussions I would have liked to attend happening at the same time. I’m hoping recordings will become available, or at least slides or other notes will appear. Some of the other talks I attended included, in no particular order:

  • Audio Production on Linux
    Like many other End Pointers, I dabble in a little bit of music. Unlike those other End Pointers, I’ve got no talent for it. Still, I try, and so I listened in on this one to find out a little more about how Jack works. I also caught wind of PipeWire, a project that’s aiming to supplant both PulseAudio and Jack. Neat!

  • Using GIS in Postgres
    I’ve got a couple PostGIS-

[...]
カテゴリー: postgresql

Sebastian Insausti: How to Use pgBackRest to Backup PostgreSQL and TimescaleDB

2019-05-03(金) 05:48:23

Your data is probably the most valuable assets in the company, so you should have a Disaster Recovery Plan (DRP) to prevent data loss in the event of an accident or hardware failure. A backup is the simplest form of DR. It might not always be enough to guarantee an acceptable Recovery Point Objective (RPO) but is a good first approach. Also, you should define a Recovery Time Objective (RTO) according to your company requirements. There are many ways to reach the RTO value, it depends on the company goals.

In this blog, we’ll see how to use pgBackRest for backing up PostgreSQL and TimescaleDB and how to use one of the most important features of this backup tool, the combination of Full, Incremental and Differential backups, to minimize downtime.

What is pgBackRest?

There are different types of backups for databases:

  • Logical: The backup is stored in a human-readable format like SQL.
  • Physical: The backup contains binary data.
  • Full/Incremental/Differential: The definition of these three types of backups is implicit in the name. The full backup is a full copy of all your data. Incremental backup only backs up the data that has changed since the previous backup and the differential backup only contains the data that has changed since the last full backup executed. The incremental and differential backups were introduced as a way to decrease the amount of time and disk space usage that it takes to perform a full backup.

pgBackRest is an open source backup tool that creates physical backups with some improvements compared to the classic pg_basebackup tool. We can use pgBackRest to perform an initial database copy for Streaming Replication by using an existing backup, or we can use the delta option to rebuild an old standby server.

Some of the most important pgBackRest features are:

  • Parallel Backup & Restore
  • Local or Remote Operation
  • Full, Incremental and Differential Backups
  • Backup Rotation and Archive Expiration
  • Backup Integrity check
  • Backup Resume
  • Delta Restore
  • Encryption

Now, let’s see how we can use pgBackRe

[...]
カテゴリー: postgresql

Ernst-Georg Schmid: Not all CASTs are created equal?

2019-05-03(金) 01:32:00
Can somebody explain this?

PostgreSQL 11.2.

The documentation says:

"A type cast specifies a conversion from one data type to another. PostgreSQL accepts two equivalent syntaxes for type casts:

CAST ( expression AS type )

expression::type

The CAST syntax conforms to SQL; the syntax with :: is historical PostgreSQL usage."

But when I test the lower limits of PostgreSQL's integer types, strange things happen.

select cast(-9223372036854775808 as bigint);
select cast(-2147483648 as integer);
select cast(-32768 as smallint);

All OK.

select -9223372036854775808::bigint;
select -2147483648::integer;
select -32768::smallint;

All fail with SQL Error [22003]: ERROR: out of range

But:

select -9223372036854775807::bigint;
select -2147483647::integer;
select -32767::smallint;

All OK.

???
カテゴリー: postgresql

Pavel Golub: 1-to-1 relationship in PostgreSQL for real

2019-05-02(木) 17:00:32

Years ago

Years ago I wrote the post describing how to implement 1-to-1 relationship in PostgreSQL. The trick was simple and obvious:

CREATE TABLE UserProfiles ( UProfileID BIGSERIAL PRIMARY KEY, ... ); CREATE TABLE Users ( UID BIGSERIAL PRIMARY KEY, UProfileID int8 NOT NULL, ... UNIQUE(UProfileID), FOREIGN KEY(UProfileID) REFERENCES Users(UProfileID) );

You put a unique constraint on a referenced column and you’re fine. But then one of the readers noticed, that this is the 1-to-(0..1) relationship, not a true 1-to-1. And he was absolutely correct.

Keep it simple stupid!

A lot of time is gone and now we can do this trick much simpler using modern features or PostgreSQL. Let’s check

BEGIN; CREATE TABLE uProfiles ( uid int8 PRIMARY KEY, payload jsonb NOT NULL ); CREATE TABLE Users ( uid int8 PRIMARY KEY, uname text NOT NULL, FOREIGN KEY (uid) REFERENCES uProfiles (uid) ); ALTER TABLE uProfiles ADD FOREIGN KEY (uid) REFERENCES Users (uid); INSERT INTO Users VALUES (1, 'Pavlo Golub'); INSERT INTO uProfiles VALUES (1, '{}'); COMMIT;

Things are obvious. We create two tables and reference each other using the same columns in both ways.
Moreover, in such model both our foreign keys are automatically indexed!
Seems legit, but executing this script will produce the error:

SQL Error [23503]: ERROR: insert or update on table "users" violates foreign key constraint "users_uid_fkey" Detail: Key (uid)=(1) is not present in table "uprofiles".

Oops. And that was the pitfall preventing the easy solutions years ago during my first post.

What about now?

But now we have DEFERRABLE constraints:

This controls whether the constraint can be deferred. A constraint that is not deferrable will be checked immediately after every command. Checking of constraints that are deferrable can be postponed until the end of the transaction (using the SET CONSTRAINTS command). NOT DEFERRABLE is the default. Currently, only UNIQUE, PRIMARY KEY, EXC

[...]
カテゴリー: postgresql

Pavel Stehule: pspg on Solaris

2019-05-02(木) 16:54:00
I fixed some issues and pspg can be used on Solaris too. I found some issues on Solaris side on utf8 support - but it is related just for subset of chars. Due this issues, don't use psql unicode borders.
カテゴリー: postgresql

Ibrar Ahmed: Benchmark ClickHouse Database and clickhousedb_fdw

2019-05-02(木) 01:37:29

In this research,  I wanted to see what kind of performance improvements could be gained by using a ClickHouse data source rather than PostgreSQL. Assuming that I would see performance advantages with using ClickHouse, would those advantages be retained if I access ClickHouse from within postgres using a foreign data wrapper (FDW)? The FDW in question is clickhousedb_fdw – an open source project from Percona!

The database environments under scrutiny are PostgreSQL v11, clickhousedb_fdw and a ClickHouse database. Ultimately, from within PostgreSQL v11, we are going to issue various SQL queries routed through our clickhousedb_fdw to the ClickHouse database. Then we’ll see how the FDW performance compares with those same queries executed in native PostgreSQL and native ClickHouse.

Clickhouse Database

ClickHouse is an open source column based database management system which can achieve performance of between 100 and 1000 times faster than traditional database approaches, capable of processing more than a billion rows in less than a second.

Clickhousedb_fdw

clickhousedb_fdw, a ClickHouse database foreign data wrapper, or FDW, is an open source project from Percona. Here’s a link for the GitHub project repository:

https://github.com/Percona-Lab/clickhousedb_fdw

I wrote a blog in March which tells you more about our FDW: https://www.percona.com/blog/2019/03/29/postgresql-access-clickhouse-one-of-the-fastest-column-dbmss-with-clickhousedb_fdw/

As you’ll see, this provides for an FDW for ClickHouse that allows you to SELECT from, and INSERT INTO, a ClickHouse database from within a PostgreSQL v11 server.

The FDW supports advanced features such as aggregate pushdown and joins pushdown. These significantly improve performance by utilizing the remote server’s resources for these resource intensive operations.

Benchmark environment
  • Supermicro server:
    • Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
    • 2 sockets / 28 cores / 56 threads
    • Memory: 256GB of RAM
    • Storage: Samsung  SM863 1.9TB Enterprise SSD
    • Filesystem: ext4/xfs
  • OS: Linux
[...]
カテゴリー: postgresql

Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Allow VACUUM to be run with index cleanup disabled.

2019-05-01(水) 10:02:06
On 4th of April 2019, Robert Haas committed patch: Allow VACUUM to be run with index cleanup disabled.     This commit adds a new reloption, vacuum_index_cleanup, which controls whether index cleanup is performed for a particular relation by default. It also adds a new option to the VACUUM command, INDEX_CLEANUP, which can be used … Continue reading "Waiting for PostgreSQL 12 – Allow VACUUM to be run with index cleanup disabled."
カテゴリー: postgresql

Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Add SETTINGS option to EXPLAIN, to print modified settings.

2019-05-01(水) 03:22:52
On 3rd of April 2019, Tomas Vondra committed patch: Add SETTINGS option to EXPLAIN, to print modified settings.     Query planning is affected by a number of configuration options, and it may be crucial to know which of those options were set to non-default values. With this patch you can say EXPLAIN (SETTINGS ON) … Continue reading "Waiting for PostgreSQL 12 – Add SETTINGS option to EXPLAIN, to print modified settings."
カテゴリー: postgresql

Eric Hanson: "Aquameta Revisited" on FLOSS Weekly

2019-04-30(火) 20:50:08

Eric talks about Aquameta 0.2 and the advantages of migrating the web stack into PostgreSQL on This Week in Tech's FLOSS Weekly.

カテゴリー: postgresql

Markus Winand: A Close Look at the Index Include Clause

2019-04-30(火) 09:00:00

Some database—namely Microsoft SQL Server, IBM Db2, and also PostgreSQL since release 11—offer an include clause in the create index statement. The introduction of this feature to PostgreSQL is the trigger for this long overdue explanation of the include clause.

Before going into the details, let’s start with a short recap on how (non-clustered) B-tree indexes work and what the all-mighty index-only scan is.

Recap: B-tree Indexes

To understand the include clause, you must first understand that using an index affects up to three layers of data structures:

  • The B-tree

  • The doubly linked list at the leaf node level of the B-tree

  • The table

The first two structures together form an index so they could be combined into a single item, i.e. the “B-tree index”. I prefer to keep them separate as they serve different needs and have a different impact on performance. Moreover, explaining the include clause requires making this distinction.

In the general case, the database software starts traversing the B-tree to find the first matching entry at the leaf node level (1). It then follows the doubly linked list until it has found all matching entries (2) and finally it fetches each of those matching entries from the table (3). Actually, the last two steps can be interleaved, but that is not relevant for understanding the general concept.

The following formulas give you a rough idea of how many read operations each of these steps needs. The sum of these three components is the total effort of an index access.0

  • The B-tree: log100(<rows in table>), often less than 5

  • The doubly linked list: <rows read from index> / 100

  • The table: <rows read from table>1

When loading a few rows, the B-tree makes the greatest contribution to the overall effort. As soon as you need to fetch just a handful of rows from the table, this step takes the lead. In either case—few or many rows—the doubly linked list is usually a minor factor because it stores rows with similar values next to each other so that a single read operation can fetch

[...]
カテゴリー: postgresql

Sebastian Insausti: An Overview of Streaming Replication for TimescaleDB

2019-04-29(月) 18:48:00

Nowadays, replication is a given in a high availability and fault tolerant environment for pretty much any database technology that you’re using. It is a topic that we have seen over and over again, but that never gets old.

If you’re using TimescaleDB, the most common type of replication is streaming replication, but how does it work?

In this blog, we are going to review some concepts related to replication and we’ll focus on streaming replication for TimescaleDB, which is a functionality inherited from the underlying PostgreSQL engine. Then, we’ll see how ClusterControl can help us to configure it.

So, streaming replication is based on shipping the WAL records and having them applied to the standby server. So, first, let’s see what WAL is.

WAL

Write Ahead Log (WAL) is a standard method for ensuring data integrity, it is automatically enabled by default.

The WALs are the REDO logs in TimescaleDB. But, what are the REDO logs?

REDO logs contain all changes that were made in the database and they are used by replication, recovery, online backup and point in time recovery (PITR). Any changes that have not been applied to the data pages can be redone from the REDO logs.

Using WAL results in a significantly reduced number of disk writes, because only the log file needs to be flushed to disk to guarantee that a transaction is committed, rather than every data file changed by the transaction.

A WAL record will specify, bit by bit, the changes made to the data. Each WAL record will be appended into a WAL file. The insert position is a Log Sequence Number (LSN) that is a byte offset into the logs, increasing with each new record.

The WALs are stored in the pg_wal directory, under the data directory. These files have a default size of 16MB (the size can be changed by altering the --with-wal-segsize configure option when building the server). They have a unique incremental name, in the following format: "00000001 00000000 00000000".

The number of WAL files contained in pg_wal will depend on the value assigned to the min_

[...]
カテゴリー: postgresql

Jobin Augustine: Watch Webinar: Upgrading / Migrating Your Legacy PostgreSQL to Newer PostgreSQL Versions

2019-04-26(金) 22:00:10

Please join Percona’s PostgreSQL Support Technical Lead, Avinash Vallarapu and Senior Support Engineers, Fernando Laudares, Jobin Augustine and Nickolay Ihalainen as they demonstrate the methods to upgrade a legacy version of PostgreSQL to a newer version using built-in as well as open source tools.

Watch the Recorded Webinar

To start, this webinar opens with a list of solutions that are built-in to PostgreSQL to help us upgrade a legacy version of PostgreSQL with minimal downtime. Next, the advantages of choosing such methods will be discussed. You will then notice a list of prerequisites for each solution, reducing the scope of possible mistakes. It’s important to minimize downtime when upgrading from an older version of PostgreSQL server. Therefore, we will present 3 open source solutions that will help us either minimize or completely avoid downtime.

Additionally, we will be publishing a series of 5 blog posts that will help us understand the solutions available to perform a PostgreSQL upgrade. Our presentation will show the full process of upgrading a set of PostgreSQL servers to the latest available version. Furthermore, we’ll show the pros and cons of each of the methods we employed.

Topics covered in this webinar include the following:

1. PostgreSQL upgrade using pg_dump/pg_restore (with downtime)
2. PostgreSQL upgrade using pg_dumpall (with downtime)
3. Continuous replication from a legacy PostgreSQL version to a newer version using Slony.
4. Replication between major PostgreSQL versions using Logical Replication
5. Fast upgrade of legacy PostgreSQL with minimum downtime.

We will walk you through and demonstrate methods that you may find useful in your database environment. We will witness how simple and quick it is to perform the upgrade using these methods.

カテゴリー: postgresql

ページ