planet postgresql

Subscribe to planet postgresql のフィード
Planet PostgreSQL
更新: 1時間 47分 前

Stefan Fercot: One RPM to rule them all…

2019-04-17(水) 09:00:00

As of 15 April 2019, there is only one repository RPM per distro, and it includes repository information for all available PostgreSQL releases.

This change, announced by Devrim on the pgsql-pkg-yum mailing list, has some impacts.

Announce

The announce from Devrim may be found here.

  • Instead of having separate repo RPMs per PostgreSQL major version, we now have one single repo RPM that supports all supported PostgreSQL releases. The new packages obsolete the current ones.

  • The repo RPM version has been bumped to 42. Hopefully that will be the end of the “The repo RPM is 10-4, how can I find 10-7 repo rpm, so that I can install PostgreSQL 10.7?” type questions.

  • The “latest” suffix has been added to all repo RPMs.

Installation

Let’s see some impacts of those changes on CentOS 7.

As usual, go to https://www.postgresql.org/download/linux/redhat/ and chose the version (11), the platform (CentOS 7) and the architecture (x86_64) you want to install.

Today, you still get the link to the pgdg-centos11-11-2 rpm.

Let’s install it:

# yum install https://download.postgresql.org/pub/repos/yum/11/redhat/rhel-7-x86_64/pgdg-centos11-11-2.noarch.rpm Loaded plugins: fastestmirror pgdg-centos11-11-2.noarch.rpm Examining /var/tmp/yum-root-5eSWGp/pgdg-centos11-11-2.noarch.rpm: pgdg-redhat-repo-42.0-4.noarch Marking /var/tmp/yum-root-5eSWGp/pgdg-centos11-11-2.noarch.rpm to be installed Resolving Dependencies --> Running transaction check ---> Package pgdg-redhat-repo.noarch 0:42.0-4 will be installed --> Finished Dependency Resolution Dependencies Resolved ======================================================================================================== Package Arch Version Repository Size ======================================================================================================== Installing: pgdg-redhat-repo noarch 42.0-4 /pgdg-centos11-11-2.noarch 6.8 k Transaction Summary ======================[...]
カテゴリー: postgresql

Luca Ferrari: An article about pgenv

2019-04-17(水) 09:00:00

A few months ago I worked to improve the great pgenv tool by theory. Today, I try to spread the word in the hope this tool can grow a little more!

An article about pgenv tl;dr

I proposed a talk about pgenv, a Bash tool to manage several PostgreSQL instances on the same local machine. My talk has been rejected, and I hate to waste what I have already prepared, so I decided to transform my talk in an article, that has been quickly accepted on Haikin9 Devops Issue!



I should have written about this a couple of months ago, but I did not had time to.
My hope is that pgenv gets more and more users, so that it can grow and become someday a widely used tool. Quite frankly, I don’t see this happening while being in Bash, for both portability and flexibility, and I suspect Perl is much more the language for a more flexible implementation. However, who knows? Gathering users is also a way to gather contributors and bring therefore new ideas to this small but very useful project.

In the meantime, if you have time and will, try testing the build from git patch, that allows you to build and manage a development version of our beloved database.

カテゴリー: postgresql

Avinash Kumar: Fast Upgrade of Legacy PostgreSQL with Minimum Downtime Using pg_upgrade

2019-04-13(土) 00:44:44

When you need to upgrade your PostgreSQL databases, there are a number of options available to you. In this post we’ll take a look at how you can upgrade PostgreSQL versions using pg_upgrade, a built-in tool that allows in-place upgrade of your software. Using pg_upgrade allows you, potentially, to minimize your downtime, an essential consideration for many organizations. It also allows you to perform a postgres upgrade with very minimal effort.

In our previous posts, we discussed various methods and tools that can help us perform a PostgreSQL upgrade – (1) pg_dumpall, (2) pg_dump and pg_restore with pg_dumpall, (3) logical replication and pglogical, and (4) slony. Methods 1 and 2 can involve additional downtime compared to the approaches taken in 3 and 4. Whilst performing an upgrade using logical replication or slony may be time consuming and require a lot of monitoring, it can be worth it if you can minimize downtime. If you have large databases that are busy with a lot of transactions, you may be better served using logical replication or slony.

This post is the fifth of our Upgrading or Migrating Your Legacy PostgreSQL to Newer PostgreSQL Versions series. These posts lead up to a live webinar, where we’ll be exploring different methods available to upgrade your PostgreSQL databases. If it’s beyond the live webinar date when you read this, you’ll find the recording at that same link.

pg_upgrade

pg_upgrade (formerly pg_migrator – until PostgreSQL 8.4) is a built-in tool that helps in upgrading a legacy PostgreSQL server to a newer version without the need of a dump and restore. The oldest version from when you can upgrade your PostgreSQL using pg_upgrade is 8.4.x. It is capable of performing faster upgrades by taking into consideration that system tables are the ones that undergo the most change between two major versions. The internal data storage format is less often affected.

In fact, in one of our tests we were able to perform an upgrade of a 2 TB database server from PostgreSQL 9.6.5 to 11.1 in less than

[...]
カテゴリー: postgresql

John Naylor: Optimizing storage of small tables in PostgreSQL 12

2019-04-10(水) 14:33:17
The problem

If your database has a large number of small tables, you likely have a lot of wasted space. To demonstrate this, let’s create a table with a single record:

create table foo (str text); insert into foo values ('a'); VACUUM foo;

Now let’s find out the path of the file containing our data, relative to the data directory:

select pg_relation_filepath('foo'); pg_relation_filepath ---------------------- base/16384/16405 (1 row)

(For more information on PostgreSQL files and directories, see Craig Ringer’s article on the topic.)

Notice what happens when we drop to the system command line within the data directory and list the files, adding a wild card to the path above:

$ ls base/16384/16405* base/16384/16405 base/16384/16405_fsm base/16384/16405_vm

The file named 16405 is the heap of our table, but what about the others? These are auxiliary files, called “relation forks” that contain additional information to help PostgreSQL access and maintain the table.

  • 16405_fsm is the free space map. Its job is to know which pages in the table have space available for inserting records.
  • 16405_vm is the visibility map. Its job is to know which heap pages may need VACUUM-ing or freezing, and also which heap pages must be visited during index-only scans.

There are other files associated with this table, but to find them we have to use a query:

select pg_relation_filepath(c.reltoastrelid) as toast_table_path, pg_relation_filepath(i.indexrelid) as toast_index_path from pg_class c left outer join pg_index i on c.reltoastrelid=i.indrelid where c.relname = 'foo'; toast_table_path | toast_index_path ------------------+------------------ base/16384/16408 | base/16384/16410 (1 row)

This gives us the path of the toast table for our table, as well as that of the toast table’s index. If we insert any records with large enough strings, they will be compressed and stored here.

How much disk space do we need to store our one record?

\x select pg_relation_size(c.oid, 'main') as heap_size, pg_relation_size(c.oid, 'fsm') as fsm_size,[...]
カテゴリー: postgresql

Nickolay Ihalainen: Continuous Replication From a Legacy PostgreSQL Version to a Newer Version Using Slony

2019-04-10(水) 00:44:40

Native streaming replication in PostgreSQL works only between servers running the same major version. We discussed about logical replication in our previous blog post. In that post, we saw how logical replication could help us set up migration between two different PostgreSQL versions. However, logical replication works only for the currently supported versions of PostgreSQL, for example between PostgreSQL 9.4 and PostgreSQL 11. So what about the legacy versions that are older than 9.4? Slony-I could help us meet this replication requirement.

Replication between different PostgreSQL versions with Slony-I is useful for migration from legacy database installations to the latest available version. So what is Slony and how does it work?

This post is the fourth of our Upgrading or Migrating Your Legacy PostgreSQL to Newer PostgreSQL Versions series where we’ll be exploring different methods available to upgrade your PostgreSQL databases.

Slony

Slony is an application-level logical replication implementation for PostgreSQL. Rather, we could say that it is an external replication tool that requires a separate installation and configuration. Slony has been around for a long time. The latest version supports PostgreSQL versions from 8.4 and up to 11.

The main goal for replication is to ship changes from one database server to another. To better understand the architecture, you should know the terms such as Slon, Events and Slonik in Slony-I.

An aside: Slony means elephants in Russian, and elephants are indeed reputed to have a great memory. A slightly angry, but nevertheless pretty elephant, “Slonik”, looks at you from the PostgreSQL logo image.

Slon

Slon is a daemon that runs on each PostgreSQL node in Slony-I replication. These daemons are used for processing configuration and replication events for each PostgreSQL server. Each PostgreSQL server is called a “node”. All nodes together form a Slony “cluster”.

The “publisher node” is a source for replicated changes. While “subscriber” nodes receive and apply changes from

[...]
カテゴリー: postgresql

Michael Paquier: Postgres 12 highlight - pg_checksums

2019-04-09(火) 19:14:37

pg_checksums is a renaming of the tool called pg_verify_checksums which has been introduced in Postgres 11. Version 12 is introducing new options and possibilities which explain the renaming, as the tool has become much more multi-purpose.

First, it is now possible to enable and disable checksums for an offline cluster:

commit: ed308d78379008b2cebca30a986f97f992ee6122 author: Michael Paquier <michael@paquier.xyz> date: Sat, 23 Mar 2019 08:12:55 +0900 Add options to enable and disable checksums in pg_checksums An offline cluster can now work with more modes in pg_checksums: - --enable enables checksums in a cluster, updating all blocks with a correct checksum, and updating the control file at the end. - --disable disables checksums in a cluster, updating only the control file. - --check is an extra option able to verify checksums for a cluster, and the default used if no mode is specified. When running --enable or --disable, the data folder gets fsync'd for durability, and then it is followed by a control file update and flush to keep the operation consistent should the tool be interrupted, killed or the host unplugged. If no mode is specified in the options, then --check is used for compatibility with older versions of pg_checksums (named pg_verify_checksums in v11 where it was introduced). Author: Michael Banck, Michael Paquier Reviewed-by: Fabien Coelho, Magnus Hagander, Sergei Kornilov Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de

Here is how it works. The tool is able to do three modes now in total:

  • –check, the default if nothing is specified and what pg_verify_checksums was already able to do. This mode scans all the relation file blocks, reporting any mismatch.
  • –enable, which enables data checksums. This rewrites all the relation file blocks, and finishes the operation by updating the control file. Note that this can be take time depending on the size of the instance, and that the tool has no parallel mode.
  • –disables which disables data checksums by only updat
[...]
カテゴリー: postgresql

Andrew Dunstan: Buildfarm RSS feed

2019-04-09(火) 07:38:23

If you’ve visited almost any web page on the PostgreSQL Build Farm server in the last few days you might have noticed that it is sporting a new RSS feed, of changes in status. This is similar to the information on the buildfarm-status-green-chgs mailing list, except that it has all status changes, not just to and from green. This new feature fulfills a long outstanding request.

カテゴリー: postgresql

Umair Shahid: Postgres is the coolest database – Reason #1: Developers love it!

2019-04-08(月) 18:50:41

PostgreSQL has been my livelihood since 2004 – so I am naturally biased in its favor. I think it is the coolest piece of software on the planet, and I am not alone.

DB-Engines

See those 2 badges up there? That’s 2 years in a row. DB-Engines monitors a total of 343 databases and their trends. For each of 2017 and 2018, DB-Engines states that PostgreSQL gained more popularity in their rankings than any other database – hence the award.

This is no small feat. DB-Engines has a complex methodology of calculating popularity including search engine queries, technical discussions on popular forums, related job offers, and social media trends. Coming out on top 2 years in a row shows how much technologists love Postgres.

Stack Overflow

You Google for the solution of a problem you are facing. Chances are the first hit you are going to get will be a link from Stack Overflow. Millions of registered users interact hundreds of thousands of times every day answering questions, sharing knowledge, and making technology so much more easy to use for fellow enthusiasts.

Stack Overflow runs a survey each year asking the developer community about their favorite technologies. Want to guess what I am driving at?

That’s the result of over 100,000 technology enthusiasts from across the world coming together and loving PostgreSQL.

Hacker News

Hacker news – run by Y Combinator, one of the top USA accelerators since 2005 – is another forum that brings together technology enthusiasts from all around the world.

Do I really need to label the blue line? ;-)

The graph above posts trends over the past 8 years and compares popularity between MySQL, SQL Server, MongoDB, and our rising star – PostgreSQL. And just look at how it is rising!

Postgres is definitely the coolest database out there, and this is only reason #1 :-)

Stay tuned for more!

カテゴリー: postgresql

Shaun M. Thomas: I am Developer! (And You Can Too!)

2019-04-06(土) 01:00:45

A while back, 2ndQuadrant notified a few of us that we should get more involved in Postgres Development in some capacity. Being as I’ve essentially fallen off the map in corresponding with the mailing lists in general, it would be a good way to get back into the habit.

But wait! Don’t we want more community involvement in general? Of course we do! So now is also a great opportunity to share my journey in the hopes others follow suit. Let’s go!

In the Beginning

So how does one start contributing? Do I have to know C? Must I poke around in the internals for weeks to form an understanding, however so tenuous, of the Black Magic that animates Postgres? Perhaps I should chant incantations to summon some dark entity to grant otherworldly powers necessary to comprehend the mind-bleedingly high-level conversations regularly churning within the deeply foreboding confines of the Hackers mailing list.

No.

God no.

If those were the prerequisites for getting involved, there would be approximately one person pushing Postgres forward, and his name is Tom Lane. Everyone else would be too terrified to even approach the process. I certainly count myself among those more timid individuals.

Instead, let’s start somewhere simple, and with something small. Sometimes all it takes to make a difference is to piggyback on the coattails of someone else who knows what they’re doing. If we’re too inexperienced to submit a patch, maybe we can review one instead.

Getting the Party Started (In Here)

Postgres development marches forward steadily in a form of punctuated equilibrium. Every few months, we throw a bunch of patches against the wall, and see what sticks. That cadence is currently marshaled by a master list of past and present commit fests going back to late 2014. It’s hardly an exhaustive resource dating back to antiquity, but it doesn’t need to be.

All we really need is the most recent iteration that’s in progress. At the time of this writing, that’s 2019-03.

Decisions, Decisions

Now for what might just be the most difficult portion

[...]
カテゴリー: postgresql

Ibrar Ahmed: Writing PostgreSQL Extensions is Fun – C Language

2019-04-05(金) 20:58:57

PostgreSQL is a powerful open source relational database management system. It extends the SQL language with additional features. A DBMS is not only defined by its performance and out of the box features, but also its ability to support bespoke/additional user-specific functionality. Some of these functionalities may be in the form of database constructs or modules, like stored procedures or functions, but their scope is generally limited to the functionality being exposed by the DBMS. For instance, how will you write a custom query-analyzing application that resides within your DBMS?

To support such options, PostgreSQL provides a pluggable architecture that allows you to install extensions. Extensions may consist of a configuration (control) file, a combination of SQL files, and dynamically loadable libraries.

This means you can write your own code as per the defined guidelines of an extension and plug it in a PostgreSQL instance without changing the actual PostgreSQL code tree. An extension by very definition extends what PostgreSQL can do, but more than that, it gives you the ability to interact with external entities. These external entities can be other database management systems like ClickHouse, Mongo or HDFs (normally these are called foreign data wrappers), or other interpreters or compilers (thus allowing us to write database functions in another language like Java, Python, Perl or TCL, etc.). Another potential use case of an extension can be for code obfuscation which allows you to protect your super secret code from prying eyes.

Build your own

To build your own extension, you don’t need a complete PostgreSQL code base. You can build and install an extension using installed PostgreSQL (it may require you to install a devel RPM or Debian package). Details about extensions can be found in PostgreSQL’s official documentation[1]. There many extensions available for different features in the contrib directory of PostgreSQL source. Other than the contrib directory, people are also writing extensions readily

[...]
カテゴリー: postgresql

Oleg Bartunov: Partial TOAST decompression for jsonb

2019-04-05(金) 19:12:53
Inspired by commit support for partial TOAST decompression
"When asked for a slice of a TOAST entry, decompress enough to return the slice instead of decompressing the entire object."

I and Nikita Glukhov made a quick experiment to see how jsonb could get benefit from this commit. The idea is simple, let's short values (more valueable) stores before long one. Currently, access time is independent on key, but with support of partial decompression we can get benefit for front keys.

Since jsonb stores values of keys in sorted (by key) order, we generate values depending on key name.

{ "key1": "aaaa", /* 4 ^ 1 */ "key2": "aaaaaaaaaaaaaaaa", /* 4 ^ 2 = 16 */ ... "key10": "aaa ... aaa" /* 4 ^ 10 = 1M */ } create table t(jb jsonb); insert into t select ( select jsonb_object_agg('key' || i, repeat('a', pow(4, i)::int)) from generate_series(1,10) i ) from generate_series(1, 1000);

We applied the partial decompression for '->' operator and tested performance with this simple query
select jb->'key1' from t;

The result is as expected - access time depends on a key:
key1-key5 key7 key8 key9 key10 10 ms 48 ms 152 ms 548 ms 2037 ms
Access time for non-optimized operator '->>' is the same for all keys and roughly is 2000 ms.

So, this is what we can get for now. Ideally we want to have access time for all keys equal for time of accessing the first (fastest) key, currently we have the opposite.

I hope TOAST will be improved and we could decompress any slice using data type specific algorithm.
カテゴリー: postgresql

Kaarel Moppel: Secure PostgreSQL – a reminder on various attack surfaces

2019-04-05(金) 17:00:16

Needless to say, security is a topic that nobody in the wider IT industry can ignore nowadays, with a constant flow of reports on data breaches of various scales. Most of such cases don’t result from direct attacks against databases though, but more from targeting Application / API / Webserver problems as database servers are usually just not directly exposed to the Internet. And even if they were, PostgreSQL installations at least have a good starting point there, as the default settings are sane enough to only listen to local connections by default, preventing from most embarrassing kind of headlines. But of course PostgreSQL setups can also be compromised in many ways, so here’s a short listing of possible threats and some simple suggestions to mitigate those threats if possible.

Unnecessary PostgreSQL superuser access

This my friends, is the #1 point of danger in my opinion, so an appropriate time to repeat the classics – ”with great power comes great responsibility”. Superuser accounts are of course needed for maintenance so cannot be really disabled…but mostly they are vastly overused, even for simple things like normal schema evolution, which can be (mostly) perfectly managed with standard “owner privileges”. But Ok, about the threats – there’s the obvious: pumping out / dropping all table data. But the most important thing to remember – superusers can execute random commands on the OS level, under PostgreSQL process owner privileges. The easiest way to do that is to use the relatively unknown PROGRAM variation of the COPY command, which is also nicely documented as such. Some other methods to achieve the same that are quite hidden and sneaky:

  • Achieving physical access to the machine with COPY PROGRAM

This attack assumes that the server has been setup for passwordless communication / data transfer over SSH using default settings (which is very common). The trick itself is disturbingly easy – just create a table with a single text field and execute something like that:

krl@postgres=# create table ssh(dat[...]
カテゴリー: postgresql

Andrew Dunstan: PostgreSQL Buildfarm Client Release 10

2019-04-05(金) 06:50:34

Announcing Release 10 of the PostgreSQL Buildfarm client

Principal feature: support for non-standard repositories:

  • support multi-element branch names, such as “dev/featurename” or “bug/ticket_number/branchname”
  • provide a get_branches() method in SCM module
  • support regular expression branches of interest. This is matched against the list of available branches
  • prune branches when doing git fetch.

This feature and some server side changes will be explored in detail in my presentation at pgCon in Ottawa next month. The feature doesn’t affect owners of animals in our normal public Build Farm. However, the items below are of use to them.

Other features/ behaviour changes:

  • support for testing cross version upgrade extended back to 9.2
  • support for core Postgres changes:
    • extended support for USE_MODULE_DB
    • new extra_float_digits regime
    • removal of user table oid support
    • removal of abstime and friends
    • changed log file locations
  • don’t search for valgrind messages unless valgrind is configured
  • make detection of when NO_TEMP_INSTALL is allowed more bulletproof

There are also various minor bug fixes and code improvements.

The release can be downloaded from https://github.com/PGBuildFarm/client-code/archive/REL_10.tar.gz or https://buildfarm.postgresql.org/downloads/latest-client.tgz

カテゴリー: postgresql

Avinash Kumar: Replication Between PostgreSQL Versions Using Logical Replication

2019-04-05(金) 02:21:11

When you need to upgrade PostgreSQL, there are a variety of approaches that you can use. To avoid application downtime, then not all of the options to upgrade postgres are suitable. When avoiding downtime is essential, then you might consider using replication as a means of upgrade, and depending on your scenario, you can choose to approach this task using either logical replication or physical (streaming) replication. Here, we take a look at the difference between logical and physical replication in PostgreSQL. Then we explore how to accomplish an upgrade using logical replication in more detail, and by doing so, avoid application downtime. In a subsequent article, we’ll investigate physical replication.

We have already discussed about a few methods available to perform PostgreSQL upgrades in our previous blog posts – PostgreSQL Upgrade Using pg_dumpall and PostgreSQL Upgrade Using pg_dump/pg_restore – as part of our Upgrading or Migrating Your Legacy PostgreSQL to Newer PostgreSQL Versions series. However, both of these methods involve downtime to application.

Types of logical replication

Here we’ll cover two types of replication you could implement:

  1. Replication between PostgreSQL 10 and 11 versions using built-in logical replication.
  2. Replication between PostgreSQL 9.4 or (<  PG 11) to PostgreSQL 11 using an extension named pglogical .

We might opt to implement replication as a method of upgrade to minimize downtime for the application. Once all the data to date has been replicated to another PostgreSQL server on the latest version, you can simply switch your application to the new server with a minimal downtime… though of course this does depends on the complexity of your application stack.

Logical replication in PostgreSQL allows users to perform a selective replication of a few tables and open a standby for writes. Whereas physical replication in PostgreSQL is a block level replication. In this case, each database in the master is replicated to a standby, and the standby is not open for writes. Going forw

[...]
カテゴリー: postgresql

Craig Kerstiens: Postgres and superuser access

2019-04-05(金) 01:56:00

A few days ago a CVE was announced for Postgres. To say this CVE is a bit overblown is an understatement. The first thing to know is you’re likely completely safe. If you run on a managed service provider you are not going to be affected by this, and if you’re managing your own Postgres database all chances are you are equally as safe. This CVE received a note from Tom Lane on the pgsql-announce mailing list in response to it getting a broad amount of awareness and attention.

But, we thought this might be a good time to talk about a few principles and concepts that underly how Postgres works.

For quick context the CVE states that the copy program within Postgres when run as super user will allow arbitrary code execution. The copy program within Postgres is a great tool for bulk loading of data. The utility essentially loads data to/from disk.

Shifting to the Postgres super user… Postgres is an extremely powerful database, well really it is more of a data platform. As super user you can perform all types of powerful operations. Those can include things that touch the underlying system itself, or enabling extensions which allow for low level system access. These types of operations should always be treated with care an caution.

In Tom’s email to the mailing list he states:

We encourage all users of PostgreSQL to follow the best practice that is to never grant superuser access to remote or otherwise untrusted users. This is a standard security operating procedure that is followed in system administration and extends to database administration as well.

If you’re handing out superuser access to your database or connecting to your application with a superuser role, consider changing that immediately.

Superuser essentially has the full power of the system user it is running as. Postgres does not allow running itself as the root user for this very reason. We actually commonly receive the request to grant super user access for customers running on Citus Cloud. Despite it being commonly requested it is not something we s

[...]
カテゴリー: postgresql

Laurenz Albe: count(*) made fast

2019-04-03(水) 17:00:28
© Laurenz Albe 2019

 

It is a frequent complaint that count(*) is so slow on PostgreSQL.

In this article I want to explore the options you have get your result as fast as possible.

Why is count(*) so slow?

Most people have no trouble understanding that the following is slow:

SELECT count(*) FROM /* complicated query */;

After all, it is a complicated query, and PostgreSQL has to calculate the result before it knows how many rows it will contain.

But many people are appalled if the following is slow:

SELECT count(*) FROM large_table;

Yet if you think again, the above still holds true: PostgreSQL has to calculate the result set before it can count it. Since there is no “magical row count” stored in a table (like it is in MySQL’s MyISAM), the only way to count the rows is to go through them.

So count(*) will normally perform a sequential scan of the table, which can be quite expensive.

Is the “*” in count(*) the problem?

The “*” in SELECT * FROM ... is expanded to all columns. Consequently, many people think that using count(*) is inefficient and should be written count(id) or count(1) instead. But the “*” in count(*) is quite different, it just means “row” and is not expanded at all.

Writing count(1) is the same as count(*), but count(id) is something different: It will only count the rows where id IS NOT NULL, since most aggregates ignore NULL values.

So there is nothing to be gained by avoiding the “*”.

Using an index only scan

It is tempting to scan a small index rather then the whole table to count the number of rows.
However, this is not so simple in PostgreSQL because of its multi-version concurrency control strategy. Each row version (“tuple”) contains the information to which database snapshot it is visible. But this information is not (redundantly) stored in the indexes. So it usually isn’t enough to count the entries in an index, because PostgreSQL has to visit the table entry (“heap tuple”) to make sure an index entry is visible.

To mitigate this problem, PostgreSQL has introduced the visibility ma

[...]
カテゴリー: postgresql

Magnus Hagander: When a vulnerability is not a vulnerability

2019-04-03(水) 04:39:50

Recently, references to a "new PostgreSQL vulnerability" has been circling on social media (and maybe elsewhere). It's even got it's own CVE entry. The origin appears to be a blogpost from Trustwave.

So is this actually a vulnerability? (Hint: it's not) Let's see:

カテゴリー: postgresql

Doug Hunley: Enhancing Your PostgreSQL 10 Security with the CIS Benchmark

2019-04-02(火) 23:59:37

 Crunchy Data has recently announced an update to the CIS PostgreSQL Benchmark by the Center for Internet Security, a nonprofit organization that provides publications around standards and best practices for securing technologies systems. This newly published CIS PostgreSQL 10 Benchmark joins the existing CIS Benchmarks for PostgreSQL 9.5 and 9.6 while continuing to build upon Crunchy Data's efforts with the PostgreSQL Security Technical Implementation Guide (PostgreSQL STIG).

What is a CIS Benchmark?

As mentioned in an earlier blog post, a CIS Benchmark is a set of guidelines and best practices for securely configuring a target system.  The benchmark contains a series of recommendations that help test the security of the system: some of the recommendations are "scored" (where a top score of 100 is the best), while others are are provided to establish best practices for security.

カテゴリー: postgresql

Dave Conlin: Postgres indexes for absolute beginners

2019-04-02(火) 22:45:00

Indexes are really important for Postgres performance, but they’re often misunderstood and misapplied. This post aims to give you a good grounding in indexes to avoid a lot of beginner mistakes.

Step one: understand what you want to achieve

Because indexes are such a powerful tool, a new index is often viewed as “the answer” to whatever performance problems people are experiencing. Wading straight in and creating an index for every sequential scan in sight is the simplest thing to do, but indexes have costs as well as benefits.

Not only do indexes take up memory, they raise the cost of writing to the table in question. Any speed-up an index may provide for reads isn’t free — it’s offset by more work to keep the index up to date when the data in the table change. So an unused index isn’t just useless — it’s actively harmful to your database’s performance.

First, take the time to understand which bits of your query are running slowly (use the query plan), make a hypothesis as to why they’re slow, and then validate that hypothesis by attempting to speed them up.

In order to understand when the answer might be an index, it’s important to understand the difference between sequential scans and index scans in Postgres.

Sequential scans

Sequential scans are the simplest, most obvious way of reading data from a table. Postgres jumps to the first block of memory (“page”) that holds rows for the table in question and reads in all the data, row by row, page by page, and passes it on.

Sequential scans can get a bit of a bad rap. One of the things we often hear from people when we ask them about their current performance analysis is “the first thing I do is look for sequential scans”.

It’s true that using an index on a table can make a big difference to query performance, but it’s also true that if your query just needs to get all of the data from a table in an unordered mass of rows, then things aren’t going to get any more efficient than just reading those rows in directly from consecutive pages.

Index scans

An index is jus

[...]
カテゴリー: postgresql

ページ