planet postgresql

Subscribe to planet postgresql のフィード
Planet PostgreSQL
更新: 2時間 24分 前

Bruce Momjian: Postgres 12 Features Presentation

2019-06-23(日) 19:00:01

Now that I have given a presentation about Postgres 12 features in Ibiza, I have made my slides available online.

カテゴリー: postgresql

Avinash Kumar: Hypothetical Indexes in PostgreSQL

2019-06-22(土) 03:02:17

At recent conferences, we have received a curious question from users who have used Oracle in the past and are now using PostgreSQL: “Do we have hypothetical indexes in PostgreSQL ?“. The answer to that question is YES. The true meaning of hypothetical is imaginary but not real. We can try creating indexes that are imaginary for the PostgreSQL optimizer which don’t exist in reality. Let’s look at some detailed examples.

How are hypothetical indexes helpful?

The execution time of an SQL in most of the relational databases depends on the cost of the execution. An optimizer chooses the plan that has the lowest cost, and thus considers further phases such as execute and fetch using that plan. One of the easiest ways of optimizing a well-written SQL is through appropriate indexes that are suitable for that query. An index may reduce the number of pages fetched from disk and may live in the cache due to its size (<<< table size). So, indexing is always a low-hanging fruit for admins and developers wishing to tune an SQL.

But often developers wish to see if an index can be really helpful. The only option available is to create the index and check to see if the queries are able to use it with a reduced cost. Creating the index may be fine if it is in a test environment, or a table that is not huge in size, but if for some reason you wish to see if an index can be helpful for an SQL without actually creating it, then you may test and try this extension to create hypothetical indexes.

PostgreSQL Extension for hypothetical indexes

In order to achieve the functionality of creating imaginary indexes in PostgreSQL, we need to use an extension named hypopg. This extension is made available in the PGDG repository from PostgreSQL 9.2 and works until the latest active release PostgreSQL 11.3. In order to create this extension, you may just use yum or apt depending on your linux distro or compile it from source.

Installing hypopg

In RedHat/CentOS

# yum install hypopg

For Debian/Ubuntu, you must make sure you have the development p

[...]
カテゴリー: postgresql

Vasilis Ventirozos: CVE-2019-10164 Who's affected and how to protect your systems.

2019-06-22(土) 02:01:00

Yesterday, 20th of June Postgres community released minor version updates for all supported versions. (11.4, 10.9, 9.6.14, 9.5.18, 9.4.23, 12Beta2)
As with any minor version, it is recommended to upgrade and keep your database to the latest minor version. But this release is a bit more important 
than others because it includes a fix about a recently discovered CVE (CVE-2019-10164) 
From the release notes:

"This release is made outside of the normal update release schedule as the security
vulnerability was determined to be critical enough to distribute the fix as quickly as possible.
Users who are running PostgreSQL 10, PostgreSQL 11, or the PostgreSQL 12 beta should upgrade as soon as possible."

In this post i'll talk about this CVE. What this all about, who is affected and how to protect your systems against it.

What is this all about. A system authenticated user could change their own password with a "special crafted password" that could crash your database server or, 
worst case, have the operating system user who owns postgres executing arbitrary code. This is a bug on SCRAM authentication mechanism 
which is why it only affects postgres version 10 and above. Reality is that on some operating systems, for example ubuntu, the default compiler options
should prevent code execution (see here) and since an attack can only happen from a user that is already authenticated, this user, with enough privileges 
could potentially come up with an SQL statement that could DOS or even bring the service down. 
Of course, this doesn't mean that if you trust your users you shouldn't upgrade, you definitely should.. 
The latest pg versions also fix a second attack in libpq, a library that is extensively used by almost all common postgres clients
psql, psycopg2 and pgadmin4 included. In this attack a user connected from a server that runs software that uses libpq, 
say psycopg2 or psql could crash the client or run arbitrary code by exploiting the SCRAM authentication process. 
In this case the exploitation will happen on the[...]
カテゴリー: postgresql

Michael Paquier: Postgres 12 highlight - SQL/JSON path

2019-06-21(金) 12:58:34

Postgres ships in-core data types for JSON with specific functions and operators (json since 9.2, and jsonb which is a binary representation since 9.4). The upcoming Postgres 12 is becoming more complaint with the SQL specifications by introducing SQL/JSON path language, introduced mainly by the following commit:

commit: 72b6460336e86ad5cafd3426af6013c7d8457367 author: Alexander Korotkov <akorotkov@postgresql.org> date: Sat, 16 Mar 2019 12:15:37 +0300 Partial implementation of SQL/JSON path language SQL 2016 standards among other things contains set of SQL/JSON features for JSON processing inside of relational database. The core of SQL/JSON is JSON path language, allowing access parts of JSON documents and make computations over them. This commit implements partial support JSON path language as separate datatype called "jsonpath". The implementation is partial because it's lacking datetime support and suppression of numeric errors. Missing features will be added later by separate commits. Support of SQL/JSON features requires implementation of separate nodes, and it will be considered in subsequent patches. This commit includes following set of plain functions, allowing to execute jsonpath over jsonb values: * jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_match(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_query(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]). * jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]). This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb, jsonpath) correspondingly. These operators will have an index support (implemented in subsequent patches). Catversion bumped, to add new functions and operators. Code was written by Nikita Glukhov and Teodor Sigaev, revised by me. Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work was inspired by Oleg Bartunov. Discussion: https://postgr.es/m/[...]
カテゴリー: postgresql

Jonathan Katz: Explaining CVE-2019-10164 + PostgreSQL Security Best Practices

2019-06-21(金) 11:30:49

The PostgreSQL Global Development Group provided an out-of-cycle update release for all supported  to provide a fix for the CVE-2019-10164 vulnerability. This vulnerability only affects people running PostgreSQL 10, 11 or the 12 beta, and it is effectively remediated by simply upgrading all of your PostgreSQL installations to the latest versions.

What follows is some more insight about what this vulnerability is, the impact it can have in your environment, how to ensure you have patched all of your systems, and provide some PostgreSQL security best practices that could help mitigate the impact of this kind of vulnerability.

カテゴリー: postgresql

Hubert 'depesz' Lubaczewski: Changes on explain.depesz.com

2019-06-18(火) 20:54:00
Recently got two bug reports: plans with “COSTS OFF" do not parse, and error out (bugreport by Marc Dean Jr) WorkTable Scan is not properly parsed (bugreport by Ivan Vergiliev) Additionally, I was kinda upset because plans that include trigger calls did not display properly. All of this has been fixed today: First, I fixed … Continue reading "Changes on explain.depesz.com"
カテゴリー: postgresql

Granthana Biswas: Install PostgreSQL 9.6 with Transparent Data Encryption

2019-06-18(火) 17:00:14
Cluster encryption can be used if the DBA can not or does not rely on the file system in terms of confidentiality. If this feature is enabled, PostgreSQL encrypts data  (both relations and write-ahead log) when writing to disk, and decrypts it when reading. The encryption is transparent, so the applications see no difference between the encrypted and unencrypted clusters.

PostgreSQL 9.6 with TDE on Ubuntu

In this blog, we go through the basic steps used to install PostgreSQL 9.6 with Transparent Data Encryption (TDE) on Ubuntu. You can download the patch for this here.

Create data directory for PostgreSQL: Just for example, I am creating it at the default location:

 

sudo mkdir -p /usr/local/pgsql/data
sudo chown postgres:postgres /usr/local/pgsql/data

Install the libraries for readline, bison, flex, openssl, zlib and crypto:

 

sudo apt-get install libreadline8 libreadline-dev zlibc zlib1g-dev bison flex libssl-dev openssl

Run configure from the source code directory:

You can choose where all files will be installed by passing --prefix. Default is /usr/local/pgsql which I am using here for example. Make sure you enable openssl by passing --with-openssl:

 

sudo ./configure --prefix=/usr/local/pgsql --with-openssl
sudo make

 

The above two commands should run without errors. Now we are ready to install:

 

sudo make install

 

We can now proceed to initialize the cluster. For that, let’s switch to the postgres user:

 

sudo su - postgres

As a good practice, lets add the PostgreSQL binaries to PATH:

 

export PATH=$PATH:/usr/local/pgsql/bin

To create encrypted cluster, use the -K option to pass the initdb utility. For example:

 

initdb -D /usr/local/pgsql/data -K/usr/local/pgsql/keypass

 

Here /usr/local/pgsql/keypass is an executable file that returns either encryption key or encryption password with the appropriate prefix. In this case, we are passing the encryption_password in 8-16 characters in a simple executable file which outputs: 

 

encryption_password=<8-16_passphrase>

$ chmod 755 /usr/local/pgsql/keypass
$ c

[...]
カテゴリー: postgresql

pgCMH - Columbus, OH: What’s new in pgBouncer

2019-06-17(月) 13:00:00

The June meeting will be held at 18:00 EST on Tues, the 25th. Once again, we will be holding the meeting in the community space at CoverMyMeds. Please RSVP on MeetUp so we have an idea on the amount of food needed.

What

CoverMyMeds’ very own CJ will be presenting this month. This month’s meeting will be CJ telling us about what’s new and improved in pgBouncer as well as how to get it up and running. Discussion will include real life examples from its use at CMM. pgBouncer is the lightweight connection pooler for PostgreSQL.

Where

CoverMyMeds has graciously agreed to validate your parking if you use their garage so please park there:

You can safely ignore any sign saying to not park in the garage as long as it’s after 17:30 when you arrive.

Park in any space that is not marked ‘24 hour reserved’.

Once parked, take the elevator/stairs to the 3rd floor to reach the Miranova lobby. Once in the lobby, the elevator bank is in the back (West side) of the building. Take a left and walk down the hall until you see the elevator bank on your right. Grab an elevator up to the 11th floor. (If the elevator won’t let you pick the 11th floor, contact Doug or CJ (info below)). Once you exit the elevator, look to your left and right; one side will have visible cubicles, the other won’t. Head to the side without cubicles. You’re now in the community space:

The kitchen is to your right (grab yourself a drink) and the meeting will be held to your left. Walk down the room towards the stage.

If you have any issues or questions with parking or the elevators, feel free to text/call Doug at +1.614.316.5079 or CJ at +1.740.407.7043

カテゴリー: postgresql

Avinash Kumar: Bloom Indexes in PostgreSQL

2019-06-15(土) 04:29:56

There is a wide variety of indexes available in PostgreSQL. While most are common in almost all databases, there are some types of indexes that are more specific to PostgreSQL. For example, GIN indexes are helpful to speed up the search for element values within documents. GIN and GiST indexes could both be used for making full-text searches faster, whereas BRIN indexes are more useful when dealing with large tables, as it only stores the summary information of a page. We will look at these indexes in more detail in future blog posts. For now, I would like to talk about another of the special indexes that can speed up searches on a table with a huge number of columns and which is massive in size. And that is called a bloom index.

In order to understand the bloom index better, let’s first understand the bloom filter data structure. I will try to keep the description as short as I can so that we can discuss more about how to create this index and when will it be useful.

Most readers will know that an array in computer sciences is a data structure that consists of a collection of values and variables. Whereas a bit or a binary digit is the smallest unit of data represented with either 0 or 1. A bloom filter is also a bit array of m bits that are all initially set to 0.

A bit array is an array that could store a certain number of bits (0 and 1). It is one of the most space-efficient data structures to test whether an element is in a set or not.

Why use bloom filters?

Let’s consider some alternates such as list data structure and hash tables. In the case of a list data structure, it needs to iterate through each element in the list to search for a specific element. We can also try to maintain a hash table where each element in the list is hashed, and we then see if the hash of the element we are searching for matches a hash in the list. But checking through all the hashes may be a higher order of magnitude than expected. If there is a hash collision, then it does a linear probing which may be time-consuming. When we

[...]
カテゴリー: postgresql

Luca Ferrari: A recursive CTE to get information about partitions

2019-06-12(水) 09:00:00

I was wondering about writing a function that provides a quick status about partitioning. But wait, PostgreSQL has recursive CTEs!

A recursive CTE to get information about partitions

I’m used to partitioning, it allows me to quickly and precisely split data across different tables. PostgreSQL 10 introduced the native partitioning, and since that I’m using native partitioning over inheritance whenever it is possible.
But how to get a quick overview of the partition status? I mean, knowing which partition is growing the more?
In the beginning I was thinking to write a function to do that task, quickly finding myself iterating recursively over pg_inherits, the table that links partitions to their parents. But the keyword here is recursively: PostgreSQL provides recursive Common Table Expression, and a quick search revelead I was right: it is possible to do it with a single CTE. Taking inspiration from this mailing list message, here it is a simple CTE to get a partition status (you can find it on my GitHub repository):

WITH RECURSIVE inheritance_tree AS ( SELECT c.oid AS table_oid , c.relname AS table_name , NULL::text AS table_parent_name , c.relispartition AS is_partition FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind = 'p' AND c.relispartition = false UNION ALL SELECT inh.inhrelid AS table_oid , c.relname AS table_name ,...
カテゴリー: postgresql

Jeff McCormick: What's New in Crunchy PostgreSQL Operator 4.0

2019-06-12(水) 00:27:48

Crunchy Data is pleased to release PostgreSQL Operator 4.0.

カテゴリー: postgresql

Hans-Juergen Schoenig: Tech preview: How PostgreSQL 12 handles prepared plans

2019-06-11(火) 17:00:50

PostgreSQL 12 is just around the corner and therefore we already want to present some of the new features we like. One important new feature gives users and devops the chance to control the behavior of the PostgreSQL optimizer. Prepared plans are always a major concern (especially people moving from Oracle seem to be most concerned) and therefore it makes sense to discuss the way plans are handled in PostgreSQL 12.

Firing up a PostgreSQL test database

To start I will create a simple table consisting of just two fields:

db12=# CREATE TABLE t_sample (id serial, name text); CREATE TABLE

Then some data is loaded:

db12=# INSERT INTO t_sample (name) SELECT 'hans' FROM generate_series(1, 1000000); INSERT 0 1000000 db12=# INSERT INTO t_sample (name) SELECT 'paul' FROM generate_series(1, 2); INSERT 0 2

Note that 1 million names are identical (“hans”) and just two people are called “paul”. The distribution of data is therefore quite special, which has a major impact as you will see later in this post.

To show how plans can change depending on the setting, an index on “name” is defined as shown in the next listing:

db12=# CREATE INDEX idx_name ON t_sample (name); CREATE INDEX

The PostgreSQL query optimizer at work

Let us run a simple query and see what happens:

db12=# explain SELECT count(*) FROM t_sample WHERE name = 'hans'; QUERY PLAN ------------------------------------------------------------------ Finalize Aggregate (cost=12656.23..12656.24 rows=1 width=8) -> Gather (cost=12656.01..12656.22 rows=2 width=8) Workers Planned: 2 -> Partial Aggregate (cost=11656.01..11656.02 rows=1 width=8) -> Parallel Seq Scan on t_sample (cost=0.00..10614.34 rows=416668 width=0) Filter: (name = 'hans'::text) (6 rows)

In this case PostgreSQL decided to ignore the index and go for a sequential scan. It has even seen that the table is already quite large and opted for a parallel query. Still, what we see is a sequential scan. All data in the table has to be

[...]
カテゴリー: postgresql

Luca Ferrari: FizzBuzz (in both plpgsql and SQL)

2019-06-11(火) 09:00:00
カテゴリー: postgresql

Luca Ferrari: Checking the sequences status on a single pass

2019-06-11(火) 09:00:00

It is quite simple to wrap a couple of queries in a function to have a glance at all the sequences and their cycling status.

Checking the sequences status on a single pass

The catalog pg_sequence keeps track about the definition of a single sequence, including the increment value and boundaries. Combined with pg_class and a few other functions it is possible to create a very simple administrative function to keep track about the overall sequences status.

I’ve created a seq_check() function that provides an output as follows:

testdb=# select * from seq_check() ORDER BY remaining; seq_name | current_value | lim | remaining ------------------------|---------------|------------|------------ public.persona_pk_seq | 5000000 | 2147483647 | 214248 public.root_pk_seq | 50000 | 2147483647 | 2147433647 public.students_pk_seq | 7 | 2147483647 | 2147483640 (3 rows)

As you can see, the function provides the current value of the sequence, the maximum value (limit) and how much values the sequence can still provide before it overflows or cycles. For example, persona_pk_seq has remained with 214248 values to provide. Combined with the current value, that is 5000000, this provides hint about the fact that the sequence has probably a too large increment interval.

The code of the function is as follows:

CREATE OR REPLACE FUNCTION seq_check() RETURNS TABLE( seq_name text, current_value bigint, lim...
カテゴリー: postgresql

Paul Ramsey: Parallel PostGIS and PgSQL 12 (2)

2019-06-08(土) 01:00:00

In my last post I demonstrated that PostgreSQL 12 with PostGIS 3 will provide, for the first time, automagical parallelization of many common spatial queries.

This is huge news, as it opens up the possibility of extracting more performance from modern server hardware. Commenters on the post immediately began conjuring images of 32-core machines reducing their query times to miliseconds.

So, the next question is: how much more performance can we expect?

To investigate, I acquired a 16 core machine on AWS (m5d.4xlarge), and installed the current development snapshots of PostgreSQL and PostGIS, the code that will become versions 12 and 3 respectively, when released in the fall.

How Many Workers?

The number of workers assigned to a query is determined by PostgreSQL: the system looks at a given query, and the size of the relations to be processed, and assigns workers proportional to the log of the relation size.

For parallel plans, the “explain” output of PostgreSQL will include a count of the number of workers planned and assigned. That count is exclusive of the leader process, and the leader process actually does work outside of its duties in coordinating the query, so the number of CPUs actually working is more than the num_workers, but slightly less than num_workers+1. For these graphs, we’ll assume the leader fully participates in the work, and that the number of CPUs in play is num_workers+1.

Forcing Workers

PostgreSQL’s automatic calculation of the number of workers could be a blocker to performing analysis of parallel performance, but fortunately there is a workaround.

Tables support a “storage parameter” called parallel_workers. When a relation with parallel_workers set participates in a parallel plan, the value of parallel_workers over-rides the automatically calculated number of workers.

ALTER TABLE pd SET ( parallel_workers = 8);

In order to generate my data, I re-ran my queries, upping the number of parallel_workers on my tables for each run.

Setup

Before running the tests, I set all the global limits o

[...]
カテゴリー: postgresql

Bruce Momjian: Exploring Postgres Tips and Tricks

2019-06-07(金) 07:00:01

I did a webinar two weeks ago titled, "Exploring Postgres Tips and Tricks." The slides are now online, as well as a video recording. I wasn't happy with the transition I used from the PDF to the blog entries, but now know how to improve that next time.

I think I might do more of these by expanding on some of the topics I covered, like psql and monitoring. Also, a new video is available of the sharding presentation I mentioned previously.

カテゴリー: postgresql

Álvaro Hernández: PostgreSQL Ibiza: 2 weeks to go

2019-06-07(金) 00:37:11
Five-day PostgreSQL networking experience, embedding a 2-day conference Just 2 weeks to go for PostgreSQL Ibiza. The new, innovative PostgreSQL Conferences that happens 50m away from a beach. The conference for thinkers, for networking, for partnering. The conference to be at. But a conference is nothing without great content. And after receiving more than 71 talk submissions, and the hard work that the Committee has done to select the talks, PostgreSQL Ibiza will have top-notch talks from top-notch international speakers.
カテゴリー: postgresql

Robert Treat: The Lost Art of plpgsql

2019-06-06(木) 14:12:00

One of the big features talked about when PostgreSQL 11 was released was that of the new stored procedure implementation. This gave Postgres a more standard procedure interface compared to the previous use of functions. This is perticularly useful for folks who are doing database migrations where they may have been using the standards CALL syntax vs Postgres traditional use of SELECT function(); syntax. So it struck me as odd earlier this year when I noticed that, despite the hoopla, that a year later that there was almost zero in the way of presentations and blog posts on either the new stored procedure functionality or the use of plpgsql in general.

And so I got the idea that maybe I would write such a talk and present it at PGCon; a nod to the past and the many years I've spent working with plpgsql in a variety of roles. The commitee liked the idea (disclosure that I am on the pgcon committee, but didn't advocate for myself) and so this talk was born. For a first time talk I think it turned out well, though it could definitly use some polish; but I'm happy that it did help spark some conversation and actually has given me a few items worth following up on, hopefully in future blog posts.

Video should be available in a few weeks, but for now, I've gone ahead and uploaded the slides on slideshare.

カテゴリー: postgresql

Robert Haas: The Contributors Team

2019-06-06(木) 02:37:00
Recently, the PostgreSQL project spun up a "contributors" team, whose mission is to ensure that the PostgreSQL contributors list is up-to-date and fair. The contributors page has a note which says "To suggest additions to the list, please email contributors@postgresql.org."  The current members of the team are Dave Page, Stephen Frost, Vik Fearing, and me.

Read more »
カテゴリー: postgresql

Mark Wong: PDXPUG June Meetup: Accessing Postgres with Java

2019-06-06(木) 01:06:21

When: 6-8pm Thursday June 20, 2019
Where: PSU Business Accelerator (Parking is open after 5pm.)
Who: Will McLean

To follow the presentations on accessing Postgres from Python and Scala, I will lead a discussion on accessing Postgres with Java. I’ll start with a jdbc tutorial and finish with adding data access to a springboot webapp.

I have twenty years experience in e-commerce applications, the last eight here in Portland, mostly at Nike.For the last few years everything has been moving to Amazon RDS Postgres, that’s a trend pdxpug can get behind!  I am currently working for Navis on CRM applications for the hospitality industry.

カテゴリー: postgresql

ページ