postgresql

Dan Langille: PostgreSQL – convert a user to a role

planet postgresql - 2018-08-10(金) 07:03:46

Users and Roles in PostgreSQL are very similar. When I set up the FreshPorts database back in mid-2000, I was using PostgreSQL 7.0.3 (that’s my best guess based on my blog entry). I suspect roles were not available then and were introduced with PostgreSQL 8. I am positive someone will correct me if that’s wrong.

I now have a need to convert a user into a role, then add users to that role. Let’s see what happens.

I’m doing this on my development server, so there’s no concurrent access issue. I’ll just turn stuff off (disable the webserver, the scripts, etc).

Creating the new users begin; ALTER ROLE www NOLOGIN; CREATE USER www_dev WITH LOGIN PASSWORD '[redacted]' IN ROLE www; CREATE USER www_beta WITH LOGIN PASSWORD '[redacted]' IN ROLE www;

That went well, so I issued a COMMIT.

The two new users will have the same permission as the original user.

Changing the login

The login credentials will need to be changed. This is my update:

# $db = pg_connect("host=pg02.example.org dbname=freshports user=www password=oldpassword sslmode=require"); $db = pg_connect("host=pg02.example.org dbname=freshports user=www_beta password=newpassword sslmode=require"); Access rights

I also updated pg_hba.conf for this server.

#hostssl freshports www 10.0.0.1/32 md5 hostssl freshports www_beta 10.0.0.1/32 md5

After changing pg_hba.conf, you have to tell PostgreSQL about it. This is the FreeBSD command for that:

sudo service postgresql reload It just worked

I was impressed with how straight forward this was. https://beta.freshports.org/ came right up.

I have three other users to convert to roles but if it’s as easy as the above, I should be finished in time for dinner.

カテゴリー: postgresql

Álvaro Herrera: Talk slides: Partitioning Improvements in PostgreSQL 11

planet postgresql - 2018-08-10(金) 05:58:45

I spent a couple of days in São Paulo, Brazil last week, for the top-notch PGConf.Brazil 2018 experience. This year I gave a talk about improvements in the declarative partitioning area in the upcoming PostgreSQL 11 release — a huge step forward from what PostgreSQL 10 offers. We have some new features, some DDL handling enhancements, and some performance improvements, all worth checking out.

I’m told that the organization is going to publish video recordings at some point; for the time being, here’s my talk slides.

I’m very happy that they invited me to talk once again in Brazil. I had a great time there, even if they won’t allow me to give my talk in Spanish! Like every time I go there, I regret it once it’s time to come home, because it’s so easy to feel at home with the Brazilian gang. Next time, I promise I’ll make time for Sunday’s churrasco.

For our amusement, they posted a teaser video to share the community conference spirit.

カテゴリー: postgresql

Christophe Pettus: Three Steps to pg_rewind Happiness

planet postgresql - 2018-08-10(金) 02:06:52

pg_rewind is a utility included with PostgreSQL since 9.x. It’s used to “rewind” a server so that it can be attached as a secondary to a primary. The server being rewound could be the former primary of the new primary, or a secondary that was a peer of the new primary.

In pg_rewind terminology, and in this post, the “source” server is the new primary that the old server is going to be attached to, and the “target” is the server that will be attached to the source as a secondary.

Step One: Have a WAL Archive

While pg_rewind does not require that you have a WAL archive, you should have one. pg_rewind works by “backing up” the target server to a state before the last shared checkpoint of the two servers. Then, when the target starts up, it uses WAL information to replay itself to the appropriate point at which it can connect as a streaming replica to the source. To do that, it needs the WAL information from the rewind point onwards. Since the source had no reason to “know” that it would be used as a primary, it may not have enough WAL information in its pgxlog / pgwal directory to bring the target up to date. If it doesn’t, you are back to rebuilding the new secondary, the exact situation that pg_rewind is meant to avoid.

Thus, make sure you have a WAL archive that the target can consult as it is coming up.

Step Two: Properly Promote the Source Server

The source server, which will be the new primary, needs to be properly promoted. Use the pg_ctl promote option, or the trigger_file option in recovery.conf so that the source promotes itself, and starts a new timeline. Don’t just shut the source down, remove recovery.conf, and bring it back up! That doesn’t create a new timeline, and the source won’t have the appropriate divergence point from the target for pg_rewind to consult.

Step Three: Wait for the Forced Checkpoint to Complete

When a secondary is promoted to being a primary, it starts a forced checkpoint when it exits recovery mode. This checkpoint is a “fast” checkpoint, but it can still take a while, depending

[...]
カテゴリー: postgresql

Craig Kerstiens: Fun with SQL: Common Table Expressions for more readable queries

planet postgresql - 2018-08-09(木) 23:44:00

This week we’re continuing our fun with SQL series. In past posts we’ve looked at generate_series, window functions, and recursive CTEs. This week we’re going to take a step backward and look at standard CTEs (common table expressions) within Postgres.

Admittedly SQL isn’t always the most friendly language to read. It’s a little more friendly to write, but even still not as natuarlly readable as something like Python. Despite it’s shortcomings there it is the lingua franca when it comes to data, SQL is the language and API that began with relational databases and now even non traditional databases are aiming to immitate it with their own SQL like thing. With CTEs though our SQL, even queries hundreds of lines long, can become readable to someone without detailed knowledge of the application.

CTEs (common table expressions), often referred to as with clauses/queries, are essentially views that are valid during the course of a transaction. They can reference earlier CTEs within that same transaction or query essentially allowing you separate building blocks on which you compose your queries. It is of note that CTEs are an optimization boundary, so in cases they may have worse performance than their alternative non-CTE queries. Even still they’re incredible useful for readability and should be considered when constructing large complex queries. Let’s dig in with an example.

We’re going to assume a basic CRM schema where we have organizations, that have contacts which are tied to accounts, and those accounts have opportunities. In this CRM world we want to create a report that has all opportunities that were opened between 30 and 60 days ago, and have a contact that was contacted within the last 30 days, grouped by the sales rep owner. The goal is to see that our sales reps are actively chasing the deals that they said exist.

Query for opportunities created 1-2 months ago

First we’re going to construct our query that gives us all opportunities opened in that range:

WITH opp_list AS ( SELECT opportunities.id as o[...]
カテゴリー: postgresql

Viorel Tabara: An Overview of Amazon RDS & Aurora Offerings for PostgreSQL

planet postgresql - 2018-08-09(木) 20:15:49

AWS PostgreSQL services fall under the RDS umbrella, which is Amazon’s DaaS offering for all known database engines.

Managed database services offer certain advantages that are appealing to the customer seeking independence from infrastructure maintenance, and highly available configurations. As always, there isn’t a one size fits all solution. The currently available options are highlighted below:

Aurora PostgreSQL Related resources  ClusterControl for PostgreSQL  Comparing Cloud Database Options for PostgreSQL  Ten Tips for Going into Production with PostgreSQL

The Amazon Aurora FAQ page provides important details that need to be considered before diving into the product. For example, we learn that the storage layer is virtualized and sits on a proprietary virtualized storage system backed up by SSD.

Pricing

In term of pricing, it must be noted that Aurora PostgreSQL is not available in the AWS Free Tier.

Compatibility

The same FAQ page makes it clear that Amazon doesn’t claim 100% PostgreSQL compatibility. Most (my emphasis) of the applications will be fine, e.g. the AWS PostgreSQL flavor is wire-compatible with PostgreSQL 9.6. As a result, the Wireshark PostgreSQL Dissector will work just fine.

Performance

Performance is also linked to the instance type, for example the maximum number of connections is by default configured based on the instance size.

Also important when it comes to compatibility is the page size that has been kept at 8KiB which is the PostgreSQL default page size. Speaking of pages it’s worth quoting the FAQ: “Unlike traditional database engines Amazon Aurora never pushes modified database pages to the storage layer, resulting in further IO consumption savings.” This is made possible because Amazon changed the way the page cache is managed, allowing it to remain in memory in case of database failure. This feature also benefits the database restart following a crash, allowing the recovery to happen much faster than in the traditional method of replaying the logs.

According to the

[...]
カテゴリー: postgresql

10.5

postgresql.org - 2018-08-09(木) 09:00:00
10.5 is the latest release in the 10 series.
カテゴリー: postgresql

9.6.10

postgresql.org - 2018-08-09(木) 09:00:00
9.6.10 is the latest release in the 9.6 series.
カテゴリー: postgresql

9.5.14

postgresql.org - 2018-08-09(木) 09:00:00
9.5.14 is the latest release in the 9.5 series.
カテゴリー: postgresql

9.4.19

postgresql.org - 2018-08-09(木) 09:00:00
9.4.19 is the latest release in the 9.4 series.
カテゴリー: postgresql

9.3.24

postgresql.org - 2018-08-09(木) 09:00:00
9.3.24 is the latest release in the 9.3 series.
カテゴリー: postgresql

Christophe Pettus: Does anyone really know what time it is?

planet postgresql - 2018-08-08(水) 09:58:03

PostgreSQL has a variety of ways of telling time: now(), statement_timestamp(), and clock_timestamp(). Each has a different sense of when “now” is:

  • now() is the time at the start of the transaction; it never changes while the current transaction is open.
  • statement_timestamp() is the time that the current statement started running. It changes from statement to statement, but is constant within a statement (which means it is constant within a PL/pgSQL function).
  • clock_timestamp() changes each time it is called, regardless of context.

Each has its uses:

  • For a predicate in a WHERE clause, you want either now() or statement_timestamp(). These work properly with indexes, because they are constant within the execution of a statement.
  • If you need the time to update within a single transaction, use statement_timestamp(); otherwise, use now().
  • Generally, you only use clock_timestamp() inside of a programming language procedure so you can get the current timestamp.
カテゴリー: postgresql

Douglas Hunley: pgBouncer and auth pass-thru

planet postgresql - 2018-08-07(火) 21:53:28

I’ve noticed several individuals inquiring lately about pgBouncer and how they can avoid putting all users and their passwords in it’s auth_file. After the most recent such inquiry (hi Richard!) I decided I’d write this post to hopefully make it clearer how to use ‘pass-thru auth’ and avoid maintaining your users and their passwords in an external file. So let’s see what this takes, shall we?

First, install pgBouncer as per your OS (yum, apt, brew, etc):

doug@ReturnOfTheMac ~> brew install pgbouncer Updating Homebrew... ==> Auto-updated Homebrew! Updated 1 tap (homebrew/core). ==> Updated Formulae <snip> ==> Downloading https://homebrew.bintray.com/bottles/pgbouncer-1.8.1.high_sierra ######################################################################## 100.0% ==> Pouring pgbouncer-1.8.1.high_sierra.bottle.tar.gz ==> Caveats The config file: /usr/local/etc/pgbouncer.ini is in the "ini" format and you will need to edit it for your particular setup. See: https://pgbouncer.github.io/config.html The auth_file option should point to the /usr/local/etc/userlist.txt file which can be populated by the /usr/local/opt/pgbouncer/bin/mkauth.py script. To have launchd start pgbouncer now and restart at login: brew services start pgbouncer Or, if you do not want/need a background service you can just run: pgbouncer -q /usr/local/etc/pgbouncer.ini ==> Summary 🍺 /usr/local/Cellar/pgbouncer/1.8.1: 17 files, 399.9KB

Great, so now we have pgBouncer installed.

To make life easier on ourselves, we’re going to temporarily enable trusted local socket connections in our pg_hba.conf:

# TYPE DATABASE USER ADDRESS METHOD # "local" is for Unix domain socket connections only local all all trust

Right now, this is the only line in my pg_hba.conf. Let’s SIGHUP the postmaster so it takes affect:

doug@ReturnOfTheMac ~> pg_ctl -D $PGDATA reload server signaled

And test it:

doug@ReturnOfTheMac ~> unset PGPASSWORD ; psql -U doug -d doug -c "select now();" ┌────────────────────────[...]
カテゴリー: postgresql

Krzysztof Książek: Monitoring your Databases with ClusterControl

planet postgresql - 2018-08-07(火) 19:42:29

Observability is critical piece of the operations puzzle - you have to be able to tell the state of your system based on data. Ideally, this data will be available from a single location. Having multiple applications, each handling separate pieces of data, is a direct way to serious troubles. When the issues start, you have to be able to tell what is going on quickly rather than trying to analyze and merge reports from multiple sources.

ClusterControl, among other features, provides users with a one single point from which to track the health of your databases. In this blog post, we will show some of the observability features in ClusterControl.

Overview Tab

Overview section is a one place to track the state of one cluster, including all the cluster nodes as well as any load balancers.

It provides easy access to multiple pre-defined dashboards which show the most important information for the given type of cluster. ClusterControl supports different open source datastores, and different graphs are displayed based on the vendor. ClusterControl also gives an option to create your own, custom dashboards:

One key feature is that graphs are aggregated across all cluster nodes. This makes it easier to track the state of the whole cluster. If you want to check graphs from each of the nodes, you can easily do that:

By ticking “Show Servers”, all nodes in the cluster will be shown separately allowing you to drill down into each one of them.

Nodes Tab

If you would like to check a particular node in more details, you can do so from the “Nodes” tab.

Here you can find metrics related to given host - CPU, disk, network, memory. All the important bits of data which define how a given server behaves and how loaded it is.

Nodes tab also gives you an option to check the database metrics for a given node:

All of those graphs are customizable, you can easily add more of them:

Nodes tab also contains metrics related to nodes other than databases. For example, for ProxySQL, ClusterControl provides extensive list of graphs to t

[...]
カテゴリー: postgresql

Luca Ferrari: pgxnclient and beta version

planet postgresql - 2018-08-07(火) 09:00:00

pgxnclient is a wonderful cpan like tool for the PGXN extension network. Unlickily, the client cannot handle PostgreSQL beta version, so I submitted a really small patch to fix the issue.

pgxnclient and beta version

If you, like me, are addicted to terminal mode, you surely love a tool like pgxnclient that allows you to install extension into PostgreSQL from the command line, much like cpan (and friends) does for Perl.

A few days ago, I run into a problem: the `load** command cannot work against a PostgreSQL 11 beta 2 server. At first I reported it with a [ticket])https://github.com/dvarrazzo/pgxnclient/issues/29), but then curiosity hit me and I decided to give a look at very well written source code.

Warning: I’m not a Python developer, or better, I’m a Python-idiot! This means the work I’ve done, even if it seems it works, could be totally wrong, so reviews are welcome.

First I got to the regular expression used to parse a version() output:

m = re.match(r'\S+\s+(\d+)\.(\d+)(?:\.(\d+))?', data)

where data is the output of a SELECT version();. Now, this works great for a version like 9.6.5 or 10.3, but does not work for 11beta2. Therefore, I decided to implement a two level regular expression check: at first search for a two or three numbers, and if it fails, search for two numbers separated by the beta text string.

m = re.match(r'\S+\s+(\d+)\.(\d+)(?:\.(\d+))?', data) if m is None: m = re.match( r'\S+\s+(\d+)beta(\d+)', data ) is_beta...
カテゴリー: postgresql

Avinash Kumar: Basic Understanding of Bloat and VACUUM in PostgreSQL

planet postgresql - 2018-08-06(月) 22:46:06

Implementation of MVCC (Multi-Version Concurrency Control) in PostgreSQL is different and special when compared with other RDBMS. MVCC in PostgreSQL controls which tuples can be visible to transactions via versioning.

What is versioning in PostgreSQL?

Let’s consider the case of an Oracle or a MySQL Database. What happens when you perform a DELETE or an UPDATE of a row? You see an UNDO record maintained in a global UNDO Segment. This UNDO segment contains the past image of a row, to help database achieve consistency. (the “C” in A.C.I.D). For example, if there is an old transaction that depends on the row that got deleted, the row may still be visible to it because the past image is still maintained in the UNDO. If you are an Oracle DBA reading this blog post, you may quickly recollect the error

ORA-01555 snapshot too old  . What this error means is—you may have a smaller undo_retention or not a huge UNDO segment that could retain all the past images (versions) needed by the existing or old transactions.

You may not have to worry about that with PostgreSQL.

Then how does PostgreSQL manage UNDO ?

In simple terms, PostgreSQL maintains both the past image and the latest image of a row in its own Table. It means, UNDO is maintained within each table. And this is done through versioning. Now, we may get a hint that, every row of PostgreSQL table has a version number. And that is absolutely correct. In order to understand how these versions are maintained within each table, you should understand the hidden columns of a table (especially xmin) in PostgreSQL.

Understanding the Hidden Columns of a Table

When you describe a table, you would only see the columns you have added, like you see in the following log.

percona=# \d scott.employee Table "scott.employee" Column | Type | Collation | Nullable | Default ----------+------------------------+-----------+----------+------------------------------------------------ emp_id | integer [...]
カテゴリー: postgresql

brian davis: Cost of a Join

planet postgresql - 2018-08-05(日) 14:00:00

How expensive is a join?

It depends! It depends what the join criteria is, what indexes are present, how big the tables are, whether the relations are cached, what hardware is being used, what configuration parameters are set, whether statistics are up-to-date, what other activity is happening on the system, to name a few things.

But we can still try and get a feel for what a couple simple scenarios look like and see what happens when:

  1. The number of tables being joined increases
  2. The number of rows in those tables increases
  3. Indexes are present / not present

My inspiration is a question that comes up over and over again when designing the schema for an upcoming feature. For example, if we have an existing table of products and we want to add the notion of a "status" for each product. "Active", "Discontinued", "Recalled", etc. Do we:

  1. Add a status_id column to the product table and reference a new status table
  2. Add a status_id column to the product table and let the app define the mapping of what each status_id is
  3. Add a status column of type text to the product table

I usually argue for the first option. Arguments for options two and three usually revolve around two points. Concerns over the performance of the join, and developer ergonomics. The developer ergonomics is a matter of taste, but we can certainly look at the join performance.

We'll go nuts and really put PostgreSQL to the test. For some straightforward equi-joins - the kind a lookup table would be doing - let's see what happens to performance when the number of joins goes through the roof. How many is too many?

Here's how we'll be creating our tables for testing. Scroll down to the charts if you just want to see the results.

DROP FUNCTION IF EXISTS create_tables(integer, integer, boolean); CREATE FUNCTION create_tables(num_tables integer, num_rows integer, create_indexes boolean) RETURNS void AS $function_text$ BEGIN -- There's no table before the first one, so this one's a little different. Create it here instead of in our loop. DROP TABLE IF EXIST[...]
カテゴリー: postgresql

Ozgun Erdogan: Citus 7.5: The right way to scale SaaS apps

planet postgresql - 2018-08-04(土) 03:18:00

One of the primary challenges with scaling SaaS applications is the database. While you can easily scale your application by adding more servers, scaling your database is a way harder problem. This is particularly true if your application benefits from relational database features, such as transactions, table joins, and database constraints.

At Citus, we make scaling your database easy. Over the past year, we added support for distributed transactions, made Rails and Django integration seamless, and expanded on our SQL support. We also documented approaches to scaling your SaaS database to thousands of customers.

Today, we’re excited to announce the latest release of our distributed database—Citus 7.5. With this release, we’re adding key features that make scaling your SaaS / multi-tenant database easier. If you’re into bulleted lists, these features include the following.

What’s New in Citus 7.5
  • Foreign key constraints from distributed to reference tables
  • SELECT .. FOR UPDATE
  • Per-customer (per-tenant) query statistics (aka Landlord)
  • Advanced Security: Row and column level access controls
  • Native integration with PostgreSQL extensions: HLL and TopN

To try these new features, you can download Citus packages on your local machine or create a Citus distributed database cluster on Citus Cloud. Or, keep reading to learn more about 7.5.

Example SaaS Application

Let’s write an ad analytics app which companies can use to view, change, and analyze their ads. Such an application has the characteristics of a typical SaaS / multi-tenant application. Data from different tenants is stored in a central database, and each tenant has an isolated view of their data.

First, you create two distributed tables using the standard PostgreSQL syntax.

CREATE TABLE companies ( id bigint NOT NULL, name text NOT NULL, image_url text, CONSTRAINT pk_company_id PRIMARY KEY (id) ); CREATE TABLE campaigns ( id bigint NOT NULL, company_id bigint NOT NULL, country_id int NOT NULL, name text NOT NULL, cost_mo[...]
カテゴリー: postgresql

Brian Fehrle: Understanding And Reading the PostgreSQL System Catalog

planet postgresql - 2018-08-02(木) 16:01:09

Managing databases is no small task, and can easily be frustrating without knowing what’s happening under the covers. Whether trying to find out if new indexes are helpful, the transaction count on a database at a certain time, or who’s connected to the database at any given time, data that allows the administrators truly know how their databases are performing is king. Luckily, with PostgreSQL, that data for all of this is available in the PostgreSQL system catalog.

The PostgreSQL System Catalog is a schema with tables and views that contain metadata about all the other objects inside the database and more. With it, we can discover when various operations happen, how tables or indexes are accessed, and even whether or not the database system is reading information from memory or needing to fetch data from disk.

Here we will go over an overview of the system catalog, and highlight how to read it, and how to pull useful information from it. Some of the metadata is straightforward, and other pieces take a bit of digesting to generate real useful information. Either way, PostgreSQL gives us a great platform to build whatever information we need about the database itself.

The PostgreSQL Catalog

PostgreSQL stores the metadata information about the database and cluster in the schema ‘pg_catalog’. This information is partially used by PostgreSQL itself to keep track of things itself, but it also is presented so external people / processes can understand the inside of the databases too.

The PostgreSQL Catalog has a pretty solid rule: Look, don’t touch. While PostgreSQL stores all this information in tables like any other application would, the data in the tables are fully managed by PostgreSQL itself, and should not be modified unless an absolute emergency, and even then a rebuild is likely in order afterwards.

We will go over a few useful catalog tables, how to read the data, and clever things we can do with the data itself. There are quite a few tables in the catalog that we won’t go over, but all information for the

[...]
カテゴリー: postgresql

David Wheeler: pgenv

planet postgresql - 2018-08-02(木) 13:31:03

For years, I’ve managed multiple versions of PostgreSQL by regularly editing and running a simple script that builds each major version from source and installs it in /usr/local. I would shut down the current version, remove the symlink to /usr/local/pgsql, symlink the one I wanted, and start it up again.

This is a pain in the ass.

Recently I wiped my work computer (because reasons) and started reinstalling all my usual tools. PostgreSQL, I decided, no longer needs to run as the postgres user from /usr/local. What would be much nicer, when it came time to test pgTAP against all supported versions of Postgres, would be to use a tool like plenv or rbenv to do all the work for me.

So I wrote pgenv. To use it, clone it into ~/.pgenv (or wherever you want) and add its bin directories to your $PATH environment variable:

git clone https://github.com/theory/pgenv.git echo 'export PATH="$HOME/.pgenv/bin:$HOME/.pgenv/pgsql/bin:$PATH"' >> ~/.bash_profile

Then you’re ready to go:

pgenv build 10.4

A few minutes later, it’s there:

$ pgenv versions pgsql-10.4

Let’s use it:

$ pgenv use 10.4 The files belonging to this database system will be owned by user "david". This user must also own the server process. # (initdb output elided) waiting for server to start.... done server started PostgreSQL 10.4 started

Now connect:

$ psql -U postgres psql (10.4) Type "help" for help. postgres=#

Easy. Each version you install – as far back as 8.0 – has the default super user postgres for compatibility with the usual system-installed version. It also builds all contrib modules, including PL/Perl using /usr/bin/perl.

With this little app in place, I quickly built all the versions I need. Check it out:

$ pgenv versions pgsql-10.3 * pgsql-10.4 pgsql-11beta2 pgsql-8.0.26 pgsql-8.1.23 pgsql-8.2.23 pgsql-8.3.23 pgsql-8.4.22 pgsql-9.0.19 pgsql-9.1.24 pgsql-9.2.24 pgsql-9.3.23 pgsql-9.4.18 pgsql-9.5.13 pgsql-9.6.9

Other commands include start, stop, and restart, which

[...]
カテゴリー: postgresql

ページ