Brian Fehrle: Dynamic Monitoring of PostgreSQL Instances Using pg_top

planet postgresql - 2018-10-08(月) 18:57:00

Monitoring PostgreSQL can, at times, be like trying to wrangle cattle in a thunderstorm. Applications connect and issue queries so quickly, it’s hard to see what’s going on or even get a good overview of the performance of the system other than the typical developer complaining ‘things are slow, help!’ kind of requests.

In previous articles, we’ve discussed how to Get to the Source when PostgreSQL is acting slow, but when the source is specifically queries, basic level monitoring may not be enough for assessing what’s going on in an active live environment.

Enter pg_top, a PostgreSQL specific program to monitor real time activity in a database, as well as view basic information for the database host itself. Much like the linux command ‘top’, running it brings the user into a live interactive display of database activity on the host, refreshing automatically in intervals.


Installing pg_top can be done in the generally expected ways: package managers and source install. The most recent version as of this article is 3.7.0.

Package Managers

Based on the distribution of linux in question, search for pgtop or pg_top in the package manager, it’s likely available in some aspect for the installed version of PostgreSQL on the system.

Red Hat based distros:

# sudo yum install pg_top

Gentoo based distros:

# sudo apt-get install pgtop Source

If desired, pg_top can be installed via source from the PostgreSQL git repository. This will provide any version desired, even newer builds that are not yet in the official releases.


Once installed, pg_top works as a very accurate real time view into the database it is monitoring and using the command line to run ‘pg_top’ will launch the interactive PostgreSQL monitoring tool.

The tool itself can help shed light on all processes currently connected to the database.

Running pg_top

Launching pg_top is the same as the unix / linux style ‘top’ command itself, along with connection information to the database.

To run pg_top on a local database host:

pg_top -h localhost[...]
カテゴリー: postgresql

Pavel Stehule: Pager for data

planet postgresql - 2018-10-08(月) 17:03:00
Any Unix like systems has great feature - a pager. The history of pagers is pretty old. Originally was used manually. The result of some programs can be redirected to any program with pipe operator.

ls -la | more

One, most simple pager is more. It can scroll only in one direction. Usually this is default pager in Unix systems.

Much more powerful pager is less. Has lot of advantages and functionality - mainly possibility to scroll in two direction, strong navigation inside result. It can display long lines, and allow horizontal scrolling. This pager is necessary for comfortable usage of psql console.

export PAGER="less -S"
psql postgres
postgres=>select * from pg_class;

It is good to know some less commands:
  • g - move to document begin,
  • G - move to document end,
  • / - search string,
  • n - next occurrence of string,
  • N - previous occurrence of string,
  • q - quit.
less is very rich pager has special support for displaying man pages. But this feature we cannot to use for database data. For tabular data, there are special pager pspg. This pager can freeze n first columns, and m first rows. It knows almost all keyboard commands of less pager, and append some new (based on mcedit keywords).
  • Alt k - new bookmark
  • Alt i - previous bookmark
  • Alt j - next bookmark
  • Alt n - show row numbers
  • Alt l - go to line
  • Ctrl Home - move to document begin
  • Ctrl End - move to document end
  • F9 - menu
  • F10 - quit
The usage of pspg is same: export PAGER=pspg
psql postgres
This pager is available from PostgreSQL community repository or from git
カテゴリー: postgresql

Avinash Kumar: PostgreSQL Extensions for an Enterprise-Grade System

planet postgresql - 2018-10-06(土) 01:43:42

In this current series of blog posts we have been discussing various relevant aspects when building an enterprise-grade PostgreSQL setup, such as security, back up strategy, high availability, and different methods to scale PostgreSQL. In this blog post, we’ll get to review some of the most popular open source extensions for PostgreSQL, used to expand its capabilities and address specific needs. We’ll cover some of them during a demo in our upcoming webinar on October 10.

Expanding with PostgreSQL Extensions

PostgreSQL is one of the world’s most feature-rich and advanced open source RDBMSs. Its features are not just limited to those released by the community through major/minor releases. There are hundreds of additional features developed using the extensions capabilities in PostgreSQL, which can cater to needs of specific users. Some of these extensions are very popular and useful to build an enterprise-grade PostgreSQL environment. We previously blogged about a couple of FDW extensions (mysql_fdw and postgres_fdw ) which will allow PostgreSQL databases to talk to remote homogeneous/heterogeneous databases like PostgreSQL and MySQL, MongoDB, etc. We will now cover a few other additional extensions that can expand your PostgreSQL server capabilities.


The pg_stat_statements module provides a means for tracking execution statistics of all SQL statements executed by a server. The statistics gathered by the module are made available via a view named pg_stat_statements. This extension must be installed in each of the databases you want to track, and like many of the extensions in this list, it is available in the contrib package from the PostgreSQL PGDG repository.


Tables in PostgreSQL may end up with fragmentation and bloat due to the specific MVCC implementation in PostgreSQL, or simply due to a high number of rows being naturally removed. This could lead to not only unused space being held inside the table but also to sub-optimal execution of SQL statements. pg_repack is the most popul

カテゴリー: postgresql

Sebastian Insausti: A New Way to Personalize Your PostgreSQL Monitoring with Prometheus

planet postgresql - 2018-10-05(金) 17:54:06

Monitoring is one of the fundamental tasks in any system. It can help us to detect problems and take action, or simply to know the current state of our systems. Using visual displays can make us more effective as we can easier detect performance problems.

In this blog, we will see how to use SCUMM to monitor our PostgreSQL databases and what metrics we can use for this task. We’ll also go through the dashboards available, so you can easily figure out what’s really happening with your PostgreSQL instances.

What is SCUMM?

First of all, let’s see what is SCUMM (Severalnines CMON Unified Monitoring and Management ).

It’s a new agent-based solution with agents installed on the database nodes.

The SCUMM Agents are Prometheus exporters which export metrics from services like PostgreSQL as Prometheus metrics.

A Prometheus server is used to scrape and store time series data from the SCUMM Agents.

Prometheus is an open-source system monitoring and alerting toolkit originally built at SoundCloud. It is now a standalone open source project and maintained independently.

Prometheus is designed for reliability, to be the system you go to during an outage to allow you to quickly diagnose problems.

How to use SCUMM?

When using ClusterControl, when we select a cluster, we can see an overview of our databases, as well as some basic metrics that can be used to identify a problem. In the below dashboard, we can see a master-slave setup with one master and 2 slaves, with HAProxy and Keepalived.

ClusterControl Overview

If we go to the “Dashboards” option, we can see a message like the following.

ClusterControl Dashboards Disabled

To use this feature, we must enable the agent mentioned above. For this, we only have to press on the "Enable Agent Based Monitoring" button in this section.

ClusterControl Enable Agent Based Monitoring

To enable our agent, we must specify the host where we will install our Prometheus server, which, as we can see in the example, can be our ClusterControl server.

We must also specify:

  • Scrape Interval
カテゴリー: postgresql

Niksa Jakovljevic: How to manage Prometheus high-availability with PostgreSQL + TimescaleDB

planet postgresql - 2018-10-05(金) 04:14:20

Prometheus has become one of the most commonly used open-source monitoring and alerting systems. What makes Prometheus so great is that it solves monitoring problems in a simple and effective way. Yet there are still challenges with operating Prometheus, including long-term storage and managing high-availability.

In a previous post, we describe how and why to use PostgreSQL + TimescaleDB for enabling long-term storage (as well as for simplifying operations, and adding query power/flexibility) to Prometheus.

Uniting SQL and NoSQL for Monitoring: Why PostgreSQL is the ultimate data store for Prometheus

In this post, we describe how PostgreSQL + TimescaleDB can also help with managing high-availability.

There are several reasons why to add high-availability (HA) to Prometheus deployments:

  • Avoiding data loss and a single point of failure
  • Avoiding missing any critical alerts generated by an Alertmanager
  • Replicating Prometheus data (although the data will not be fully consistent between replicas — more on this later)

Setting up HA for Prometheus requires running two of more identically configured Prometheus servers in parallel. Yet this simple setup does come with limitations.

One problem is that it can be hard to keep data in sync. Often parallel Prometheus instances do not have identical data. There are few reasons for this. One reason is that scrape intervals may differ, since each instance has its own clock. Another reason is that, in case of instance failures, either of the Prometheus nodes may miss some data, which would mean that in the long run none of the Prometheus instances would have a complete data set. In other words, there would be no single source of truth.

Another problem is with remote storage. When running HA Prometheus with remote storage, one can decide to just have a separate remote storage for each Prometheus instance. But this approach, while valid, does introduce other weaknesses: again, no ground truth; the complexity of operating multiple remote storage instances; and hardware costs.


カテゴリー: postgresql

Lukas Fittl: New in Postgres 11: Monitoring JIT performance, Auto Prewarm & Stored Procedures

planet postgresql - 2018-10-04(木) 21:00:00
Everyone’s favorite database, PostgreSQL, has a new release coming out soon: Postgres 11 In this post we take a look at some of the new features that are part of the release, and in particular review the things you may need to monitor, or can utilize to increase your application and query performance. Just-In-Time compilation (JIT) in Postgres 11 Just-In-Time compilation (JIT) for query execution was added in Postgres 11. It's not going to be enabled for queries by default, similar to parallel…
カテゴリー: postgresql

Bruce Momjian: Moving Tablespaces

planet postgresql - 2018-10-04(木) 00:00:01

Tablespaces are designed to allow Postgres clusters to be spread across multiple storage devices. CREATE TABLESPACE creates a symbolic link in the pg_tblspc directory in the cluster's data directory pointing to the newly-created tablespace directory.

Unfortunately, though there is a command to move tables and indexes between tablespaces, there is no command to move tablespaces to different directories. However, since Postgres 9.2, the process of moving tablespaces is quite simple:

  1. Record the OID of the tablespace you want to move
  2. Shut down the Postgres cluster
  3. Move the tablespace directory, either within the same file system or to a different file system
  4. Update the OID symbolic link that represents the moved tablespace to the new tablespace directory location
  5. Restart the server

Here's an example of moving a tablespace:

Continue Reading »

カテゴリー: postgresql

Liaqat Andrabi: Webinar : New Features in PostgreSQL 11 [Follow Up]

planet postgresql - 2018-10-03(水) 21:44:24

PostgreSQL 11, the next major release of the world’s most advanced open source database, is just around the corner. The new release of PostgreSQL will include enhancements in partitioning, parallelism, SQL stored procedures and much more.

To give PostgreSQL enthusiasts a deeper look into the upcoming release, 2ndQuadrant hosted a Webinar discussing the new features in PostgreSQL 11. The webinar was presented by Peter Eisentraut, Core Team Member and Major PostgreSQL Contributor.

If you weren’t able to make it to the live session, you can now view the recording here.

Questions that Peter couldn’t respond to during the live webinar have been answered below.

Q: Could you use a custom hash function for partitioning? (or is this in future plans)

A: This is currently not planned.

Q: As far as i understand, there are still no global indexes on partitioned tables. Am i right?

A: Right, these are only local indexes that get automatically added to new partitions and can be managed more simply. Global indexes would be a separate feature that is not currently on the horizon.

Q: What about commit in function? Is this available as well?

A: Commit and rollback are only possible in procedures.

Q: Will JIT-enabled builds be available through PGDG?

A: Yes, community produced Debian and RPM packages will have LLVM support. It might be in a separate sub-package. But some older operating systems don’t have a recent enough LLVM version, so in that case it won’t be available.

Q: Does JIT inlining work with user-defined operators?

A: Yes. The extension module needs to be built in a certain way to install the associated LLVM bitcode. But the normal extension build system takes care of that.

Q: Are JIT compiled queries cached? So they can be reused either by the same session or another session.

A: No, the result of JIT compilation is not cached. This was considered but was not deemed worthwhile at this time.

Q: What about TDE in PostgreSQL 11?

A: TDE is not in PostgreSQL 11.

Q: Can we expect data at rest encryption enhancements in Postgre

カテゴリー: postgresql

Jobin Augustine: Scaling PostgreSQL using Connection Poolers and Load Balancers for an Enterprise Grade environment

planet postgresql - 2018-10-03(水) 01:15:30

In the previous blog posts in this series we’ve covered some of the essential aspects of an Enterprise-grade solution: security, high availability, and backups. Another important aspect is the scalability of the solution: as our application grows how do we accommodate an increase in traffic while maintaining the quality of the service (response time)? The answer to this question depends on the nature of the workload at play but it is often shaped around:

(a) improving its efficiency and
(b) increasing the resources available.

Why connection pooling ?

When it comes to improving the efficiency of a database workload, one of the first places we start looking at is the list of slow queries; if the most popular ones can be optimized to run faster then we can easily gain some overall performance back. Arguably, we may look next at the number and frequency of client connections: is the workload composed of a high number of very frequent but short-lived connections? Or are clients connections of a more moderate number, and tend to stick around for longer ?

If we consider the first scenario further–a high number of short lived connections–and that each connection spawns a new OS process, the server may hit a practical limit as to the number of transactions—or connections—it can manage per second, considering the hardware available and the workload being processed. Remember that PostgreSQL is process-based, as opposed to thread-based, which is itself an expensive operation in terms of resources, both CPU and memory.

A possible remedy for this would be the use of a connection pooler, acting as a mediator between the application and the database. The connection pooler keeps a number of connections permanently opened with the database, and receives and manages all incoming requests from clients itself, allowing them to temporarily use one of the connections it already has established with PostgreSQL. This removes the burden of creating a new process each time a client establishes a connection with PostgreSQL, and allows it

カテゴリー: postgresql

Oleg Bartunov: Bloom index for bigint

planet postgresql - 2018-10-03(水) 01:14:17
Bloom index by default works for int4 and text, but other types with hash function and equality operator could be supported. Just use opclass interface, for example, for type bigint

OPERATOR 1 =(bigint, bigint),
FU>CTION 1 hashint8(bigint);

Now, you can build bloom index for bigint data type.
カテゴリー: postgresql

Alexey Lesovsky: pgCenter - stress free Postgres stats.

planet postgresql - 2018-10-02(火) 23:16:00
pgCenter has seen a new release with several significant changes. If you used pgCenter before, you might have noticed that I stopped issuing updates for a while. This is because I was working on this new release with several new features and overall improvements that will make the use of the tool easier and will allow additional functionality to make your life easier when working with Postgres stats.

This new version has been completely rewritten using Golang.

Why Go? To me, Golang is much simpler than pure C, but, at the same time, it is an amazing and powerful language. I had a few requests that I was aiming to address in this release and frankly, it was much easier to implement these using Golang.

With this release pgCenter became quite comprehensive toolbox for working with Postgres stats.

In this 0.5.0 version I’ve added two new key tools - for recording stats into history files and building reports based on those files. People who are familiar with pgCenter, often asked me about these features, and now they are available and ready to use.

Additional features
There are also a few additional minor changes that were made in this version of pgCenter:
  • pgCenter’s functionality is now split into subcommands and looks like Git or Perf style. pgCenter obtained new functions that have different aims and distributed across several sub-commands with different execution options and parameters.
  • pgCenter uses lib/pq library for handling the connections to Postgres, and one of the advantages is the built-in support of .pgpass file and environment variables. And the cherry on the cake is that you can override connections settings using PGOPTIONS variable.
pgCenter top changes
As mentioned earlier, pgCenter split into several sub-commands, the “top” is now a dedicated sub-command, so in a new version it has seen some improvements:
  • Activity statistics are shown by default. In previous versions, database statistics were default, but majority of pgCenter users often switched to activity stats, so in this version act
カテゴリー: postgresql

Hubert 'depesz' Lubaczewski: Foreign Key to partitioned table

planet postgresql - 2018-10-02(火) 18:02:48
One of the long standing limitations of partitions is that you can't have foreign keys pointing to them. Let's see if I can make it possible to have some kind of constraint that would do the same thing as fkey. First, obviously, we need partitioned table with some data: =$ CREATE TABLE USERS ( id […]
カテゴリー: postgresql

Don Seiler: Sequence Caching: Oracle vs. PostgreSQL

planet postgresql - 2018-10-02(火) 14:00:00

Many RDBMSes use sequence caching to pre-generate sequence values in advance and store them in memory, allowing them to return new values quickly. If you are doing a lot of inserts that each call the sequence to get the next value, sequence caching is a good thing. Having to wait for the sequence to generate a value on every call could slow things down greatly.

When I made the move from Oracle to PostgreSQL, I noticed some interesting differences in sequence caching behavior and wanted to share them here for those that may be curious about them.

All examples below were conducted in Oracle (12c Enterprise Edition) and PostgreSQL 9.6.9 (via PGDG Yum Repository).

Cache Value Sharing

In Oracle, when a sequence cache is generated, all sessions access the same cache. However in PostgreSQL, each session gets its own cache. We demonstrate this with a couple of quick-and-easy examples below.


First let’s create our sequence, using a cache size of 25.

SQL> CREATE SEQUENCE dts_seq 2 INCREMENT BY 1 3 CACHE 25; Sequence created.

Now I can open a couple of sessions and call nextval on that sequence to increment it from two different sessions:

-- First session SQL> SELECT dts_seq.nextval FROM dual; NEXTVAL ---------- 1 SQL> SELECT dts_seq.nextval FROM dual; NEXTVAL ---------- 2 SQL> SELECT dts_seq.nextval FROM dual; NEXTVAL ---------- 3 -- Second session SQL> SELECT dts_seq.nextval FROM dual; NEXTVAL ---------- 4 SQL> SELECT dts_seq.nextval FROM dual; NEXTVAL ---------- 5 SQL> SELECT dts_seq.nextval FROM dual; NEXTVAL ---------- 6

It’s even more clear when alternating between the two sessions:

-- Session A SQL> SELECT dts_seq.nextval FROM dual; NEXTVAL ---------- 151 -- Session B SQL> SELECT dts_seq.nextval FROM dual; NEXTVAL ---------- 152 -- Session A SQL> SELECT dts_seq.nextval FROM dual; NEXTVAL ---------- 153 -- Session B SQL> SELECT dts_seq.nextval FROM dual[...]
カテゴリー: postgresql

Bruce Momjian: Switchover/Failover and Session Migration

planet postgresql - 2018-10-02(火) 07:45:01

I have already covered switchover and failover. In the case of failover, the old primary is offline so there are no options for migrating clients from the old primary to the new primary. If there is a switchover, there are options for client migration. Assuming you are using streaming replication, only one server can accept writes. Therefore, for client migration, you can either:

  • Force all clients on the old primary to exit, then promote the new primary
  • Wait for all clients on the old primary to exit, then promote the new primary

If you choose force, it will disrupt applications; they must be designed to handle disconnections and potentially reconfigure their sessions, e.g., session variables, cursors, open transactions. If you choose wait, how do you handle clients that want to connect while you are waiting for the existing clients to disconnect? Minor upgrades, which require a restart of the database server, have similar issues.

The only clean solution is to use multi-master replication so new clients can connect to the new primary while waiting for old-primary clients to finish and disconnect. However, it is operationally expensive to support multi-master just to minimize switchover disruption.

カテゴリー: postgresql

Keith Fiske: Per-Table Autovacuum Tuning

planet postgresql - 2018-10-02(火) 03:43:06

A pattern that seems to drive my blog posts definitely seems to be the frequency of client questions. And that is definitely the case here again. Vacuum tuning to manage bloat and transaction id wraparound on production systems has been a hot topic and lately this has even been getting down to tuning autovacuum on the individual table basis. I’ve already discussed bloat pretty extensively in previous posts. While I’d like to get into the details of transaction ID wraparound, that really isn’t the focus of this post, so I’ll defer you to the documentation.

One setting I will discuss first though is autovacuum_freeze_max_age. Basically when any table’s max XID value reaches this, a more aggressive “emergency” autovacuum is kicked off. If many tables hit this at the same time (a common occurrence with data-warehouses that have many large, sparsely written tables), that can kick off some pretty high and long lasting IO caused by these autovacuums. I highly recommend increasing autovacuum_freeze_max_age from the default value of 200 million to 1 billion. However, I don’t recommend doing this unless you have some monitoring in place to watch for tables reaching both autovacuum_freeze_max_age and wraparound, the latter of which can shut your database down for some pretty extensive downtime.

So ideally, we want autovacuum running often enough on its own so you never encounter any of those bad situations above. The next thing I’d recommend tuning on any database before getting down to the table-level methods, no matter its size or traffic, are the default settings that control when autovacuum initially kicks in.

autovacuum_analyze_scale_factor = .10 autovacuum_analyze_threshold = 50 autovacuum_vacuum_scale_factor = .20 autovacuum_vacuum_threshold = 50

In my opinion, the defaults on these are not ideal for any situation. The scales are too high and the thresholds are too low. The scale factor settings are percentage values that say, “When this percentage of the table’s rows have changed (updates/deletes), run vacuum or ana

カテゴリー: postgresql

Andreas 'ads' Scherbaum: PGConf.EU 2018 - Speaker Interviews

planet postgresql - 2018-10-01(月) 19:00:00

Like in previous years, we conduct interviews with our speakers ahead of this year's PostgreSQL Conference Europe.

The first interviews are already online: 2018 - Speaker Interviews

If you are speaking in Lisbon, and haven't already submitted your answers, please check your mailbox. There should be an email asking you if you want to participate.

カテゴリー: postgresql

Paul Ramsey: PostGIS Code Sprint 2018 #2

planet postgresql - 2018-10-01(月) 17:00:00

An important topic of conversation this sprint was what kinds of core PostgreSQL features might make PostGIS better in the future?

Parallel GIST Scan

The PostGIS spatial index is built using the PostgreSQL GIST index infrastructure, so anything that makes GIST scans faster is a win for us. This would be a big win for folks with large tables (and thus deep trees) and who run scans that return a lot of indexed tuples.

Faster GIST Index Building

B-Tree index builds are accellerated by pre-sorting the inputs; could the same trick be used in building GIST indexes? Again, for large tables, GIST index building is slower than B-Tree and “faster” is the #1 feature all existing users want.

Multi-Threading in Functions

This isn’t a feature request, so much as a request for clarification and assurance: PostGIS calls out to other libraries, like GEOS, and it’s possible we could make some of our algorithms there faster via parallel processing. If we do our parallel processing within a function call, so the PostgreSQL thread of execution isn’t doing anything until we return, is it OK for us to do threading? We use pthreads, or maybe OpenMP.

Compressed Datum Slicing

“Detoast datum slice” doesn’t actually get around to the slicing step until after the datum is decompressed, which can make some queries quite slow. We already try to read boxes from the headers of our objects, and for large objects that means decompressing the whole thing: it would be nice to only decompress the first few bytes. I have an ugly patch I will be testing to try and get committed.

Forced Inlining

A problem we have with PostgreSQL right now is that we cannot effectively cost our functions due to the current inlining behaviour on our wrapper functions like ST_Intersects(). When raised on the list, the hackers came to a tentative conclusion that improving the value caching behaviour would be a good way to avoid having inlining in incorrect places. It sounded like subtle and difficult work, and nobody jumped to it.

We propose leaving the current inlining

カテゴリー: postgresql