planet postgresql

Subscribe to planet postgresql のフィード
Planet PostgreSQL
更新: 2時間 27分 前

Bruce Momjian: "Get Off My Lawn"

2018-10-17(水) 23:45:01

As you might know, Postgres is old-school in its communication methods, relying on email for the bulk of its development discussion, bug reports, and general assistance. That's the way it was done in 1996, and we are still doing it today. Of course, some projects have moved on to github or Slack.

As a project, we try to take advantage of new technologies while retaining technologies that are still optimal, and email is arguably one of them. Email hasn't changed much since 1996, but email clients have. Previously all email was composed using a text-editor-like interface, which allowed for complex formatting and editing. New email tools, like Gmail, provide a more simplified interface, especially on mobile devices. This simplified interface is great for composing emails while commuting on a train, but less than ideal for communicating complex topics to thousands of people.

This email captures the requirements of communicating technical points to thousands of people, and the need for fine-grained composition. Basically, when you are emailing thousands of people, taking time to make the communication clear is worthwhile. However, this email explains the reality that many email tools are more tailored for effortless communication on devices with limited features.

Continue Reading »

カテゴリー: postgresql

Craig Kerstiens: Commenting your Postgres database

2018-10-17(水) 23:36:00

At Citus whether it’s looking at our own data or helping a customer debug a query I end up writing a lot of SQL. When I do write SQL I do my best to make sure it’s readable in case others need to come along and understand or modify, but admittedly I do have some bad habits from time to time such as using implicit joins. Regardless of my bad habits I still try to make my SQL and database as easy to understand for someone not already familiar with it. One of the biggest tools for that is comments.

Even early on in learning to program we take advantage of comments to explain and describe what our code is doing, even in times when it seems obvious. I see this less commonly in SQL and databases, which is a shame because data is just as valuable so making it easier to reason and work with seems logical. Postgres has a few great mechanisms you can start leveraging when it comes to commenting so you can better document things.

Inline commenting your queries

The place I most use comments is in larger queries I write. Yes, [common table expressions] can be an optimization fence, but they also allow you to create building blocks within your SQL making a query easier to understand. Though you shouldn’t stop there. Within SQL you can have a line comment when you preface it with --. You can do this at the start of a line, or at the end with what follows becoming a comment. We can see this in action on our query example from our earlier CTE blog post:

-- Getting a list of all opportunities opened longer than 30 days ago, but earlier than 60 days ago WITH opp_list AS ( SELECT opportunities.id as opportunity_id, account_id, accounts.name as account_name, opportunities.amount as opportunity_amount, opportunities.created_at as opportunity_created FROM opportunities, accounts WHERE opportunities.created_at <= now() - '30 days'::interval AND opportunities.created_at >= now() - '60 days'::interval AND opportunities.account_id = accounts.id ), -- Get a list of all contacts we[...]
カテゴリー: postgresql

Nickolay Ihalainen: PostgreSQL locking, Part 1: Row Locks

2018-10-16(火) 23:26:00

An understanding of PostgreSQL locking is important to build scalable applications and avoid downtime. Modern computers and servers have many CPU cores and it’s possible to execute multiple queries in parallel. Databases containing many consistent structures with changes made by queries or background processes running in parallel could crash a database or even corrupt data. Thus we need the ability to prevent access from concurrent processes, while changing shared memory structures or rows. One thread updates the structure while all others wait (exclusive lock), or multiple threads read the structure and all writes wait. The side effect of waits is a locking contention and server resources waste. Thus it’s important to understand why waits happen and what locks are involved. In this article, I review PostgreSQL row level locking.

In follow up posts, I will investigate table-level locks and latches protecting internal database structures.

Row locks – an overview

PostgreSQL has many locks at different abstraction levels. The most important locks for applications are related to MVCC implementation – row level locking. In second place – locks appearing during maintenance tasks (during backups/database migrations schema changes) – table level locking. It’s also possible—but rare—to see waits on low level PostgreSQL locks. More often there is a high CPU usage, with many concurrent queries running, but overall server performance reduced in comparison with normal number of queries running in parallel.

Example environment

To follow along, you need a PostgreSQL server with a single-column table containing several rows:

postgres=# CREATE TABLE locktest (c INT); CREATE TABLE postgres=# INSERT INTO locktest VALUES (1), (2); INSERT 0 2 Row locks

Scenario: two concurrent transactions are trying to select a row for update.

PostgreSQL uses row-level locking in this case. Row level locking is tightly integrated with MVCC implementation, and uses hidden xmin and xmax fields.

xmin  and xmax  store the transaction id. All statement[...]
カテゴリー: postgresql

Kaarel Moppel: Ideas for scaling PostgreSQL to multi-terabyte and beyond

2018-10-16(火) 17:00:27

After a recent chat with a DBA, making his first steps towards migrating a big database away from Oracle to Postgres, I thought there must be just too few articles on PostgreSQL features and “hacks”, aiding in achieving scalability to squeeze the last out of the hardware to safely accomodate some decent multi-terabyte size databases. Especially as there are quite a few options out there, I was very surprised that there was so much fear that Postgres is somehow very limited in the scaling matters. Well maybe it was indeed once so (I started with Postgres in 2011), but in year 2018 things are pretty solid actually – so please do read on for some ideas how to juggle terabytes with ease.

Standard Postgres facilities

If you don’t like to sweat too much and do some pioneering then the safest way to scale of course would be to stick with proven out-of-the-box features of Postgres – so first I’d recommend to take a look at the following keywords with some short explanations and maybe it’s all that you need.

  • Light-weight / special purpose indexes

For a complex OLTP system, supporting hundreds of freaky queries, it is very common that the indexes actually take much more disk space than the table files holding the data. To improve on that (especially for indexes that are used infrequently) one can reduce the index sizes drastically with appropriate use of partial, BRIN, GIN or even a bit experimental BLOOM indexes. In total there are 7 different index types supported…but mostly people only know about and use the default B-tree – a big mistake in a multi-TB setting!

Partial indexes allow indexing only a subset of the data – for example in a sales system we might not be interested in fast access to orders in status “FINISHED” (some nightly reports deal with that usually and they can take their time), so why should we index such rows?

GIN, the most know non-default index type perhaps, has been actually around for ages (full-text search) and in short is perfect for indexing columns where there are lot of repeating values – th

[...]
カテゴリー: postgresql

Tim Colles: Hold the exact(ish) “CREATE VIEW” definition in the PostgreSQL System Catalog

2018-10-16(火) 00:15:36

It would be rather nice if the exact SQL definition (well almost, see below) supplied with the “create view” statement could be held in the PostgreSQL System Catalog, rather than the re-written optimised version that is currently held. This has come up before, see:

http://www.postgresql-archive.org/idea-storing-view-source-in-system-catalogs-td1987401.html
http://www.postgresql-archive.org/Preserving-the-source-code-of-views-td5775163.html

Back then this idea got stomped on for good reasons but I am not suggesting quite the same.

We would prefer PostgreSQL to store view definitions in the system catalog *AFTER* column expansion (unlike in the discussion above) but *BEFORE* internal parser rewrites. So, currently for example:

CREATE TABLE bar ( x INTEGER, y TEXT );

CREATE VIEW foo(x, y) AS
SELECT * FROM bar WHERE y IN ( ‘a’, ‘b’ ) AND x IS NOT DISTINCT FROM 0;

SELECT view_definition FROM information_schema.views
WHERE table_name = ‘foo’;

SELECT bar.x, bar.y
FROM bar
WHERE ((bar.y = ANY (ARRAY[‘a’::text, ‘b’::text])) AND (NOT (bar.x IS DISTINCT FROM 0)));
(1 row)

The column expansion and corresponding rewrite is mandatory at the time of creation (see below), but the other changes in the definition are not. They make comparing the current definition of a view as it is in a database against an externally maintained definition difficult, which is what we in particular want to be able to do.

The SQL standard (at least the ancient version I have access to) for “information_schema” seems rather open to interpretation on this issue. It says that the view definition column:

“contains a representation of the view descriptors”

What do we take “representation” to be – presumably a completely mangled internally optimised “blob” would be compliant (but otherwise useless). The spirit of the statement might be more reasonably taken to be “a human legible representation”. In that case how much rewrite should actually be allowed – to me the rewritten definition above is already getting less legible than the original, but may

[...]
カテゴリー: postgresql

Bruce Momjian: Three-Year Cycle

2018-10-15(月) 23:45:02

In the early years of Postgres's open source development, we focused on features that could be accomplished in a few weekends. Within a few years, we had completed many of those, and were challenged trying to accomplish big project with a mostly volunteer workforce. Soon, however, large companies started to sponsor developers' time and we launched into big feature development again.

Currently, the calendar is our only real challenge. We have major releases every year, but many features take multiple years to fully implement. We have seen this multi-year process with:

  • Windows port
  • Point-in-time recovery
  • JSON
  • Streaming replication

and are in the process of completing even more:

  • Parallelism
  • Partitioning
  • JIT
  • Sharding

Continue Reading »

カテゴリー: postgresql

Pavel Trukhanov: PgBouncer monitoring improvements in recent versions

2018-10-15(月) 22:48:47
PgBouncer monitoring improvements in recent versions

As I wrote in my previous article “USE, RED and real world PgBouncer monitoring” there are some nice commands in PgBouncer’s admin interface that allow to collect stats how things going and spot problems, if you know where to look.

This post is about new stats added in these commands in new PgBouncer versions.

So as you know, SHOW STATS shows cumulative stats for each proxied DB:

Since PgBouncer version 1.8 there’s a couple of new columns in its output.

First one — total_xact_time — total number of microseconds spent by pgbouncer when connected to PostgreSQL in a transaction, either idle in transaction or executing queries.

This will allow us to chart db pool utilization in terms of time spent in transactions and compare it to the query time utilization:

We see two totally different situations — while database is used for serving queries only 5 to 25 % of the time, PgBouncer connections around 8:00 am spend up to 70% of time in transactions!

But this total_xact_time is useful in one more very important way.

There’s a known anti-pattern in Postgres usage, is when your application opens up a transaction, makes a query and then starts doing something else, for example some CPU-heavy calculation on that result or query to some other resource/service/database, while transaction keeps hangging. Later this app will probably return back to this transaction and might, for example, update something and commit it. The bad thing in that case is that there’s a corresponding Postgres backend process, that sits there doing nothing, while transaction is idling. And Postgres backends are somewhat expensive.

Your app should avoid such behavior.

This idle in transaction state might be monitored in the Postgres itself — there’s state column in pg_stat_activity system view. But pg_stat_activity provides us only with a snapshot of current states, that leads to possible false negative errors in reporting occurrences of such cases. Using PgBouncer's stats we can calculate a percenta

[...]
カテゴリー: postgresql

pgCMH - Columbus, OH: Joining the herd

2018-10-15(月) 13:00:00

The Oct meeting will be held at 18:00 EST on Tues, the 23rd. Once again, we will be holding the meeting in the community space at CoverMyMeds. Please RSVP on MeetUp so we have an idea on the amount of food needed.

What

Our very own Douglas will be presenting this month. He’s going to tell us all about how you can join the PostgreSQL community and contribute the growth and success of PostgreSQL.

Where

CoverMyMeds has graciously agreed to validate your parking if you use their garage so please park there:

You can safely ignore any sign saying to not park in the garage as long as it’s after 17:30 when you arrive.

Park in any space that is not marked ‘24 hour reserved’.

Once parked, take the elevator/stairs to the 3rd floor to reach the Miranova lobby. Once in the lobby, the elevator bank is in the back (West side) of the building. Take a left and walk down the hall until you see the elevator bank on your right. Grab an elevator up to the 11th floor. (If the elevator won’t let you pick the 11th floor, contact Doug or CJ (info below)). Once you exit the elevator, look to your left and right; one side will have visible cubicles, the other won’t. Head to the side without cubicles. You’re now in the community space:

The kitchen is to your right (grab yourself a drink) and the meeting will be held to your left. Walk down the room towards the stage.

If you have any issues or questions with parking or the elevators, feel free to text/call Doug at +1.614.316.5079 or CJ at +1.740.407.7043

カテゴリー: postgresql

Bruce Momjian: Installing PL/v8

2018-10-13(土) 00:00:02

PL/v8 is the JavaScript server-side language for Postgres. It has been available for several years, but a change in the way Google packages the v8 languages has made it burdensome for packagers to build PL/v8 packages.

Therefore, few package managers still distribute PL/v8. This is disappointing since it undercuts some of our NoSQL story. We previously supported JSON storage and a JSON-specific server-side language. This second option is effectively no longer available, and those users who are using PL/v8 will need to find alternatives. This highlights the risk of software significantly relying on other software that it does not control and cannot maintain itself.

カテゴリー: postgresql

Ibrar Ahmed: Track PostgreSQL Row Changes Using Public/Private Key Signing

2018-10-12(金) 22:43:31

Authorisations and encryption/decryption within a database system establish the basic guidelines in protecting your database by guarding against malicious structural or data changes.

What are authorisations?

Authorisations are the access privileges that mainly control what a user can and cannot do on the database server for one or more databases. So consider this to be like granting a key to unlock specific doors. Think of this as more like your five star hotel smart card. It allows you access all facilities that are meant for you, but doesn’t let you open every door. Whereas, privileged staff have master keys which let them open any door.

Similarly, in the database world, granting permissions secures the system by allowing specific actions by specific users or user groups, yet it allows database administrator to perform whatever action(s) on the database he/she wishes. PostgreSQL provides user management where you can can create users, and grant and revoke their privileges.

Encryption

Encryption, decryption can protect your data, obfuscate schema structure and help hide code from prying eyes. Encryption/decryption hides the valuable information and ensures that there are no mischievous changes in the code or data that may be considered harmful. In almost all cases, data encryption and decryption happens on the database server. This is more like hiding your stuff somewhere in your room so that nobody can see it, but also making your stuff difficult to access.

PostgreSQL also provides encryption using pgcrypto (PostgreSQL extension). There are some cases where you don’t want to hide the data, but don’t want people to update it either. You can revoke the privileges to modify the data.

Data modifications

But what if an admin user modifies the data? How you can identify that data is changed? If somebody changes the data and you don’t know about, then it is more dangerous than you losing your data, as you are relying on data which may no longer be valid.

Logs in database systems allow us to track back changes and “po

[...]
カテゴリー: postgresql

Shaun M. Thomas: PG Phriday: Studying Stored Procedures in Postgres 11

2018-10-12(金) 16:00:47
Studying Stored Procs in Postgres 11

With Postgres 11 looming on the near horizon, it’s only appropriate to check out a recent beta and kick the tires a few times. Whether it’s improvements in parallelism, partitions, stored procedures, JIT functionality, or any number of elements in the release page, there’s a lot to investigate.

It just so happens that I ran across a fortuitous event on Twitter when deciding on an appropriate topic. Behold!

Wasting XIDs has never been this fun

Wait! No! That’s not what stored procedures are for!

I felt so good like anything was possible

When confronted with such a blatant corruption of such a cool new feature, it’s only natural to question the wisdom of doing so. It is, after all, not a great idea to programatically consume transaction IDs. I said as much and moved on with life, certain the worst was over.

Then this happened.

Magnus helpfully recommends proceeding

Now, Magnus is infamous for two things: his technical acumen, and giddy malevolence. His advocacy of a stored procedure named “waste_xid” only proved nobody anywhere should ever run this anywhere, lest they immolate whatever system hosted the database instance.

But hey, VMs are cheap; let’s break things. How else can we learn the limits of our new toys, but by virtually atomizing them?

I hit cruise control and rubbed my eyes

Before we saddle our poor innocent Postgres 11 installation with an inadvisable stored procedure designed specifically to underhandedly reap its transaction lifespan, we should probably make the routine as evil as possible.

One thing stands out immediately: calling EXECUTE is unnecessary overhead. According to the information function documentation, txid_current will assign a new transaction ID if there isn’t one already. Since the stored procedure is constantly committing, that’s extremely handy. And since this is Pl/pgSQL, we can use direct assignment instead.

Our new procedure looks something like this:

CREATE OR REPLACE PROCEDURE waste_xid(cnt int) AS $$ DECLARE i INT; x BIGINT; B[...]
カテゴリー: postgresql

Pavel Stehule: New functionality of plpgsql_extension

2018-10-11(木) 23:40:00
I am working on new release of plpgsql_check extension - https://github.com/okbob/plpgsql_check.

Interesting new function is possibility to return list of used relation and functions. With these information is easy to generate dependency graph:

postgres=# \sf fx
CREATE OR REPLACE FUNCTION public.fx()
RETURNS void
LANGUAGE plpgsql
AS $function$
begin
perform upper(((plus(a) + 200) * 100)::text) from xx;
end;
$function$
postgres=# select * from plpgsql_show_dependency_tb('fx()');
┌──────────┬───────┬────────┬──────┬───────────┐
│ type │ oid │ schema │ name │ params │
╞══════════╪═══════╪════════╪══════╪═══════════╡
│ FUNCTION │ 18310 │ public │ plus │ (integer) │
│ RELATION │ 24576 │ public │ xx │ │
└──────────┴───────┴────────┴──────┴───────────┘
(2 rows)
カテゴリー: postgresql

Bruce Momjian: Multi-Host Libpq

2018-10-11(木) 03:45:01

Libpq is used by many client interface languages to communicate with the Postgres server. One new feature in Postgres 10 is the ability to specify multiple servers for connection attempts. Specifically, it allows the connection string to contain multiple sets of host, hostaddr, and port values. These are tried until one connects.

NoSQL solutions have used this method of multi-host connectivity for a while, so it is good Postgres can now do it too. It doesn't have all the features of a separate connection pooler, but it doesn't have the administrative or performance overhead of a separate connection pooler either, so it certainly fits a need for some environments.

カテゴリー: postgresql

Jonathan Katz: Won’t You Be My Neighbor? Quickly Finding Who is Nearby

2018-10-10(水) 23:01:54

Many applications these days want us to know how close we are to things:

  • What are the three closest coffee shops to my current location?
  • Which is the nearest airport to the office?
  • What are the two closest subway stops to the restaurant?

and countless more examples.

Another way of asking these questions is to say “who are my nearest neighbors to me?” This maps to a classic algorithmic problem: efficiently finding the K-nearest neighbors (or K-NN), where K is a constant. For example, the first question would be a 3-NN problem as we are trying to find the 3 closest coffee shops.

(If you are interested in learning more about K-NN problems in general, I highly recommend looking at how you can solve this using n-dimensional Voronoi diagrams, a wonderful data structure developed in the field of computational geometry.)

How can we use PostgreSQL to help us quickly find our closest neighbors? Let’s explore.

カテゴリー: postgresql

Laurenz Albe: Are your foreign keys indexed?

2018-10-10(水) 17:01:23
© Laurenz Albe 2018

 

Foreign key constraints are an important tool to keep your database consistent while also documenting relationships between tables.

A fact that is often ignored is that foreign keys need proper indexing to perform well.

This article will explain that and show you how to search for missing indexes.

Index at the target of a foreign key

In the following, I will call the table on which the foreign key constraint is defined the source table and the referenced table the target table.

The referenced columns in the target table must have a primary key or unique constraint. Such constraints are implemented with unique indexes in PostgreSQL. Consequently, the target side of a foreign key is automatically indexed.

This is required so that there is always a well-defined row to which the foreign key points. The index also comes handy if you want to find the row in the target table that matches a row in the source table.

Index at the source of a foreign key

In contrast to the above, PostgreSQL requires no index at the source of a foreign key.

However, such an index is quite useful for finding all source rows that reference a target row. The typical cases where you need that are:

1. You perform a join between the two tables where you explicitly search for the source rows referencing one or a few target rows. If there is an index on the columns at the source, PostgreSQL can use an efficient nested loop join.

This is well known and pretty obvious.

2. You delete rows or update key columns in the target table.

Then PostgreSQL has to check if the foreign key constraint is still satisfied. It does so by searching if there are rows in the source table that would become orphaned by the data modification. Without an index, this requires a sequential scan of the source table.

An example

Let’s build a source and a target table:

-- to make the plans look simpler SET max_parallel_workers_per_gather = 0; -- to speed up CREATE INDEX SET maintenance_work_mem = '512MB'; CREATE TABLE target ( t_id integer NOT NULL, [...]
カテゴリー: postgresql

Andrew Dunstan: Managing Freezing in PostgreSQL

2018-10-10(水) 16:26:18

Postgres contains a moving event horizon, which is in effect about 2 billion transactions ahead of or behind the current transaction id. Transactions more than 2 billion ahead of or more than 2 billion behind the current transaction id are considered to be in the future, and will thus be invisible to current transactions.

Postgres avoids this catastrophic data loss by specially marking old rows so that no matter where they are in relation to the current transaction id they will be visible.

Freezing is this process of marking old live tuples (i.e. database rows) so that they don’t get run over by the moving event horizon that would otherwise make them appear to be in the future. This is in contrast to vacuuming, which is the freeing up of space consumed by old dead tuples that are no longer visible to any transaction.

Both processes are managed by vacuum.

There are a number of settings that govern how freezing is done.

First, vacuum_freeze_min_age governs whether or not a tuple will be frozen while vacuum is already looking at a page to see if it has dead tuples that can be cleaned up. Tuples older than vacuum_freeze_min_age will be frozen in this case. Setting this low means that there will be less work to do later on, but at the possible cost of extra effort both in CPU and IO or WAL activity. Generally you probably want this set to at least a few hours worth of transactions. Let’s say you’re expecting to do up to 2000 transactions per second as a sustained rate. 2000 TPS is 7.2m transactions per hour. Thus a fairly aggressive setting for this case might be say 20m. The default setting is 50m. Similarly for vacuum_multixact_freeze_min_age. Note that the transaction_id and multixid counters are independent – you need to keep track of both of them.

Second, there are vacuum_freeze_table_age and vacuum_multixact_freeze_table_age. These settings govern when autovacuum will not just look at pages that might have dead rows, but any page that might have unfrozen rows. The defaults for these settings are 150m. If you hav

[...]
カテゴリー: postgresql

Fernando Laudares Camargos: PostgreSQL Monitoring: Set Up an Enterprise-Grade Server (and Sign Up for Webinar Weds 10/10…)

2018-10-10(水) 01:23:44

This is the last post in our series on building an enterprise-grade PostgreSQL set up using open source tools, and we’ll be covering monitoring.

The previous posts in this series discussed aspects such as security, backup strategy, high availability, connection pooling and load balancing, extensions, and detailed logging in PostgreSQL. Tomorrow, Wednesday, October 10 at 10AM EST, we will be reviewing these topics together, and showcasing then in practice in a webinar format: we hope you can join us!

Register Now

 

Monitoring databases

The importance of monitoring the activity and health of production systems is unquestionable. When it comes to the database, with its high number of customizable settings, the ability to track its various metrics (status counters and gauges) allows for the maintenance of a historical record of its performance over time. This can be used for capacity planningtroubleshooting and validation.

When it comes to capacity planning, a monitoring solution is a helpful tool to help you assess how the current setup is faring. At the same time, it can help predict future needs based on trends, such as the increase of active connections, queries, and CPU usage. For example, an increase in CPU usage might be due to a genuine increase in workload, but it could also be a sign of unoptimized queries growing in popularity. In which case, comparing CPU with disk access might provide a more complete view of what is going on.

Being able to easily correlate data like this helps you to catch minor issues and to plan accordingly, sometimes allowing you to avoid an easier but more costly solution of scaling up to mitigate problems like this. But having the right monitoring solution is really invaluable when it comes to investigative work and root cause analysis. Trying to understand a problem that has already taken place is a rather complicated, and often unenviable, task unless you established a continuous, watchful eye on the set up for the whole time.

Finally, a monitoring solution can help you valida

[...]
カテゴリー: postgresql

Bruce Momjian: Trigger Me Writable

2018-10-09(火) 23:30:01

Postgres support for hot standby servers allows read-only queries to be run on standby servers, but how are read-only sessions handled when promoting a standby server to primary? After a standby is promoted to primary, new connections are read/write, but existing connections also change to read/write:

SELECT pg_is_in_recovery(); pg_is_in_recovery ------------------- t SHOW transaction_read_only; transaction_read_only ----------------------- on \! touch /u/pg/data2/primary.trigger -- wait five seconds for the trigger file to be detected CREATE TABLE test (x INTEGER); SELECT pg_is_in_recovery(); pg_is_in_recovery ------------------- f SHOW transaction_read_only; transaction_read_only ----------------------- off

Continue Reading »

カテゴリー: postgresql

Fernando Laudares Camargos: Detailed Logging for Enterprise-Grade PostreSQL

2018-10-08(月) 22:13:21

In this penultimate post from our series on building an enterprise-grade PostgreSQL environment we cover the parameters we have enabled to configure detailed logging in the demo setup we will showcase in our upcoming webinar.

Detailed logging in PostgreSQL and log analyzer

Like other RDBMS, PostgreSQL allows you to maintain a log of activities and error messages. Until PostgreSQL 9.6, PostgreSQL log files were generated in pg_log directory (inside the data directory) by default. Since PostgreSQL 10, pg_log has been renamed to simply log. However, this directory can be modified to a different location by modifying the parameter log_directory.

Unlike MySQL, PostgreSQL writes the error and activity log to the same log file thus it may grow to several GBs when detailed logging is enabled. In these cases, logging becomes IO-intensive thus it is recommended to store log files in a different storage to the one hosting the data directory.

Parameters to enable detailed logging

Here’s a list of parameters used to customize logging in PostgreSQL. All of them need to be modified in the postgresql.conf or postgresql.auto.conf files.

logging_collector: in order to log any activity in PostgreSQL this parameter must be enabled. The backend process responsible for logging database activity is called logger, it gets started when logging_collector is set to ON. Changing this parameter requires a PostgreSQL restart.

log_min_duration_statement: this parameter is used primarily to set a time threshold: queries running longer than such should be logged (as “slow queries”). Setting it to -1 disables logging of statements. Setting it to 0 enables the logging of every statement running in the database, regardless of its duration. The time unit should follow the actual value, for example: 250ms,  250s, 250min, 1h. Changing this parameter does not require a PostgreSQL restart – a simple reload of the configuration is enough.reload but not a restart. For example:

log_min_duration_statement = 5s   logs every statement running for 5 seconds o[...]
カテゴリー: postgresql

Brian Fehrle: Dynamic Monitoring of PostgreSQL Instances Using pg_top

2018-10-08(月) 18:57:00

Monitoring PostgreSQL can, at times, be like trying to wrangle cattle in a thunderstorm. Applications connect and issue queries so quickly, it’s hard to see what’s going on or even get a good overview of the performance of the system other than the typical developer complaining ‘things are slow, help!’ kind of requests.

In previous articles, we’ve discussed how to Get to the Source when PostgreSQL is acting slow, but when the source is specifically queries, basic level monitoring may not be enough for assessing what’s going on in an active live environment.

Enter pg_top, a PostgreSQL specific program to monitor real time activity in a database, as well as view basic information for the database host itself. Much like the linux command ‘top’, running it brings the user into a live interactive display of database activity on the host, refreshing automatically in intervals.

Installation

Installing pg_top can be done in the generally expected ways: package managers and source install. The most recent version as of this article is 3.7.0.

Package Managers

Based on the distribution of linux in question, search for pgtop or pg_top in the package manager, it’s likely available in some aspect for the installed version of PostgreSQL on the system.

Red Hat based distros:

# sudo yum install pg_top

Gentoo based distros:

# sudo apt-get install pgtop Source

If desired, pg_top can be installed via source from the PostgreSQL git repository. This will provide any version desired, even newer builds that are not yet in the official releases.

Features

Once installed, pg_top works as a very accurate real time view into the database it is monitoring and using the command line to run ‘pg_top’ will launch the interactive PostgreSQL monitoring tool.

The tool itself can help shed light on all processes currently connected to the database.

Running pg_top

Launching pg_top is the same as the unix / linux style ‘top’ command itself, along with connection information to the database.

To run pg_top on a local database host:

pg_top -h localhost[...]
カテゴリー: postgresql

ページ