フィードアグリゲーター

Bruce Momjian: Client Row Access Control

planet postgresql - 2018-09-06(木) 02:30:01

Usually the database administrator controls who can access database data. However, it is possible for clients to completely control who can access data they add to the database, with the help of openssl.

First, let's create RSA keys for three users from the command line. We first create an RSA public/private key pair for each user in their home subdirectory and then make a copy of their RSA public key in the shared directory /u/postgres/keys:

# # must be run as the root user # cd /u/postgres/keys # for USER in user1 user2 user3 > do mkdir ~"$USER"/.pgkey > chown -R "$USER" ~"$USER"/.pgkey > chmod 0700 ~"$USER"/.pgkey > openssl genpkey -algorithm RSA -out ~"$USER"/.pgkey/rsa.key > chmod 0600 ~"$USER"/.pgkey/* > openssl pkey -in ~"$USER"/.pgkey/rsa.key -pubout -out "$USER".pub > done

Continue Reading »

カテゴリー: postgresql

Dave Page: Why do we install as root?

planet postgresql - 2018-09-05(水) 20:04:00
A couple of common questions I hear from customers (usually long-time users of a particular database from Redwood) via our guys in the field is “why do we install our software as root?” And “why do we run services as postgres?”. The simple, TLDR; answer is “for security”. For a detailed explanation, read on…
A basic principle when securing a software installation is “install with maximum privilege requirements and run with minimal”. In practice this equates to having software being installed and binaries/executables etc. owned by the root user, whilst the services themselves are actually run under a minimally privileged (and ideally dedicated) service user account, typically postgres in a PostgreSQL installation. Data files, and any other files that need to be modified by the software in normal operation are also owned by the service user account.
Let’s look at the running software first. Postgres (which will in fact refuse to run as root), is a server process which is often running on a network port that is accessible from other nodes on the network. Of course, we should limit access as much as possible to only those nodes that need access using both a firewall (even simple iptables rules will work), and Postgres’ pg_hba.conf access control file, but even with those measures in place, it’s possible that a determined attacker (let’s call him Zero Cool) can still gain access to the port the database server is running on.
Once our arch-nemesis Zero Cool has access to the database server port, he needs a way to escalate his attack. This may involve exploiting an unknown security issue in Postgres itself (as with any software, we hope there are none but we’re kidding ourselves if we think it’s totally secure), or it may be that he’s used other techniques such as social engineering to learn a users credentials.
If Zero gains “regular” access to Postgres, then he will be subject to any security measures (access control lists, RLS policies etc) that limit the scope of what the user account he’s used can ac[...]
カテゴリー: postgresql

Christophe Pettus: CHAR: What is it good for?

planet postgresql - 2018-09-05(水) 04:00:33

In addition to the familiar text types VARCHAR and TEXT, PostgreSQL has a type CHAR. It’s little used… and that’s for a reason. It has some very unusual behaviors, which can be quite a surprise if you are not expecting them.

First, CHAR is a fixed-width type. When character data is stored in it, it’s padded out with spaces if it is not full length:

xof=# create table chars (c char(20)); CREATE TABLE xof=# insert into chars values('x'); INSERT 0 1 xof=# select * from chars; c ---------------------- x (1 row)

OK, that’s reasonable, right? But what is going on here?

xof=# select length(c) from chars; length -------- 1 (1 row) xof=# select substring(c from 8 for 1) = ' '::char(1) from chars; ?column? ---------- t (1 row) xof=# select substring(c from 8 for 1) = ' '::varchar(1) from chars; ?column? ---------- f (1 row) xof=# select length(substring(c from 8 for 1)) from chars; length -------- 0 (1 row) xof=# select c || 'y' from chars; ?column? ---------- xy (1 row)

CHAR, when actually used, first trims off all trailing spaces, then applies the operation. It is trying to simulate a variable-length type, for historic reasons. This can be quite surprising, since a supposedly fixed-length type suddenly starts behaving as if it were variable. Unless you are terribly nostalgic for punched cards, CHAR is generally not what you want.

Is there ever a time to use CHAR? Not really. If you have a single-character enumeration that can never be either ” or ‘ ‘ (a single space), it might be more logical to store it as CHAR(1) rather than VARCHAR, but any space savings will be minimal and highly dependent on the alignment of the surrounding items.

And for n > 1, just use VARCHAR… or TEXT. (Remember that in PostgreSQL, VARCHAR and TEXT are stored the same way.)

カテゴリー: postgresql

Craig Kerstiens: 12 Factor: Dev/prod parity for your database

planet postgresql - 2018-09-05(水) 03:29:00

The twelve-factor app changed the way we build SaaS applications. Explicit dependency management, separating config from code, scaling out your app concurrently—these design principles took us from giant J2EE apps to apps that scale predictably on the web. One of these 12 factors has long stood out as a challenge when it comes to databases: dev/prod parity. Sure, you can run the exact same version of your database, and have a sandbox copy, but testing and staging with production data… that’s a different story.

Dev/Prod parity is easy until it’s not

Running the same version of your database in development as in production should be a no brainer. Just do it. If you use Postgres 10 in production, make sure to use the same version of Postgres in dev. For Postgres at least, you’re usually pretty safe on point releases so between 10.1 and say 10.3 you may not have to worry too much, but at least keep the same major version of the database the same between dev and prod.

Now that we have the easy part out of the way, the complexity starts to come with the data. The version of your database is one piece, but how your data interacts is equally as key. If your perfectly pristine data set for development doesn’t attempt to create hotspots in your data, violate constraints, or in general push the boundaries of what’s allowed then there is a good chance you run into issues when you deploy to production.

Remember that time that you added an index to speed things up… Everything tested fine on your local dev box. Even in staging against the 10 GB, the sample DB worked fine. You deployed, and the index ran as part of your deploy script, and 30 minutes in you were madly figuring out how to cancel things as so many people showed up at your desk asking why the system is down?

Let’s start with database migrations, and indexes

We’re going to come back to dev/prod parity in a minute and how you can safely test operations against a production dataset. But first, let’s tackle two practices that you should put in place immediately to sav

[...]
カテゴリー: postgresql

Joshua Drake: PostgresConf Silicon Valley: Schedule now available

planet postgresql - 2018-09-05(水) 02:23:00
PostgresConf Silicon Valley is being held October 15th-16th at the Hilton San Jose and the schedule is now available.

The two day event received over 80 submissions! A lot of the usual and legendary speakers are present but we were pleasantly surprised to find that so many new (to the conference) speakers submitted. It shows that the mission of People, Postgres, Data is growing at an amplified rate. The presentations are all of the "best-in-class quality" our attendees have come to expect from PostgresConf events.



Whether your needs are Big Data, Google Cloud, AWS RDS, GPDR Compliance, or you just have a burning desire to learn more about Kubernetes, PostgresConf Silicon Valley has you covered!
We also have two fantastic training opportunities which are the first of their kind: Join us on October 15th-16th and remember all roads lead to PostgresConf 2019 (more news on that soon)!


カテゴリー: postgresql

Paul Ramsey: Moving on to CrunchyData

planet postgresql - 2018-09-05(水) 00:16:11

Today is my first day with my new employer Crunchy Data. Haven’t heard of them? I’m not surprised: outside of the world of PostgreSQL, they are not particularly well known, yet.

I’m leaving behind a pretty awesome gig at CARTO, and some fabulous co-workers. Why do such a thing?

While CARTO is turning in constant growth and finding good traction with some core spatial intelligence use cases, the path to success is leading them into solving problems of increasing specificity. Logistics optimization, siting, market evaluation.

Moving to Crunchy Data means transitioning from being the database guy (boring!) in a geospatial intelligence company, to being the geospatial guy (super cool!) in a database company. Without changing anything about myself, I get to be the most interesting guy in the room! What could be better than that?

Crunchy Data has quietly assembled an exceptionally deep team of PostgreSQL community members: Tom Lane, Stephen Frost, Ed Conway, Peter Geoghegan, Dave Cramer, David Steele, and Jonathan Katz are all names that will be familiar to followers of the PostgreSQL mailing lists.

They’ve also quietly assembled expertise in key areas of interest to large enterprises: security deployment details (STIGs, RLS, Common Criteria); Kubernetes and PaaS deployments; and now (ta da!) geospatial.

Why does this matter? Because the database world is at a technological inflection point.

Core enterprise systems change very infrequently, and only under pressure from multiple sources. The last major inflection point was around the early 2000s, when the fleet of enterprise proprietary UNIX systems came under pressure from multiple sources:

  • The RISC architecture began to fall noticeably behind x86 and particular x86-64.
  • Pricing on RISC systems began to diverge sharply from x86 systems.
  • A compatible UNIX operating system (Linux) was available on the alternative architecture.
  • A credible support company (Red Hat) was available and speaking the language of the enterprise.

The timeline of the Linux tidal wave was (extrem

[...]
カテゴリー: postgresql

Marriya Malik: PostgreSQL installer by 2ndQuadrant – now supports OmniDB!

planet postgresql - 2018-09-04(火) 21:30:33

PostgreSQL installer or PGInstaller – is a user-friendly, graphical installation and configuration tool for PostgreSQL. With just a few clicks you can install PostgreSQL – version 9.5, 9.6, 10 and 11(beta) – on Windows, Linux and macOS.

The latest PGInstaller release includes support for OmniDB – an interactive and user-friendly database management tool to manage multiple databases in a unified workspace.

Using utilities bundled with PostgreSQL, the only means to connect to the database is via psql. Psql works via the command line, which can be fairly tricky for new users especially if they are migrating from another database and are not used to the interface. PGInstaller makes the connection process easier with OmniDB.

PGInstaller comes bundled with OmniDB, as an optional package that can be selected during the installation process, shown in the screenshot below (this screen will show in the third step of the PGInstaller installation process). To view the complete process, please go through this step-by-step installation guide.

Once the installation process is complete, launch OmniDB. This step varies on  different platforms and can be viewed in the “README” file that opens up right after the installation. I will be demonstrating the process on Windows.

  • Go to the location:
C:\Program Files\2ndQuadrant\PostgreSQL\OmniDB
  • Double click on “omnidb-app.exe” to start OmniDB

You should see the screen below

OmniDB allows users to manage multiple databases (PostgreSQL, Oracle, MySQL, MariaDB, etc) in a unified workspace with a user-friendly and fast-performing interface. In order to start managing your PostgreSQL database, you need to first establish a connection with it. Without any connection, your interface should look like the following:

To establish connection, go to “Connections” in the top navigation bar and click on “New connection”

Enter your Server ID on which your database is hosted, Port, Database and Username and click on “Save data”

Before you start playing around with your database, you’d need to

[...]
カテゴリー: postgresql

Kaarel Moppel: Next feature release for the pgwatch2 monitoring tool

planet postgresql - 2018-09-04(火) 16:01:10

With summer fading away it’s time to get busy again – last couple of weeks I’ve taken time to work on our Open Source PostgreSQL monitoring tool pgwatch2 and partly on request from a couple of bigger organizations using the tool on a larger scale, added another batch of useful management / scaling features and some more minor enhancements from the Github queue as well. By the way, this is already the 4th “Feature Pack” in one and a half years, so after having implemented the below features we’re considering the software now “ripe”, with no important features missing. Also we’re glad that quite some people have given their feedback recently, helping to improve the software even further and thus hopefully helping to provide more value to the PostgreSQL community. But read on for a more detailed overview on the most important features from this v1.4.0 update.

Getting friendly with Ansible & Co

Similar to the last update we have tried to make pgwatch2 easier to deploy on a larger scale. This time nothing new on the containerization front though, but we’ve added the possibility to make repeatable, configuration based deployments possible! Meaning – one can add a config file(s) with connect strings, metric selections / intervals and the metric definitions themselves to some version control / configuration management / application deployment system and deploy the metrics collector easily to each required DB node, pushing metrics directly to InfluxDB or Graphite. This works better also for firewalled environments.

Performance

The previously supported centrally managed metrics gatherer / configuration database apporach works as before, but for the case when the amount of servers gets too large (hundreds and above) to be handeled by one central gatherer without lag, one can now add a logical grouping label to the monitored hosts and then deploy separate gatherers for subset(s) of hosts based on that label. There are also other performance changes like batching of metric storage requests and connection pooling, helping to i

[...]
カテゴリー: postgresql

PHP Day Curitiba 2018

php.net - 2018-09-04(火) 06:49:30
カテゴリー: php

Pavel Trukhanov: PostgreSQL: why and how WAL bloats

planet postgresql - 2018-09-04(火) 00:12:57
Today’s post is about real life of PG’s Write-Ahead log. WAL. An almost short introduction

Any changes to a Postgresql database first of all are saved in Write-Ahead log, so they will never get lost. Only after that actual changes are made to the data in memory pages (in so called buffer cache) and these pages are marked dirty — meaning they need to be synced to disk later.

For that there’s a Checkpoint process, ran periodically, that dumps all the ‘dirty’ pages to disk. It also saves the position in WAL (called REDO point), up to which all changes are synchronized.

So in case of a Postgres DB crashes, it will restore its state by sequentially replaying the WAL records from REDO point. So all the WAL records before this point are useless for recovery, but still might be needed for replication purposes or for Point In Time Recovery.

From this description a Super-Engineer might’ve figured out all the ways it will go wrong in real life :-) But in reality usually one will do this in a reactive way: one need to stumble upon a problem first.

WAL bloats #1

Our monitoring agent for every instance of Postgres will find WAL files and collect their number and total size.

Here’s a case of some strange x6 growth of WAL size and segment count:

What could that be?

WAL is considered unneeded and to be removed after a Checkpoint is made. This is why we check it first. Postgres has a special system view called pg_stat_bgwriter that has info on checkpoints:

  • checkpoints_timed — is a counter of checkpoints triggered due that the time elapsed from the previous checkpoint is more than pg setting checkpoint_timeout. These are so called scheduled checkpoints.
  • checkpoints_req — is a counter of checkpoints ran due to uncheckpointed WAL size grew to more than max_wal_size setting — requested checkpoints.

So let’s see:

We see that after 21 Aug checkpoints ceased to run. Though we would love to know the exact reason it’s so, we can’t ¯\_(ツ)_/¯

But as one might remember, Postgres is known to be prone to unexpected behavior due to long la

[...]
カテゴリー: postgresql

Pavel Stehule: New features for pspg

planet postgresql - 2018-09-02(日) 17:20:00
I wrote some (I useful) features for pspg

  • possibility to show/hide row numbers
  • possibility to hide/show row cursor

Load from GitHub 1.6.0 release.
カテゴリー: postgresql

Hans-Juergen Schoenig: PostgreSQL: Parallel CREATE INDEX for better performance

planet postgresql - 2018-09-02(日) 17:00:00

PostgreSQL will shortly be released and it is therefore time to take a look at one of the most important new features provided by PostgreSQL 11: The ability to create indexes in parallel. For many years various commercial database vendors have already offered this feature and we are glad that PostgreSQL has become part of this elite club, which offers multi-core index creation, which will dramatically improve the usability of large database deployments in the future.

Creating large tables in PostgreSQL

Since version 11 PostgreSQL supports classical “stored procedures”. The beauty is that a procedure can run more than one transaction, which is ideal if you want to generate huge amounts of random data. When you call generate_series to generate 1 million rows, PostgreSQL has to keep this data in memory and therefore generating hundreds of millions of random rows using more than 1 transactions can be really useful to reduce the memory footprint. Here is how it works:

CREATE TABLE t_demo (data numeric); CREATE OR REPLACE PROCEDURE insert_data(buckets integer) LANGUAGE plpgsql AS $$ DECLARE i int; BEGIN i := 0; WHILE i < buckets LOOP INSERT INTO t_demo SELECT random() FROM generate_series(1, 1000000); i := i + 1; RAISE NOTICE 'inserted % buckets', i; COMMIT; END LOOP; RETURN; END; $$; CALL insert_data(500);

This tiny bit of code loads 500 million random numeric values, which should be enough to demonstrate, how CREATE INDEX can be improved in PostgreSQL 11. In our example 500 million rows translate to roughly 21 GB of data:

test=# \d+ List of relations Schema | Name | Type | Owner | Size | Description --------+--------+-------+-------+-------+------------- public | t_demo | table | hs | 21 GB | (1 row)

The reason why I went for numeric is that numeric causes the most overhead of all number data types. Creating a numeric index is a lot more costly than indexing, say, int4 or int8. The goal is to see, how much CP

[...]
カテゴリー: postgresql

Bruce Momjian: Cryptographically Authenticated Rows

planet postgresql - 2018-09-01(土) 08:15:01

When storing data in the database, there is an assumption that you have to trust the database administrator to not modify data in the database. While this is generally true, it is possible to detect changes (but not removal) of database rows.

To illustrate this, let's first create a table:

CREATE TABLE secure_demo ( id SERIAL, car_type TEXT, license TEXT, activity TEXT, event_timestamp TIMESTAMP WITH TIME ZONE, username NAME, hmac BYTEA);

Continue Reading »

カテゴリー: postgresql

Ibrar Ahmed: Tuning PostgreSQL Database Parameters to Optimize Performance

planet postgresql - 2018-09-01(土) 00:38:27

Out of the box, the default PostgreSQL configuration is not tuned for any particular workload. Default values are set to ensure that PostgreSQL runs everywhere, with the least resources it can consume and so that it doesn’t cause any vulnerabilities. It has default settings for all of the database parameters. It is primarily the responsibility of the database administrator or developer to tune PostgreSQL according to their system’s workload. In this blog, we will establish basic guidelines for setting PostgreSQL database parameters to improve database performance according to workload.

Bear in mind that while optimizing PostgreSQL server configuration improves performance, a database developer must also be diligent when writing queries for the application. If queries perform full table scans where an index could be used or perform heavy joins or expensive aggregate operations, then the system can still perform poorly even if the database parameters are tuned. It is important to pay attention to performance when writing database queries.

Nevertheless, database parameters are very important too, so let’s take a look at the eight that have the greatest potential to improve performance

PostgreSQL’s Tuneable Parameters shared_buffer

PostgreSQL uses its own buffer and also uses kernel buffered IO. That means data is stored in memory twice, first in PostgreSQL buffer and then kernel buffer. Unlike other databases, PostgreSQL does not provide direct IO. This is called double buffering. The PostgreSQL buffer is called shared_buffer which is the most effective tunable parameter for most operating systems. This parameter sets how much dedicated memory will be used by PostgreSQL for cache.

The default value of shared_buffer is set very low and you will not get much benefit from that. It’s low because certain machines and operating systems do not support higher values. But in most modern machines, you need to increase this value for optimal performance.

The recommended value is 25% of your total machine RAM. You should try

[...]
カテゴリー: postgresql

Bulgaria PHP Conference 2016

php.net - 2018-08-31(金) 21:10:09
カテゴリー: php

Sebastian Insausti: How to Monitor PostgreSQL using Nagios

planet postgresql - 2018-08-31(金) 18:33:28
Introduction

Regardless of database technology, it is necessary to have a monitoring setup,both to detect problems and take action, or simply to know the current state of our systems.

For this purpose there are several tools, paid and free. In this blog we will focus on one in particular: Nagios Core.

What is Nagios Core?

Nagios Core is an Open Source system for monitoring hosts, networks and services. It allows to configure alerts and has different states for them. It allows the implementation of plugins, developed by the community, or even allows us to configure our own monitoring scripts.

How to Install Nagios?

The official documentation shows us how to install Nagios Core on CentOS or Ubuntu systems.

Let's see an example of the necessary steps for the installation on CentOS 7.

Packages required

[root@Nagios ~]# yum install -y wget httpd php gcc glibc glibc-common gd gd-devel make net-snmp unzip

Download Nagios Core, Nagios Plugins and NRPE

[root@Nagios ~]# wget https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.4.2.tar.gz [root@Nagios ~]# wget http://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz [root@Nagios ~]# wget https://github.com/NagiosEnterprises/nrpe/releases/download/nrpe-3.2.1/nrpe-3.2.1.tar.gz

Add Nagios User and Group

[root@Nagios ~]# useradd nagios [root@Nagios ~]# groupadd nagcmd [root@Nagios ~]# usermod -a -G nagcmd nagios [root@Nagios ~]# usermod -a -G nagios,nagcmd apache

Nagios Installation

[root@Nagios ~]# tar zxvf nagios-4.4.2.tar.gz [root@Nagios ~]# cd nagios-4.4.2 [root@Nagios nagios-4.4.2]# ./configure --with-command-group=nagcmd [root@Nagios nagios-4.4.2]# make all [root@Nagios nagios-4.4.2]# make install [root@Nagios nagios-4.4.2]# make install-init [root@Nagios nagios-4.4.2]# make install-config [root@Nagios nagios-4.4.2]# make install-commandmode [root@Nagios nagios-4.4.2]# make install-webconf [root@Nagios nagios-4.4.2]# cp -R contrib/eventhandlers/ /usr/local/nagios/libexec/ [root@Nagios nagios-4.4.2]# chown -R nagios:nagios /usr/local/nagios/libexec/eve[...]
カテゴリー: postgresql

PHP 7.3.0.beta3 Released

php.net - 2018-08-31(金) 00:37:34
カテゴリー: php

Shaun M. Thomas: [Video] Data Integration with PostgreSQL

planet postgresql - 2018-08-30(木) 15:02:01

Just in case you missed the live broadcast, the video of my presentation below covers various topics around integration of PostgreSQL with other data sources and database technologies.

This presentation covers the following topics:

  • What is a Foreign Data Wrapper?
  • How to query MySQL, a Flat file, a Python script, a REST interface and a different Postgres Node
  • Perform all of the above simultaneously
  • Take snapshots of this data for fast access
  • Tweak remote systems for better performance
  • Package as an API for distribution
  • Stream to Kafka
  • Write data to… MongoDB!?
  • What does all of this have in common?

It’s an exciting topic, and I hope more developers and admins begin to see Postgres as the global integration system it really is.

カテゴリー: postgresql

Luca Ferrari: Managing Multiple PostgreSQL Installations with pgenv

planet postgresql - 2018-08-30(木) 09:00:00

pgenv is a shell script that allows you to quickly manage multiple PostgreSQL installations within the same host. It reminds somehow perlbrew (for Perl 5) and systems like that. In this post I briefly show how to use pgenv as well as I explain which changes I made to it.

Managing Multiple PostgreSQL Installations with pgenv

pgenv is another pearl from theory. It is a bash single script that allows you to download, build, start and stop (as well as nuke) several PostgreSQL installations within the same host.
It is worth noting that pgenv is not, at least now, an enterprise-level PostgreSQL management tool, rather an easy way to keep test instances clean and organized. It can be very useful to keep several clusters on which doing experiments, testing, and so on.

I first discovered pgenv reading this blog post by David, and I thought it was cool to have a single script to help me manage several environments. I must be honest, this is not the first tool like this I have seen for PostgreSQL, but somehow it caught my attention. I then cloned the repository and start using it. And since I’m curious, I read the source code. Well, ehm, bash? Ok, it is not my favourite shell anymore, but surely it can speed up development while shorting the code with respect to more portable shells.

pgenv works with a command-oriented interface: as in git or other developer-oriented tools you specify a command (e.g., build) and optionally a specific PostgreSQL version to apply the command to. pgenv works on a single cluster at time, by linking and unlinking the specific instance...

カテゴリー: postgresql

Dimitri Fontaine: Preventing SQL Injections

planet postgresql - 2018-08-30(木) 08:17:00

An SQL Injection is a security breach, one made famous by the Exploits of a Mom xkcd comic episode in which we read about little Bobby Tables:

PostgreSQL implements a protocol level facility to send the static SQL query text separately from its dynamic arguments. An SQL injection happens when the database server is mistakenly led to consider a dynamic argument of a query as part of the query text. Sending those parts as separate entities over the protocol means that SQL injection is no longer possible.

カテゴリー: postgresql

ページ