フィードアグリゲーター

Federico Campoli: Happy birthday pg_chameleon

planet postgresql - 2018-08-23(木) 09:00:00
Today is two years since I started working on pg_chameleon. Back in 2016 this commit changed the project’s license from GPL v2 to the 2 clause BSD and the project’s scope, which became a MySQL to PostgreSQL replica system. Since then I learned a lot of lessons, made several mistakes and worked out solutions which resulted in a decent tool for bridging two different universes. Writing a replica system is a very complex task.
カテゴリー: postgresql

Magnus Hagander: Updates about upcoming conferences

planet postgresql - 2018-08-22(水) 21:39:29

Summer vacation times are over. Well, for some of us at least, clearly some are still lucky enough to be off, which is showing itself a bit (see below). But as both conference organisation and participation throughout the rest of the year is starting to be clear, I figure it's time to share some updates around different ones.

Postgres Open SV

First of all - if you haven't already, don't forget to register for Postgres Open SV in San Francisco in two weeks time! Registration for the main US West Coast/California PostgreSQL community conference will close soon, so don't miss your chance. I'm looking forward to meeting many old and new community members there.

PostgreSQL Conference Europe

Next up after Postgres Open will be pgconf.eu, the main European PostgreSQL community conference of 2018. The planning for this years conference is at full speed, but unfortunately we are slightly behind. In particular, we were supposed to be notifying all speakers today if they were accepted or not, and unfortunately our program committee are a bit behind schedule on this one. We had over 200 submissions this year which makes their work even bigger than usual. But speakers will be receiving their notification over the upcoming couple of days.

Hopefully once all speakers have been able to confirm their attendance, we will also have a schedule out soon. Until then, you can always look at the list of accepted talks so far. This list is dynamically updated as speakers get approved and confirm their talks.

We have already sold approximately half of the tickets that we have available this year, so if you want to be sure to get your spot, we strongly recommend that you register as soon as you can! And if you want to attend the training sessions, you should hurry even more as some are almost sold out!

PGConf.ASIA

Work on the program committee of PGConf.ASIA has also been going on over the late summer, and is mostly done! The schedule is not quite ready yet, but expected out shortly. You can look forward to a very interesting lineup of spe

[...]
カテゴリー: postgresql

Pablo González Doval: jOOQ: Stream processing PostgreSQL query results

planet postgresql - 2018-08-22(水) 18:42:54
When Java 8 came out, it proved a great effort to cope with the great demand of stream processing. Streams and lambdas allow you to process the data in a list asynchronously, so that you don’t have to wait until the last element of step 1 is calculated in order to start step 2 (apart from the evident code cleanness). With great amounts of data, this reduces a very painful bottleneck, so why wouldn’t we want this huge data-related improvement in our ORMs?
カテゴリー: postgresql

Cascadia PHP

php.net - 2018-08-22(水) 07:29:36
カテゴリー: php

Joshua Drake: PostgresConf Silicon Valley: Early bird tickets now available

planet postgresql - 2018-08-22(水) 01:34:00
Silicon Valley Postgres and PostgresConf Silicon Valley are happy to announce that early bird tickets are now available! You can purchase them here. You will want to hurry as early bird discounted tickets are only available until September 1st, 2018.

PostgresConf Silicon Valley takes place on October 15th and 16th, 2018 at the Hilton San Jose. The schedule is not finalized. However, we have a comprehensive list of content from technical leaders in the Postgres ecosystem we are reviewing. Here is a brief list of some (but not all!) of the topics to be presented:
  • Postgres 11
  • Data integrity at scale
  • Distributed data and GPDR
  • Massively Parallel Postgres for Analytics
  • Data intersection between roles
  • RDS tips and tricks
  • I didn’t know Postgres could do that!
  • Performant time-series data management analytics
The Silicon Valley schedule is expected to be published this week and we are looking forward to seeing you all in October!

PostgresConf is organized by an all volunteer team of fantastic individuals and is backed by a 501c3 non-profit. The Silicon Valley conference provides the most cost effective avenue to learn all about Postgres and the Postgres ecosystem. We invite everyone to enjoy our expertly delivered presentations, good food, the hallway track and the opportunity to learn, network and appreciate everything that is Postgres.

カテゴリー: postgresql

Jobin Augustine: Foreign Data Wrappers in PostgreSQL and a closer look at postgres_fdw

planet postgresql - 2018-08-22(水) 00:44:37

There are a few features in PostgreSQL that are very compelling, and that I rarely see in other RDBMSs. Some of these features are the driving force behind the growing popularity of PostgreSQL. This blog post is about one of my favourite features: FDW (Foreign Data Wrapper). As the name indicates, this feature allows a PostgreSQL database to treat tables in a remote PostgreSQL database as locally available tables.

The history of FDW began when SQL/MED came out as part of the ANSI SQL standard specification in 2003. MED stands for “Management of External Data”. By definition, “external data” is the data that the DBMS is able to access but does not manage. There are two parts for this specification:

  1. Foreign Table : this is about how to access external data sources and present them as relational tables.
  2. Datalink : this extends the functionality of database systems to include control over external files without the need to store their contents directly in the database, such as LOBs. A column of a table could directly refer a file.

PostgreSQL’s FDW capabilities addresses foreign tables only. It was introduced in PostgreSQL 9.1 and has been receiving improvements ever since.

Today there are a variety of FDWs which allow PostgreSQL to talk to most of the data sources we can think of. However, most FDWs are independent open source projects implemented as Postgres Extensions, and not officially supported by the PostgreSQL Global Development Group.

postgres_fdw

In this blog post we will take a closer look at the postgres_fdw which can be considered as the “reference implementation” for other FDW development efforts, and showcases its capabilities. This is the one FDW which comes with PostgreSQL source as a contrib extension module. The only other FDW which is part of PostgreSQL source tree is file_fdw.

Let’s look into postgres_fdw with a use case. In many organizations, there could be multiple systems catering to different functionalities/departments. For example, while an HR database may be holding the employee informa

[...]
カテゴリー: postgresql

Hans-Juergen Schoenig: 3 ways to detect slow queries in PostgreSQL

planet postgresql - 2018-08-21(火) 17:00:34

When digging into PostgreSQL performance it is always good to know, which option one has to spot performance problems and to figure out, what is really going on on a server. Finding slow queries and performance weakspots is therefore exactly what this post is all about.

There are many ways to approach performance problems. However, three methods have proven to really useful to quickly assess a problem. Here are my top three suggestions to handle bad performance:

  • Make use of the slow query log
  • Checking execution plans with auto_explain
  • Relying on aggregate information in pg_stat_statements
Analyzing PostgreSQL performance and finding bottlenecks

Each method has its own advantages and disadvantages, which will be discussed in this document

Making use of the PostgreSQL slow query log

A more traditional way to attack slow queries is to make use of PostgreSQL’s slow query log. The idea is: If a query takes longer than a certain amount of time, a line will be sent to the log. This way slow queries can easily be spotted so that developers and administrators can quickly react and know where to look.

In a default configuration the slow query log is not active. Therefore it is necessary to turn it on. You have version choices: If you want to turn the slow query log on globally, you can change postgresql.conf:

log_min_duration_statement = 5000

If you set log_min_duration_statement in postgresql.conf to 5000, PostgreSQL will consider queries, which take longer than 5 seconds to be slow queries and send them to the logfile. If you change this line in postgresql.conf there is no need for a server restart. A “reload” will be enough:

postgres=# SELECT pg_reload_conf(); pg_reload_conf ---------------- t (1 row)

You can do that using an init script or simply by calling the SQL function shown above.

If you change postgresql.conf the change will be done for the entire instance, which might be too much. In many cases you want to be a lot more precise. Therefore it can make sense to make the change only for a certain us

[...]
カテゴリー: postgresql

Venkata Nagothi: Using Barman to Backup PostgreSQL - An Overview

planet postgresql - 2018-08-21(火) 07:28:32

Database backups play an imperative role in designing an effective disaster recovery strategy for production databases. Database Administrators and Architects must continuously work towards designing an optimal and effective backup strategy for real-time mission critical databases and further ensure Disaster Recovery SLAs are satisfied. As per my experience, this is not easy and can take from days to weeks to achieve an impeccable backup strategy. It is just not writing a good script to backup databases and make sure it works. There are several factors to consider, let us take a look at them:

  • Database size: Database size plays in important role when designing backup strategies. In-fact, this is one of the core factors which defines
    • Time taken by the backup
    • The load on the infrastructure components like Disk, Network, CPU etc.
    • Amount of backup storage required and the costs involved
    • If the databases are hosted on cloud, then, the backup storage costs rely on the amount of storage required
    • Also, database size impacts the RTO
  • Infrastructure: Backup strategy heavily relies on infrastructure of the databases. The backup procedure would be different for databases hosted on a physical server in an on-prem data-centre as compared to those hosted on cloud.
  • Backup Location: Where are the backups going? Generally, the backups will be placed at a remote location, for instance on tape or cloud specific storage like AWS S3.
  • Backup Tool: Identify an optimal tool to perform online database backup which potentially ensures consistent backup has been taken.

A good database backup strategy must ensure RTO (Recovery Time Objective) and RPO (Recovery Point Objective) are met which in-turn help achieve Disaster Recovery objective. File-system level backups can be performed on PostgreSQL Databases in several ways. In this blog, my focus will be on a tool called Barman which is popularly used to perform PostgreSQL Database Backups.

Barman (backup and recovery manager) is an Python based open-source tool developed by developers at

[...]
カテゴリー: postgresql

Regina Obe: PostGIS 2.5.0rc1

planet postgresql - 2018-08-19(日) 09:00:00

The PostGIS development team is pleased to release PostGIS 2.5.0rc1.

This release is a work in progress. Remaining time will be focused on bug fixes and documentation until PostGIS 2.5.0 release. Although this release will work for PostgreSQL 9.4 and above, to take full advantage of what PostGIS 2.5 offers, you should be running PostgreSQL 11beta3+ and GEOS 3.7.0rc1 which were released recently.

Best served with PostgreSQL 11beta3 which was recently released.

Changes since PostGIS 2.5.0beta2 release are as follows:

  • 4146, Fix compilation error against Postgres 12 (Raúl Marín).
  • 4147, 4148, Honor SOURCEDATEEPOCH when present (Christoph Berg).

View all closed tickets for 2.5.0.

After installing the binaries or after running pg_upgrade, make sure to do:

ALTER EXTENSION postgis UPDATE;

— if you use the other extensions packaged with postgis — make sure to upgrade those as well

ALTER EXTENSION postgis_sfcgal UPDATE; ALTER EXTENSION postgis_topology UPDATE; ALTER EXTENSION postgis_tiger_geocoder UPDATE;

If you use legacy.sql or legacy_minimal.sql, make sure to rerun the version packaged with these releases.

2.5.0rc1

カテゴリー: postgresql

Avinash Kumar: Replication from Percona Server for MySQL to PostgreSQL using pg_chameleon

planet postgresql - 2018-08-18(土) 00:38:18

Replication is one of the well-known features that allows us to build an identical copy of a database. It is supported in almost every RDBMS. The advantages of replication may be huge, especially HA (High Availability) and load balancing. But what if we need to build replication between 2 heterogeneous databases like MySQL and PostgreSQL? Can we continuously replicate changes from a MySQL database to a PostgreSQL database? The answer to this question is pg_chameleon.

For replicating continuous changes, pg_chameleon uses the mysql-replication library to pull the row images from MySQL, which are transformed into a jsonb object. A pl/pgsql function in postgres decodes the jsonb and replays the changes into the postgres database. In order to setup this type of replication, your mysql binlog_format must be “ROW”.

A few points you should know before setting up this tool :

  1. Tables that need to be replicated must have a primary key.
  2. Works for PostgreSQL versions > 9.5 and MySQL > 5.5
  3. binlog_format must be ROW in order to setup this replication.
  4. Python version must be > 3.3

When you initialize the replication, pg_chameleon pulls the data from MySQL using the CSV format in slices, to prevent memory overload. This data is flushed to postgres using the COPY command. If COPY fails, it tries INSERT, which may be slow. If INSERT fails, then the row is discarded.

To replicate changes from mysql, pg_chameleon mimics the behavior of a mysql slave. It creates the schema in postgres, performs the initial data load, connects to MySQL replication protocol, stores the row images into a table in postgres. Now, the respective functions in postgres decode those rows and apply the changes. This is similar to storing relay logs in postgres tables and applying them to a postgres schema. You do not have to create a postgres schema using any DDLs. This tool automatically does that for the tables configured for replication. If you need to specifically convert any types, you can specify this in the configuration file.

The following is just an

[...]
カテゴリー: postgresql

PHP 7.1.21 Released

php.net - 2018-08-17(金) 23:52:00
カテゴリー: php

Craig Kerstiens: How Citus real-time executor parallelizes Postgres queries

planet postgresql - 2018-08-17(金) 23:49:00

Citus has multiple different excutors which each behaving differently to support a wide array of use cases. For many the notion distributed SQL seems like it has to be a complicated one, but the principles of it aren’t rocket science. Here we’re going to look at a few examples of how Citus takes standard SQL and transforms it to operate in a distributed form so it can be parallelized. The result is that you can see speed up of 100x or more in query performance over a single node database.

How do we know something is distributed vs. single shard?

Before we get to how the real-time executor works it is worth a refresher on Citus executors in general.

When Citus receives query we first look to see if it has the shard key also known as distribution column as part of the where clause. If you’re sharding a multi-tenant app such as a CRM application you may have an org_id which you’re always limiting queries on. In that case as long as org_id is part of your where clause we know it’s targetting a single shard, and thus use the router executor. If that’s not being used we split the query up and send it to all shards in parallel across nodes.

As a quick refresher a shard within Citus is another table. If you a table events and you want to distribute that, you could create 32 shards, this means we could easily scale up to 32 nodes. If you’re starting with 2 nodes, then each node contains 32 shards. This means that each node will receive 16 queries at once, and if it has 16 cores available all that work will be done in parallel resulting in a 2 node x 16 core, or rather 32x speed up as opposed to executing on a single core.

For our examples later on we’re going to create only 4 shards to simplify them a bit, but things scale nearly linearly the more shards and cooresponding cores you add.

Writing in SQL, thinking in MapReduce

Citus’ support for real-time analytics is a workload that people used Citus for since our early days, thanks to our advanced query parallelization. The result result is you are able to express things

[...]
カテゴリー: postgresql

Joshua Drake: Postgres Silicon Valley: Patch review workshop summary

planet postgresql - 2018-08-17(金) 07:27:00

On August 15th, Silicon Valley Postgres held their first Postgresql.org Patch Review workshop. The workshop was hosted by PostgresConf partner Pivotal. Attendees’ feedback indicate that we have found a great way to show how companies within the community can benefit Postgresql.org, Postgres User groups, and help us fulfill our mission:

There were 20 attendees, including Melanie Plageman of Pivotal presenting/coordinating “Postgres Patch Review,” and Postgresql.org -Hacker Jeff Davis as hands on help. This was the first workshop of its kind for Silicon Valley Postgres and with the feedback received, we are planning to plan similar events at least twice a year to provide an opportunity that is currently not available elsewhere. (Sneak peek: there may be one at PostgresConf Silicon Valley on October 15th and 16th, which is right around the corner!)
Eight patches were reviewed, with three including specific feedback to the patch authors: We’re excited to be a part of this path of Postgres contributor building. Hopefully it will be a model for other meetups and community partners to follow in continuing to contribute to the Most Advanced Open Source Database ecosystem.
Silicon Valley Postgres is the fastest growing Postgres user group within North America. It was initiated just 12 months ago and will reach 700 members likely before Monday. The group exists to help all facets of Postgres users and the Postgres ecosystem, including, but not limited to, PostgreSQL, Greenplum, Aurora Postgres, AgensGraph, Yugabyte, ZomboDB, TimescaleDB.

Silicon Valley Postgres is also working with PostgresConf Silicon Valley to bring a high quality and cost effective education solution to the users of Postgres within the Silicon Valley. Early Bird tickets are now on sale and available. You will want to hurry as the Early Bird price ends on September 1st[...]
カテゴリー: postgresql

PHP 7.2.9 Released

php.net - 2018-08-17(金) 04:14:25
カテゴリー: php

php[world] 2018

php.net - 2018-08-16(木) 23:31:44
カテゴリー: php

PHP 7.3.0.beta2 Released

php.net - 2018-08-16(木) 21:11:38
カテゴリー: php

Haroon .: PostgreSQL and IoT Data Localization, Integration, and Write Scalability

planet postgresql - 2018-08-16(木) 21:09:30

In my previous post we looked at various partitioning techniques in PostgreSQL for efficient IoT data management. We do understand that the basic objective behind time based partitions is to achieve better performance, especially in IoT environments, where active data is usually the most recent data. New data is usually append only and it can grow pretty quickly depending on the frequency of the data points.

Some might argue on why to have multiple write nodes (as would be inherently needed in a BDR cluster) when a single node can effectively handle incoming IoT data utilizing features such as time based partitioning. Gartner estimated 8.4 billion connected devices in 2017, and it expects that this number will grow to over 20 billion by 2020. The scale at which connected devices are operating, it becomes imperative to introduce additional write nodes into the cluster at some point just to handle the sheer number of devices connecting to the databases.

In a conventional Master-Standby setup, adding additional write nodes isn’t trivial. There are techniques one might use to introduce additional write nodes (as a separate cluster), but those techniques introduce many complexities which time critical IoT applications cannot afford. As an example, consider a city’s smart grid station that needs to run analytics on past data to make adjustments for power for a specific time of the year.

A couple of key concepts with respect to data in an IoT environment are:

  1. Data localization
  2. Local and global analytics

Edge devices write to their local data store. Data security regulations might dictate data localization in many cases and therefore the ability to store data locally is important. Other important reasons include reducing read/write latency. However, it is equally important to be able to run analytics on the reported data to make various decisions.

Enterprises are increasingly operating at global levels and the ability to handle geographically distributed data is becoming increasingly important. This is where PostgreSQ

[...]
カテゴリー: postgresql

Sebastian Insausti: Live Webinar: An Introduction to Performance Monitoring for PostgreSQL - August 21st 2018

planet postgresql - 2018-08-15(水) 19:30:01

There’s a bit less than a week to go before I broadcast live with this webinar on monitoring PostgreSQL for performance.

My plan is to cover some of the main ins and outs of the PostgreSQL monitoring and performance world and I’m also planning to share some tips and tricks on how to use ClusterControl to monitor PostgreSQL for performance.

The webinar aims to address some of the following questions:

  • PostgreSQL offers many metrics through various status overviews and commands, but which ones really matter to us users?
  • How do we trend and alert on them?
  • What is the meaning behind the metrics?
  • And what are some of the most common causes for performance problems in production?

To operate PostgreSQL efficiently, you need to have insight into database performance and make sure it is at optimal levels.

I’ll discuss this and more in ordinary, plain DBA language.

We’ll have a look at some of the tools available for PostgreSQL monitoring and trending; and I’ll show you how to leverage ClusterControl’s PostgreSQL metrics, dashboards, custom alerting and other features to track and optimize the performance of your system.

Date, Time & Registration Europe/MEA/APAC

Tuesday, August 21st at 09:00 BST / 10:00 CEST (Germany, France, Sweden)

Register Now

North America/LatAm

Tuesday, August 21st at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)

Register Now

Agenda
  • PostgreSQL architecture overview
  • Performance problems in production
    • Common causes
  • Key PostgreSQL metrics and their meaning
  • Tuning for performance
  • Performance monitoring tools
  • Impact of monitoring on performance
  • How to use ClusterControl to identify performance issues
    • Demo
Speaker

Sebastian Insausti has loved technology since his childhood, when he did his first computer course (Windows 3.11). And from that moment he was decided on what his profession would be. He has since built up experience with MySQL, PostgreSQL, HAProxy, WAF (ModSecurity), Linux (RedHat, CentOS, OL, Ubuntu server), Monitoring (Nagios), Networking and Virtualization (VMWare, Proxmox,

[...]
カテゴリー: postgresql

Mark Wong: PDXPUG: August Meetup

planet postgresql - 2018-08-15(水) 12:25:40

When: 6-8pm Thursday August 16, 2018
Where: iovation
Who: Mark Wong
What: Sneak peek at stored procedures

Stored procedure support is coming to PostgreSQL 11.  Come hear about what a stored procedure is and how it differs from the existing user-defined functions.

Mark leads the 2ndQuadrant performance practice as a Performance Consultant for English Speaking Territories, based out of Oregon in the USA. He is a long time Contributor to PostgreSQL, co-organizer of the Portland PostgreSQL User Group, and serves as a Director and Treasurer
for the United States PostgreSQL Association.


If you have a job posting or event you would like me to announce at the meeting, please send it along. The deadline for inclusion is 5pm the day before the meeting.

Our meeting will be held at iovation, on the 3rd floor of the US Bancorp Tower at 111 SW 5th (5th & Oak). It’s right on the Green & Yellow Max lines. Underground bike parking is available in the parking garage; outdoors all around the block in the usual spots. No bikes in the office, sorry! For access to the 3rd floor of the plaza, please either take the lobby stairs to the third floor or take the plaza elevator (near Subway and Rabbit’s Cafe) to the third floor. There will be signs directing you to the meeting room. All attendess must check in at the iovation front desk.

See you there!

カテゴリー: postgresql

Sebastian Insausti: Performance Monitoring & Auditing PostgreSQL - Top Resources

planet postgresql - 2018-08-14(火) 19:35:42

This month we are spending some time talking about the importance of database monitoring and auditing of your database environments. Over the past year (and in some cases even longer!) we have posted many technical resources to help you monitor and audit your PostgreSQL setups and we wanted to consolidate them into one blog for you to bookmark.

And make sure to join us on August 21st for our free webinar “An Introduction to Performance Monitoring for PostgreSQL” to learn how to operate PostgreSQL efficiently and running at optimal levels.

Most Popular PostgreSQL Monitoring Resources A Performance Cheat Sheet for PostgreSQL

Performance tuning is not trivial, but you can go a long way with a few basic guidelines. In this blog, we will discuss how you analyze the workload of the database, and then review a list of important configuration parameters to improve the performance of PostgreSQL.

Read More

Key Things to Monitor in PostgreSQL - Analyzing Your Workload

This blog provides an overview of key things you should know when monitoring your PostgreSQL database and its workload.

Read More

PostgreSQL Audit Logging Best Practices

This blog provides an introduction to audit logging, why you should do it, and specific tips for doing it on your PostgreSQL database environment.

Read More

How to Decode the PostgreSQL Error Logs

This blog provides the basic information you need to know to understand the PostgreSQL error logs and what to do when you find errors.

Read More

The Best Alert and Notification Tools for PostgreSQL

This blog highlights the top alert and notification systems available for PostgreSQL database deployments.

Read More

Monitoring your Databases with ClusterControl

Observability is critical piece of the operations puzzle - you have to be able to tell the state of your systems based on trending data presented in the form of graphs and alerts. Ideally, this data will be available from one single location. This blog explains how ClusterControl can work as a monitoring hub for all your database systems.

R

[...]
カテゴリー: postgresql

ページ