Joshua Otwell: Care To Know Clauses: All About SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY, and LIMIT

planet postgresql - 2018-11-12(月) 22:47:01

SQL is a language of databases and PostgreSQL is our chosen one. Oftentimes, storing data is but one facet of the process. Typically, in any data-centered endeavor, you will: view and read data, take action or implement changes on the data, garner decision-making information (analytics), or manipulate the stored data in some form or fashion.

SQL is composed of a combination of keywords, commands, and clauses. SQL seems simple. Just a few 'easy' commands here and there. No big deal, right?

But, there is more to SQL than meets the eye. SQL can trip you up on those 'easy' queries.

One challenge (that I must routinely revisit) is understanding that SQL execution order is definitely different from that of its syntax.

In this blog post, I visit, at a high-level, the major SQL clauses as they apply to PostgreSQL. There are many dialects of SQL but PostgreSQL’'s interpretation is the focus here. (Some characteristics of each clause very well may apply to other SQL dialects.)

SQL clauses form the foundation for basic, often-used commands and queries. That being said, advanced queries and examples utilizing Window Functions, CTE's, Derived Tables, etc will not be covered in this post.

As we will see, not all clauses are created equal. Yet, they do operate in tandem, providing query results seamlessly (or not).

Allow me to clarify...

I will periodically make mention of an execution order throughout the blog post as it applies to many of the clauses. But, this is generalized.

To my understanding, more often than not, the optimizer chooses and decides the best query plan for execution.

SELECT - The 'picky' Clause Used to Query the Database

SELECT is one busy clause. It is everywhere. Used more than all the other clauses. Certain clauses you may not need at all. Not so much the case with SELECT, for it is a mandatory clause.

The SELECT clause is typically used for querying the database, containing (at a basic level):

  1. A SELECT list - The columns of data you want.
  2. the source data set(s) - named in the FROM clause. Tables, Vie
カテゴリー: postgresql

Adrien Nayrat: PostgreSQL and heap-only-tuples updates - part 1

planet postgresql - 2018-11-12(月) 16:00:00
Here is a series of articles that will focus on a new feature in version 11. During the development of this version, a feature caught my attention. It can be found in releases notes : https://www.postgresql.org/docs/11/static/release-11.html Allow heap-only-tuple (HOT) updates for expression indexes when the values of the expressions are unchanged (Konstantin Knizhnik) I admit that this is not very explicit and this feature requires some knowledge about how postgres works, that I will try to explain through several articles:
カテゴリー: postgresql

Dimitri Fontaine: Preventing SQL Injections

planet postgresql - 2018-11-10(土) 23:40:01

An SQL Injection is a security breach, one made famous by the Exploits of a Mom xkcd comic episode in which we read about little Bobby Tables:

PostgreSQL implements a protocol level facility to send the static SQL query text separately from its dynamic arguments. An SQL injection happens when the database server is mistakenly led to consider a dynamic argument of a query as part of the query text. Sending those parts as separate entities over the protocol means that SQL injection is no longer possible.

カテゴリー: postgresql

Jan Karremans: Why document databases are old news…

planet postgresql - 2018-11-09(金) 21:57:55

We’re going to store data the way it’s stored naturally in the brain.

This is a phrase being heard more often today. This blog post is inspired by a short rant by Babak Tourani (@2ndhalf_oracle) and myself had on Twitter today. How cool is that!! This phrase is used by companies like MongoDB or Graph Database vendors to explain why they choose to store information / data in an unstructured format. It is new, it is cool, hip and happening. Al the new compute power and storage techniques enable doing this. How cool is that!! Well, it is… for the specific use-cases that can benefit from such techniques. Thinking of analytical challenges, where individual bits of information basically have no meaning. If you are analyzing a big bunch of captured data, which is coming from a single source like a machine, or a click-stream or social media, for instance, one single record basically has no meaning. If that is the case, and it is really not very interesting if you have and retain all individual bits of information, but you are interested in “the bigger picture”, these solutions can really help you! How cool is it, actually? If it comes to the other situations where you want to store and process information… where you do care about the individual records (I mean, who wants to repopulate their shopping cart on a web-shop 3 times before all the items stick in the cart) there are some historical things that you should be aware of. Back in the day when computers were invented, all information on computers was stored “the way it’s stored naturally in the brain”. Back in the day when computers were invented, all we had were documents to store information. This new cool hip and happening tech is, if anything, not new at all… Sure, things changed over the last 30 years and with all the new compute power and storage techniques, the frayed ends of data processing have significantly improved. This makes the executing of data analysis, as described above, actually so much better!! Really, we can do things to data, using these co[...]
カテゴリー: postgresql

Magnus Hagander: PGConf.EU 2019 - Dates and location!

planet postgresql - 2018-11-09(金) 18:31:37

It's been over 10 years since PostgreSQL Europe got started in Prato, just outside Florence, and it's time to return to our roots! PostgreSQL Conference Europe 2019 will be held in Milan, Italy, at the Milan Marriott Hotel, on October 15-18, 2019.

More details will be shared as things progress and we are not yet ready to open for sponsorship, call for papers or registrations, but it's time to mark your calendars and block out the week!

Follow us on twitter at @pgconfeu for notifications of when news are posted, check our website or subscribe to our RSS feed for the latest news!

We had only one correct guess in our "guess the location" contest at the closing session of this years conference. This attendee will be contacted personally with information about how to claim their free ticket for next year.

カテゴリー: postgresql

11.1, 10.6, 9.6.11, 9.5.15, 9.4.20, 9.3.25 リリース (2018-11-08)

www.postgresql.jp news - 2018-11-09(金) 11:46:58
11.1, 10.6, 9.6.11, 9.5.15, 9.4.20, 9.3.25 リリース (2018-11-08) harukat 2018/11/09 (金) - 11:46
カテゴリー: postgresql

PHP 7.1.24 Released

php.net - 2018-11-09(金) 00:28:08
カテゴリー: php

PHP 7.2.12 Released

php.net - 2018-11-08(木) 19:28:15
カテゴリー: php

PHP 7.3.0RC5 Released

php.net - 2018-11-08(木) 19:11:26
カテゴリー: php


postgresql.org - 2018-11-08(木) 09:00:00
11.1 is the latest release in the 11 series.
カテゴリー: postgresql


postgresql.org - 2018-11-08(木) 09:00:00
10.6 is the latest release in the 10 series.
カテゴリー: postgresql


postgresql.org - 2018-11-08(木) 09:00:00
9.6.11 is the latest release in the 9.6 series.
カテゴリー: postgresql


postgresql.org - 2018-11-08(木) 09:00:00
9.5.15 is the latest release in the 9.5 series.
カテゴリー: postgresql


postgresql.org - 2018-11-08(木) 09:00:00
9.4.20 is the latest release in the 9.4 series.
カテゴリー: postgresql


postgresql.org - 2018-11-08(木) 09:00:00
9.3.25 is the latest release in the 9.3 series. This version is unsupported!
カテゴリー: postgresql

Hernan Resnizky: Machine Learning in PostgreSQL Part 1: Kmeans clustering

planet postgresql - 2018-11-07(水) 18:00:03
Machine Learning in 10 Lines

Every person that reads newspapers, magazines or any other media of general interest has at least a basic idea of what Machine Learning is. And this is not only a fashion, Machine Learning is already part of our everyday life and will be much more in the future: from personalized advertisement on the Internet to robot dentists or autonomous cars, Machine Learning seems to be some kind of super power capable of everything.


But, what is Machine Learning really? It is mainly a set of statistical algorithms that, based on existing data, are capable of deriving insights out of them. These algorithms are basically divided into two families, supervised and unsupervised learning. In supervised learning, the objective is to perform some kind of prediction, such as, for example, if an e-mail message is spam or not (classification), how many beers will be sold next week in a supermarket (regression), etc. Unsupervised Learning, on the contrary, focuses on answering the question how are my cases divided in groups? What these algorithms do (each of them with their particularities) is to bring similar items as close as possible and keep items that differ  from each other as far as possible.

The popularisation of Machine Learning revolutionized the way we do business. Regardless if you are talking of a 10 or 10,000 employees company, if you do not make use of your data to make decisions, you are definitely running behind your competitors.


Machine Learning without leaving the Database

Relational Databases are definitely the most essential tools when it comes to data persistence. Although there are other alternatives which could be suitable for certain purposes, there is probably no company with at least a minimal IT Infrastructure that doesn’t have a database.

So if every company has a database, it contains data that is worth using. This means that every company has the opportunity to improve its decision-making process with minimal effort through the use of machine learning. However, he drawbac

カテゴリー: postgresql

William Ivanski: OmniDB debugger for PostgreSQL 11

planet postgresql - 2018-11-06(火) 22:58:04

PostgreSQL 11 was released recently, with exciting new features. One of them is the ability to write SQL procedures that can perform full transaction management, enabling developers to create more advanced server-side applications. SQL procedures can be created using the CREATE PROCEDURE command and executed using the CALL command. Since OmniDB 2.3.0 it is possible to debug PostgreSQL PL/pgSQL functions. Support to PostgreSQL 11 functions and procedures was added in OmniDB 2.11.0.

Last week we released OmniDB 2.12.0 with nice new features and a new revamped visual, so I’m going to show you how OmniDB 2.12.0 can debug PostgreSQL 11 procedures.

First of all, if you have not done that already, download and install a binary PostgreSQL library called omnidb_plugin and enable it in PostgreSQL’s config file. The debugger also uses a special schema with special tables to control the whole debugging process. This can be manually created or with an extension. For more details on the installation, please refer to the instructions. You can also refer to the documentation about the debugger.

Creating some tables in OmniDB

For our tests, let’s create 2 simple tables, foo and bar. Let’s do that using the OmniDB Console Tab:


Creating a procedure with transaction management

Note that OmniDB has a Procedures node in the tree view. Right-click on it, then click on Create Procedure. It will open a Query Tab with a SQL template showing basic SQL syntax to create a procedure.

If you want to know more about procedures, you can read online documentation without leaving OmniDB. Simple click on Procedures -> Doc: Procedures and a browser tab will be open for you already pointing to the documentation page:

Now let’s go back to the Create Procedure tab and change the code to actually create a procedure, like this:

CREATE OR REPLACE PROCEDURE public.prc_test ( p INTEGER ) LANGUAGE plpgsql AS $procedure$ BEGIN F[...]
カテゴリー: postgresql

Magnus Hagander: Tracking foreign keys throughout a schema

planet postgresql - 2018-11-05(月) 22:44:41

I recently ran into the need with a customer to track the usage of a specific key throughout the schema. Basically, "what are all the tables and columns referencing this key, directly or indirectly". Luckily, with a little bit of catalog query, that's not hard:

WITH RECURSIVE what (tbl) AS ( VALUES ('public.tt') ), t (oid, key, constrid) AS ( SELECT tbl::regclass::oid, conkey, NULL::oid FROM what INNER JOIN pg_constraint ON (contype='p' AND conrelid=tbl::regclass) UNION ALL SELECT conrelid, conkey, c.oid FROM pg_constraint c INNER JOIN t ON (c.confrelid=t.oid AND c.confkey=t.key) WHERE contype='f' ) SELECT nspname, relname, key, ARRAY( SELECT attname FROM pg_attribute a WHERE a.attrelid=t.oid AND attnum=ANY(key) ) FROM t INNER JOIN pg_class cl ON cl.oid=t.oid INNER JOIN pg_namespace n ON n.oid=cl.relnamespace

The output can be similar to:

nspname | relname | key | array ---------+---------+-----+------- public | tt | {1} | {ttt} public | foo1 | {1} | {a} public | foo2 | {3} | {z}

for a single column key (tt being the table with the primary key in, and the foo1 and foo2 tables referencing it directly or through the other one), or:

nspname | relname | key | array ---------+---------+-------+------- public | m1 | {1,2} | {a,b} public | m2 | {1,2} | {a,b}

for a multi-column foreign key.

In this particular use-case, it was an efficient way to track down key usage where naming standards for using the key had not always been followed. And of course, we also found a couple of cases where the column had the correct name but lacked the actual FOREIGN KEY definition, but that was done by just looking at the column names.

カテゴリー: postgresql

Bruce Momjian: Submitting Talks to Conferences

planet postgresql - 2018-11-05(月) 19:00:02

Having attended many conferences, I have a few suggestions on how to submit successful conference talks. First, determine the type of conference. Then, try to submit talks that match the conference type; possible topics include:

  • New Postgres features
  • User cast studies
  • Internals
  • New workloads
  • Performance
  • Application development

Of course, only some of these topics match specific types of conferences.

Second, submit multiple talks. It is very possible that someone better known than you, or someone with a better abstract, will also submit to the conference. By submitting more than one topic, you increase your chances of submitting something unique and interesting.

Continue Reading »

カテゴリー: postgresql

Laurenz Albe: Killed index tuples

planet postgresql - 2018-11-05(月) 18:00:16
© Laurenz Albe 2018


Since I only recently learned about the concept of “killed index tuples”, I thought there might be some others who are not yet familiar with this interesting PostgreSQL concept.

This may give you an explanation the next time you encounter wildly varying execution times for the same execution plan of the same PostgreSQL query.

Before we look more closely at the index, let’s review the life cycle of a table row version (“heap tuple”).

Life, death and visibility in the table heap

It is widely known that the visibility of heap tuples is determined by the system columns xmin and xmax (though there is more to xmax than meets the eye). A heap tuple is “dead” if its xmax is less than the xmin of all active transactions.

Now xmin and xmax are only valid if the respective transactions have been marked committed in the “commit log”. Consequently, any transaction that needs to know if it can see a tuple has to consult the commit log. To save future readers that extra work, the first one that consults the commit log will save the information in the tuple’s “hint bits”.

Dead tuples are eventually reclaimed by VACUUM.

This is all fairly well known, but how is the situation with index entries?

Life, death and visibility in the index

To avoid redundancy and to keep index tuples small, the visibility information is not stored in the index.
The status of an index tuple is determined by the heap tuple it points to, and both are removed by VACUUM at the same time.

As a consequence, an index scan has to inspect the heap tuple to determine if it can “see” an entry. This is the case even if all the columns needed are in the index tuple itself. Even worse, this “heap access” will result in random I/O, which is not very efficient on spinning disks.

This makes index scans in PostgreSQL more expensive than in other database management systems that use a different architecture. To mitigate that, several features have been introduced over the years:

  • PostgreSQL 8.1 introduced the “bitmp index scan”. This scan method fi
カテゴリー: postgresql