フィードアグリゲーター

Hans-Juergen Schoenig: Implementing “AS OF”-queries in PostgreSQL

planet postgresql - 2019-02-05(火) 23:01:09

Over the years many people have asked for “timetravel” or “AS OF”-queries in PostgreSQL. Oracle has provided this kind of functionality for quite some time already. However, in the PostgreSQL world “AS OF timestamp” is not directly available. The question now is: How can we implement this vital functionality in user land and mimic Oracle functionality?

Implementing “AS OF” and timetravel in user land

Let us suppose we want to version a simple table consisting of just three columns: id, some_data1 and some_data2. To do this we first have to install the btree_gist module, which adds some valuable operators we will need to manage time travel. The table storing the data will need an additional column to handle the validity of a row. Fortunately PostgreSQL supports “range types”, which allow to store ranges in an easy and efficient way. Here is how it works:

CREATE EXTENSION IF NOT EXISTS btree_gist; CREATE TABLE t_object ( id int8, valid tstzrange, some_data1 text, some_data2 text, EXCLUDE USING gist (id WITH =, valid WITH &&) );

Mind the last line here: “EXLUDE USING gist” will ensure that if the “id” is identical the period (“valid”) must not overlap. The idea is to ensure that the same “id” only has one entry at a time. PostgreSQL will automatically create a Gist index on that column. The feature is called “exclusion constraint”. If you are looking for more information about this feature consider checking out the official documentation (https://www.postgresql.org/docs/current/ddl-constraints.html).

If you want to filter on some_data1 and some_data2 consider creating indexes. Remember, missing indexes are in many cases the root cause of bad performance:

CREATE INDEX idx_some_index1 ON t_object (some_data1); CREATE INDEX idx_some_index2 ON t_object (some_data2);

By creating a view, it should be super easy to extract data from the underlying tables:

CREATE VIEW t_object_recent AS SELECT id, some_data1, some_data2 FROM t[...]
カテゴリー: postgresql

Avinash Kumar: Use pg_repack to Rebuild PostgreSQL Database Objects Online

planet postgresql - 2019-02-05(火) 10:14:49

In this blog post, we’ll look at how to use

pg_repack  to rebuild PostgreSQL database objects online.

We’ve seen a lot of questions regarding the options available in PostgreSQL for rebuilding a table online. We created this blog post to explain the 

pg_repack  extension, available in PostgreSQL for this requirement. pg_repack is a well-known extension that was created and is maintained as an open source project by several authors.

There are three main reasons why you need to use

pg_repack  in a PostgreSQL server:
  1. Reclaim free space from a table to disk, after deleting a huge chunk of records
  2. Rebuild a table to re-order the records and shrink/pack them to lesser number of pages. This may let a query fetch just one page  ( or < n pages) instead of n pages from disk. In other words, less IO and more performance.
  3. Reclaim free space from a table that has grown in size with a lot of bloat due to improper autovacuum settings.

You might have already read our previous articles that explained what bloat is, and discussed the internals of autovacuum. After reading these articles, you can see there is an autovacuum background process that removes dead tuples from a table and allows the space to be re-used by future updates/inserts on that table. Over a period of time, tables that take the maximum number of updates or deletes may have a lot of bloated space due to poorly tuned autovacuum settings. This leads to slow performing queries on these tables. Rebuilding the table is the best way to avoid this. 

Why is just autovacuum not enough for tables with bloat?

We have discussed several parameters that change the behavior of an autovacuum process in this blog post. There cannot be more than

autovacuum_max_workers  number of autovacuum processes running in a database cluster at a time. At the same time, due to untuned autovacuum settings and no manual vacuuming of the database as a weekly or monthy jobs, many tables can be skipped from autovacuum. We have discussed in this post that the default autovacuum settings run autova[...]
カテゴリー: postgresql

Christophe Pettus: “Breaking PostgreSQL at Scale” at FOSDEM 2019

planet postgresql - 2019-02-05(火) 06:00:49

The slides for my talk, “Breaking PostgreSQL at Scale” at FOSDEM 2019 are available.

カテゴリー: postgresql

Bruce Momjian: Permission Letters

planet postgresql - 2019-02-05(火) 05:00:01

If you have looked at Postgres object permissions in the past, I bet you were confused. I get confused, and I have been at this for a long time.

The way permissions are stored in Postgres is patterned after the long directory listing of Unix-like operating systems, e.g., ls -l. Just like directory listings, the Postgres system stores permissions using single-letter indicators. r is used for read (SELECT) in both systems, while w is used for write permission in ls, and UPDATE in Postgres. The other nine letters used by Postgres don't correspond to any directory listing permission letters, e.g., d is DELETE permission. The full list of Postgres permission letters is in the GRANT documentation page; the other letters are:

D -- TRUNCATE x -- REFERENCES t -- TRIGGER X -- EXECUTE U -- USAGE C -- CREATE c -- CONNECT T -- TEMPORARY

Continue Reading »

カテゴリー: postgresql

Amit Jain: An Overview of the Index Changes in PostgreSQL 11

planet postgresql - 2019-02-05(火) 02:23:33

The right application of indexes can make queries blazing fast.

Indexes use pointers to access data pages in a speedy fashion.

Major changes happened on Indexes in PostgreSQL 11, lots of much awaited patches have been released.

Let's have a look at some of the great features of this release.

Parallel B-TREE Index Builds

PostgreSQL 11 introduced an infrastructure patch to enable parallel index creation.

It can be only used with B-Tree index as for now.

Building a parallel B-Tree index is two to three times faster than doing the same thing without parallel working (or serial build).

In PostgreSQL 11 parallel index creation is on by default.

There are two important parameters:

  • max_parallel_workers - Sets the maximum number of workers that the system can support for parallel queries.
  • max_parallel_maintenance_workers - Controls the maximum number of worker processes which can be used to CREATE INDEX.

Let's check it with an example:

severalnines=# CREATE TABLE test_btree AS SELECT generate_series(1,100000000) AS id; SELECT 100000000 severalnines=# SET maintenance_work_mem = '1GB'; severalnines=# \timing severalnines=# CREATE INDEX q ON test_btree (id); TIME: 25294.185 ms (00:25.294)

Let's try it with 8-way parallel work:

severalnines=# SET maintenance_work_mem = '2GB'; severalnines=# SET max_parallel_workers = 16; severalnines=# SET max_parallel_maintenance_workers = 8; severalnines=# \timing severalnines=# CREATE INDEX q1 ON test_btree (id); TIME: 11001.240 ms (00:11.001)

We can see the performance difference with the parallel worker, more than 60% performant with just a small change. The maintenance_work_mem can also be increased to get more performance.

The ALTER table also helps to increase parallel workers. Below syntax can be used to increase parallel workers along with max_parallel_maintenance_workers. This bypasses the cost model completely.

ALTER TABLE test_btree SET (parallel_workers = 24);

Tip: RESET to default once the index build is completed to prevent adverse query plan.

CREATE INDEX with the CONCU

[...]
カテゴリー: postgresql

Brandur Leach: SortSupport: Sorting in Postgres at Speed

planet postgresql - 2019-02-05(火) 01:56:52

Most often, there’s a trade off involved in optimizing software. The cost of better performance is the opportunity cost of the time that it took to write the optimization, and the additional cost of maintenance for code that becomes more complex and more difficult to understand.

Many projects prioritize product development over improving runtime speed. Time is spent building new things instead of making existing things faster. Code is kept simpler and easier to understand so that adding new features and fixing bugs stays easy, even as particular people rotate in and out and institutional knowledge is lost.

But that’s certainly not the case in all domains. Game code is often an interesting read because it comes from an industry where speed is a competitive advantage, and it’s common practice to optimize liberally even at some cost to modularity and maintainability. One technique for that is to inline code in critical sections even to the point of absurdity. CryEngine, open-sourced a few years ago, has a few examples of this, with “tick” functions like this one that are 800+ lines long with 14 levels of indentation.

Another common place to find optimizations is in databases. While games optimize because they have to, databases optimize because they’re an example of software that’s extremely leveraged – if there’s a way to make running select queries or building indexes 10% faster, it’s not an improvement that affects just a couple users, it’s one that’ll potentially invigorate millions of installations around the world. That’s enough of an advantage that the enhancement is very often worth it, even if the price is a challenging implementation or some additional code complexity.

Postgres contains a wide breadth of optimizations, and happily they’ve been written conscientiously so that the source code stays readable. The one that we’ll look at today is SortSupport, a technique for localizing the information needed to compare data into places where it can be accessed very quickly, thereby making sorting data much fas

[...]
カテゴリー: postgresql

Hubert 'depesz' Lubaczewski: Converting list of integers into list of ranges

planet postgresql - 2019-02-05(火) 00:13:19
Yesterday someone on irc asked: i've a query that returns sequential numbers with gaps (generate_series + join) and my question is: can is somehow construct ranges out of the returned values? sort of range_agg or something? There was no further discussion, aside from me saying sure you can. not trivial task, but possible. you'd need … Continue reading "Converting list of integers into list of ranges"
カテゴリー: postgresql

Paul Ramsey: Dr. JTS comes to Crunchy

planet postgresql - 2019-02-04(月) 22:00:00

Today’s an exciting day in the Victoria office of Crunchy Data: our local staff count goes from one to two, as Martin Davis joins the company!

This is kind of a big deal, because this year Martin and I will be spending much or our time on the core computational geometry library that powers PostGIS, the GEOS library, and the JTS library from which it derives its structure.

Why is that a big deal? Because GEOS, JTS and other language ports provide the computational geometry algorithms underneath most of the open source geospatial ecosystem – so improvements in our core libraries ripple out to help a huge swathe of other software.

JTS came first, initially as a project of the British Columbia government. GEOS is a C++ port of JTS. There are also Javascript and .Net ports (JSTS and NTS.

Each of those libraries has developed a rich downline of other libraries and projects that depend on them. On the desktop, on the web, in the middleware, JTS and GEOS power all of it.

So we know that work on JTS and GEOS on our side is going to benefit far more than just PostGIS.

I’ve already spent a decent amount of time on bringing the GEOS library up to date with the changes in JTS over the past few months, and trying to fulfill the “maintainer” role, merging pull requests and closing some outstanding tickets.

As Martin starts adding to JTS, I now feel more confident in my ability to bring those changes into the C++ world of GEOS as they land.

Without pre-judging what will get first priority, topics of overlay robustness, predicate performance, and geometry cleaning are near the top of our list.

Our spatial customers at Crunchy process a lot of geometry, so ensuring that PostGIS (GEOS) operations are robust and high performance is a big win for PostgreSQL and for our customers as well.

カテゴリー: postgresql

Andreas Scherbaum: How long will a 64 bit Transaction-ID last in PostgreSQL?

planet postgresql - 2019-02-03(日) 22:34:00

Andreas 'ads' Scherbaum

At FOSDEM someone asked how long 64 bit Transaction-IDs will last.

To refresh: PostgreSQL is currently using 32 bits for the TXID, and is good for around 4 billion transactions:

fosdem=# SELECT 2^32; ?column? ------------ 4294967296 (1 row)

That will not last very long if you have a busy database, doing many writes over the day. MVCC keeps the new and old versions of a row in the table, and the TXID will increase with every transaction. At some point the 4 billion transactions are reached, the TXID will overrun, and start again at the beginning. The way transactions are working in PostgreSQL, suddenly all data in your database will become invisible. No one wants that!

To limit this problem, PostgreSQL has a number mechanism in place:

  • PostgreSQL splits transaction ids into half: 2 billion in the past are visible, 2 billion in the future are not visible - all visible rows must live in the 2 billion in the past, at all times.
  • Old, deleted row versions are enevtually removed by VACUUM (or Autovacuum), the XID is no longer used.
  • Old row versions, which are still live, are marked as "freezed" in a table, and assigned a special XID - the previously used XID is no longer needed. The problem here is that every single table in every database must be Vacuumed before the 2 billion threshold is reached.
  • PostgreSQL uses lazy XIDs, where a "real" transaction id is only assigned if the transaction changes something on disk - if a transaction is read only, and does not change anything, no transaction id is consumed.

 

 

Continue reading "How long will a 64 bit Transaction-ID last in PostgreSQL?"
カテゴリー: postgresql

Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Allow COPY FROM to filter data using WHERE conditions

planet postgresql - 2019-02-03(日) 17:47:51
On 19th of January 2019, Tomas Vondra committed patch: Allow COPY FROM to filter data using WHERE conditions   Extends the COPY FROM command with a WHERE condition, which allows doing various types of filtering while importing the data (random sampling, condition on a data column, etc.). Until now such filtering required either preprocessing of … Continue reading "Waiting for PostgreSQL 12 – Allow COPY FROM to filter data using WHERE conditions"
カテゴリー: postgresql

James Coleman: PostgreSQL at Scale: Database Schema Changes Without Downtime

planet postgresql - 2019-02-02(土) 08:01:28

Braintree Payments uses PostgreSQL as its primary datastore. We rely heavily on the data safety and consistency guarantees a traditional relational database offers us, but these guarantees come with certain operational difficulties. To make things even more interesting, we allow zero scheduled functional downtime for our main payments processing services.

Several years ago we published a blog post detailing some of the things we had learned about how to safely run DDL (data definition language) operations without interrupting our production API traffic.

Since that time PostgreSQL has gone through quite a few major upgrade cycles — several of which have added improved support for concurrent DDL. We’ve also further refined our processes. Given how much has changed, we figured it was time for a blog post redux.

In this post we’ll address the following topics:

First, some basics

For all code and database changes, we require that:

  • Live code and schemas be forward-compatible with updated code and schemas: this allows us to roll out deploys gradually across a fleet of application servers and database clusters.
  • New code and schemas be backward-compatible with live code and schemas: this allows us to roll back any change to the previous version in the event of unexpected errors.

For all DDL operations we require that:

  • Any exclusive locks acquired on tables or indexes be held for at most ~2 seconds.
  • Rollback strategies do not involve reverting the database schema to its previous version.
Transactionality

PostgreSQL supports transactional DDL. In most cases, you can execute multiple DDL statements inside an explicit database transaction and take an “all or nothing” approach to a set of changes. However, running multiple DDL statements inside a transaction has one serious downside: if you alter multiple objects, you’ll need to acquire exclusive locks on all of those objects in a single

[...]
カテゴリー: postgresql

JestPHP Conference - February 22nd 2019, Mesa Arizona

php.net - 2019-02-02(土) 03:59:28
カテゴリー: php

DDD Europe notes - Day 2

planet PHP - 2019-02-01(金) 20:50:00
Cyrille Martraire: Domain modeling towards First Principles

This was a talk from the first day, but it required some more processing before writing about it. Cyrille is one of my favorite speakers. He's fast, funny and brings a lot of interesting topics to the table. So many that it's sometimes hard to keep following his train of thought, and writing down some notes at the same time.

A central concept from his talk was what he called the waterline between IT and the business. In a traditional scenario, developers get provided with "work" on a case-by-case basis. They don't learn about the general idea or plan, or even the vision, goal or need that's behind the "work item". They just have to "implement" it. It leads to badly designed code. But it also leads to the wrong solutions being delivered. If only developers could have talked with the people who actually have the problem for which they build the solution. Maybe there's another problem behind it, or maybe the business has provided the developer with a solution, instead of a problem. To higher the waterline means to get more involved with the customers and users, to understand their problems, and work together on a solution. Make sure you get involved.

When looking for the right solutions, investige the problems and use the following heuristic: "Consider the intensional alternative". Cyrille considers "intensional" a pedantic word and likes it. It's opposed to "extensional". Extensional corresponds to enumarating the results. Intensional means defining the predicate which will produce those results. Looking for intensional solutions means you'll end up with a better understanding of the problem. And also an automatable one.

While collecting these intensional aspects of existing business problems, as a developer you will be building a theory about that business. Cyrille warns against the illusion that a domain is simply a lot of information withs if-s on top. Business is messy, so we shoudn't be obsessed with rules. Some parts of a domain allow theorizing, some don't.

This nicely aligns with my own experience, trying to understand a business domain. Domain experts often aren't aware of the inconsistencies, or impossible rules they come up with. Once you bring in some logic, some rules seem reasonable but just aren't feasible. These may be areas where there's manual intervention in an otherwise automated system.

Another interesting concept Cyrille brought up was that of "skeuomorphism". He noticed that software solutions seem to continue building on technology from the past. Software systems often look like they're rebuilding a paper bureaucracy, but now it's digital. You can't really be "disruptive" if you don't think about radically different ways of solving some problem for the user. This was a bit of shock, because I realized that I often use "paper metaphors" to find out a possible solution for a design issue. Then again, maybe there's room for both - a paper metaphor as a design tool, yet a non-paper bureaucracy inspired software solution.

Maaret Pyhäjärvi: Breaking Illusions with Testing

Identifying herself as a "feedback fairy", Maaret talks about testing, in a very broad sense. It doesn't only cover automated testing, but also exploratory testing, and what you're doing with them. Tests don't break code, they break the illusions you have about it. The way Maaret explained testing, I immediately became scared of all the illusions I have about my code and that I should start breaking them.

I like how Maaret used the term "heuristics", like more DDD-ers do, to share with us how she and other testers do their work. It seems it's very common to share heuristics in the testing community. Very interesting! I don't think programmers share them as freely as they should.

Testing an application starts with realizing: you don't know much. So you have to create some sort of a map of what's there (functions, data, the platform, etc.). Start to discover, and never be bored. Always look for the interesting things, and "poke it until it pops".

In general, what you'll find is all kinds of illusions (they could be called "assumptions", I think). In testing an application ("it has high code coverage, so it's tested") but also in the general activity of developing software ("if we have this new feature, we'll earn more money").

Thinking in terms of illusions, I think most development teams are very good at them. We have all kinds of ideas about our code, our way of working, the application, our teammates, the company, etc. But they aren't really tested. It'll be smart to challenge them.

Also, looking at how others approach their problems is very useful. There will be a different perspective that you

Truncated by Planet PHP, read more at the original (another 826 bytes)

カテゴリー: php

Michael Paquier: Two-phase commit and temporary objects

planet postgresql - 2019-02-01(金) 17:01:39

A couple of weeks ago a bug has popped up on the community mailing lists about the use of temporary objects in two-phase commit. After discussions, the result is the following commit:

commit: c5660e0aa52d5df27accd8e5e97295cf0e64f7d4 author: Michael Paquier <michael@paquier.xyz> date: Fri, 18 Jan 2019 09:21:44 +0900 Restrict the use of temporary namespace in two-phase transactions Attempting to use a temporary table within a two-phase transaction is forbidden for ages. However, there have been uncovered grounds for a couple of other object types and commands which work on temporary objects with two-phase commit. In short, trying to create, lock or drop an object on a temporary schema should not be authorized within a two-phase transaction, as it would cause its state to create dependencies with other sessions, causing all sorts of side effects with the existing session or other sessions spawned later on trying to use the same temporary schema name. Regression tests are added to cover all the grounds found, the original report mentioned function creation, but monitoring closer there are many other patterns with LOCK, DROP or CREATE EXTENSION which are involved. One of the symptoms resulting in combining both is that the session which used the temporary schema is not able to shut down completely, waiting for being able to drop the temporary schema, something that it cannot complete because of the two-phase transaction involved with temporary objects. In this case the client is able to disconnect but the session remains alive on the backend-side, potentially blocking connection backend slots from being used. Other problems reported could also involve server crashes. This is back-patched down to v10, which is where 9b013dc has introduced MyXactFlags, something that this patch relies on. Reported-by: Alexey Bashtanov Author: Michael Paquier Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/5d910e2e-0db8-ec06-dd5f-baec420513c3@imap.cc Backpatch-through: 10

In PostgreSQL, temporary objects are as

[...]
カテゴリー: postgresql

Blast from the Past

planet PHP - 2019-02-01(金) 16:00:00
カテゴリー: php

CoderCruise 2019 - Call for Speakers

php.net - 2019-02-01(金) 07:01:27
カテゴリー: php

DDD Europe notes - Day 1

planet PHP - 2019-02-01(金) 04:00:00
Eric Evans: Keynote ("Language in Context")

Starting out with the basics (words have meaning within a context; when we make the boundary of this context explicit we end up with a bounded context), Eric discussed two main topics: the Big Ball of Mud, and bounded contexts in the context (no pun intended) of microservices.

Legacy applications are always framed in a negative way, as if it's something to get away from. Personally, I've come to enjoy them a lot. However, there will always be the urge to work around legacy code. The Bubble Context (PDF) can be a great way of creating a new model that works well next to the already existing models. To keep a safe buffer between the new and the old model, you could build an Anti-Corruption Layer (ACL). A fun thing Eric mentioned is that the buffer works in two directions. The ACL also allows the old model to keep functioning without being disturbed by all the new things that are going on in the Bubble Context.

Given that a bounded context may or may not align with an actual subdomain, it may very well be that a legacy context is actually a Big Ball of Mud, with unified models, and simply just a lot of domain knowledge written between the many spaghetti lines that are in there. However, even though it is a mess, and it's getting harder to work with it every day, it may still be what you could call a "Mature Productive Context". The question is: is it still aligned with business views? If it is, we could improve at least maintainability and the cost of change by performing local refactorings. If it isn't, it'll be very hard to change anything. If the basic assumptions of the model change, rework will be very costly.

As a matter of fact, for a project I'm currently working on, we're looking into a module (or context), which requires some heavy refactoring, because it has become technically very hard to work with it. However, the business is still quite happy about it, and it's quite central to its processes.

An important Domain-Driven approach which can be used in the area of legacy code is where you analyze the different subdomains, and find out which ones are generic, and which ones are "core" to the business. As an example, in the aforementioned project there are actually two candidates for context-level improvements. One is related to Sales (which is the heart of this financial application), and one is related to Addressbook records (which is very much supportive to the Sales part). One could say it's even generic, in the sense that an off the shelf solution might be preferable. We wouldn't want to spend a lot of design or development effort there either.

Eric mentioned the term "Quaint Context" as a suitable name for a context that one would consider "legacy". It uses outdated technology probably, and has become hard to maintain. It won't be possible to make big changes there (as mentioned, because these basic assumptions can't easily be changed), so another good name could be "Patch-by-Patch Context".

With a microservice architecture, another option to deal with legacy contexts becomes what Eric calls the "Exposed Legacy Asset" (yet another nice term!). This will be a legacy application which starts to adapt to the microservices environment by producing messages that will be useful for actual microservices in that environment. For instance, database triggers could be used to produce events. The external events themselves don't have to be as low-level as the internal events that caused them.

Eric touches on several other interesting aspects of microservice architecture, but I wanted to briefly mention some other relevant ideas here. Eric looked back at 15 years of Domain-Driven Design and proposed that by now we maybe need a definition of DDD itself. He doesn't want DDD to be just a club, but asks for intellectual honesty. If you try to apply DDD and somehow it fails, you should share this story. If you're skeptical about some aspect of DDD, talk about it. I like how it boils down to focusing on the core domain, exploring models together, and speaking a ubiquitous language in an explicitly bounded context. Nice!

Rebecca Wirfs-Brock: Growing Your Design Heuristics Toolkit

This one was a workshop with limited access, so I was lucky I could attend it. Rebecca had spoken in a previous edition of the conference about heuristics, which triggered my interest in the idea. The workshop was about the process behind it. It had some interesting pointers, like a PDF about the concept and a book by Billy Vaughn Koen: Discussion of the Method. Definitely things to check ou

Truncated by Planet PHP, read more at the original (another 2340 bytes)

カテゴリー: php

Keith Fiske: Managing Transaction ID Exhaustion (Wraparound) in PostgreSQL

planet postgresql - 2019-01-31(木) 21:14:00

One of the most critical topics to understand when administering a PostgresSQL database is the concept of transaction IDs (TXID) and that they can be exhausted if not monitored properly. However, this blog post isn't going to go into the details of what it TXID exhaustion actually is. The Routine Vacuuming section of the documentation is probably one of the most important to read and understand so I will refer you there. What this blog post is going to cover is an easy way to monitor for it and what can be done to prevent it ever being a problem.

カテゴリー: postgresql

Using img2lambda to publish your Serverless PHP layer

planet PHP - 2019-01-31(木) 20:02:00

This interesting tweet by Clare Liguori came to my attention last week:

This new img2lambda tool will take the layers of a Docker container and convert them to AWS layers for use in Lambda.

I poked around with Clare's example and updated my lambda-php project in order to understand how it works. I also rewrote my runtime's bootstrap to make it clearer.

The clever thing from my point of view is that you can build your PHP runtime layer locally using Docker and then publish the layers to AWS for use in your Lambda functions. This means you can now use the layer in different projects and have them all reference the same PHP runtime.

The magic of Docker

This is all done with the magic of Docker. Specifically, we create a Dockerfile that creates to containers:

  • A container to build the PHP binary
  • A container containing the layers that img2lambda will upload to AWS

Simplified, it looks like this:

# Build PHP in the Lambda container FROM amazonlinux:2017.03.1.20170812 as builder # PHP version of create ARG ver="7.3.1" RUN sed -i 's;^releasever.*;releasever=2017.03;;' /etc/yum.conf && \ yum clean all && \ yum install -y autoconf bison gcc gcc-c++ make libcurl-devel {etc} RUN curl -sL https://github.com/php/php-src/archive/php-${ver}.tar.gz | tar -xvz && \ cd php-src-php-${ver} && \ ./buildconf --force && \ ./configure --prefix=/opt/php/ --with-openssl --with-curl --{etc} && \ make install && \ /opt/php/bin/php -v # copy php binary into /runtime/bin/php RUN mkdir -p /runtime/bin && \ cp /opt/php/bin/php /runtime/bin/php # copy bootstrap files into /runtime COPY src/* /runtime/ # Create runtime container for use with img2lambda FROM lambci/lambda:provided as runtime COPY --from=builder /runtime /opt/

The first part download and compiles the PHP binary and puts it into /runtime/bin/php and also copies in the bootstrap files required to make the layer act as a Lambda runtime.

The second part simply copies all the files in /runtime in the first container into a new container. This new container containers a single layer which is our PHP runtime.

To build it we do:

$ docker build -t lambda-php-runtime .

The tag name can be anything.

Create the the layer in AWS

To create the AWS layer we use img2lambda:

$ img2lambda -i lambda-php-runtime:latest -r eu-west-2 -n lambda-php73

This tool will find the layer in the runtime container and upload it to eu-west2 and then store the identifier in output/layers.json which looks like this.

[ "arn:aws:lambda:eu-west-2:66...06:layer:lambda-php73-sha256-e3c4...2e618b:1" ]

Using in Serverless Framework

You can take the ARN number and use it for your actions in serverless.yml like this:

functions: hello: handler: handler.hello layers: - "arn:aws:lambda:eu-west-2:66...06:layer:lambda-php73-sha256-e3c4...2e618b:1"

This is the way that we can re-use our PHP runtime in all our actions across multiple projects.

Serverless also allows you to import data from JSON files into your serverless.yml file, so you can also do:

functions: hello: handler: handler.hello # {file name}.{function name}. In this case: hello() in handler.php layers: ${file(../php-runtime/output/layers.json)}

This has the advantage that when you re-build and upload PHP runtime layer, your project picks it up, which could also be convenient.

Testing locally

One of the benefits of building the runtime in Docker is that we can run it locally for testing our Lambda function. As our runtime container inherits from lambci/lambda:provided, we get an environment that looks very much like Lambda itself.

To test locally, we use another container and inherits from our runtime one and copies in the PHP files for our function. The Dockerfile looks something like this:

FROM lambda-php-runtime as function COPY handler.php /var/task/src/handler.php

We create our cont

Truncated by Planet PHP, read more at the original (another 1321 bytes)

カテゴリー: php

Bruce Momjian: Limiting Superuser Activity

planet postgresql - 2019-01-31(木) 02:15:01

This interesting email thread explores the question of how much you can prevent or detect unauthorized database superuser activity. The main conclusions from the thread are:

  • It is impossible to restrict database administrator access without hindering their ability to perform their jobs
  • Monitoring superuser activity is the most reasonable way to detect and hopefully discourage unauthorized activity
  • Monitoring includes:
    • Assign a separate account to each administrator for auditing purposes; do not use generic/shared accounts
    • Use an auditing tool to record database activity, e.g., pgAudit
    • Use syslog to send database logs to a computer not under database administrators' control
    • Record all shell command activity in a similar way

There is also a helpful summary email.

カテゴリー: postgresql

ページ