Significant Symbols

planet PHP - 2019-04-23(火) 17:27:00
Significant Symbols
London, UK Tuesday, April 23rd 2019, 09:27 BST

Last week a person on the #php IRC channel on freenode, mentioned that he had problems loading some extensions with his self-compiled PHP binary. For example, trying to activate the timezonedb PECL extension failed with:

sapi/cli/php: symbol lookup error: /usr/local/php/extensions/debug-non-zts-20180731/timezonedb.so: undefined symbol: php_date_set_tzdb

Which is odd, as the php_date_set_tzdb is a symbol that PHP has made available since Date/Time support was added. I asked the user to check whether his PHP binary exported the symbol by using nm, and the answer was:

$ nm sapi/cli/php | grep php_date_set_tzdb 000000000018bc75 t php_date_set_tzdb

The small letter t refers to a local only text (code) section: the symbol was not made available to shared libraries to use. In other words, extensions that make use of the symbol, such as the timezonedb extension can not find it, and hence fail to load.

In PHP, the php_date_set_tzdb function is defined with the PHPAPI prefix, which explicitly should mark the symbol as a global symbol, so that shared libraries can find it:

PHPAPI void php_date_set_tzdb(timelib_tzdb *tzdb)

The PHPAPI macro is used, because on Windows it is required to explicitly make symbols available:

#ifdef PHP_WIN32 # define PHPAPI __declspec(dllexport)

On Linux (with GCC) symbols are made available unless marked differently (through for example the static keyword).

When looking into this, we discovered that his PHP binary had no exported symbols at all.

After doing a bit more research, I found that more recent GCC versions support a specific compiler flag that changes the default behaviour of symbol visibility: -fvisibility=hidden. In recent versions of PHP, we enable this flag if it is supported by the installed GCC version through a check in the configure system:

dnl Mark symbols hidden by default if the compiler (for example, gcc >= 4) dnl supports it. This can help reduce the binary size and startup time. AX_CHECK_COMPILE_FLAG([-fvisibility=hidden], [CFLAGS="$CFLAGS -fvisibility=hidden"])

As this makes all symbols hidden by default, the same commit also made sure that when the PHPAPI moniker is used, we set the visibility of these specific symbols back to visible:

# if defined(__GNUC__) && __GNUC__ >= 4 # define PHPAPI __attribute__ ((visibility("default"))) # else # define PHPAPI # endif

When the original reporter saw this, he mentioned that was using an older GCC version: 3.4, and that he could see the -fvisibility=hidden flag when running make, just like here:

… cc -Iext/date/lib … -fvisibility=hidden … -c ext/date/php_date.c -o ext/date/php_date.lo

Because his GCC supported the -fvisibility=hidden flag, the check in the configure script enabled this feature, but because his GCC version was older than version 4, the counter-acting ((visbility("default"))) attribute was not set for symbols that are explicitly marked with the PHPAPI specifier. Which means that no symbols were be made available for shared PHP extensions to use.

The user created a bug report for this issue, but as GCC 3.4 is a really old version, it seems unlikely that this issue will get fixed, unless somebody contributes a patch. In the end, it was quite a fun detective story and to get to the bottom of this!

カテゴリー: php

How to validate the email address format in PHP?

planet PHP - 2019-04-23(火) 00:00:00

Receiving email messages via your website or web application is an important feature and often the only way to get in contact with your customers. If you look back, how often have you got an email message from a potential customer with an invalid or wrong email address? Sure you can use advanced regular expression […]

Originally published by Web Development Blog

カテゴリー: php

Michael Paquier: Postgres 12 highlight - REINDEX CONCURRENTLY

planet postgresql - 2019-04-22(月) 15:52:09

A lot of work has been put into making Postgres 12 an excellent release to come, and in some of the features introduced, there is one which found its way into the tree and has been first proposed to community at the end of 2012. Here is the commit which has introduced it:

commit: 5dc92b844e680c54a7ecd68de0ba53c949c3d605 author: Peter Eisentraut <peter@eisentraut.org> date: Fri, 29 Mar 2019 08:25:20 +0100 REINDEX CONCURRENTLY This adds the CONCURRENTLY option to the REINDEX command. A REINDEX CONCURRENTLY on a specific index creates a new index (like CREATE INDEX CONCURRENTLY), then renames the old index away and the new index in place and adjusts the dependencies, and then drops the old index (like DROP INDEX CONCURRENTLY). The REINDEX command also has the capability to run its other variants (TABLE, DATABASE) with the CONCURRENTLY option (but not SYSTEM). The reindexdb command gets the --concurrently option. Author: Michael Paquier, Andreas Karlsson, Peter Eisentraut Reviewed-by: Andres Freund, Fujii Masao, Jim Nasby, Sergei Kornilov Discussion: https://www.postgresql.org/message-id/flat/60052986-956b-4478-45ed-8bd119e9b9cf%402ndquadrant.com#74948a1044c56c5e817a5050f554ddee

As pointed out by the documentation, REINDEX needs to take an exclusive lock on the relation which is indexed, meaning that for the whole duration of the operation, no queries can be run on it and will wait for the REINDEX to finish. Sometimes REINDEX can become very handy in the event of an index corruption, or when in need to rebuild the index because of extra bloat on it. So the longer the operation takes, the longer a production instance is not available, and that’s bad for any deployments so maintenance windows become mandatory. There is a community tool called pg_reorg, which happens to be used by an organization called Instagram aimed at reducing the impact of a REINDEX at the cost of extra resources by using a trigger-based method to replay tuple changes while an index is rebuilt in parallel of the existing one. Later this

カテゴリー: postgresql

Raghavendra Rao: Fixing up a corrupted TOAST table

planet postgresql - 2019-04-22(月) 13:30:25
Today, when taking a logical backup(pg dump) of a database cluster table (PG 9.4), we saw a toast table error. [crayon-5cbd43c67da7c367733810/] Above error shows the toast table corruption. To fix this, we don’t need any special software, all we have to do is follow the instructions repeatedly suggested by Postgres-community folks on the community channel....
カテゴリー: postgresql

A PHP Compiler, aka The FFI Rabbit Hole

planet PHP - 2019-04-22(月) 13:00:00

It’s no secret that I’m into building toy compilers and programming languages. Today I’m introducing something that’s not a toy (I hope). Today, I’m introducing php-compiler (among many other projects). My hope is that these projects will grow from experimental status into fully production ready systems.

JIT? AOT? VM? What The Heck?

Since I’m going to be talking a lot about compilers and components in this post, I figure it’s good to start with a primer on how they work, and how the different types behave.

Types of Compilers

Let’s start by talking about the 3 main categories of how programs are executed. (There are definitely some blurred lines here, and you’ll hear people using these labels to refer to multiple different things, but for the purposes of this post):

  • Interpreted: The vast majority of dynamic languages use a Virtual Machine of some sort. PHP, Python (CPython), Ruby, and many others may be interpreted using a Virtual Machine.

    A VM is - at its most abstract level - is a giant switch statement inside of a loop. The language parses and compiles the source code into a form of Intermediary Representation often called Opcodes or ByteCode.

    The prime advantage of a VM is that it’s simpler to build for dynamic languages, and removes the “waiting for code to compile” step.

  • Compiled: The vast majority of what we think of as static languages are “Ahead Of Time” (AOT) Compiled directly to native machine code. C, Go, Rust, and many many others use an AOT compiler.

    AOT basically means that the full compilation process happens as a whole, ahead of when you want to run the code. So you compile it, and then some time later you can execute it.

    The prime advantage of AOT compilation is that it can generate very efficient code. The (prime) downside is that it can take a long time to compile code.

  • Just In Time (JIT): JIT is a relatively recently popularized method to get the best of both worlds (VM and AOT). Lua, Java, JavaScript, Python (via PyPy), HHVM, PHP 8, and many others use a JIT compiler.

    A JIT is basically just a combination of a VM and an AOT compiler. Instead of compiling the full program at once, it instead runs the code on a Virtual Machine for a while. It does this for two reasons: to figure out which parts of the code are “hot” (and hence most useful to be in machine code), and to collect some runtime information about the code (what types are commonly used, etc). Then, it pauses execution for a moment to compile just that small bit of code to machine code before resuming execution. A JIT runtime will bounce back and forth between interpreted code and native compiled code.

    The prime advantage of JIT compilation is that it balances the fast deployment cycle of a VM with the potential for AOT-like performance for some use-cases. But it is also insanely complicated since you’re building 2 full compilers, and an interface between them.

Another way of saying this, is that an Interpreter runs code, whereas an AOT compiler generates machine code which then the Computer runs. And a JIT compiler runs the code but every once in a while translates some of the running code into machine code, and then executes it.

Some more definitions

I just used the word “Compiler” a lot (along with a ton of other words), but each of these words have many different meanings, so it’s worth talking a bit about that:

  • Compiler: The meaning of “Compiler” changes depending on what you’re talking about:

    When you’re talking about building language runtimes (aka: compilers), a Compiler is a program that translates code from one language into another with different semantics (there’s a conversion step, it isn’t just a representation). It could be from PHP to Opcode, it could be from C to an Intermediary Representation. It could be from Assembly to Machine Code, it could be from a regular expression to machine code. Yes, PHP 7.0 includes a compiler to compile from PHP source code to Opcodes.

    When you’re talking about using language runtimes (aka: compilers), a Compiler is usually implied to be a specific set of programs that convert the original source code into machine code. It’s worth noting that a “Compiler” (like gcc for example) is normally made up of several smaller compilers th

Truncated by Planet PHP, read more at the original (another 77750 bytes)

カテゴリー: php

Shaun M. Thomas: PG Phriday: Around the World in Two Billion Transactions

planet postgresql - 2019-04-20(土) 02:00:21

Transaction IDs (XID) have been something of a thorn in Postgres’ side since the dawn of time. On one hand, they’re necessary to differentiate tuple visibility between past, present, and concurrent transactions. On the other hand, the counter that stores it is only 32-bits, meaning it’s possible to eventually overflow without some kind of intervention. […]

The post PG Phriday: Around the World in Two Billion Transactions appeared first on 2ndQuadrant | PostgreSQL.

カテゴリー: postgresql

Ernst-Georg Schmid: The Hare and the Hedgehog. Muscle, brain - or both?

planet postgresql - 2019-04-19(金) 07:22:00
In the famous fairy tale the hedgehog wins the race against the hare because he uses his brain to outwit the much faster hare: Brain beats muscle. But is that always the case? And what if we combine the two virtues?

The case at hand: Screening large sets of molecules for chemical simliarity.

Since (sub)graph isomorphism searching faces some mathematical challenges because of nonpolynomial O - even if you can use a specialized index, like pgchem::tigress does - fast similarity searching based on binary fingerprints has gained popularity in recent years.

I was tasked with evaluating a solution to the problem of similarity screening large sets of molecules with PostgreSQL where the fingerprints are generated externally, e.g. with the CDK.

This is, what I came up with...

Preparing the Racetrack
CREATE TABLE cdk.externalfp (
id int4 NOT NULL,
smiles varchar NOT NULL,
pubchemfp varbit NULL,
"cardinality" int4 NULL,
CONSTRAINT externalfp_pk PRIMARY KEY (id)

Above is the table definition of the final table. The cardinality column will be not used now, but since it is calculated by the fingerprint generator anyway, keeping it will save some work later. If you want to copy my example code 1:1, please use a database named chemistry and a schema named cdk.

First we need to load some data into the table. I used the free NCISMA99 dataset  from the National Cancer Institute, containing 249081 chemical structures in SMILES notation.

COPY cdk.externalfp (id, smiles) FROM '/tmp/NCISMA99' 

And a few seconds later you should have 249081 rows in the table. Now we need to generate the fingerprints. The generator code is here, additionally you need the CDK 2.2 and a PostgreSQL JDBC driver. After changing the code to reflect your JDBC URL you are good to go.

Running the FingerprintGenerator should show no errors and takes about 30 Minutes on my Core i5 Linux Notebook. The fingerprint used is the PubChem fingerprint as described here.
Now we can put an index on the cardinality column[...]
カテゴリー: postgresql

Tom&aacute;&scaron; Votruba Blog: Pattern Refactoring

phpdeveloper.org - 2019-04-19(金) 04:00:02

In Removing Static - There and Back Again post we tried looked at anti-patterns in legacy code from a new point of view. It can be static in your code, it can be active record pattern you needed for fast bootstrapping of your idea, it can be moving from the code in controllers to command bus.


カテゴリー: php

Blog entries :: phly, boy, phly: From Zend Framework To The Laminas Project

phpdeveloper.org - 2019-04-19(金) 04:00:02

Ten years ago this month, I was involved in a couple of huge changes for Zend Framework. First, I helped spearhead integration of the JavaScript library Dojo Toolkit into Zend Framework, and finalized the work that month. I'd worked closely with the two developers who had been leading that proj...

カテゴリー: php

Technical Thoughts, Tutorials, and Musings: PSR-14: Example - plugin registration

phpdeveloper.org - 2019-04-19(金) 04:00:02

PSR-14: Example - plugin registration

In Content Management Systems and similar highly-configurable applications, a common pattern is to have a registration mechanism of some sort. That is, some part of the system asks other parts of the system "give me a list of your Things!", and t...
カテゴリー: php

Derick Rethans: PHP Internals News: Episode 6: PHP Quality Assurance

phpdeveloper.org - 2019-04-19(金) 04:00:02
PHP Internals News: Episode 6: PHP Quality Assurance London, UK Thursday, April 18th 2019, 09:06 BST In this sixth episode of "PHP Internals News" we talk to Remi Collet (Twitter, Website, GitHub, Donate) about the work that he does through...
カテゴリー: php

Voices of the ElePHPant: Interview with Andrew Caya

phpdeveloper.org - 2019-04-19(金) 04:00:01


Show Notes Linux for PHP PHP Continuous Learning Light MVC Framework


This episode is sponsored by

The post Interview with Andrew Caya appeared first on Voices of the ElePHPant.

カテゴリー: php

Pavel Stehule: new release of plpgsql_check - possibility to check SQL injection issue

planet postgresql - 2019-04-19(金) 03:22:00
Yesterday I released next version of plpgsql_check.

With this release a developer can check some well known patterns of SQL injection vulnerabilities. The code of stored procedures of native languages like PL/SQL, T-SQL or PL/pgSQL is secure, and there is not a risk of SQL injection until dynamic SQL is used (the EXECUTE command in PL/pgSQL). The safe programming requires sanitization of all string variables. Anybody can use functions: quote_literal, quote_ident or format. This check can be slow, so it should be enabled by setting security_warnings parameter:

CREATE OR REPLACE FUNCTION public.foo1(a text)
LANGUAGE plpgsql
AS $function$
DECLARE result text;
-- secure
-- secure
EXECUTE 'SELECT ' || quote_literal(a) INTO result;
-- secure
EXECUTE format('SELECT %L', a) INTO result;
-- unsecure
EXECUTE 'SELECT ''' || a || '''' INTO result;
-- unsecure
EXECUTE format(e'SELECT \'%s\'', a) INTO result;
RETURN result;

postgres=# select * from plpgsql_check_function('foo1');
│ plpgsql_check_function │
(0 rows)

postgres=# select * from plpgsql_check_function('foo1', security_warnings => true);
│ plpgsql_check_function │
│ security:00000:11:EXECUTE:text type variable is not sanitized │
│ Query: SELECT 'SELECT ''' || a || '''' │
│ -- ^ │
│ Detail: The EXECUTE expression is SQL injection vulnerable. │
│ Hint: Use quote_ident, quote_literal or format function to secure variable. │
│ security:00000:13:EXECUTE:text type variable is not sanitized │
│ Query: SELECT format(e'SELECT \'%s\'', a) [...]
カテゴリー: postgresql

Julien Rouhaud: New in pg12: Statistics on checkums errors

planet postgresql - 2019-04-18(木) 20:02:26
Data checksums

Added in PostgreSQL 9.3, data checksums can help to detect data corruption happening on the storage side.

Checksums are only enabled if the instance was setup using initdb --data-checksums (which isn’t the default behavior), or if activated afterwards with the new pg_checksums tool also added in PostgreSQL 12.

When enabled, checksums are written each time a block is written to disk, and verified each time a block is read from disk (or from the operating system cache). If the checksum verification fails, an error is reported in the logs. If the block was read by a backend, the query will obviously fails, but if the block was read by a BASE_BACKUP operation (such as pg_basebackup), the command will continue its processing . While data checkums will only catch a subset of possible problems, they still have some values, especially if you don’t trust your storage reliability.

Up to PostgreSQL 11, any checksum validation error could only be found by looking into the logs, which clearly isn’t convenient if you want to monitor such error.

New counters available in pg_stat_database

To make checksum errors easier to monitor, and help users to react as soon as such a problem occurs, PostgreSQL 12 adds new counters in the pg_stat_database view:

commit 6b9e875f7286d8535bff7955e5aa3602e188e436 Author: Magnus Hagander <magnus@hagander.net> Date: Sat Mar 9 10:45:17 2019 -0800 Track block level checksum failures in pg_stat_database This adds a column that counts how many checksum failures have occurred on files belonging to a specific database. Both checksum failures during normal backend processing and those created when a base backup detects a checksum failure are counted. Author: Magnus Hagander Reviewed by: Julien Rouhaud


commit 77bd49adba4711b4497e7e39a5ec3a9812cbd52a Author: Magnus Hagander <magnus@hagander.net> Date: Fri Apr 12 14:04:50 2019 +0200 Show shared object statistics in pg_stat_database This adds a row to the pg_stat_database view with datoid 0 and datname NULL fo[...]
カテゴリー: postgresql

Interview with Andrew Caya

planet PHP - 2019-04-18(木) 20:00:00

@andrewscaya Show Notes


This episode is sponsored by

The post Interview with Andrew Caya appeared first on Voices of the ElePHPant.

カテゴリー: php

Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Log all statements from a sample of transactions

planet postgresql - 2019-04-18(木) 17:01:31
On 3rd of April 2019, Alvaro Herrera committed patch: Log all statements from a sample of transactions   This is useful to obtain a view of the different transaction types in an application, regardless of the durations of the statements each runs.   Author: Adrien Nayrat Commit message makes it pretty clear, so let's see … Continue reading "Waiting for PostgreSQL 12 – Log all statements from a sample of transactions"
カテゴリー: postgresql

Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Report progress of CREATE INDEX operations

planet postgresql - 2019-04-18(木) 10:15:52
On 2nd of April 2019, Alvaro Herrera committed patch: Report progress of CREATE INDEX operations     This uses the progress reporting infrastructure added by , adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY.   There are two pieces to this: one is index-AM-agnostic, and the other is AM-specific. The latter is fairly elaborate … Continue reading "Waiting for PostgreSQL 12 – Report progress of CREATE INDEX operations"
カテゴリー: postgresql

PSR-14: Example - plugin registration

planet PHP - 2019-04-18(木) 08:08:00
PSR-14: Example - plugin registration

In Content Management Systems and similar highly-configurable applications, a common pattern is to have a registration mechanism of some sort. That is, some part of the system asks other parts of the system "give me a list of your Things!", and then modules/extensions/plugins (whatever the system calls them) can incrementally build up that list of Things, which the caller then does something with. Those Things can be defined by the extension, or they can be defined by user-configuration and turned into a Thing definition by the module. Both are valid and useful, and can be mixed and matched.

This pattern lends itself very well to an Event system like PSR-14, and in fact the "give me a list of Things" pattern was one of the explicit use cases the Working Group considered. Today let's look at how one could easily implement such a mechanism.

Continue reading this post on SteemIt.

Larry 17 April 2019 - 6:08pm
カテゴリー: php