Friday, 28 September 2012

PostgreSQL Conference Europe 2012 - 3 weeks to go!

There are less than four weeks until this years PostgreSQL Conference Europe, to be held at the Corinthia Hotel, Prague, organised by PostgreSQL Europe. This years event is shaping up to be our best ever, with a higher rate of attendee registrations than we've ever seen before, a bumper crop of talks from PostgreSQL users and contributors from around the world, and a keynote presentation from well known industry veteran, renowned author on all things SQL, and one of the original authors of the SQL89 and SQL92 standards, Joe Celko.

For more information on the event, visit the conference website.

This years conference will start with a day of training sessions (available at extra cost), places on which must be pre-booked when you register:
  • Joe Celko - A day of SQL with Celko: How often does an opportunity to learn SQL from the master come along? Not very often! Not to be missed if you're new to SQL or want to polish up your skills.
  • Bruce Momjian, Devrim Gunduz (EnterpriseDB) - Mastering PostgreSQL Administration: Hone your mastery of PostgreSQL, learning from Bruce, one of the project founders, and long term contributor and RPM maintainer, Devrim. This day long session starts with the basics, moving into more advanced administration and management topics.
  • Greg Smith, Peter Geoghegan (2nd Quadrant) - PostgreSQL Performance Training: Renowned PostgreSQL performance expert Greg and developer Peter spend half a day showing you how to get the best performance from your database servers.
  • Dimitri Fontaine, Simon Riggs (2nd Quadrant) - PostgreSQL Replication Training: PostgreSQL developer Dimitri and committer and author of features such as hot standby and point in time recovery, Simon, spend half a day discussing the streaming and synchronous replication, as well as failover and monitoring options available to PostgreSQL users.
  • Pavel Stehule - Implementace uložených procedur v PostgreSQL: Veteran contributor to PostgreSQL's stored functions languages Pavel shows you how to write stored procedures in PostgreSQL. Half day, presented in Czech.
  • Tomas Vondra (GoodData) - Čtení exekučních plánů: Long time contributor Tomas teaches the finer points of reading query (EXPLAIN) plans. Half day, presented in Czech.
Following the training sessions, we have three days of talks in three different rooms, with something for everyone - check out the conference schedule for details - and as always there will be social events hosted by our sponsors, and the hallway track to meet and chat with other PostgreSQL users and developers.

Convinced? Of course you are. Head over to the registration page and book your place now!

Monday, 26 March 2012

PostgreSQL Developer Meeting 2012

For the last few years I've organised a by-invitation meeting of the most active developers in the PostgreSQL community the day before PGCon's sessions start in Ottawa. This is undoubtedly the best time of the year to do this as the vast majority of PostgreSQL hackers attend this event - many only attend local events for the rest of the year.

As you can imagine, figuring out how to keep the meeting productive is not an easy task. Opening it up to anyone to attend is not really an option unfortunately, as we would be unlikely to be able to provide a suitable conference room for large numbers of people due to the cost of the space (which really needs to be comfortable as we'll be sitting around a table for nine hours or so) and the food. Budget aside; having too many people in attendance makes it very difficult to have a productive meeting, a problem we believe we had last year when we had our highest number of attendees to date (around 30) and what many felt was our least productive meeting to date.

This year, the Core Team spent quite some time discussing the upcoming meeting and came to the conclusion that we needed to do two things to make the meeting a success again:

1) We need to spend more time fine-tuning the agenda.

We try to leave the agenda up to the attendees - after all, it is their meeting. However, there is a tendency to include items that haven't really been thought through in enough detail in advance to make good discussion points. This year we're working to ensure the agenda contains only items that have had some amount of thought given to them already, but are yet to be resolved. Ideally, we suspect many of these topics will be things that have been discussed on the mailing lists already, but haven't come to any sort of conclusion. That happens from time to time, and face to face discussion often helps reach consensus on the way forward. What we don't want to discuss are the ideas that we all have after a couple of drinks at the pub, scribbled on the back of a beer mat, added to the agenda and then forgotten about. Those ideas are best saved for the evenings of the conference (with more beer)!

2) We need to reduce the number of attendees at the meeting.

This was the really hard one. We decided that we should reduce the number of attendees down from last years 30 to maybe 20 - 25. Two of those people are Josh Berkus and myself who are primarily there to take notes and chair the meeting respectively. The rest of the attendees come from a variety of backgrounds typically - committers, active developers, future (hopefully) active developers, past developers and senior staff from Postgres-friendly companies. This year we decided that the criteria for invitations would be "has this person been a significant contributor to PostgreSQL 9.2"? We've applied that test to all the proposed attendees, and whilst many people who received invitations are amongst the normal crowd, unfortunately it does mean that some people who have attended in the past were not sent invitations this year. If you're one of those people, please accept my apologies - hopefully this blog post will clarify why that is the case.

Of course, it is possible we have overlooked inviting people who have made significant contributions to the upcoming 9.2 release. If you're one of those people, please let me know if you would like to attend.

Regardless of whether or not you'll be at the developer meeting dear readers, I do hope to see you in Ottawa in May.

Wednesday, 21 December 2011

Updated PostgreSQL Download Infrastructure

Back at the tail end of November, the PostgreSQL sysadmin/web team successfully migrated the PostgreSQL website to a new platform, based on PostgreSQL, Django, Lighttpd, and Varnish, to replace the old somewhat complex and messy PHP platform used in the past. Functionally and visually the website is almost identical to what it was, though the behind-the-scenes management interface is now vastly improved, as is the "Your Account" section which now offers users much more control over their submissions like news and events etc.

One other change that went largely unnoticed however was in the downloads section of the website, specifically the FTP area. This is a web interface over the content on the PostgreSQL FTP site, ftp.postgresql.org, that gives users a nice way to browse the files and directories on the site. On the old website, when the user clicked to download a file they would then be taken to a page of flags where they could select a mirror site to download the file from. The mirrors were all third party servers over which we had no control, aside from an automated system to ensure we only listed those which had content which was no more than 48 hours out of date. This arrangement made sense years ago when bandwidth was more precious, however with the bandwidth available to us these days it's really just clunky and inconvenient for users to have to choose one of 75 flags to reach a server that may not be entirely up to date.

To resolve this, as the new website infrastructure went live we also pushed a new download infrastructure into production. What you'll see now is direct links to files on download servers we run ourselves, from the website. This gives us a number of advantages:

  • The user has a much slicker experience when downloading, both in terms of the workflow, and often the speed of downloads (because some of the old mirrors were much faster than others).
  • We can push out files to the download servers in minutes, rather than days.
  • We can collect meaningful statistics to help us understand what users are downloading.
  • We can automatically (and invisibly) disable download servers in the event of problems, within minutes.
So what does this mean to the end user? Well, for a user coming from the website downloads take fewer clicks and avoid the "flags" page which could be daunting and potentially confusing. For those using the FTP site directly, there are a number of changes:
  • The site can be accessed at ftp://ftp.postgresql.org/, rather than using one of the individual mirror hostnames we used in the past.
  • Because there are no third party mirrors, there are no inconsistent paths to the content (one mirror in the past may have had content under /pub/mirrors/postgresql/ whilst another may have used /u/postgres/).
  • We can now offer downloads over HTTP using the same paths as FTP - you can use http://ftp.postgresql.org/pub/... or ftp://ftp.postgresql.org/pub/...
  • We've also opened up RSYNC access to all users, where previously only registered mirror sites could rsync the downloads from us: rsync://ftp.postgresql.org::pgsql-ftp.
Because there are lots of links to files on the old mirror network on the web, we've left the old mirror hostnames in place for the time being (though they are no longer being monitored), however they will be phased out over time.

Happy downloading!

Tuesday, 11 October 2011

To upgrade or not to upgrade? That is the question.

On a fairly regular basis the Postgres community hear from users who are complaining of bugs in old versions of Postgres - they'll post a bug report or a request for help on one of the mailing lists saying something along the lines of having run into an issue with PostgreSQL 8.4.2 for example, and immediately be met with suggestions to upgrade to the latest version because there have been 37 bug fixes and 5 security issues resolved since 8.4.2 was released.

Generally this happens with one of two types of users. There are those that just don't bother to upgrade (who we can do little about), and those that don't upgrade because they are concerned that changes to PostgreSQL will break their application. This latter class of user is sometimes also restricted by what they can do by corporate policies in their workplace.

The Postgres developers are mindful of this issue and have practices in place to allow users to safely upgrade without significant risk of behavioural changes breaking their applications. The practice is really quite simple; minor upgrades of PostgreSQL never include new features.

What does this mean in practice? Well, PostgreSQL version numbers are in 3 parts, X.Y.Z (some packages such as the EnterpriseDB installers also add a build suffix):

X.Y: This is the major version number of the server, for example, 8.4, 9.0 or 9.1. New major versions may include new functionality, require upgrades to the database files on disk and generally require thorough testing of applications prior to deployment.

Z: This is the minor version number. This number is increased for bug fixes releases, also known as "point releases". These releases never include new features; only carefully applied bug fixes. Point releases are fully compatible with previous point releases of the same major version and should require minimal testing prior to deployment on existing installations.

Returning to our opening example, the user in this case is not being told to upgrade to 9.1.1 - the latest and greatest release at the time of writing, complete with a myriad of new features and changes from 8.4.2 - but to 8.4.9, the latest point release in the 8.4 series, which is functionally identical to 8.4.2. This numbering scheme and the processes behind it are specifically designed to allow users to safely and easily upgrade their database servers to minimise the number of known bugs in the software; in fact the PostgreSQL developers consider not upgrading to the latest point release to be riskier than upgrading.

So next time you're "certifying" your application with PostgreSQL, aim to certify it with a specific major version of the server, and avoid getting into a situation where you prevent yourself from updating to the latest point release as doing so can cause more problems than it can solve, and if you're a sysadmin or DBA rest assured that point releases won't introduce functionality changes and should be welcomed and installed as soon as possible!

Wednesday, 5 October 2011

StackBuilder Package Updates


If you've ever used one of the PostgreSQL installers for v8.2 or above, either the old Windows MSI installer or the newer "one click" installers that also support Linux and Mac, you'll probably have come across StackBuilder. For those that haven't or those that never found the time, StackBuilder was introduced with the PostgreSQL 8.2 installer to allow us to distribute different components of PostgreSQL independently of the server itself. Originally in 8.0/8.1, the installer included lots of additional products, such as the ODBC, JDBC, OLEDB and .NET drivers, Slony and PostGIS.  As you can imagine, this proved near impossible to maintain as we needed to try to coordinate the release of products from multiple independently run projects.

StackBuilder was the solution to this. The installers were cut down to essentially include just the PostgreSQL server, pgAdmin and StackBuilder, which allowed us to provide all the other components on independent release schedules. It also gave us a vehicle to encourage adoption of PostgreSQL by other Open Source projects, by including packages for them as well, so with a few mouse clicks a user could be up and running with a "stack" like Drupal, Apache, PHP and PostgreSQL on their OS of choice. We made a conscious decision to include software other than well known Open Source products in the StackBuilder catalog as well, with the aim of giving the user as much choice of product to use with PostgreSQL as possible - as a result we have products in the catalog from multiple vendors and projects, as well as the "PostgreSQL family" projects. If you're interested in having a PostgreSQL related product included in the catalog, please contact me to discuss how we can make that happen.

So, with the refresher course out of the way, todays blog post is prompted by the latest round of updates that I've been adding to the catalog - we've got 42 updates to the Open Source packages, as well as 18 new packages, including for the first time, pgBouncer, pgMemcache and Drupal 7.

Updates:

  • Apache/PHP 2.2.20-5.3.8: Linux32, Linux64, Mac, Win32
  • mediaWiki 1.17.0: Linux32, Linux64, Mac, Win32
  • pgJDBC 9.0-801: Linux32, Linux64, Mac, Win32
  • phpBB 9.0-801: Linux32, Linux64, Mac, Win32
  • PostGIS 1.3.6 for PG 8.3: Linux32, Linux64, Mac
  • PostGIS 1.4.2 for PG 8.4: Linux32, Linux64, Mac
  • PostGIS 1.5.3 for PG 9.0: Linux32, Linux64, Mac
  • psqlODBC 09.00.0310: Linux32, Linux64, Mac, Win32
  • Slony 1.2.22 for PG 8.3: Linux32, Linux64, Mac
  • Slony 2.0.7 for PG 8.4: Linux32, Linux64, Mac
  • Slony 2.0.7 for PG 9.0: Linux32, Linux64, Mac
  • Npgsql 2.0.11: Linux32, Linux64, Mac, Win32

New releases:

  • Drupal 7:  Linux32, Linux64, Mac, Win32
  • pgBouncer:  Linux32, Linux64
  • pgMemcache 2.0.1 for PG 9.0: Linux32, Linux64, Mac
  • pgMemcache 2.0.1 for PG 9.1: Linux32, Linux64, Mac
  • PostGIS 1.5.3 for PG 9.1: Linux32, Linux64, Mac
  • Slony 2.0.7 for PG 9.1: Linux32, Linux64, Mac

Note that the PostGIS community maintain the PostGIS installers for Windows which will be released as soon as they're available. Windows updates for Slony are still in development due to an issue found in QA. Mac and Windows builds of pgBouncer are on their way.

To download and install any of these packages, just run StackBuilder - if you don't have it already you can get it with the PostgreSQL Installers.

Enjoy :-)

Tuesday, 4 October 2011

PostgreSQL Conference Europe: Are you ready?

PostgreSQL Conference Europe 2011 starts 2 weeks from today in the beautiful city of Amsterdam in the Netherlands. This is the fourth annual conference hosted by PostgreSQL Europe, following on from extremely successful events in Prato (Italy), Paris and Stuttgart, and is aimed at developers, DBAs, technologists and decision makers either using, or considering using the world's most advanced Open Source database.

This year we have four days on the schedule, with a kick-off day of training sessions hosted by respected PostgreSQL developers such as Greg Smith, Bruce Momjian, Magnus Hagander, Guillaume Lelarge and more. Topics will cover performance tuning, application development, database administration, replication & high availability and geospatial. The training sessions are available on their own, or as part of a regular conference attendance at additional - but very reasonable - cost.

We had a record number of talk proposals submitted this year but we've resisted the urge to host even more sessions in parallel - in fact we've reduced the number of parallel sessions to three as we all know how frustrating it can be when more than one that you want to see are at the same time. Instead we've extended the conference by a day to accomodate over 40 different sessions, which has the added bonus of allowing an additional night of social activities - always a great way to discuss the latest technologies, trends and ideas with other Postgres users over a beer or two.

We've got a great range of topics for this year, covering new features in PostgreSQL 9.1 and beyond, developing applications, running Postgres in the cloud, hacking PostgreSQL internals, tools and add-on products and managing large databases, presented by a wide cross-section of users and developers, including a number of this year's Google Summer of Code students who will talk about their work. You can view the complete schedule on the conference website.

Our opening keynote this year will be presented by Ram Mohan, EVP and CTO of Afilias who manage the .info, .org and .mobi top level domains using Postgres. Ram will be discussing the business decisions and strategy around their use of PostgreSQL. Our closing keynote will be presented by Ed Boyajian, President and CEO of EnterpriseDB who will discuss PostgreSQL's role in the post-Oracle era.

So, if you haven't done so already, head on over to the website to register as an attendee to avoid missing out on what promises to be an outstanding conference in an fantastic location. See you in Amsterdam!

Monday, 1 August 2011

PostgreSQL 9.1, meet MySQL

So having spent the last few months with my head buried deep in a project at work, I finally managed to get back to my previous hacking on SQL/MED at the weekend after the kids went away for a week (good $DEITY it's quiet here)! Within a couple of hours, I had my half-baked Foreign Data Wrapper for MySQL up and running, and am now able to create foreign table objects in PostgreSQL 9.1 that map either directly to tables in a MySQL database, or to queries run on the MySQL server.

Here's an example:

-- Create the required functions for the FDW.
CREATE FUNCTION mysql_fdw_handler()
RETURNS fdw_handler
AS '$libdir/mysql_fdw'
LANGUAGE C STRICT;

CREATE FUNCTION mysql_fdw_validator(text[], oid)
RETURNS void
AS '$libdir/mysql_fdw'
LANGUAGE C STRICT;

-- Create the data wrapper or "transport".
CREATE FOREIGN DATA WRAPPER mysql_fdw
HANDLER mysql_fdw_handler
VALIDATOR mysql_fdw_validator;

-- Create the foreign server, a pointer to the MySQL server.
CREATE SERVER mysql_svr
FOREIGN DATA WRAPPER mysql_fdw
OPTIONS (address '127.0.0.1', port '3306');

-- Create one or more foreign tables on the MySQL server. The first of
-- these maps to a remote table, whilst the second uses an SQL query.
CREATE FOREIGN TABLE employees (
id integer,
name text,
address text)
SERVER mysql_svr
OPTIONS (table 'hr.employees');

CREATE FOREIGN TABLE ex_staff (
id integer,
name text,
address text)
SERVER mysql_svr
OPTIONS (query 'SELECT * FROM hr.employees WHERE date_left IS NOT NULL');


-- Create a user mapping to tell the FDW the username/password to
-- use to connect to MySQL, for PUBLIC. This could be done on a per-
-- role basis.
CREATE USER MAPPING FOR PUBLIC
SERVER mysql
OPTIONS (username 'dpage', password '');

So let's try it out. Here's the test table in MySQL:

mysql> SELECT * FROM employees;
+----+-------------+----------------------------+------------+
| id | name | address | date_left |
+----+-------------+----------------------------+------------+
| 1 | Dave Page | 27 High Street, Georgetown | NULL |
| 2 | Fred Bloggs | 46 Mill Road, Klasborough | NULL |
| 3 | Fred Bloggs | 24 The Wharf, Westhampton | 2010-05-23 |
+----+-------------+----------------------------+------------+
3 rows in set (0.00 sec)

And here we are in PostgreSQL 9.1 beta 3:

postgres=# SELECT * FROM employees;
id | name | address
----+-------------+----------------------------
1 | Dave Page | 27 High Street, Georgetown
2 | Fred Bloggs | 46 Mill Road, Klasborough
3 | Fred Bloggs | 24 The Wharf, Westhampton
(3 rows)

postgres=# SELECT * FROM ex_staff;
id | name | address
----+-------------+---------------------------
3 | Fred Bloggs | 24 The Wharf, Westhampton
(1 row)

For the curious, here's what the EXPLAIN output looks like:

postgres=# EXPLAIN SELECT * FROM employees;
QUERY PLAN
----------------------------------------------------------------
Foreign Scan on employees (cost=10.00..13.00 rows=3 width=68)
Local server startup cost: 10
MySQL query: SELECT * FROM hr.employees
(3 rows)

Pretty neat huh? There are a couple of limitations in the current implementation:
  • No attempt is currently made to push down quals (WHERE clauses) to the MySQL server, so every row MySQL finds is returned to PostgreSQL and filtered there. There's no defined API for this in PostgreSQL yet, and it's not immediately clear how to build something more complex than the simple example I used in my Redis FDW that would be required for a remote relational database. That said, you can build WHERE clauses into the foreign table definition of course.
  • The MySQL C API doesn't seem to offer a simple way to either randomly access a result set, or at least reset the cursor to the first row, unless you copy the entire resultset to the client (PostgreSQL in this case). Because we need to be able to return to the first row if PostgreSQL calls the Rescan function, we therefore currently copy the entire resultset, rather than reading it from the server, on demand.
Aside from minor tweaks, this is probably about as far as I'll take this little project for now. I'll be talking about it at both Postgres Open 2011 in Chicago in September, and PGBR 2011 in São Paulo in November - hopefully I'll see you there.

The MySQL FDW source code is available on Github, and uses the PostgreSQL licence.