Stephen Frost | CrunchyData Blog

Choice of Table Column Types and Order When Migrating to PostgreSQL

Stephen.Frost@crunchydata.com (Stephen Frost) — Tue, 27 Apr 2021 05:00:00 EDT

Contributing author David Youatt

An underappreciated element of PostgreSQL performance can be the data types chosen and their organization in tables. For sites that are always looking for that incremental performance improvement, managing the exact layout and utilization of every byte of a row (also known as a tuple) can be worthwhile. This is an important consideration for databases that are migrating from other databases to PostgreSQL as the data types available in PostgreSQL and how they are laid out is unlike many other platforms.

When to use the NUMERIC/DECIMAL Data Type Vs. Other Numeric Types in PostgreSQL

What is important, when trying to squeeze out every bit of performance, is optimizing the organization of your data and minimizing overhead. Much of this is done by the PostgreSQL programmer and the compiler, but there are things you can do to improve performance, including choosing the right column type and order of columns in table definitions.

Variable length

The NUMERIC type (same as DECIMAL) makes sense for things like money, because PG stores the value precisely, including decimal places for fractions, and the range of numbers that can be represented precisely is larger than can be represented by an integer or big integer, and supports storing fractional parts of numbers. Computations are precise with no rounding, but the storage format is base-10000 and, importantly, its storage is variable size. There are several options for storing numbers in most databases, including PostgreSQL. The PostgreSQL documentation is the authoritative source (or look at the source code) for types to represent numbers.

On-disk storage of NUMERIC is actually base-10000, not base-10. This means that there are actually 4 base-10 digits per base-10000 digit and each base-10000 digit takes up 2 bytes. The reason that the variable storage size matters is that variable-length data adds an additional header—1 byte when the variable-length data is less than 127 bytes, 4 bytes otherwise—and even just comparing two numeric values against each other is much more expensive than doing the same for integers or bigints.

So, how does this work in practice? Storage of the vast majority of numerics will be less than 127 bytes, in which case you have:

1 byte for the length (assuming less than 127 bytes)
2 bytes for the numeric header
2 bytes for each base-10000 digit (for up to 4 base-10 digits)

and therefore, numerics tend to require between 5 and 11 bytes to store.

We can compare that against the storage required for integer and bigints using the function pg_catalog.pg_column_size():

=> select
  c1 as numeric, pg_column_size(c1) as numeric_size,
  c2 as int, pg_column_size(c2) as int_size,
  c3 as bigint, pg_column_size(c3) as bigint_size
from t1;

     numeric      | numeric_size |    int     | int_size |      bigint      | bigint_size
------------------+--------------+------------+----------+------------------+-------------
                1 |            5 |          1 |        4 |                1 |           8
               12 |            5 |         12 |        4 |               12 |           8
              123 |            5 |        123 |        4 |              123 |           8
             1234 |            5 |       1234 |        4 |             1234 |           8
            12345 |            7 |      12345 |        4 |            12345 |           8
           123456 |            7 |     123456 |        4 |           123456 |           8
          1234567 |            7 |    1234567 |        4 |          1234567 |           8
         12345678 |            7 |   12345678 |        4 |         12345678 |           8
        123456789 |            9 |  123456789 |        4 |        123456789 |           8
       1234567890 |            9 | 1234567890 |        4 |       1234567890 |           8
      12345678901 |            9 |            |          |      12345678901 |           8
     123456789012 |            9 |            |          |     123456789012 |           8
    1234567890123 |           11 |            |          |    1234567890123 |           8
   12345678901234 |           11 |            |          |   12345678901234 |           8
  123456789012345 |           11 |            |          |  123456789012345 |           8
 1234567890123456 |           11 |            |          | 1234567890123456 |           8
(16 rows)

Looking at this, we can see that 'integer' is always going to be smaller (and faster!) to use than 'numeric', and 'bigint' will be smaller once you get up into the hundreds of millions (and it'll also be faster, of course).

Alternative Types with Fixed Length

Unfortunately, people often use the NUMERIC/DECIMAL type when migrating from other databases to PostgreSQL, even though the actual values in a given column are integers (because it’s a primary key, for example). You’ll get far better performance and typically less space used by using either INTEGER or BIGINT in those cases.

PG also supports float or double types (REAL and DOUBLE PRECISION, hello FORTRAN), which may be appropriate if exact precision isn’t required (such as with measurements). Of course, PostgreSQL supports the standard SQL type syntax of float(n) where n = 1..24 maps to REAL and n = 25..53 maps to DOUBLE PRECISION, and just float means DOUBLE PRECISION. If you've ever had to be aware of the details of IEEE 754, or know about the related exponent and fraction bits, those ranges will look familiar.

Using a fixed width data type likely will be more efficient and can be smaller in space required than NUMERIC (it does depend on the exact values being stored).

Type Size, Alignment, and Order

If you’re worried about how much space is needed to store your data on disk, and you probably should be, the column order also matters. When doing binary IO, PG accesses binary data in the row directly, doesn’t serialize data. PostgreSQL doesn’t reorder, compress across columns (though a value in a given column may be compressed) or generally make attempts to avoid wasted space. That’s left up to the database designer to consider and decide on.

One aspect of this direct mapping from disk to memory is that memory access alignment must be respected, at a minimum for performance but sometimes also for function, depending on the architecture. Meaning that data types must be stored at certain offsets in memory, which can introduce alignment “holes”. To ensure you don’t introduce alignment holes, you should order the columns in your table definition with the largest fixed-width columns first, followed by smaller fixed-width columns, and then variable-length fields at the end.

For example, if you have columns with sizes integer, bigint, you would want to create the table with columns in this order: bigint, integer.

If you create it as integer, bigint, then you’ll end up with a 4-byte alignment hole (just completely dead and wasted space) between the integer and the bigint.

For example:

0    4    8   12   16   20   24   28   32
+----+----+----+----+----+----+----+----+
|int4| W  |  bigint |int4|  W |  bigint | ...
+----+----+----+----+----+----+----+----+

Where "W" is wasted space, because the bigint will be naturally aligned on an 8-byte address. By reordering the column definitions, you can avoid the wasted space, in memory and for binary storage:

0    4    8   12   16   20   24   28   32
+----+----+----+----+----+----+----+----+
| bigint  |  bigint |int4|int4| ...
+----+----+----+----+----+----+----+----+

In just this simple example, you have saved 8-bytes of memory and storage, or 8 out of 32 bytes or 25%. Imagine if you have wide rows with many columns, and a large table with many rows.

PostgreSQL will tell what size a type is and how it will be aligned with this query:

SELECT typname,typbyval,typlen,typalign FROM pg_catalog.pg_type ORDER BY 3 DESC,1;

Here are the first few lines. Note that length of -1 is a variable length type. Note that several of the wider types are geometric types, and note that uuid is as long as two bigints, which you should keep in mind if you consider using a UUID.

        typname                | typbyval | typlen | typalign
---------------------------------------+----------+--------+----------
 name                                  | f        |     64 | c
 sql_identifier                        | f        |     64 | c
 box                                   | f        |     32 | d
 lseg                                  | f        |     32 | d
 circle                                | f        |     24 | d
 line                                  | f        |     24 | d
 interval                              | f        |     16 | d
 point                                 | f        |     16 | d
 uuid                                  | f        |     16 | c
 aclitem                               | f        |     12 | i
 timetz                                | f        |     12 | d
 float8                                | t        |      8 | d
 int8                                  | t        |      8 | d
 internal                              | t        |      8 | d
 macaddr8                              | f        |      8 | i
 money                                 | t        |      8 | d

and for NUMERIC:

typname | typbyval | typlen | typalign
---------+----------+--------+----------
 numeric | f        |     -1 | i
(1 row)

Why?

Because when PostgreSQL does binary IO to storage, it uses the in-memory storage layout. It doesn't pack and unpack individual items (columns) to minimize size.

But Why?

CPUs want data to be "naturally aligned" to their natural address location. For example, the natural alignment for a byte is on a 1-byte address boundary, for a short integer on a 2-byte address, for a 32-bit integer (and float) on 4-byte address, 64-bit integer (and double) on an 8-byte boundary. RISC CPUs like ARM and MIPS strictly enforce this. If your data is not naturally aligned, you get a runtime error. Intel architectures will adjust misaligned data at runtime, but at a high performance cost. Fortunately, this is mostly controlled by the compiler when the PostgreSQL source code is compiled, though overly clever programmers can cause misalignments.

In general, the PostgreSQL source code plus the compiler decides memory alignment for you, but it does not change the order that you define things. You can help by using types with fixed length (not DECIMAL or NUMERIC), and if you can, by declaring your table's columns in order from largest fixed size to smallest fixed size followed by variable sized data like NUMERIC/DECIMAL. Note that ordering your columns like this not only reduces memory use and storage requirements, but also affects performance by resulting in better HW cache use, and better TLB use, but that's getting way off in to the weeds.

Note that standard fixed-length types like int2, integer (int4), bigint (int8), REAL (float4), DOUBLE PRECISION (float8) will use CPU native types and instructions, while operations on multi-precision types will be implemented partly in software.

Summary

The short story is that you can help your performance cause, storage-wise, memory-wise and cpu-wise by:

using the NUMERIC/DECIMAL when you really need it, like counting money
choosing alternative types like INTEGER, BIGINT, REAL, DOUBLE PRECISION when you don't
declaring your tables' columns from largest fixed to smallest fixed size follow by variable length types like NUMERIC/DECIMAL

How to setup Windows Active Directory with PostgreSQL GSSAPI Kerberos Authentication

Stephen.Frost@crunchydata.com (Stephen Frost) — Mon, 04 Mar 2019 04:00:00 EST

PostgreSQL provides a bevy of authentication methods to allow you to pick the one that makes the most sense for your environment. One desired implementation that I have found customers wanting is to use Windows Active Directory with PostgreSQL's GSSAPI authentication interface using Kerberos. I've put together this guide to help you take advantage of this setup in your own environment.

Setting up Windows Active Directory

The first step in setting up a Windows Active Directory is to create a regular user account. The password can be anything but shouldn't expire and it needs to be unique in the environment. In this instance, we'll use pg1postgres.

Once the user account exists, we have to create a mapping between that user account and the service principal and create a keytab file. These steps can be combined using the Windows ktpass command, like so:

ktpass /out pg1.keytab /princ postgres@pg1.domain.local@DOMAIN.LOCAL /mapuser pg1postgres /crypto AES256-SHA1 +rndpass /target DOMAIN.LOCAL -ptype KRB5_NT_PRINCIPAL

This should create a pg1.keytab file which has to then be copied to the PostgreSQL server on Ubuntu.

Lastly, in the Windows system, go into the User account, under Properties for the pg1postgres user, on the 'Account' tab, be sure to check the box that says "This account supports Kerberos AES 256 bit encryption."

Setting up PostgreSQL on Ubuntu

On the Ubuntu PostgreSQL server, move the pg1.keytab file into /etc/postgresql/, change the ownership to be postgres:postgres and the file mode to be 600.

On both the client and servers, the krb5-user package should be installed. In an Active Directory environment, that's likely all that will be required since the rest of the information is available in DNS.

In postgresql.conf, configure krb_server_keyfile to point to the keytab file, like so:

krb_server_keyfile = '/etc/postgresql/pg1.keytab'

In pg_hba.conf, configure the appropriate rows to use the gss authentication mechanism, like so:

host all all 0.0.0.0/0 gss

Once these steps are done, PostgreSQL is ready to accept Kerberos (aka GSSAPI) based authentication from clients.

Creating Kerberos users in PostgreSQL

When Kerberos / GSSAPI authentication is used, the "authentication system" user authenticated to PostgreSQL will be user@DOMAIN. In our example, this will be sfrost@DOMAIN.LOCAL. In order for a user to authenticate with Kerberos and log in, that user needs to exist in PostgreSQL, or a mapping needs to exist to map to a user in PostgreSQL. For instance, here is what things look like without a mapping:

As a user who can create roles, run:

postgres=# create user "sfrost@DOMAIN.LOCAL"; CREATE ROLE

Then, to log in using Kerberos as that user, run psql like so:

(if you do not have a ticket already, run: kinit sfrost@DOMAIN.LOCAL)

psql -U sfrost@DOMAIN.LOCAL -h pg1.domain.local -d postgres

Note that in Kerberos, a user is always logging into a server and we have to specify what that server is- in this case "-h pg1.domain.local" is telling psql that we want to log into the pg1.domain.local server, even though that's actually the local system. Further, psql, by default, will try to log into PostgreSQL using the current Unix username, which is "sfrost" in this case, but there is no "sfrost" PostgreSQL user, so we have to use -U sfrost@DOMAIN.LOCAL to tell psql to use that username to log in. Alternatively, the user created in PostgreSQL could be "sfrost" and a mapping created to allow the Kerberos user sfrost@DOMAIN.LOCAL to log in as that user. See the PostgreSQL documentation of pg_ident.conf for details.

Requirements for Kerberos Authentication

There's a number of things which Kerberos depends on for proper authentication:

Reverse DNS must be set up and returning the correct result. This is what Kerberos uses to find the service in Active Directory.
The clocks on all of the systems need to be reasonably close to each other (within about 5 minutes)
The reverse DNS result for the IP that the server is answering on needs to match the service principal used in the ktpass command.
If running psql on Windows, it may be necessary to deal with case differences- specifically, the service principal might have to be specified to psql in the connection string, or created in active directory as POSTGRES/pg1.domain.local@DOMAIN.LOCAL instead (though psql on unix systems would then have to use the connection string option).
There are additional complications when it comes to integration with Web servers, such as when running pgAdmin4 as an independent web server, because the web server needs to be configured to authenticate with its own service account using SPNEGO, and that service account needs to be configured in Active Directory to allow delegation, and the web client needs to be able to authenticate and delegate authentication to the web server to allow it to log into PostgreSQL. (This would be good to have a dedicated article on).
The PostgreSQL server needs to live in the Active Directory domain and in its DNS, or in a domain with a cross-realm trust with the AD server. For a larger environment, with many PostgreSQL servers, it may make sense to have a Unix-based KDC, such as the MIT KDC, and then have a cross-realm trust between the Active Directory environment and the Unix/PostgreSQL environment.

A Committer's Preview of PGConf.EU 2016 - Part 3

Stephen.Frost@crunchydata.com (Stephen Frost) — Thu, 27 Oct 2016 05:00:00 EDT

Today, I am wrapping up my preview of next week's PGConf.EU conference. I'm really excited about all of the excellent topics and speakers that we get to choose from! Once again, here's the full Schedule.

On Friday morning, I have to recommend Arthur Zakirov and Oleg Bartunov’s talk on Better Full Text Search in PostgreSQL, even if you don’t use any Full Text Search today, they will be discussing the new RUM indexing capability which is currently being worked on. If that isn’t your thing, then definitely check out Jan Wieck’s talk on Peeking into the black hole called PL/pgSQL - The new PL profiler.

After the coffee break is Marcin Cieślak’s talk which pits PostgreSQL against MySQL for the MediaWiki project and looks to be an extremely interesting talk, though I must admit that Vik’s example-driven examination of how useful LATERAL is looks like a great talk for anyone who is not familiar with LATERAL. Next, Crunchy Data's own David Steele is back and talking about Audit Logging for PostgreSQL, a subject which is near and dear to my heart and one that I am hopeful we will make progress in with PG10 and beyond.

Last, but certainly not least, is Stefanie Janine Stölting’s talk coined One Database to Rule 'em All, which I have to get behind because of the reference to the One Ring and Tolkien’s masterpiece, though also because PostgreSQL’s ability to be the central repository of huge amounts of federated data is just plain amazing and if you aren’t familiar with Foreign Data Wrappers, you need to get familiar with them.

The final set of talks on Friday also all look very interesting and I plan to attend them, prior to heading out to the bar!

Thanks!

A Committer's Preview of PGConf.EU 2016 - Part 2

Stephen.Frost@crunchydata.com (Stephen Frost) — Wed, 26 Oct 2016 05:00:00 EDT

Today, I will continue with my preview of the exciting talks at the upcoming PGConf.EU conference. In Part 1, I discussed the talks that will happen on Wednesday. Today, I want to dive into the Thursday sessions.

Starting in early on Thursday morning,if you haven’t tracked all the fantastic progress we’ve made with PostgreSQL 9.6 then definitely go to Magnus Hagander’s talk on What's New in PostgreSQL 9.6. If all of that is old news to you and you’re looking at PG10 with the other developers, then the talk to be at is Anastasia Lubennikova and Konstantin’s on Page Level Compression and Encryption in Postgres, which is an absolutely amazing and fantastic direction for PostgreSQL to be going in and I’m quite excited about it.

Following that, be sure to come to my talk on Understanding PostgreSQL Query Plans. ;)

After the morning coffee break Giuseppe Broccolo and Julien Rouhaud will be talking about Extend BRIN Support to PostGIS: Block Range INdexing on Geospatial Data, which looks like a fantastic capability and an excellent way to have very small but extremely fast indexes on geospatial data. This is high on my list of talks to check out as we discover more and more cases where we must answer queries based on both time ranges and spatial areas at the same time very quickly. Following that is a very interesting talk by Dennis Butterstein known as Firing the Interpreter. A Case Study of LLVM-based Expression Compilation - Just in Time, which looks like a very interesting way to compile your queries down to a lower level for execution instead of the current approach which works with Executor nodes to implement the query. Another deep internals talk but one which shows a great deal of promise for almost free order-of-magnitude performance improvements in PostgreSQL.

After lunch, I’ll have to go to Simon Rigg’s HOT & Other UPDATE Optimizations as the mailing lists have been all buzzing about WARM and true secondary indexes and this looks like a very interesting opportunity to extend PostgreSQL in directions we have not reached out to yet. If internals are not your thing, then both of the other talks look great- Securing PostgreSQL is key for every DBA to understand how to do, and if you feel comfortable there, then a nice break would be to check out the Billion Rows Pet Project.

Lastly on Thursday, I would have to recommend Cédric Villemain’s talk on Transactions Across Multiple Datastores. Perhaps you don’t need to deal with that today (though I find that unlikely), but you’ll definitely have to deal with that in the future and it is extremely important to understand the techniques and best practices for scaling across multiple datastores.

Also, don’t miss the Lightning Talks - Harald does a fantastic job with them and it is a ton of fun.

Check back tomorrow for my Part 3 of the PGConf.EU preview in which I discuss Friday's sessions!

A Committer's Preview of PGConf.EU 2016 - Part 1

Stephen.Frost@crunchydata.com (Stephen Frost) — Tue, 25 Oct 2016 05:00:00 EDT

Only one week left til PGConf.EU in Tallinn, Estonia!

Next week will be PGConf.EU’s 8th conference, having traveled to many different parts of Europe, and lately moving on a North-East trajectory, two years ago in Madrid, last year in Vienna, is now in Tallinn, Estonia with another fantastic line-up of talks.

Here are the talks which I am most interested in this year. Warning for the unwary, I’m a PostgreSQL Committer, so I tend to be quite developer heavy when it comes to my talk selections.

First off, if you are interested in training, and can manage to find a slot, Hans-Jürgen Schönig’s Detecting Performance Problems and Fixing Them looks to be a fantastic all-day training class.

On to the regular talks, in the first slot on Wednesday are three great talks, my first pick going to David Steele (full disclosure: who also works for Crunchy Data) for his talk on Reviewing PostgreSQL Patches for Fun and Profit. David ran for PostgreSQL 9.6, as Commitfest Manager, what is generally accepted as the toughest Commitfest for any PostgreSQL release- the last one. If you are interested in helping to get new features into PostgreSQL, this is the talk to be at. A close second, and more in line with your typical DBA’s role, is Christophe Pettus’s talk Unclogging the VACUUM. Most any PostgreSQL DBA will agree that understanding VACUUM is key to understanding how to get the best out of PostgreSQL (and, perhaps more importantly, how to avoid the worst).

In the next slot is Stella Nisenbaum’s Becoming an SQL Guru, which I strongly encourage anyone who isn’t deeply in with Common Table Expressions (CTEs), Lateral joins, or Windowing Functions or other advanced SQL to attend. SQL is the language you use to talk to the database- learn to talk SQL at a college level instead of early secondary school level. On the flip side is Petr Jelinek’s fantastic talk on PGLogical, where he talks about the latest and greatest in logical replication, an excellent alternative to binary replication for many use-cases.

Following lunch on Wednesday is Alexander Korotkov’s talk on The Future is CSN, which I tend to agree with and hope to attend, but I caution that it is a very developer and internals oriented talk. If you are not familiar with transaction IDs, snapshots, XMINs, XMAXs, the ProcArrayLock and other internals, this might not be the talk for you. If you aren’t up for the hacking talk then perhaps check out Gianni’s **Advanced** SQL talk, or take a break and listen to Federico talk about Large-scale Backup Challenges.

In the next slot, my pick is Andreas Seltenreich’s talk on Hunting PostgreSQL Bugs with SQLSmith. I’m always interested in finding new ways to test PostgreSQL and make sure that we are producing correct results. For the last two slots of the day, I’m looking at Jehan-Guillaume (ioguix) de Rorthais’s talk on PAF: Auto Failover and More, followed by Michał Gutkowski and Rafal Hawrylak’s talk on PostgreSQL in TomTom, though Emre Hasegeli’s talk on Managing thousands of Database Servers also looks fantastic, and if you haven’t followed all the developments on Parallel Query, definitely check out Amit Kapila’s talk on it during that slot.

That pretty much wraps up Wednesday for us. Check back here for my next post which will give you a preview of my favorite talks that will happen on Thursday!