<?xml version="1.0" encoding="UTF-8" ?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" version="2.0"><channel><title>Jesse Soyland | CrunchyData Blog</title>
<atom:link href="https://www.crunchydata.com/blog/author/jesse-soyland/rss.xml" rel="self" type="application/rss+xml" />
<link>https://www.crunchydata.com/blog/author/jesse-soyland</link>
<image><url>https://www.crunchydata.com/build/_assets/jesse-soyland.png-PMFLXXQO.webp</url>
<title>Jesse Soyland | CrunchyData Blog</title>
<link>https://www.crunchydata.com/blog/author/jesse-soyland</link>
<width>713</width>
<height>717</height></image>
<description>PostgreSQL experts from Crunchy Data share advice, performance tips, and guides on successfully running PostgreSQL and Kubernetes solutions</description>
<language>en-us</language>
<pubDate>Fri, 03 Oct 2025 11:00:00 EDT</pubDate>
<dc:date>2025-10-03T15:00:00.000Z</dc:date>
<dc:language>en-us</dc:language>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<item><title><![CDATA[ Postgres Migrations Using Logical Replication ]]></title>
<link>https://www.crunchydata.com/blog/postgres-migrations-using-logical-replication</link>
<description><![CDATA[ Instructions and tips for using logical replication to migrate Postgres to a new platform or host. ]]></description>
<content:encoded><![CDATA[ <p>Moving a Postgres database isn’t a small task. Typically for Postgres users this is one of the biggest projects you’ll undertake. If you’re migrating for a new Postgres major version or moving to an entirely new platform or host, you have a couple options:<ul><li><p><strong><em>Using pg_dump and pg_restore</em></strong>: pg_dump is a very reliable way to collect an entire database and restore it to a new place. This includes the entire schema, all tables, and special database elements. If you’re migrating a small database, that’s 50, 100, 150GB this is probably the easiest way to do it. On modern hardware a dump and restore using this method can be done in less than an hour.<li><p><strong><em>Using WAL</em></strong>: For folks that have a WAL based backup system like pgBackRest or WAL-G/E, you can do a major Postgres migration by running a full base backup and streaming that WAL to your new host. Once you’re ready to do a cutover to the new database, you have an exact copy already standing by. This is a great way for larger databases, folks in the terabyte size, to do a major migration with minimal downtime.</ul><p>But what if your database is too big for a dump restore and you can’t take the downtime? But you don’t have access to the WAL (i.e. you're on a host like RDS that doesn't share it). There’s a third option:<ul><li><strong><em>Logical migration</em></strong>: Using Postgres logical migration you can set up a database copy at a new location. While WAL has everything, logical replication just captures data and doesn't migrate schema, indexes, sequences, and a couple other fiddly bits. But with a few tricks in this post, you can capture everything for a full migration using logical replication.</ul><p>The architecture of logical replication is straightforward, see our intro post on <a href=https://www.crunchydata.com/blog/data-to-go-postgres-logical-replication>Data To Go: Postgres Logical Replication</a> if you’re brand new to the topic. Your existing database will be the <code>publisher</code>, and the receiving database will be the <code>subscriber</code>. In the initial load, all data is copied from the publisher to the subscriber. Following the initial data copy, any transactions made on the publisher side are sent to the subscriber.<h2 id=step-1-migrate-schema><a href=#step-1-migrate-schema>Step 1: Migrate schema</a></h2><p>Logical replication only replicates data changes (<code>INSERT</code>, <code>UPDATE</code>, <code>DELETE</code>), so you must ensure that the target database has the correct schema beforehand. To get a schema-only dump of your source and apply to your database, run something like:<pre><code>pg_dump -Fc -s $SOURCE_DB_URI | pg_restore --no-acl --no-owner -d $TARGET_DB_URI
</code></pre><p>If your migration process is proceeding while application development continues, you must make sure to update the receiving database's schema as you make any schema changes on your source database.<h2 id=step-2-publisher-current-host-set-up><a href=#step-2-publisher-current-host-set-up>Step 2: Publisher (current host) set up</a></h2><p>Logical replication is enabled via the wal_level setting. Some managed Postgres services may have a different want to turn this on. wal_level = logical Slot configuration Review the replication slots settings to make sure there are sufficient resources. For very large replication projects, the defaults may need to be changed.<ul><li><code>max_replication_slots</code><li><code>max_wal_senders</code><li><code>max_logical_replication_workers</code><li><code>max_worker_processes</code><li><code>max_sync_workers_per_subscription</code></ul><p>For details on how these parameters should be set, see the PostgreSQL chapter on <a href=https://www.postgresql.org/docs/current/logical-replication-config.html>logical replication configuration settings</a>.<h4 id=networking><a href=#networking>Networking</a></h4><p>Make sure that the network/firewall for your old database permits connections from your new database.<h4 id=replication-user-for-the-new-subscriber><a href=#replication-user-for-the-new-subscriber>Replication user for the new subscriber</a></h4><p>You can create a specific user for this purpose that has the <code>REPLICATION</code> role attribute. Also make sure that the new role has read access to tables being replicated.<pre><code class=language-sql>CREATE ROLE elizabeth WITH REPLICATION LOGIN PASSWORD 'my_password';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO elizabeth;
</code></pre><h4 id=find-tables-without-primary-keys-or-unique-indexes><a href=#find-tables-without-primary-keys-or-unique-indexes>Find tables without primary keys or UNIQUE indexes</a></h4><p>For logical replication, Postgres needs a way to uniquely identify rows to be updated/deleted. For tables with primary keys, that key is used, so first identify any tables that lack primary keys:<pre><code class=language-sql>select tab.table_schema,
       tab.table_name
from information_schema.tables tab
left join information_schema.table_constraints tco
          on tab.table_schema = tco.table_schema
          and tab.table_name = tco.table_name
          and tco.constraint_type = 'PRIMARY KEY'
where tab.table_type = 'BASE TABLE'
      and tab.table_schema not in ('pg_catalog', 'information_schema')
      and tco.constraint_name is null
order by table_schema,
         table_name;
</code></pre><p>For tables without primary keys, any <code>UNIQUE</code> index can be used:<pre><code class=language-sql>ALTER TABLE tablename REPLICA IDENTITY USING INDEX idx_some_unique_index;
</code></pre><p>If there are no existing <code>UNIQUE</code> indexes, one can be created, or the table can be set with <code>REPLICA IDENTITY FULL</code> - in which case it treats each row as its own "key":<pre><code class=language-sql>ALTER TABLE tablename REPLICA IDENTITY FULL;
</code></pre><p>Next create a publication, which is a grouping of tables you intend to replicate. In most cases you will create a publication FOR ALL TABLES:<pre><code class=language-sql>CREATE PUBLICATION bridge_migration FOR ALL TABLES;
</code></pre><p>Check that your tables are ready for publication, all the tables should be listed here.<pre><code class=language-sql>SELECT * FROM pg_publication_tables;
``
## Step 3: Subscriber (new host) settings
On the new host side, create a subscription to each publication to begin receiving the published data. Using the connection details to your old host and the login details you created in Step 2, you set up a subscription to that replicated data.

```sql
CREATE SUBSCRIPTION bridge_migration CONNECTION 'host={host} port=5432 dbname={datatbase} user={login} password={password}' PUBLICATION bridge_migration;
</code></pre><p>Creating the subscription in this way will create a replication slot on the publisher and begin copying data from tables specified in the publication. A separate temporary slot will be created for each table for the duration of its initial data synchronization copy.<p>You can limit how many tables are synchronized at once with the <code>max_sync_workers_per_subscription</code> setting.<h2 id=step-4-monitor-the-initial-copy><a href=#step-4-monitor-the-initial-copy>Step 4: Monitor the initial copy</a></h2><p>You likely want to monitor this initial copy. The <code>pg_stat_subscription</code> table will show data on the subscriber end of the transaction:<pre><code>select * from pg_stat_subscription;

-[ RECORD 1 ]---------+------------------------------
subid                 | 27183
subname               | bridge_migration
worker_type           | table synchronization
pid                   | 1197139
leader_pid            |
relid                 | 26721
received_lsn          |
last_msg_send_time    | 2025-09-26 15:54:45.095215+00
last_msg_receipt_time | 2025-09-26 15:54:45.095215+00
latest_end_lsn        |
latest_end_time       | 2025-09-26 15:54:45.095215+00
-[ RECORD 2 ]---------+------------------------------
subid                 | 27183
subname               | bridge_migration
worker_type           | apply
pid                   | 47075
leader_pid            |
relid                 |
received_lsn          | 4E32/7092F6F8
last_msg_send_time    | 2025-09-26 15:55:11.020012+00
last_msg_receipt_time | 2025-09-26 15:55:11.021989+00
latest_end_lsn        | 4E32/7092F3E0
latest_end_time       | 2025-09-26 15:55:10.843251+00
</code></pre><p>You can also look at the pg_subscription_rel view to see the synchronization state of each table with <code>select * from pg_subscription_rel;</code>.<p>Here, the <code>state_code</code> can tell you about each object:<ul><li>d - data is being copied<li>f - finished table copy<li>s - synchronized<li>r - ready (normal replication)</ul><p>Because of table bloat and other factors with internal table statistics, you won't be able to compare the table sizes. Though you can do select count(*) to compare row sizes.<h2 id=step-5-testing-and-cutover><a href=#step-5-testing-and-cutover>Step 5: Testing and cutover</a></h2><p>Now you can begin testing your application against the new database. Once you have confirmed that all the data is present, you can do a migration cutover. This will require stopping transactions on the original host, fixing your sequences, and pointing your application to the new database.<h2 id=step-6-fix-sequences><a href=#step-6-fix-sequences>Step 6: Fix sequences</a></h2><p>While logical replication will copy over all the data from the source, it doesn't update any of the sequences. For this reason, we recommend that you update your sequences post-cutover, before you begin production operations. The best approach to fixing your sequences is to simply create setval commands for all sequences in your source database(s), which you can do with this query:<pre><code class=language-sql>SELECT
    'SELECT setval(' || quote_literal(quote_ident(n.nspname) || '.' || quote_ident(c.relname)) || ', ' || s.last_value || ');'
FROM
    pg_class c
    JOIN pg_namespace n ON n.oid = c.relnamespace
    JOIN pg_sequences s ON s.schemaname = n.nspname
        AND s.sequencename = c.relname
WHERE
    c.relkind = 'S';
</code></pre><p>The resulting output file can be executed on the new host to synchronize all sequences.<h2 id=final-thoughts><a href=#final-thoughts>Final thoughts</a></h2><p>Logical replication is a safe and effective migration strategy. Data consistency for replicated tables is ensured as long as the subscriber's schema is identical and replication is one-way with no conflicting writes on the subscriber.<p>We help folks with migrations to <a href=https://www.crunchydata.com/products/crunchy-bridge>Crunchy Bridge</a> every day. With Postgres you have a lot of choices for no-downtime or low downtime platform changes. Contact us to find out more about the right plan for your project. ]]></content:encoded>
<author><![CDATA[ Jesse.Soyland@crunchydata.com (Jesse Soyland) ]]></author>
<dc:creator><![CDATA[ Jesse Soyland ]]></dc:creator>
<guid isPermalink="false">c484718ed2d8926bd5974d9d3a4ccdcf1ed84865b374e476b0971a6a8f813618</guid>
<pubDate>Fri, 03 Oct 2025 11:00:00 EDT</pubDate>
<dc:date>2025-10-03T15:00:00.000Z</dc:date>
<atom:updated>2025-10-03T15:00:00.000Z</atom:updated></item>
<item><title><![CDATA[ Postgres Troubleshooting - DiskFull ERROR could not resize shared memory segment ]]></title>
<link>https://www.crunchydata.com/blog/postgres-troubleshooting-diskfull-error-could-not-resize-shared-memory-segment</link>
<description><![CDATA[ Jesse has some tips if you see the dreaded full disk memory segment error. He goes through the most likely causes and fixes. ]]></description>
<content:encoded><![CDATA[ <p>There’s a couple super common Postgres errors you’re likely to encounter while using this database, especially with an application or ORM. One is the <strong>PG::DiskFull: ERROR:</strong> <strong>could not resize shared memory segment.</strong> It will look something like this.<pre><code class=language-sql>"PG::DiskFull: ERROR: could not resize shared memory segment "/PostgreSQL.938232807" to 55334241 bytes: No space left on device"
</code></pre><h3 id=dont-panic><a href=#dont-panic>Don’t panic</a></h3><p>We see a good amount of support tickets from customers on this topic. If you see this error pass by in your logs. Don’t worry. Seriously. There’s no immediate reason to panic from a single one of these errors.<p>If you’re seeing them regularly or all the time, or your curious about how these are generated, let’s continue through some troubleshooting.<h2 id=you-arent-really-out-of-disk><a href=#you-arent-really-out-of-disk>You aren’t really out of disk</a></h2><p>In this case when it's stating "no space left on device" it's not talking about the entire disk, but rather the shared memory device at that exact moment. Segments are created there when a thread is allocating shared buffers for things like hashes, sorts, etc. <a href=https://www.crunchydata.com/blog/parallel-queries-in-postgres>Parallel workers</a> will also allocate shared buffers. When there are not sufficient shared buffers remaining, the statement terminates with that sort of error.<p>The ‘disk full’ part of this error message is a bit of a red herring. This is an error that you'll see when your Postgres instance fails to allocate more memory in support of a query. It is not a real disk full message. Sometimes this happens when modest memory consuming queries that execute very slowly will end up tipping you past the available memory. Other times a huge memory-intensive query comes and takes a huge chunk of memory to cause this issue.<p>Why don’t these spill out to temp, like normally large queries? Well you probably just went over the total memory allocation. Work_mem is allocated for each query node that needs it, rather than once per query or session, meaning that a session can potentially consume many multiples of work_mem. For example, if max_parallel_workers is 8 and work_mem is 384MB, it's possible to use up to 3,072MB of shared buffers even with a single parallel hash join. If your query plan has 5 query nodes that would also allocate work_mem (ie. sorts / hash operations), and four parallel workers, you could be using (384MB x 5 query nodes x 4 workers) = 7.6GB of shared buffers. If you have 7.7 GB available, that’s not going to work.<h2 id=to-the-logs-we-go><a href=#to-the-logs-we-go>To the logs we go</a></h2><p>To see what’s going on with these errors, let’s get into the logs and see how often we’re seeing these. Search your logs for the resize memory issues.<pre><code>$ grep -iR "could not resize shared memory" * | sed 's/.log.*//' | uniq -c
  1597 postgresql-Fri
   587 postgresql-Mon
   325 postgresql-Sat
  1223 postgresql-Sun
  1395 postgresql-Thu
</code></pre><p>You can also look for the specific process ID mentioned in the OOM error. For this one, its <code>5883275</code>.<pre><code class=language-sql>Aug 08 16:34:31 4qd4kp2ot5bwlmdnp7566v4owy postgres[5883275]: [36-1] [5883275][client backend][17/20137143][0] [user=application,db=postgres,app=/rails/bin/rails] ERROR:  could not resize shared memory segment "/PostgreSQL.2449246800" to 33554432 bytes: No space left on device
</code></pre><p>To track the error back to the origin, search your logs for that process id. You might see very long queries broken up into smaller sequence numbers like 42-1, 42-2, and 42-3 in this example<pre><code>Aug 08 16:34:31 4qd4kp2ot5bwlmdnp7566v4owy postgres[5883275]: [42-1] [5883275][client backend][17/20137143][0] [user=application,db=postgres,app=/rails/bin/rails] ERROR:  could not resize shared memory segment "/PostgreSQL.2551246800" to 5883275 bytes: No space left on device

Aug 08 16:34:31 4qd4kp2ot5bwlmdnp7566v4owy postgres[5883275]: [42-2] [5883275][client backend][17/20137143][0] [user=application,db=postgres,app=/rails/bin/rails] STATEMENT: SELECT COUNT(*)
FROM trucks t
JOIN truck_locations tl ON t.truck_id = tl.truck_id
JOIN jobs j ON tl.location_id = j.location_id
JOIN job_hiring_locations_trucks_join jhltj ON j.job_id = jhltj.job_id AND t.truck_id = jhltj.truck_id
JOIN drivers d ON j.driver_id = d.driver_id
JOIN driver_certifications dc ON d.driver_id = dc.driver_id
JOIN certifications c ON dc.certification_id = c.certification_id

Aug 08 16:34:31 4qd4kp2ot5bwlmdnp7566v4owy postgres[5883275]: [42-3] "JOIN maintenance_records mr ON t.truck_id = mr.truck_id
JOIN maintenance_types mt ON mr.maintenance_type_id = mt.maintenance_type_id
JOIN job_status js ON j.status_id = js.status_id
JOIN locations l ON tl.location_id = l.location_id
JOIN job_types jt ON j.job_type_id = jt.job_type_id
JOIN job_priorities jp ON j.priority_id = jp.priority_id
JOIN fuel_records fr ON t.truck_id = fr.truck_id
JOIN fuel_stations fs ON fr.fuel_station_id = fs.fuel_station_id
</code></pre><p>Look for patterns in the logs: Start looking at individual examples in the errors and look for patterns with the event right before the OOM errors. Are you seeing the same query? Maybe large sorts, or large <code>JOIN</code> operations. Are you seeing secondary process jobs, ie Sidekiq, cron, etc? Large analytics-type queries? Those could be large or misconfigured.<h2 id=common-fixes-for-could-not-resize-shared-memory-segment><a href=#common-fixes-for-could-not-resize-shared-memory-segment>Common Fixes for <code>Could Not Resize Shared Memory Segment</code></a></h2><h3 id=decrease-reliance-on-hash-tables-and-add-indexes><a href=#decrease-reliance-on-hash-tables-and-add-indexes><strong>Decrease reliance on hash tables and add indexes</strong></a></h3><p>In what I’ve seen in the wild, hash tables seem to be the main culprit for these types of errors, so that’s a good place to start. Hash joins are used for very large joins across tables and Postgres will create an in-memory hash table to store some of the data. Systems with large amounts of memory or larger <code>work_mem</code> settings can favor hash joins over other join methods like nested loops or merges if the data being joined is small enough to fit into <code>work_mem</code> but large enough that (or indexed so that) a nested loop is inefficient.<p>You can see which strategy the query planner is using by looking at the query’s <code>EXPLAIN</code> plan, ie:<pre><code class=language-sql>EXPLAIN (ANALYZE, BUFFERS)
SELECT COUNT(*)
FROM trucks t

Finalize Aggregate  (cost=238.12..238.13 rows=1 width=8) (actual time=5.276..5.276 rows=1 loops=1)
  Buffers: shared hit=29
  ->  Gather  (cost=238.01..238.12 rows=2 width=8) (actual time=5.236..5.272 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=29
        ->  Partial Aggregate  (cost=238.01..238.02 rows=1 width=8) (actual time=5.226..5.227 rows=1 loops=3)
              Buffers: shared hit=29
              ->  HashAggregate  (cost=238.00..238.01 rows=1 width=4) (actual time=5.213..5.217 rows=3 loops=3)
                    Group Key: trucks.id
                    Buffers: shared hit=29
                    ->  Hash Join  (cost=37.75..236.75 rows=500 width=4) (actual time=0.605..4.879 rows=70 loops=3)
                          Hash Cond: (truck_locations.job_id = trucks.id)
                          Buffers: shared hit=29
                          ->  Seq Scan on truck_locations  (cost=0.00..18.20 rows=820 width=8) (actual time=0.010..0.054 rows=10 loops=3)
                                Buffers: shared hit=3
                          ->  Hash  (cost=27.25..27.25 rows=820 width=4) (actual time=0.575..0.576 rows=10 loops=3)
                                Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                Buffers: shared hit=26
Planning Time: 0.256 ms
Execution Time: 36.562 ms
</code></pre><p>Since the datasets being joined are fairly large, it may be possible to nudge the planner toward merge joins versus hash joins by adding indexes on the join keys of both of the tables. The join keys themselves are already indexed but since there are additional criteria in the queries for filters and other uses, including those columns in indexes can be beneficial.<p>A good rule of thumb is that if a query has a WHERE filter on column A and joins to another table via column B, a <a href=https://www.crunchydata.com/blog/postgres-indexes-for-newbies#multicolumn-b-tree-indexes>multicolumn index</a> on (A, B) will help by reducing the amount of data being joined.<h3 id=decreasing-work_mem><a href=#decreasing-work_mem>Decreasing work_mem</a></h3><p>It is possible that your work_mem is set too generously and you’re allowing too much memory per worker.<h3 id=decreasing-max_parallel_workers><a href=#decreasing-max_parallel_workers>Decreasing max_parallel_workers</a></h3><p>You may want to peek at the settings you have for <a href=https://www.crunchydata.com/blog/parallel-queries-in-postgres#tuning-postgres-parallel-queries>parallel workers</a>. If you have a high work_mem setting, lots of parallel workers, and hash joins, you may be over allocating resources.<h3 id=dig-into-the-queries><a href=#dig-into-the-queries>Dig into the queries</a></h3><p>In a lot of cases, working through a specific query to make it more performant might be the place to go for fixing your OOM issues.<ul><li>Adding <code>WHERE</code> clauses or <code>LIMIT</code> s to <code>SELECT *</code> queries can be a good starting place.<li>Creating <a href=https://www.crunchydata.com/blog/postgres-subquery-powertools-subqueries-ctes-materialized-views-window-functions-and-lateral#what-is-a-materialized-view>views or materialized views</a> to store table join data could help your database as well.</ul><h3 id=add-more-memory><a href=#add-more-memory>Add more memory</a></h3><p>After you’ve added indexes and done what you can with individual queries, if you continue to see these errors, you might need to add more memory to your machine.<h2 id=quick-summary><a href=#quick-summary>Quick summary</a></h2><ul><li><strong>ERROR: could not resize shared memory segment</strong> is probably just a single query or operation that took up all your memory.<li>If you just have one of these, its no big deal. If you have a lot of them, its still no big deal. This is Postgres, everything’s fixable. There’s some simple things you can do to optimize and add indexes before you upgrade your instance to larger memory.<li>Look in your logs for the queries or processes giving the OOM errors.<li>Your #1 place to look is for queries joining large tables where the data being processed fits within <code>work_mem</code>. Adding indexes to strategically limit the amount of data being processed can help.</ul> ]]></content:encoded>
<category><![CDATA[ Production Postgres ]]></category>
<author><![CDATA[ Jesse.Soyland@crunchydata.com (Jesse Soyland) ]]></author>
<dc:creator><![CDATA[ Jesse Soyland ]]></dc:creator>
<guid isPermalink="false">22dedaec5a29ef58d3ddaa05e2d92e8491e5d14559343ebeecb6655d94c1be74</guid>
<pubDate>Fri, 09 Aug 2024 08:00:00 EDT</pubDate>
<dc:date>2024-08-09T12:00:00.000Z</dc:date>
<atom:updated>2024-08-09T12:00:00.000Z</atom:updated></item>
<item><title><![CDATA[ One PID to Lock Them All: Finding the Source of the Lock in Postgres ]]></title>
<link>https://www.crunchydata.com/blog/one-pid-to-lock-them-all-finding-the-source-of-the-lock-in-postgres</link>
<description><![CDATA[ One process can lock your Postgres database, dominating all will, blocking other processes and queries. Jesse shows you how to find that one process that’s ruling them all. Once you’ve grabbed this lock and held it close to your chest, he’ll help you on your quest to cast it into the depths of Mt Doom. ]]></description>
<content:encoded><![CDATA[ <p>On the Customer Success Engineering team at <a href=https://www.crunchydata.com/products/crunchy-bridge>Crunchy Bridge</a>, we run across customers with lock issues on their Postgres database from time to time. Locks can have a cascading effect on queries. If one process is locking a table, then a query can be waiting on the process before it, and the process before that one. Major lock issues can quickly take down an entire production Postgres instance or application.<p>In this post let’s look at why locks happen, and more importantly how to get to the bottom of a lock issue and the one process blocking everything else. That one process that blocks them all! Once you find the source of the lock, I’ll give you the options for terminating the process that created all your troubles in the first place.<h2 id=finding-the-source-of-the-lock><a href=#finding-the-source-of-the-lock>Finding the source of the lock</a></h2><p>Often you won’t immediately know that you have a lock issue. If something is off, queries aren’t returning, or your application is slow, finding statements blocked by locks is a great place to start.<h3 id=1-find-processes-that-are-waiting><a href=#1-find-processes-that-are-waiting>1. Find processes that are waiting</a></h3><p>Take a look at the <code>pg_stat_activity</code> view for processes that are <code>active</code> but have a <code>wait_event</code> or <code>wait_event_type</code> that are non-NULL:<pre><code class=language-sql>SELECT
  pid,
  datname,
  usename,
  application_name,
  client_addr,
  client_port,
  to_char (now (), 'YYYY-MM-DD HH24:MI:SS') as now,
  to_char (now () - xact_start, 'DD HH24:MI:SS MS') as xact_time,
  to_char (now () - query_start, 'DD HH24:MI:SS MS') as query_time,
  state,
  to_char (now () - state_change, 'DD HH24:MI:SS MS') as state_time,
  wait_event,
  wait_event_type,
  left (query, 40)
FROM
  pg_stat_activity
WHERE
  state != 'idle'
  and pid != pg_backend_pid ()
ORDER BY
  query_time desc;
</code></pre><p>If a connection is active and waiting on a lock, then the <code>wait_event</code> and <code>wait_event_type</code> columns will be non-NULL. If that's the case (and it stays that way after a couple of runs of the query to ensure that you didn't just catch a short lock wait), record that affected PID. Here is a very simple example where I ran an update in a transaction, then in a different session added a column to the same table. The <code>ALTER TABLE</code> in this case will not proceed until the transaction from the prior thread has been committed or rolled back. Here are the results - note the PID 295998 that is "active" but has wait_event=relation and wait_event_type=Lock<pre><code class=language-text>  pid   | datname  | usename  | application_name |   client_addr   | client_port |         now         |    xact_time    |   query_time    |        state        |   state_time    | wait_event | wait_event_type |                   left
--------+----------+----------+------------------+-----------------+-------------+---------------------+-----------------+-----------------+---------------------+-----------------+------------+-----------------+------------------------------------------
 295995 | postgres | postgres | psql             | 149.42.105.253 |       49327 | 2023-11-09 20:41:10 | 00 00:02:11 535 | 00 00:02:01 755 | idle in transaction | 00 00:02:01 755 | ClientRead | Client          | RELEASE pg_psql_temporary_savepoint
 295998 | postgres | postgres | psql             | 149.42.105.253 |       49344 | 2023-11-09 20:41:10 | 00 00:01:55 550 | 00 00:01:01 138 | active              | 00 00:01:01 138 | relation   | Lock            | alter table sampledata add column data02
(2 rows)
</code></pre><h3 id=2-find-which-pid-is-locking-the-table><a href=#2-find-which-pid-is-locking-the-table>2. Find which PID is locking the table</a></h3><p>Now we know that the PID (295998) is awaiting a lock on a relation (table), but we don’t know what process currently holds the lock on which it is waiting. To find it, we start by querying <code>pg_locks</code> using the ID of the awaiting process:<pre><code class=language-sql>SELECT
  *
FROM
  pg_locks
WHERE
  pid = 295998
  AND granted IS NOT true;
</code></pre><p>Here’s the result of that query:<pre><code class=language-text>locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid   |        mode         | granted | fastpath |          waitstart
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+--------+---------------------+---------+----------+------------------------------
 relation |        5 |    16501 |      |       |            |               |         |       |          | 6/6743             | 295998 | AccessExclusiveLock | f       | f        | 2023-11-09 20:40:08.98843+00
(1 row)
</code></pre><p>The <code>locktype</code> column shows which of the other columns describe what Postgres is waiting on. In this example, <code>locktype</code> is <code>relation</code>, so we look to the <code>relation</code> column to see the OID of the relation (16501) where the blocking process has an active lock.<h3 id=3-find-the-process-with-the-existing-lock><a href=#3-find-the-process-with-the-existing-lock>3. Find the process with the existing lock</a></h3><p>Now that we know which object is locked, we can once again query <code>pg_locks</code> using the relation OID to see what is holding the current lock(s):<pre><code class=language-sql>SELECT
  *
FROM
  pg_locks
WHERE
  relation = 16501
  AND granted IS true;
</code></pre><p>Here is the result:<pre><code class=language-text>locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid   |       mode       | granted | fastpath | waitstart
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+--------+------------------+---------+----------+-----------
 relation |        5 |    16501 |      |       |            |               |         |       |          | 3/243227           | 295995 | RowExclusiveLock | t       | f        |
(1 row)
</code></pre><p>This shows that PID 295995 is the process holding the lock.<h3 id=4-find-what-that-blocking-process-is-doing><a href=#4-find-what-that-blocking-process-is-doing>4. Find what that blocking process is doing</a></h3><p>Now that we know which process has been granted the lock, we can go back to <code>pg_stat_activity</code> to see what that PID is doing:<pre><code class=language-sql>SELECT
  pid,
  state,
  wait_event,
  wait_event_type,
  left (query, 40)
FROM
  pg_stat_activity
WHERE
  pid = 295995;
</code></pre><p>Here is the result:<pre><code class=language-sql>pid   |        state        | wait_event | wait_event_type |                left
--------+---------------------+------------+-----------------+-------------------------------------
 295995 | idle in transaction | ClientRead | Client          | RELEASE pg_psql_temporary_savepoint
</code></pre><p>The last column is showing the last statement executed by that session, which in this case was the <a href=https://www.postgresql.org/docs/current/sql-release-savepoint.html>savepoint release</a> after an update, but in most cases it will show an active transaction.<h2 id=one-lock-to-rule-them-all><a href=#one-lock-to-rule-them-all>One lock to rule them all</a></h2><p>The above statements are pretty straightforward once you know what you are looking for, but they can also be combined into a single statement for a general blocking / blocked query. The <a href=https://wiki.postgresql.org/wiki/Lock_Monitoring>Postgres wiki</a> has some good combined versions.<p>Often times you might find that the blocked statement is blocked by another (and another, and another still…). In those cases, it is still possible to trace all the way up to the One PID that blocks all the rest, but that can be an arduous, unexpected journey. For those cases, a colleague here at Crunchy Data, Brian Pace, wrote a query that helps to show locks waiting on other locks, rolling up to the PID holding the initial lock:<pre><code class=language-sql>WITH sos AS (
	SELECT array_cat(array_agg(pid),
           array_agg((pg_blocking_pids(pid))[array_length(pg_blocking_pids(pid),1)])) pids
	FROM pg_locks
	WHERE NOT granted
)
SELECT a.pid, a.usename, a.datname, a.state,
	   a.wait_event_type || ': ' || a.wait_event AS wait_event,
       current_timestamp-a.state_change time_in_state,
       current_timestamp-a.xact_start time_in_xact,
       l.relation::regclass relname,
       l.locktype, l.mode, l.page, l.tuple,
       pg_blocking_pids(l.pid) blocking_pids,
       (pg_blocking_pids(l.pid))[array_length(pg_blocking_pids(l.pid),1)] last_session,
       coalesce((pg_blocking_pids(l.pid))[1]||'.'||coalesce(case when locktype='transactionid' then 1 else array_length(pg_blocking_pids(l.pid),1)+1 end,0),a.pid||'.0') lock_depth,
       a.query
FROM pg_stat_activity a
     JOIN sos s on (a.pid = any(s.pids))
     LEFT OUTER JOIN pg_locks l on (a.pid = l.pid and not l.granted)
ORDER BY lock_depth;
</code></pre><p>Example output from that statement:<pre><code class=language-text>pid   |   usename   | datname  |        state        |     wait_event      |  time_in_state  |  time_in_xact   |  relname   |   locktype    |        mode         | page | tuple |     blocking_pids      | last_session | lock_depth |                       query
--------+-------------+----------+---------------------+---------------------+-----------------+-----------------+------------+---------------+---------------------+------+-------+------------------------+--------------+------------+----------------------------------------------------
 879401 | application | postgres | idle in transaction | Client: ClientRead  | 00:29:53.512147 | 00:30:01.31748  |            |               |                     |      |       |                        |              | 879401.0   | select * from sampledata where id=101 for update;
 880275 | application | postgres | active              | Lock: transactionid | 00:01:00.342763 | 00:01:00.459375 |            | transactionid | ShareLock           |      |       | {879401}               |       879401 | 879401.1   | update sampledata set data = 'abc' where id = 101;
 880204 | application | postgres | active              | Lock: relation      | 00:00:29.722705 | 00:00:29.722707 | sampledata | relation      | AccessExclusiveLock |      |       | {879401,880275,879488} |       879488 | 879401.4   | alter table sampledata add column data03 integer;
 880187 | application | postgres | active              | Lock: relation      | 00:00:03.580716 | 00:00:03.580718 | sampledata | relation      | RowExclusiveLock    |      |       | {880204}               |       880204 | 880204.2   | update sampledata set data = 'abc' where id = 103;
 879527 | application | postgres | active              | Lock: relation      | 00:00:14.974433 | 00:28:32.80346  | sampledata | relation      | RowExclusiveLock    |      |       | {880204}               |       880204 | 880204.2   | update sampledata set data = 'abc' where id = 102;
 879488 | application | postgres | active              | Lock: tuple         | 00:00:41.35361  | 00:00:41.47118  | sampledata | tuple         | ExclusiveLock       |    2 |    21 | {880275}               |       880275 | 880275.2   | update sampledata set data = 'def' where id = 101;
(6 rows)
</code></pre><p>In this manufactured example we have:<p>879401 - the “idle in transaction” PID - This is a <code>SELECT... FOR UPDATE</code> within a transaction. Its <code>blocking_pids</code> field is blank because it’s not blocked by any other process. This is the process in this example that is blocking everything else.<p>880275 - Attempting to update the same <code>id=101</code> - It’s blocked until the <code>FOR UPDATE</code> is completed.<p>879488 - Again attempting to update the same <code>id=101</code> - It can’t execute until the process blocking <em>it</em> completes. It’s waiting on 880275 since it came in afterwards. If 880275 is canceled, it will just roll up to the next blocker, 879401.<p>880204 - Here added in an <code>ALTER TABLE</code> - since it takes an access exclusive lock, note it’s <code>blocking_pids</code> shows all three of the prior statements - it won’t be able to execute until each of those are out of the way<p>879527 - Blocked by the <code>ALTER TABLE</code> since it requires an AccessExclusiveLock. Note that it’s still blocked, even though it’s a different row (<code>id=102</code>).<p>880187 - Blocked also by <code>ALTER TABLE</code>. They are at the same <code>lock_depth</code> since they are both blocked by the same thing, but not by each other.<h2 id=ending-the-process-holding-the-lock><a href=#ending-the-process-holding-the-lock>Ending the process holding the lock</a></h2><p>Ok, now we’ve found the PID at the top of the tree, that one locking holding the key to the rest of our locks. Fortunately, as Postgres wizards, we do possess the craft to unmake the lock.<h3 id=commit><a href=#commit>Commit</a></h3><p>If the statement is showing as <code>idle in transaction</code> it is possible that you have a non-committed transaction open that started with a <code>BEGIN</code> statement. In that case you can commit with:<pre><code class=language-sql>COMMIT;
</code></pre><h2 id=rollback><a href=#rollback>Rollback</a></h2><p>You may have performed some unintended updates, or run into an error. In that case you can abort the transaction and rollback any changes already made with:<pre><code class=language-sql>ROLLBACK;
</code></pre><h3 id=cancel-the-pid><a href=#cancel-the-pid>Cancel the PID</a></h3><p>If this wasn't an transaction you initiated, in most cases you can cancel the running query with:<pre><code class=language-sql>SELECT pg_cancel_backend(PID);
</code></pre><h3 id=terminate-the-backend-connection-and-process><a href=#terminate-the-backend-connection-and-process>Terminate the backend connection and process</a></h3><p>If the cancel statement above doesn’t work, you can cast the lock back into the fiery chasm from whence it came by executing a terminate back end statement. This will end the process and its associated database connection.<pre><code class=language-sql>SELECT pg_terminate_backend(PID);
</code></pre><h2 id=why-did-postgres-lock><a href=#why-did-postgres-lock>Why did Postgres lock?</a></h2><p>Postgres’ multi-version concurrently control system is incredibly advanced and by and large is letting you query, update, and insert rows without locking tables. There are two main kinds of locks:<ul><li><em>Shared locks</em> - the resource can be accessed by more than one backend/session at the same time<li><em>Exclusive locks</em> - the resource can only be accessed by a single backend/session at a time</ul><p>The lock type that generally gets us into trouble and blocks other queries and processes are exclusive locks. If you want an overview, see David’s post, <a href=https://www.crunchydata.com/blog/postgres-locking-when-is-it-concerning>Postgres Locking: When Is It Concerning</a>? There are probably hundreds of ways to put an exclusive lock on a table, but these are the most common ones we see with our customers.<p><strong>Alter Table</strong><p>By far the most common event I see to take an exclusive and detrimental lock is an <code>ALTER TABLE</code>command, which can be issued to the database directly or in some cases via the application’s ORM while running migrations. The <code>ALTER TABLE</code> itself takes an <code>ACCESS EXCLUSIVE</code> lock (see <a href=https://www.postgresql.org/docs/current/sql-altertable.html>ALTER TABLE docs</a>) which pretty much blocks every other process on that table.<p><strong>ORM framework</strong><p>ORM frameworks can hide circular dependencies that produce deadlocks. An error on the application side, where other operations run into errors while being executed within the transaction scope, can cause locks and result in future transactions taking a long time to complete.<p><strong>Create index</strong><p>Creating indexes can lock tables if you’re not using <code>CREATE INDEX CONCURRENTLY</code>.<p><strong>Vacuum</strong><p><code>VACUUM FULL</code> will take out an <code>ACCESS EXCLUSIVE</code> lock against a table, so should be used only in rare cases.<p><strong>Other</strong><p>The Postgres documentation has a table showing the <a href=https://www.postgresql.org/docs/current/explicit-locking.html#TABLE-LOCK-COMPATIBILITY>different lock modes</a>, how they might block each other, and some examples of statement types that result in those locks.<h2 id=getting-proactive-about-locks><a href=#getting-proactive-about-locks>Getting proactive about locks</a></h2><p>Let’s look at a few tips for managing locking in the future.<h3 id=logging-lock_waits><a href=#logging-lock_waits>Logging lock_waits</a></h3><p>You can log any time your query is waiting on a lock by turning on <code>log_lock_waits</code>. Lock_waits in your logs can be a good indicator that processes are being contentious. There is virtually no overhead on enabling this and it’s very safe for production databases. This is set to “on” by default on Crunchy Bridge clusters:<pre><code class=language-sql>log_lock_waits = on
</code></pre><h3 id=set-a-lock-timeout><a href=#set-a-lock-timeout>Set a lock timeout</a></h3><p>We generally recommend clients set a <code>lock_timeout</code> within a session so that it will cancel the transaction and relinquish any locks it was holding after a certain period of time. This helps to prevent other processes from getting caught up behind them in a chain.<pre><code class=language-sql>ALTER SYSTEM SET lock_timeout = '10s';
</code></pre><h2 id=summary><a href=#summary>Summary</a></h2><ul><li>Find processes waiting on locks in <code>pg_stat_activity</code> by looking for processes that are <code>active</code> but have a <code>wait_event</code> or <code>wait_event_type</code> that are non-NULL.<li>Use this query to find the source of the lock (<strong><em>seriously save this query somewhere, you might need it someday</em></strong>).</ul><pre><code class=language-sql>WITH sos AS (
	SELECT array_cat(array_agg(pid),
           array_agg((pg_blocking_pids(pid))[array_length(pg_blocking_pids(pid),1)])) pids
	FROM pg_locks
	WHERE NOT granted
)
SELECT a.pid, a.usename, a.datname, a.state,
	   a.wait_event_type || ': ' || a.wait_event AS wait_event,
       current_timestamp-a.state_change time_in_state,
       current_timestamp-a.xact_start time_in_xact,
       l.relation::regclass relname,
       l.locktype, l.mode, l.page, l.tuple,
       pg_blocking_pids(l.pid) blocking_pids,
       (pg_blocking_pids(l.pid))[array_length(pg_blocking_pids(l.pid),1)] last_session,
       coalesce((pg_blocking_pids(l.pid))[1]||'.'||coalesce(case when locktype='transactionid' then 1 else array_length(pg_blocking_pids(l.pid),1)+1 end,0),a.pid||'.0') lock_depth,
       a.query
FROM pg_stat_activity a
     JOIN sos s on (a.pid = any(s.pids))
     LEFT OUTER JOIN pg_locks l on (a.pid = l.pid and not l.granted)
ORDER BY lock_depth;
</code></pre><ul><li>End the lock by canceling the pid or issuing a <code>COMMIT</code> or <code>ROLLBACK</code> of the process that’s holding the lock and blocking the other processes<li>Be careful with <code>ALTER TABLE</code> commands, <code>CREATE INDEX</code> without <code>CONCURRENTLY</code> part, or runaway processes from your ORM that may be holding exclusive locks and blocking general database processing.<li>It can be a good idea to set a <code>lock_timeout</code> and it is generally a good idea to log lock waits if you’re doing proactive logging to keep track of ongoing problems.</ul><p>Thanks to my colleague <a href=https://www.crunchydata.com/blog/author/brian-pace>Brian Pace</a> for the great cascading locks query. ]]></content:encoded>
<category><![CDATA[ Production Postgres ]]></category>
<author><![CDATA[ Jesse.Soyland@crunchydata.com (Jesse Soyland) ]]></author>
<dc:creator><![CDATA[ Jesse Soyland ]]></dc:creator>
<guid isPermalink="false">b08a19715438b0cdf8443bb6c79ba718c9db5a1dd09d38e8b1ad5bc5bd5962c0</guid>
<pubDate>Thu, 18 Jan 2024 14:00:00 EST</pubDate>
<dc:date>2024-01-18T19:00:00.000Z</dc:date>
<atom:updated>2024-01-18T19:00:00.000Z</atom:updated></item>
<item><title><![CDATA[ The Integer at the End of the Universe: Integer Overflow in Postgres ]]></title>
<link>https://www.crunchydata.com/blog/the-integer-at-the-end-of-the-universe-integer-overflow-in-postgres</link>
<description><![CDATA[ Integer overflow can happen if you have a sequencing data type exceeding integer limits. Jesse has a query to help you spot it and recommendations for a short term and long term fix. ]]></description>
<content:encoded><![CDATA[ <p>Integer overflow occurs when a computer program tries to store an integer but the value being stored exceeds the maximum value that can be represented by the data type being used to store it. We have helped a few <a href=https://www.crunchydata.com/>Crunchy Data</a> clients navigate this recently and wanted to write up some notes.<p>In Postgres, there are three integer types:<ul><li><code>smallint</code> - A 2-byte integer, -32768 to 32767<li><code>integer</code>- A 4-byte integer, -2147483648 to 2147483647<li><code>bigint</code> - An 8-byte integer, -9223372036854775808 to +9223372036854775807</ul><p>It is not uncommon to use a 4-byte integer as a primary key when defining a new table. This can cause problems if the value to be represented is more than 4-bytes can hold. If a sequence’s limit is reached you might see an error in your logs that looks like this:<pre><code class=language-text>ERROR:  nextval: reached maximum value of sequence "test_id_seq" (2147483647)
</code></pre><p><strong>Don’t Panic!</strong> We have some helpful and intelligible PostgreSQL solutions.<h3 id=how-do-you-know-if-you-are-close-to-overflowing-an-integer><a href=#how-do-you-know-if-you-are-close-to-overflowing-an-integer>How do you know if you are close to overflowing an integer?</a></h3><p>The following query will identify any auto-incrementing columns, which SEQUENCE object it owns, data types of the column and SEQUENCE object, and percent until the sequence value exceeds the sequence or column data type:<pre><code class=language-pgsql>SELECT
    seqs.relname AS sequence,
    format_type(s.seqtypid, NULL) sequence_datatype,
    CONCAT(tbls.relname, '.', attrs.attname) AS owned_by,
    format_type(attrs.atttypid, atttypmod) AS column_datatype,
    pg_sequence_last_value(seqs.oid::regclass) AS last_sequence_value,
    TO_CHAR((
        CASE WHEN format_type(s.seqtypid, NULL) = 'smallint' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 32767::float)
        WHEN format_type(s.seqtypid, NULL) = 'integer' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 2147483647::float)
        WHEN format_type(s.seqtypid, NULL) = 'bigint' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 9223372036854775807::float)
        END) * 100, 'fm9999999999999999999990D00%') AS sequence_percent,
    TO_CHAR((
        CASE WHEN format_type(attrs.atttypid, NULL) = 'smallint' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 32767::float)
        WHEN format_type(attrs.atttypid, NULL) = 'integer' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 2147483647::float)
        WHEN format_type(attrs.atttypid, NULL) = 'bigint' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 9223372036854775807::float)
        END) * 100, 'fm9999999999999999999990D00%') AS column_percent
FROM
    pg_depend d
    JOIN pg_class AS seqs ON seqs.relkind = 'S'
        AND seqs.oid = d.objid
    JOIN pg_class AS tbls ON tbls.relkind = 'r'
        AND tbls.oid = d.refobjid
    JOIN pg_attribute AS attrs ON attrs.attrelid = d.refobjid
        AND attrs.attnum = d.refobjsubid
    JOIN pg_sequence s ON s.seqrelid = seqs.oid
WHERE
    d.deptype = 'a'
    AND d.classid = 1259;
</code></pre><p>To show this query in action, let me set up a test table with an <code>integer</code> primary key, where the sequence has been artificially advanced to 2 Billion:<pre><code class=language-pgsql>postgres=# create table test(id serial primary key, value integer);
CREATE TABLE
postgres=# select setval('test_id_seq', 2000000000);
   setval
------------
 2000000000
(1 row)

postgres=# \d test
                            Table "public.test"
 Column |  Type   | Collation | Nullable |             Default
--------+---------+-----------+----------+----------------------------------
 id     | integer |           | not null | nextval('test_id_seq'::regclass)
 value  | integer |           |          |
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)
</code></pre><p>Now when running the query above to find the integer overflow percent, I can see that that the data types for both the column and the sequence are both <code>integer</code>, and since the sequence’s next value is 2 Billion, it is 93% through the acceptable range:<pre><code class=language-pgsql>sequence   | sequence_datatype | owned_by | column_datatype | last_sequence_value | sequence_percent | column_percent
-------------+-------------------+----------+-----------------+---------------------+------------------+----------------
 test_id_seq | integer           | test.id  | integer         |          2000000001 | 93.13%           | 93.13%
(1 row)
</code></pre><h3 id=changing-to-negative-number-sequencing><a href=#changing-to-negative-number-sequencing>Changing to negative number sequencing</a></h3><p>Since the <code>integer</code> types in Postgres include negative numbers, a simple way to deal with integer overflow is to flip to sequencing with negative numbers. This can be done by giving the sequence a new start value of <code>-1</code> and converting to a descending sequence by giving it a negative <code>INCREMENT</code> value:<pre><code class=language-pgsql>alter sequence test_id_seq no minvalue start with -1 increment -1 restart;
</code></pre><p>If the purpose of the generated key is purely to create uniqueness, negative values are perfectly acceptable, but in some application frameworks or other use cases negative numbers may be undesirable or not work at all. In those cases we can change the field type entirely.<p>Keep in mind that the data type will need to be changed for any fields that reference this ID as well, or else they will also be out of bounds. Also any foreign key constraints will need to be dropped and reapplied after the both fields’ types have been updated.<p><strong>Benefits of the negative number approach:</strong><ul><li>No change to the column structure<li>Very fast: just change the sequence start number</ul><p><strong>Drawbacks:</strong><ul><li>Negative numbers might not work with your application framework<li>You only buy yourself double the amount of IDs. You could be in this situation again soon</ul><p>In general, this is a buy you some time approach and seen as a short term fix.<h3 id=changing-to-bigint><a href=#changing-to-bigint>Changing to <code>bigint</code></a></h3><p>The more complete fix to your sequence exhaustion is changing to the <code>bigint</code> data type.<p>In order to change the field type of the above <code>test</code> table, we will first create a new ID of type <code>bigint</code> that will eventually replace the current <code>id</code>, and create a unique constraint on it:<pre><code class=language-pgsql>alter table test add column id_new bigint;
CREATE UNIQUE INDEX CONCURRENTLY test_id_new ON test (id_new);
</code></pre><p>The new column will also need a new sequence of type <code>bigint</code>. The sequence needs to start at some point after the latest value that had been recorded.<pre><code class=language-pgsql>CREATE SEQUENCE test_id_new_seq START 2147483776 AS bigint;
ALTER TABLE test ALTER COLUMN id_new SET DEFAULT nextval ('test_id_new_seq');
alter sequence test_id_new_seq owned by test.id_new;
</code></pre><p>Now new values can be added to the table, but there are two different sequences being incremented - the old and the new, ie:<pre><code class=language-pgsql>postgres=# select * from test;
     id     | value |   id_new
------------+-------+------------
 2000000007 |       |
 2000000008 |       |
 2000000009 |       |
 2000000010 |       |
 2000000011 |       | 2147483776
 2000000012 |       | 2147483777
 2000000013 |       | 2147483778
 2000000014 |       | 2147483779

</code></pre><p>In a single transaction, we will drop the old ID constraint and default, rename columns, and add an invalid “not null” constraint on the new ID column:<pre><code class=language-pgsql>BEGIN;
ALTER TABLE test DROP CONSTRAINT test_pkey;
ALTER TABLE test ALTER COLUMN id DROP DEFAULT;
ALTER TABLE test RENAME COLUMN id TO id_old;
ALTER TABLE test RENAME COLUMN id_new TO id;
ALTER TABLE test ALTER COLUMN id_old DROP NOT NULL;
ALTER TABLE test ADD CONSTRAINT id_not_null CHECK (id IS NOT NULL) NOT VALID;
COMMIT;
</code></pre><p>Now new IDs are being added to the table. Thanks to the <code>NOT NULL</code> constraint on <code>id</code>, new NULL values cannot be added, but since it is also <code>NOT VALID</code> the existing NULL values are permitted. In order to make <code>id</code> back into a primary key, the <code>id_old</code> data must be backfilled so that the constraint can be made valid. This can be done in batches, ie:<pre><code class=language-pgsql>WITH unset_values AS (
    SELECT
        id_old
    FROM
        test
    WHERE
        id IS NULL
    LIMIT 1000)
UPDATE
    test
SET
    id = unset_values.id_old
FROM
    unset_values
WHERE
    unset_values.id_old = test.id_old;
</code></pre><p>Once all rows have been backfilled, the <code>NOT NULL</code> constraint can be validated, the UNIQUE index on <code>id</code> can be converted to a primary key, and finally the standalone <code>NOT NULL</code> constraint can be dropped:<pre><code class=language-pgsql>ALTER TABLE test VALIDATE CONSTRAINT id_not_null;
ALTER TABLE test ADD CONSTRAINT test_pkey PRIMARY KEY USING INDEX test_id_new;
ALTER TABLE test DROP CONSTRAINT id_not_null;
</code></pre><p>At any point now the 4-byte <code>id_old</code> column can be dropped, as the bigint has taken its place:<pre><code class=language-pgsql>postgres=# ALTER table test drop column id_old;
ALTER TABLE
postgres=# \d test
                              Table "public.test"
 Column |  Type   | Collation | Nullable |               Default
--------+---------+-----------+----------+--------------------------------------
 value  | integer |           |          |
 id     | bigint  |           | not null | nextval('test_id_new_seq'::regclass)
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)
</code></pre><p>The new 8-byte bigint id should be sufficient for a very, very, <strong>very</strong> long time:<pre><code class=language-pgsql>sequence     | sequence_datatype | owned_by | column_datatype | last_sequence_value | sequence_percent | column_percent
-----------------+-------------------+----------+-----------------+---------------------+------------------+----------------
 test_id_new_seq | bigint            | test.id  | bigint          |          2147483788 | 0.00%            | 0.00%
</code></pre><p><strong>Benefits of the <code>bigint</code>:</strong><ul><li>This is a long term fix and you won't have to worry about running out of sequence numbers for a very long time.</ul><p><strong>Drawbacks:</strong><ul><li>You probably need to update a lot of other things to larger integers<li>Takes coordination with the entire database. In our experience, this is a large project.</ul><h3 id=serial-types><a href=#serial-types><code>SERIAL</code> types</a></h3><p>In Postgres, the <code>SERIAL</code> data types (<code>smallserial</code>, <code>serial</code>, and <code>bigserial</code>) are shortcuts for creating auto-incrementing identifier columns whose values are assigned the next value from a Postgres SEQUENCE object.<p>Creating a column of type <code>SERIAL</code> will default to as type <code>integer</code>, simultaneously creating an integer sequence object owned by the specified table column and make its nextval() the default value for the column.<p>For new tables, consider using <code>BIGSERIAL</code>.<h3 id=summary><a href=#summary>Summary</a></h3><ul><li>You can check with a query if you’re running out of sequence numbers.<li>Changing to negative numbers can be a short term fix.<li>Changing to <code>bigint</code> is the recommended long term fix.<li>When you are setting up a new database that’s likely to have a lot of data in it using <code>SERIAL</code> look at <code>BIGSERIAL</code> instead.</ul><p>Integer overflow may appear at a glance to be insanely complicated. I have written this to keep Postgres DBAs and intergalactic travelers from panicking. ]]></content:encoded>
<category><![CDATA[ Production Postgres ]]></category>
<author><![CDATA[ Jesse.Soyland@crunchydata.com (Jesse Soyland) ]]></author>
<dc:creator><![CDATA[ Jesse Soyland ]]></dc:creator>
<guid isPermalink="false">8dba0794d227f4924cc226b508ef99a9fd78a460cba2e32468454bcb7c3b4b49</guid>
<pubDate>Fri, 03 Mar 2023 11:00:00 EST</pubDate>
<dc:date>2023-03-03T16:00:00.000Z</dc:date>
<atom:updated>2023-03-03T16:00:00.000Z</atom:updated></item>
<item><title><![CDATA[ Postgres Migration Pitstop: Collations ]]></title>
<link>https://www.crunchydata.com/blog/postgres-migration-pitstop-collations</link>
<description><![CDATA[ Checking on your collations is a must have stop on your migration path. You might just run a quick check and be on your way or you might need to add a few more steps to your cutover plans. ]]></description>
<content:encoded><![CDATA[ <p>At Crunchy Data we spend a lot of time helping customers migrate their databases. Migrating Postgres tends to be a very straightforward process. Yet there can still be a few gotchas that can catch you off-guard if you are not prepared to deal with them. From some recent experiences with customers migrating to <a href=https://www.crunchydata.com/products/crunchy-bridge>Crunchy Bridge</a> we found most customers had not considered the underlying collations. These customers ran a risk of data corruption by not handling collation review and updates as part of their migration. A mismatched glibc is one of those details that could actually be a big gotcha and quite the headache if you are unaware of it - so we wanted to cover a few quick details.<h2 id=why-should-i-care-about-mismatched-glibc><a href=#why-should-i-care-about-mismatched-glibc>Why should I care about mismatched <code>glibc</code>?</a></h2><p><strong>Using mismatched <code>glibc</code> versions can have a risk of:</strong><ul><li>Missing data when you query it<li>Inconsistent sorting between versions<li>Undetected unique constraint violations</ul><p>These can all result in data corruption issues. For example, if you have a unique constraint on email addresses and the sort returns is different across versions - you may now have two accounts for users. You may get empty results when you query. Reconciling data corruption may be simple if it is a single record, but the longer it lives the bigger the cleanup and can result in weeks of pain.<p><strong>We’ve seen differences between glibc versions when:</strong><ul><li>Using <a href=https://docs.crunchybridge.com/how-to/migrate/#replica-using-postgres-tooling-wal-e-wal-g-pgbackrest>physical replication to migrate databases</a> (i.e., <code>wal-e</code>, <code>wal-g</code>, <code>pgbackrest</code>) from one host to a new one.<li>Restoring a binary backup (i.e., <code>pg_basebackup</code>) on a system with different OS configuration.<li>Upgrading the Linux distribution to a new major release while keeping the PostgreSQL data directory. In this case, the glibc version may have changed but your underlying data did not.</ul><p>Not all types of migrations or replication are affected by this inconsistency. Situations where the data is transported in a logical (not binary) way are quite safe, including:<ul><li>Backups and restore processes using <code>pg_dump</code>, since these use logical data only<li>Logical replication, which uses only a data copy and not the physical copy</ul><h2 id=how-the-sorting-works><a href=#how-the-sorting-works>How the sorting works</a></h2><p>For a very simple but practical example, on glibc versions older than 2.28 we can run this query and see how data sorts.<pre><code class=language-pgsql>old-glibc::DATABASE=> SELECT * FROM (values ('a'), ('$a'), ('a$'), ('b'), ('$b'), ('b$'), ('A'), ('B'))                                                                                                      AS l(x) ORDER BY x ;
 x
----
 a
 $a
 a$
 A
 b
 $b
 b$
 B
(8 rows)
</code></pre><p>Then run the same on a newer version:<pre><code class=language-pgsql>new-glibc::DATABASE=> SELECT * FROM (values ('a'), ('$a'), ('a$'), ('b'), ('$b'), ('b$'), ('A'), ('B'))                                                                                                        AS l(x) ORDER BY x ;
 x
----
 $a
 $b
 a
 A
 a$
 b
 B
 b$
(8 rows)
</code></pre><p>Thanks to <a href=https://www.twitter.com/DanielVerite>@DanielVerite</a> for a great example in his write-up on <a href=https://postgresql.verite.pro/blog/2018/08/27/glibc-upgrade.html>glibc and Postgres</a>. Let’s dig in a bit more though.<h2 id=what-is-glibc><a href=#what-is-glibc>What is glibc?</a></h2><p>Libc is the main C library used by the Linux system. Many Linux programs, including Postgres, use the glibc implementation. It is used to provide many fundamental software operations and is used inside Postgres to do things like sorting text or comparing data when creating indexes.<p>A <a href=https://sourceware.org/legacy-ml/libc-alpha/2018-08/msg00003.html>major update</a> released with <code>glibc 2.28</code> in 2018 brought localization and collation information into compliance with the 2016 Edition 4 ISO 14651 standards. With the update, indexes that were created with a prior version of the collations potentially exhibit corruption when being read by a system using the updated collations. If there is a mis-match the indexes must be rebuilt to avoid issues.<h2 id=what-collations-are-you-using><a href=#what-collations-are-you-using>What collations are you using?</a></h2><p>You can find the data collation your databases are using via the <code>datcollate</code> field of <code>pg_database</code>.<pre><code class=language-pgsql>SELECT datname, datcollate FROM pg_database;
  datname  | datcollate
-----------+-------------
 postgres  | en_US.UTF-8
 demo      | en_US.UTF-8
 template1 | en_US.UTF-8
 template0 | en_US.UTF-8
(4 rows)
</code></pre><p>And to check your <code>glibc</code> version (this query is environment dependent)<pre><code class=language-pgsql>select collname, collversion from pg_collation where collprovider = 'c';
     collname     | collversion
------------------+-------------
 C                |
 POSIX            |
 ucs_basic        |
 C.utf8           |
 en_AG            | 2.28
 en_AU            | 2.28
 en_AU.utf8       | 2.28
</code></pre><h2 id=how-do-i-fix-it><a href=#how-do-i-fix-it>How do I fix it?</a></h2><h3 id=fix-during-a-migration><a href=#fix-during-a-migration>Fix during a migration</a></h3><p>Since this issue shows up with binary data that is moved across operating system’s <code>glibc</code> versions, this generally shows up during a migration. Migrating via logical replication or logical backup (i.e., <code>pg_dump</code>) eliminates the issue as any affected indexes will be recreated at the time. So changing direction to logical restore might be worth thinking about.<p>For large databases, in excess of 100GB, logical backup migrations can take longer than desirable. In those cases, WAL migration followed by rebuilding affected indexes is generally the method we prefer in order to minimize downtime and engineering effort.<h3 id=on-a-live-database><a href=#on-a-live-database>On a live database</a></h3><p>If you think you might have an issue collations post migration, the <code>amcheck</code> extension help identify any data inconsistencies.<pre><code class=language-pgsql>SELECT bt_index_check(index => c.oid, heapallindexed => true),
               c.relname,
               c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree' AND n.nspname = 'pg_catalog'
-- Don't check temp tables, which may be from another session:
AND c.relpersistence != 't'
-- Function may throw an error when this is omitted:
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages;

bt_index_check |                    relname                    | relpages
----------------+-----------------------------------------------+----------
                | pg_publication_pubname_index                  |        1
                | pg_largeobject_loid_pn_index                  |        1
                | pg_largeobject_metadata_oid_index             |        1
</code></pre><p>If <code>bt_index_check</code> is empty, there’s no indexes that would change collations if a <code>REINDEX</code> is run. If the index check shows information, you’ll likely need to do <code>REINDEX</code>.<p>Side note: The <code>amcheck</code> extension can be a somewhat resource intensive process to run, both in terms of I/O and time. If you have a large or critical database, consider running this on a physical replica so you don’t disrupt production workflows, as this can be detected on a binary copy of the database.<h3 id=reindex><a href=#reindex>Reindex</a></h3><p>If you’ve found an issue with the above steps, you’ll need to <code>REINDEX</code> or <code>REINDEX CONCURRENTLY</code>. (Note: If you are using Postgres 14, we recommend to use <a href=https://www.postgresql.org/docs/current/release-14-4.html>14.4 or higher</a> to properly <code>REINDEX CONCURRENTLY</code> to avoid further risk of corruption).<h2 id=have-questions><a href=#have-questions>Have questions?</a></h2><p>Data migrations can often be straightforward but you want to make sure you ask the right questions instead of assuming things will just work. We hope you’ll have found this helpful as you consider a migration, but if you have additional questions please <a href=https://www.crunchydata.com/contact>reach out</a> as we may be able to help.<p>co-authored with <a href=https://www.crunchydata.com/blog/author/elizabeth-christensen>Elizabeth Christensen</a>, <a href=https://www.crunchydata.com/blog/author/david-christensen>David Christensen</a>, and <a href=https://www.crunchydata.com/blog/author/craig-kerstiens>Craig Kerstiens</a> ]]></content:encoded>
<category><![CDATA[ Production Postgres ]]></category>
<author><![CDATA[ Jesse.Soyland@crunchydata.com (Jesse Soyland) ]]></author>
<dc:creator><![CDATA[ Jesse Soyland ]]></dc:creator>
<guid isPermalink="false">9d6460e76806e6785fced6e1943a8f04593d3e94af5c21b34772d19c8b3dfddd</guid>
<pubDate>Fri, 02 Sep 2022 11:00:00 EDT</pubDate>
<dc:date>2022-09-02T15:00:00.000Z</dc:date>
<atom:updated>2022-09-02T15:00:00.000Z</atom:updated></item></channel></rss>