Crunchy Data | CrunchyData Blog

Crunchy Data PostgreSQL Security Technical Implementation Guide Now Available

Crunchy.Data@crunchydata.com (Crunchy Data) — Thu, 07 Jan 2021 04:00:00 EST

This new guide is the result of ongoing collaboration with DISA and provides security guidance for PostgreSQL 9.6 through 12

Charleston, S.C. (January 6, 2021) - Crunchy Data — the leading provider of trusted open source PostgreSQL — is pleased to announce the release of the Crunchy Data PostgreSQL Security Technical Implementation Guide (STIG) by the United States Defense Information Systems Agency (DISA). In 2017, Crunchy Data collaborated with DISA to publish the initial version of the PostgreSQL STIG, representing the first published STIG for an open source database.

The Crunchy Data PostgreSQL STIG provides guidance for the secure deployment and configuration of Crunchy Certified PostgreSQL in adherence to the United States Department of Defense security requirements guidelines. Enterprises can refer to this comprehensive guide for PostgreSQL security best practices as they consider open source PostgreSQL as an alternative to proprietary database software.

“We are proud to work in partnership with DISA to provide this updated security guidance for PostgreSQL and believe that it is yet another validation of the comprehensive security functionality of PostgreSQL,” said Crunchy Data President Paul Laurence. “Crunchy Data is committed to continue bringing the extraordinary cost effectiveness of open source PostgreSQL technology to the U.S. Defense community and to all database users who need to manage their information reliably, securely and efficiently.”

This new PostgreSQL STIG includes updated guidance for PostgreSQL 9.6 through 12, including how to use SCRAM authentication, a new logging location for PostgreSQL 10+, and usage of built-in defined roles added in newer Postgres releases. The Crunchy Data PostgreSQL STIG also provides expanded information regarding the use of any Federal Information Processing Standard (FIPS) compliant operating systems.

Crunchy Certified PostgreSQL, Crunchy Data’s trusted 100% open source PostgreSQL distribution, enables Crunchy Data PostgreSQL STIG compliance by providing trusted PostgreSQL along with the requisite security enhancing audit logging extensions and Crunchy Data's enterprise support. To ensure that Crunchy Certified PostgreSQL represents the most trusted enterprise PostgreSQL distribution, Crunchy Certified PostgreSQL has received Common Criteria Evaluation Assurance Level (EAL) 2+ certification, an international standard for computer security certification, Crunchy Certified PostgreSQL is the first commercially available open source relational database management system to receive Common Criteria certification.

About DISA STIG

STIGs are the configuration standards for DoD Information Assurance (IA) and IA-enabled devices/systems. Since 1998, DISA has played a critical role enhancing the security posture of DoD's security systems by providing the STIGs. The STIGs contain technical guidance to "lock down" information systems/software that might otherwise be vulnerable to a malicious computer attack.

About Crunchy Data

Crunchy Data is the leading provider of trusted open source and enterprise PostgreSQL technology, support and training. Crunchy Data offers Crunchy Certified PostgreSQL, the most advanced open source RDBMS on the market, Crunchy PostgreSQL for Kubernetes, the leading solution for deploying Kubernetes native Postgres, and the recently launched Crunchy Bridge, a fully managed cloud Postgres service that gives enterprises the ultimate choice in Postgres management and provides the ability to modernize infrastructure as needed. Learn more at www.crunchydata.com.

Virtual PostGIS Day 2020 is Nov. 19

Crunchy.Data@crunchydata.com (Crunchy Data) — Tue, 10 Nov 2020 04:00:00 EST

Authored by Steve Pousty

Today we are going to talk a little bit about spatial databases and a virtual event Crunchy Data is hosting with several friends and community members. We're putting together an awesome PostGIS Day virtual conference on Thursday, Nov 19th. Last year, we hosted our first PostGIS Day in-person in St. Louis and although we can't gather in the same way this year, going virtual allows us to give even more talks! Registration is now open so be sure to sign up and log into the presentations throughout the day.

History of PostGIS Day

I have been doing spatial data analysis since 1989 using software called Geographic Information Systems (GIS) and I fell in love with the field. I mean c’mon, computers and maps?! This was 2 legs of my golden triangle (the third we can save for another day).

The GIS field has quite a large community and has grown tremendously over time. If you think about it, so much data collected and so many questions people ask have a spatial component. In the early days of the field, all the software needed to do this had some sort of graphical interface, though it usually had a command line. Over time, operating systems became GUI driven, GIS practitioners of the field moved to desktop software to do their analysis.

Now every year, the GIS community comes together to celebrate all the traditional ways they use desktop computer software to fiddle with maps and locations and they call it GIS Day.

And on the day after GIS Day, a second group gathers to celebrate the new and innovative ways to integrate maps and locations and spatial analysis into the wider world of technology, using spatial SQL in the database, and we call that PostGIS Day.

PostGIS day: it's what comes next

What we have in store for YOU!

With conferences going virtual this year, it was much easier for us to put together a variety of speakers and topics. Here's what the day is going to look like:

To kick off the day, we have two introductory sessions for those new to PostGIS. Then talks about PostGIS with the desktop GIS tool, QGIS (FOSS as well). Following all that cool graphical stuff, we then show you all the analytic glory you can have by doing work right in PostGIS.
Right after that we have two great talks on the future of PostGIS. We will hear some of the greatness coming in the next release, PostGIS 3.1. Following that there will be a talk on some of low level greatness coming the spatial engine used by PostGIS.
The next 5 talks are all applications of PostGIS. We will have talks on things such as, using PostGIS augmented reality as well as using pgRouting for nautical routing. We will see PostGIS being used in local government as well as in data science pipelines.
Then we will move to the use of PostGIS in web based applications. There will be specialized application servers and applications to manage space and time for cultural resources.
Finally, we will end the day with two very large organizations talking about the power of PostGIS for their research and operations.

Why Attend?

Even if you haven't used PostGIS or spatial data before, the PostGIS Day presentations will serve as a great introduction and expand your world of possibilities. If you are already familiar with PostGIS or spatial data, it will give you a lot to enjoy. I am so confident that you will enjoy it that I will promise you twice your money back if you don’t enjoy the day (it’s free to attend, see what I did there?)

Deploy High Availability PostgreSQL Clusters on Kubernetes by Example

Crunchy.Data@crunchydata.com (Crunchy Data) — Tue, 21 Jan 2020 04:00:00 EST

One of the great things about PostgreSQL is its reliability: it is very stable and typically “just works.” However, there are certain things that can happen in the environment that PostgreSQL is deployed in that can affect its uptime, such as:

The database storage disk fails or some other hardware failure occurs
The network on which the database resides becomes unreachable
The host operating system becomes unstable and crashes
A key database file becomes corrupted
A data center is lost

There may also be downtime events that are due to the normal case of operations, such as performing a minor upgrade, security patching of operating system, hardware upgrade, or other maintenance.

Fortunately, the Crunchy PostgreSQL Operator is prepared for this.

Crunchy Data recently released version 4.2 of the open source PostgreSQL Operator for Kubernetes. Among the various enhancements included within this release is the introduction of distributed consensus based high availability (HA) for PostgreSQL clusters by using the Patroni high availability framework.

What does this mean for running high availability PostgreSQL clusters in Kubernetes, how does it work, and how to create a high availability PostgreSQL cluster by example? Read on to find out!

The Crunchy PostgreSQL Operator High Availability Fundamentals

To make the PostgreSQL clusters deployed by the PostgreSQL Operator resilient to the types of downtime events that affect availability, the Crunchy PostgreSQL Operator leverages the distributed consensus store (DCS) that backs Kubernetes to determine if the primary PostgreSQL database is in an unhealthy state. The PostgreSQL instances communicate amongst themselves via the Kubernetes DCS to determine which one is the current primary and if they need to failover to a new primary.

This is the key to how the PostgreSQL Operator provides high availability: it delegates the management of HA to the PostgreSQL clusters themselves! This ensures that the PostgreSQL Operator is not a single-point of failure for the availability of any of the PostgreSQL clusters that it manages, as the PostgreSQL Operator is only maintaining the definitions of what should be in the cluster (e.g. how many instances in the cluster, etc.). This is similar to what you find in the outline of the Raft algorithm that describes how to provide consensus amongst who is the leader (or primary) instance in a cluster.

(A quick aside: the Raft algorithm (“Reliable, Replicated, Redundant, Fault-Tolerant”) was developed for systems that have one “leader” (i.e. a primary) and one-to-many followers (i.e. replicas) to provide the same fault tolerance and safety as the PAXOS algorithm while being easier to implement. Given PostgreSQL runs as one primary and however many replicas you want, it is certainly appropriate to use Raft. PostgreSQL clusters managed by the PostgreSQL Operator, via Patroni, leverage Raft properties of the Kubernetes DCS so that way you can run a smaller number of PostgreSQL instances (i.e. 2) and still have distributed consensus!)

For the PostgreSQL cluster group to achieve distributed consensus on who the primary (or leader) is, each PostgreSQL cluster leverages the distributed etcd key-value store that is bundled with Kubernetes. After the PostgreSQL clusters elect a leader, a primary will place a lock in the distributed cluster to indicate that it is the leader. The "lock" is how the primary PostgreSQL instance will provide its heartbeat: it will attempt to periodically update the lock and so long as the other replicas see the update in the allowable automated failover time, the replicas will continue to follow the current primary.

The “log replication” portion that is defined in the Raft algorithm, the primary instance will replicate changes to each replica based on the rules set up in the provisioning process. Each replica keeps track of how far along in the recovery process it is using a “log sequence number” (LSN), a built-in PostgreSQL serial representation of how many logs have been replayed on each replica. For the purposes of high availability, there are two LSNs that need to be considered: the LSN for the last log received by the replica, and the LSN for the changes replayed for the replica. The LSN for the latest changes received can be compared amongst the replicas to determine which one has replayed the most changes, and an important part of the automated failover process.

For PostgreSQL clusters that leverage “synchronous replication,” a transaction is not considered complete until all changes from those transactions have been sent to all replicas that are subscribed to the primary.

Determining When to Failover, And How It Works

As mentioned above, the PostgreSQL replicas periodically check in on the lock to see if it has been updated by the primary within the allowable time. If a replica believes that the primary is unavailable, it becomes what is called a "candidate" according to the Raft algorithm and initiates an "election." It then votes for itself as the new primary. A candidate must receive a majority of votes in a cluster in order to be elected as the new primary. The replicas try to promote the PostgreSQL instance that is both available and has the highest LSN value on the latest timeline.

This system protects against a replica promoting itself when the primary is actually still available. If a replica believes that a primary is down and starts an election, but the primary is actually not down, the replica will not receive enough votes to become a new primary and will go back to following and replaying the changes from the primary.

Once an election is decided, the winning replica is immediately promoted to be a primary and takes a new lock in the Kubernetes consensus store. If the new primary has not finished replaying all of its transactions logs, it must do so in order to reach the desired state based on the LSN. Once the logs are finished being replayed, the primary is able to accept new queries.

At this point, any existing replicas are updated to follow the new primary.

Notice not once did I say anything about the PostgreSQL Operator here. This is the beauty of this high availability method: the PostgreSQL Operator allows for the administration of the PostgreSQL clusters and can set their overall structure (e.g. have 3 PostgreSQL instances in a cluster), but it does not manage their availability. This ensures that the PostgreSQL Operator is not a single-point-of-failure!

Automatic Healing of the Failed Primary

One of the most important pieces of this kind of failover is being able to bring back the old primary into the fold as one of the replicas. For very large databases, this can be a challenge if you have to reinitialize the database from scratch. Fortunately, the PostgreSQL Operator provides a way for the failed primary to automatically heal!

When the old primary tries to become available again, it realizes that it has been deposed as the leader and must be healed. It leverages the pgBackRest repository that is deployed alongside the PostgreSQL cluster and uses the “delta restore” feature, which does an in place update of all of the missing files from the current primary. This is much more efficient than reprovisioning the failed instance from scratch, and works well for very large databases! When the delta restore is done, the instance is considered heal and is ready to follow the new primary.

Less Talk, More Examples

Now that you understand how this all works let's look at an example!

For the 4.2 release, we tested a variety of scenarios that would trigger a failover, from network splits (my favorite one to test) to critical file removal to the primary pod disappearing. For the purposes of this exercise, we will try out the last case as it is a very easy experiment to run.

First, I have gone ahead and deployed the PostgreSQL Operator to a Kubernetes cluster. I have set up my cluster to use the PostgreSQL Operator client, aka pgo.

Let's create a high availability PostgreSQL cluster with two replicas using the pgo create cluster command:

pgo create cluster hippo --replica-count=2

Notice I don't add any extra flags: high availability is enabled by default in the PostgreSQL Operator starting with version 4.2. If you want to disable high availability, you must use the --disable-autofail flag.

(Also note that you may need to explicitly pass in the Kubernetes Namespace with the -n flag. I set the PGO_NAMESPACE environmental variable to automatically use the Namespace).

Give the cluster a few minutes to get started. At some point, all of your PostgreSQL instance should be available, which you can test with the pgo test command:

pgo test hippo

cluster : hippo
	Services
		primary (10.96.130.226:5432): UP
		replica (10.96.142.11:5432): UP
	Instances
		primary (hippo-9d5fb67c9-6svhm): UP
		replica (hippo-dupt-775c5fc66-fc7vh): UP
		replica (hippo-sekv-5f88dcbc5b-748zl): UP

Under the instances section, you can see the name of the Kubernetes Pods that comprise the entirety of the PostgreSQL cluster. Take note of the name of the primary Pod for this cluster, which in this example is hippo-9d5fb67c9-6svhm. Let's have this Pod meet with an unfortunately accident (note, you may need to add a Namespace to this command with a -n flag).:

kubectl delete pods hippo-9d5fb67c9-6svhm

When demonstrating automatic failover with this method, you may notice that you kubectl command hangs for a few moments. This is due to Kubernetes making its updates after the failover event is detected.

Wait a few moments, and run pgo test again to see what happens:

pgo test hippo

cluster : hippo
	Services
		primary (10.96.130.226:5432): UP
		replica (10.96.142.11:5432): UP
	Instances
		replica (hippo-9d5fb67c9-bkrht): UP
		replica (hippo-dupt-775c5fc66-fc7vh): UP
		primary (hippo-sekv-5f88dcbc5b-748zl): UP

Wow! Not only did a new primary PostgreSQL cluster get elected, but we were able to automatically heal the old primary and turn it into a replica. Granted, this may be far more impressive if we had some more data in the database and I demonstrated the continuity of the availability, but this is already a long article. ;-)

Conclusion & Further Reading

Using Kubernetes to run high availability PostgreSQL clusters is no easy task: while the fundamental building blocks are available to create this kind of environment, it does require some smarts and automation behind it. Fortunately, the PostgreSQL Operator

If you want to understand more how the PostgreSQL Operator creates high availability PostgreSQL environments, I encourage you to read the high availability architecture section in our documentation.

I also encourage you to deploy the PostgreSQL Operator and try creating your own downtime scenarios and see what happens. We'd love to iron out any edge cases that may occur (though some may be too out on the edge, and at that point you'd still need manual intervention. But hey, we'd like to try to automate it, and maybe you could propose a patch to automate it!), but most importantly, we'd love to understand how you deploy high availability PostgreSQL!

Using Kubernetes Deployments for Running PostgreSQL

Crunchy.Data@crunchydata.com (Crunchy Data) — Mon, 30 Jul 2018 05:00:00 EDT

Running PostgreSQL databases in containerized environments is more popular than ever and is moving beyond running only in local, development environments and into large scale production environments. To answer the need to orchestrate complex, database workloads, the Crunchy Data team created the PostgreSQL Operator to automate many typical database administrator tasks at scale:

Provisioning new PostgreSQL clusters
Scaling up replicas
Setup and manage disaster recovery, high-availability, and monitoring
Allocate resources (memory, CPU, etc.) and suggest nodes for databases to run on
Mass applying user policies
Performing major/minor upgrades

and more. In order to keep all of its PostgreSQL databases up and running, the PostgreSQL Operator uses Kubernetes Deployments, which provides an API to manage replicated applications. In order to understand why this is, first we need to understand about running stateful applications with Kubernetes.

Running Stateful Applications With Kubernetes

Kubernetes began as a project that focused on managing the compute workload of container-based applications and did not worry about storage management. As the project has matured, the Kubernetes community has incorporated the building blocks that make it possible to run stateful workloads, such as databases, in a Kubernetes environment. Tools such as the Operator Framework have allowed developers to capture the various nuances of managing complex stateful workloads, e.g. a relational database, and take full advantages of Kubernetes ability to schedule and manage container runtimes.

When it comes to managing stateful applications, Kubernetes offers a few different solutions. We are going to focus using Deployments. A Deployment manages pods based on a particularly container image and operates in a replicated Kubernetes environment, i.e. it uses a ReplicaSet and aim to have a given number of pods running at any time.

Why Run PostgreSQL With Kubernetes Deployments?

Running your clusters with Kubernetes Deployments gives you flexibility in how you operate your cluster and other advantages, including:

Using Different Storage Classes

Often in database environments, you will have databases running on different types of disks. For example, you may want your primary databases to run on fast disks, replica databases to run on medium-speed disks, and development databases to run on slow disks.

The PostgreSQL Operator accounts for this by allowing you to create different storage classes for persistent volumes (PVs). When you provision a new database using the PostgreSQL Operator, whether its a primary or a replica, you can specify which storage class for the new persistent volume to use.

Additionally, utilizing Kubernetes Deployments allows you to use different storage engines within the same deployment. For instance, you may want to run your primary instances on NFS preallocated persistent volumes but use dynamic storage classes with your replicas using an engine like Gluster.

Choosing Your Operating Environment

In Kubernetes, a node is a machine (physical or virtual) that is responsible for executing workloads. For our purposes, a node is where the PostgreSQL database software runs.

Often a production requirement of running PostgreSQL (let alone any database) is to select the hardware for your database system to operate in. For instance, you typically do not want you primary and replica instances to run on the same physical machine as any problems to the machine could impact your database system availability. Similarly, you may want to have your primary instances run on nodes that have better hardware than your replicas.

The PostgreSQL Operator uses node labels in Kubernetes to create “node affinity” rules, such as “do not deploy a primary and replica to the same node.” Users can specify specific node label rules on each part of a PostgreSQL Deployment (primary and replicas) that influence which node a database workload is scheduled on. This enables a fine-tuned placement of a PostgreSQL database cluster across a Kubernetes nodes.

Additionally, the PostgreSQL Operator also allows you to specify what CPU and memory resources to each primary or replica instance deployed, giving you finer grained control over how many resources a database can use.

Selective PostgreSQL Version Upgrades

A favorite past time of every PostgreSQL administrator is planning upgrades and ensuring that they can minimize downtime for all affected users. There are many different strategies for upgrading, including some special ones for very large clusters, all of which involve different levels of outages. A common theme for upgrading multiple clusters involves using “rolling upgrades,” or selectively upgrading each cluster. In addition to often being easier to manage, this gives the database administrator the ability to rollback an individual cluster should the upgrade fail.

The PostgreSQL Operator comes with the ability to perform selective upgrades across all of the PostgreSQL clusters its managing using a combination of Kubernetes selectors and individual PostgreSQL images.

Conclusion

There are other ways to run PostgreSQL clusters in a Kubernetes environment (in fact we provide some examples in the Crunchy Container Suite). The architecture decision to use Deployments in the Crunchy PostgreSQL Operator is specifically geared around flexibility: high-availability: you can choose different nodes, resources, and storage solutions for your databases based upon your operating requirements.

The Crunchy PostgreSQL Operator is actively being developed - we recently released version 3.1 - to help you take full advantage of running your PostgreSQL workloads on Kubernetes and ensure you can run your PostgreSQL clusters with all your good database administrator habits.