Rise and Fall for an expected feature in PostgreSQL – Transparent Data Encryption

Enterprise PostgreSQL Solutions

TDE is an important feature missing from PostgreSQL, it is also a crucial tick in the box which people are deciding to move to PostgreSQL from other DBMS systems. The community has been trying to add this feature in PG however it has taken too long because of lack of attention from serious community members for reaching consensus on basic spec and design.

In previous blogs, we have talked about the concept of TDE and its significance. Here we summarize the discussion and development status of TDE in PostgreSQL.

Discussion in the community

TDE has been discussed by the community since 2016. Although there are still many controversies, at least the following facts have been confirmed in the first version:

1.Threat model:

    Although everyone’s perception of “security” is different, the same thing is that everyone thinks that the most important thing is to clarify the security risk model of TDE before really discussing how to implement it. At present, it can be considered that the security threat of TDE in PG is mainly the risk of directly stealing the persistent files in the database. Therefore, although it does not seem so secure, encryption operations and keys can exist in memory.

2.Granularity of encryption:

    In the first version of TDE, only cluster level encryption is implemented. The reason for this is that it can meet the security requirements, and secondly, it is relatively simple to implement.

    Although both MySQL and Oracle DB have adopted the tablespace level encryption, there is a dispute about the development of tablespace level encryption in PG. It should be pointed out that in MySQL/Oracle, the tables and indexes under each tablespace are clustered into one file, but this is not the case in PG. Therefore, in PG, the tablespace level and the table level are actually similar, which can be attributed to the attempt to selectively partially encrypt the data to reduce the performance impact. However, after distinguishing the tables that need to be encrypted and those that do not need to be encrypted, we also need to distinguish their WAL record, which brings some complexity.

    As for the encryption of a column, most people agree that it is more suitable to implement the method of trigger + view.

3.Encryption method:

    For files that contain user data and need to be encrypted, there is a detailed list in the wiki. It can be divided into: table and index files, WAL files, temporary files, etc.

    Table files and index files have a formatted storage structure. They can be encrypted paged by page as the buffer manager writes the encrypted data to disk. Each page of the table file has a fixed size (8K by default), and the LSN and page number of the header can be filled as IV, which can be used for AES encryption in CTR mode. Of course, using LSN and page number as IV padding means that they should not be encrypted. By the way, the CRC information in the header is also recommended not to be encrypted, so that we can conduct integrity check without decryption.

    Because the hard disk brushing methods of WAL and table files are different, there are some differences between the encryption method and table file. The common point is that LSN is also used as IV. Considering that in order to avoid the same risk caused by the combination of key + IV and table encryption, WAL uses different encryption keys from the table file. At the same time, it also meets the requirements that different nodes can use different table file encryption keys when using the same WAL encryption key in the case of master and standby.

    Temporary files have not been discussed too much. It is preliminarily considered that these files may need to generate random IV for encryption algorithm and storage.

Current development progress

After discussion, the community decided that in the first version, the development of KMS should be completed first, and the relevant discussion in this mailing list. So far (Early December 2020), the latest patch of KMS released by Sawada-san is V13 version. After that, Bruce made some changes to the passphrase, added some support for front-end tools and released the patch. We can learn how to use it from documents and mailing lists:

1.Compilation phrase:

    KMS is available only when configured –with-openssl.

2.Cluster initialization phrase:

    The –cluster-passphrase-command(-c) option is added to initdb to specify how to get the passphrase(such as read from a file). Bruce gives an example of manual input:

$ initdb -R -c '/u/postgres/tmp/pass_fd.sh "Enter password:" %R'
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
Key management system is enabled.
fixing permissions on existing directory /u/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ...
--> Enter password:ok
performing post-bootstrap initialization ...
--> Enter password:ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
  pg_ctl -D /u/pgsql/data -l logfile start

The input passphrase will derive the KEK(key encryption key) and HMAC. They are used to wrap the randomly generated DEKs(data encryption key) in the initialization phase and store them to the hard disk.

3.Service startup phrase:

$ pg_ctl -R -l /u/pg/server.log start
waiting for server to start...
--> Enter password: done
server started

When you start the server with the ‘-R’ option, you are asked to provide passphrase again. It then generates KEK and HMAC again, and computes DEK’s HMAC before unwrap DEK to verify whether the passphrase is correct.

4.Server running:

    In the process of server running, the data encryption key will be kept in memory, which is convenient for encryption operation

Controversy and personal thinking

Although the development has started as planned and KMS module is planned to be launched in PG13 (now it has been postponed to version 14), there are still some doubts…

1.PostgreSQL Kernel Modification:

    There is a misconception that TDE is only required by a small number of users therefore it shouldn’t modify the kernel code too much. They suggested that maybe we should consider using hook function or other ways. However, before the extension is loaded, some initialization data files have been generated in the initialization phase of the database instance, which needs to be encrypted.

    Obviously, if we encrypt existing files after startup, it will increase a lot of uncertainty. What we can do now is to implement TDE related functions in a separate source code file as far as possible to reduce the modification of existing code logic.

2.The introduction of encryption algorithm:

    Based on the same reason as #1, how to select encryption algorithm in the initialization phase is also a concern. At present, patch can only be used in the case of enable-openssl. People wonder if there are some methods that can use the algorithm provided by extension such as pgcrypto for encryption of key or data.

    What is certain is that we will never add encryption algorithms to the PG kernel, so if we consider that we can not use the extension loading mode, maybe we could consider loading the standard encryption interface(such as pkcs#11) as a dynamic library? This proposal also applies to the issue of how HSM is supported…

3.Key management or encryption functions provided to users:

    Now we have KMS in the kernel, we should use it to manage the key for encryption extensions like pgcrypto, but how? And as mentioned in #1, we should not prevent people who are not interested in TDE from using encryption extension normally.

    Because our threat model is only for rest files, we should only consider that the key should not appear in the server log. This is still an open topic in the community, and I think we can think more about it later.

Anyway, TDE is a complex and huge project, and the best way is to complete this function is reach a consensus in a small functional scope and then gradually promote it. I hope anyone who is interested in it can pay attention to the relevant information and put forward their own opinions.