What is a Table Access Method?
Table access method is the interface between the PostgreSQL core and data storage management. Since PostgreSQL 12, it is possible to define your own custom table access method that stores data in custom forms by implementing over 45 interface API callback functions. Generally, implementing all of the interface API callback functions is a difficult task as you are essentially defining your own custom storage engine that has to cooperate with PostgreSQL core to achieve:
- sequential scan
- parallel scan
- index fetch
- query estimate
- insert, update, delete, truncate
- table creatio, vacuum, vacuum full
OrioleDB, for example, is a custom storage engine for PostgreSQL that provides a more modern way to store data.
Heap is the default and the only table access method supported in PostgreSQL.
src/include/access/tableam.h, and heap access method’s implementation is located in
There are different interface callback function for different purposes. Examine closely, you would notice that the functions involving data retrieval (scan) and data insertion (insert, update ..etc) are invoked by PostgreSQL core with a input data structure called Tuple Table Slot (TTS).
Regardless if you are using heap access method or defining your own access method, you will get the data in the format of TTS. It is the access method’s responsibility to understand this structure and convert it into the format (for example, heap tuple)to be physically stored on disk.
Likewise, when PostgreSQL requested data from the access method, it is responsible for converting the stored data format back to TTS.
What is Tuple Table Slot (TTS)?
It is basically the format understood by PostgreSQL (Executor module specifically).
- TTS is an internal data structure that holds a single row of data, including column values.
- It is a basic component in the statement processing (Query Processing) process.
- Used to store rows returned by queries, and also used to store rows to be inserted or updated.
- Common data format between
Table Access Method.
- Their life cycle also follows query processing.
- TTS Operation callback function tells PostgreSQL how to convert TTS to Heap Tuple or other types of Tuple data formats.
- The structure is defined in src/include/executor/tuptable.h
What is Heap Tuple?
It is basically the format stored on disk
- Row Representation: Heap tuples are the physical storage representation of rows in a PostgreSQL table. Each heap tuple contains the actual data values for each column in a row.
- Visibility Information: Heap tuples include metadata to track their visibility such as
hintbitflag…etc. This is crucial for PostgreSQL’s Multi-Version Concurrency Control (MVCC) system, allowing transactions to work with consistent snapshots of the data.
- Support for Updates: When a row is updated, PostgreSQL marks the old heap tuple as “dead” (assign a
xmaxvalue) and creates a new version of the tuple with the updated data. This versioning system supports data consistency.
- Indexed for Efficiency: Heap tuple contains
ctid, which represents the physical location of such heap tuple (which page number and at what offset). An index normally contains
tidlook at a heap tuple instantly, without scanning entire table to find a match.
Tuple Table Slot vs Heap Tuple
Tuple Table Slot
- data format used by the executor module in PostgreSQL kernel
- used internally that includes:
- total number of columns
- description of columns
- datum array
- NULL array
- data format used by Heap Access Method
- The format to be stored on disk that includes:
- visibility information
- physical location
- actual user data
Access method is like a bridge, sitting in the middle, coordinating the instructions between PostgreSQL core and actual data storage
This blog intends to give a brief overview of table access method API and describes how it coordinates between Tuple Table Slot and Heap tuple data formats. Table access method is a huge architectural topic and we will gradually explore these different API calls in the subsequent blogs.
Cary is a Senior Software Developer in HighGo Software Canada with 8 years of industrial experience developing innovative software solutions in C/C++ in the field of smart grid & metering prior to joining HighGo. He holds a bachelor degree in Electrical Engineering from University of British Columnbia (UBC) in Vancouver in 2012 and has extensive hands-on experience in technologies such as: Advanced Networking, Network & Data security, Smart Metering Innovations, deployment management with Docker, Software Engineering Lifecycle, scalability, authentication, cryptography, PostgreSQL & non-relational database, web services, firewalls, embedded systems, RTOS, ARM, PKI, Cisco equipment, functional and Architecture Design.