1. Overview
Sometimes, you may need to manage large objects, i.e. CLOB, BLOB and BFILE, using PostgreSQL. There are two ways to deal with large objects in PostgreSQL: one is to use existing data type, i.e. bytea for binary large object, and text for character-based large object; another is to use pg_largeobject; This blog will explain how to use pg_largeobject.
2. pg_largeobject
pg_largeobject is one of the large object solutions provided by PostgreSQL, which allows stream-style access to user data that is stored in a special large-object structure. Streaming access is useful when working with data values that are too large to manipulate conveniently as a whole.
pg_largeobject is a system table used store the actual large objects. Each large object has an entry in system table pg_largeobject_metadata. The large objects managed using pg_largeobject can be created, modified, and deleted using a read/write API. pg_largeobject allows to store large object up to 4TB.
The other large object solution using existing data type bytea and text built on top of TOAST table, which limits the large object’s size to 1GB.
Here are the schema of these two system tables designed for pg_largeobject.
postgres=# \d+ pg_largeobject;
Table "pg_catalog.pg_largeobject"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
loid | oid | | not null | | plain | | |
pageno | integer | | not null | | plain | | |
data | bytea | | not null | | extended | | |
Indexes:
"pg_largeobject_loid_pn_index" PRIMARY KEY, btree (loid, pageno)
Access method: heap
postgres=# \d+ pg_largeobject_metadata;
Table "pg_catalog.pg_largeobject_metadata"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
----------+-----------+-----------+----------+---------+----------+-------------+--------------+-------------
oid | oid | | not null | | plain | | |
lomowner | oid | | not null | | plain | | |
lomacl | aclitem[] | | | | extended | | |
Indexes:
"pg_largeobject_metadata_oid_index" PRIMARY KEY, btree (oid)
Access method: heap
3. pg_largeobject interfaces
PostgreSQL provides client interface using libpq library for accessing large objects. The interface is similar to the Unix file-system interface, such as create, open, read, write, lseek, etc.
Notes, when using the interface to manipulate a large object, it must happen within an SQL transaction block, since large object file descriptors are only valid in a particular transaction. Below are all the interfaces related with pg_largeobject.
postgres=# \dfS lo*
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+---------------+------------------+---------------------------+------
pg_catalog | lo_close | integer | integer | func
pg_catalog | lo_creat | oid | integer | func
pg_catalog | lo_create | oid | oid | func
pg_catalog | lo_export | integer | oid, text | func
pg_catalog | lo_from_bytea | oid | oid, bytea | func
pg_catalog | lo_get | bytea | oid | func
pg_catalog | lo_get | bytea | oid, bigint, integer | func
pg_catalog | lo_import | oid | text | func
pg_catalog | lo_import | oid | text, oid | func
pg_catalog | lo_lseek | integer | integer, integer, integer | func
pg_catalog | lo_lseek64 | bigint | integer, bigint, integer | func
pg_catalog | lo_open | integer | oid, integer | func
pg_catalog | lo_put | void | oid, bigint, bytea | func
pg_catalog | lo_tell | integer | integer | func
pg_catalog | lo_tell64 | bigint | integer | func
pg_catalog | lo_truncate | integer | integer, integer | func
pg_catalog | lo_truncate64 | integer | integer, bigint | func
pg_catalog | lo_unlink | integer | oid | func
pg_catalog | log | double precision | double precision | func
pg_catalog | log | numeric | numeric | func
pg_catalog | log | numeric | numeric, numeric | func
pg_catalog | log10 | double precision | double precision | func
pg_catalog | log10 | numeric | numeric | func
pg_catalog | loread | bytea | integer, integer | func
pg_catalog | lower | anyelement | anymultirange | func
pg_catalog | lower | anyelement | anyrange | func
pg_catalog | lower | text | text | func
pg_catalog | lower_inc | boolean | anymultirange | func
pg_catalog | lower_inc | boolean | anyrange | func
pg_catalog | lower_inf | boolean | anymultirange | func
pg_catalog | lower_inf | boolean | anyrange | func
pg_catalog | lowrite | integer | integer, bytea | func
4. Examples
To better understand how to use pg_largeobject, here are some examples.
- create a large object
postgres=# select lo_create(0);
lo_create
-----------
16384
(1 row)
postgres=# select * from pg_largeobject_metadata;
oid | lomowner | lomacl
-------+----------+--------
16384 | 10 |
(1 row)
postgres=# select * from pg_largeobject;
loid | pageno | data
------+--------+------
(0 rows)
- import a large object
create a simple text file,
$ echo "this is a test on pg_largeobject." > /tmp/lo_test.txt
Import the text file into pg_largeobject,
postgres=*# select lo_import('/tmp/lo_test.txt');
lo_import
-----------
16385
(1 row)
postgres=# select * from pg_largeobject_metadata;
oid | lomowner | lomacl
-------+----------+--------
16384 | 10 |
16385 | 10 |
(2 rows)
Display the content after imported into pg_largeobject,
postgres=# set bytea_output = 'escape';
SET
postgres=# select * from pg_largeobject;
loid | pageno | data
-------+--------+---------------------------------------
16385 | 0 | this is a test on pg_largeobject.\012
(1 row)
- manipulate a large object
One big advantage of using pg_largeobject is that it allows use to modify the large object. Here is one example to append some information to another large object.
begin;
select lo_open(16385, x'60000'::int);
select lo_lseek(0, 32, 0);
select lowrite(0, ', + large object.\012');
commit;
postgres=# select * from pg_largeobject;
loid | pageno | data
-------+--------+-------------------------------------------------------
16385 | 0 | this is a test on pg_largeobject, + large object.\012
(1 row)
here, x’60000′ is to set the large object access mode to INV_WRITE|INV_READ
. These access modes are defined in the header file libpq/libpq-fs.h as below,
#define INV_WRITE 0x00020000
#define INV_READ 0x00040000
- export a large object to an external file
Other than access the large object inside PostgreSQL, you can also export the large object an external file.
postgres=# select lo_export(16385, '/tmp/lo_test_new.txt');
Then, you can check the content as normal file.
$ cat /tmp/lo_test_new.txt
this is a test on pg_largeobject, + large object.
- wrap the operations into a function
You can build the functions using libpq library to customize the access of large objects. Here is one simple example using PL/SQL.
DROP FUNCTION IF EXISTS my_lo_append;
CREATE OR REPLACE FUNCTION my_lo_append(oid, bytea)
RETURNS oid AS $$
DECLARE
fd integer;
bytes integer;
BEGIN
fd := lo_open($1, x'60000'::int);
bytes := lo_lseek(0, 0, 2);
bytes := lowrite(fd, $2);
PERFORM lo_close(fd);
RETURN $1;
END;
$$ LANGUAGE plpgsql STRICT;
and then call the customized function to append some information,
postgres=# select my_lo_append(16385::oid, 'abc123'::bytea);
my_lo_append
--------------
16385
(1 row)
postgres=# set bytea_output = 'escape';
SET
postgres=# select * from pg_largeobject;
loid | pageno | data
-------+--------+---------------------------------------------
16385 | 0 | this is a test on pg_largeobject.\012abc123
(1 row)
postgres=# select lo_export(16385, '/tmp/lo_test_new.txt');
lo_export
-----------
1
(1 row)
You can also check the manipulated large object as file after exported.
$ cat /tmp/lo_test_new.txt
this is a test on pg_largeobject.
abc123
5. Summary
In this blog, I explained how to use pg_largeobject to deal with large object in PostgreSQL and provided some very simple examples. I hope it can help.
A software developer specialized in C/C++ programming with experience in hardware, firmware, software, database, network, and system architecture. Now, working in HighGo Software Inc, as a senior PostgreSQL architect.
Recent Comments