Introduction
In your PostgreSQL extension development based on the C language, you may come across a need to work with structured data like JSON. Naturally, you would probably introduce a third-party JSON parsers such as cJSON or libjannson in your extension. While they are powerful, easy to use and offering many features, it may be unnecessary if we weren’t fully utilizing the advanced features of these libraries. Many times, we just want to read a particular value from JSON or simply iterating through it. PostgreSQL is already equipped with sufficient capabilities to work with JSON data, though they may not be as straightforward as third-party libraries. You could potentially save the costs of introducing third-party JSON library if you could fully grasp what PostgreSQL already has. In this blog, I will show you how to use PostgreSQL’s jsonb APIs to parse, fetch and iterate through a JSON structure.
Parse and Fetch
To use PostgreSQL’s jsonb APIs, you need to include its header in your C extension:
#include "utils/jsonb.h"
Now, we are ready to work with jsonb. Let’s assume we have a char *
that points to a complete JSON structure content. We need to convert it to a Jsonb *
before we can do anything with it.
/* myjson points to a complete JSON content */
void jsonb_example(const char * myjson)
{
Datum jsonb_datum;
Jsonb * jb;
/* we first convert char * to datum representation */
jsonb_datum = DirectFunctionCall1(jsonb_in, CStringGetDatum(myjson));
/* then, we convert it to Jsonb * */
jb = DatumGetJsonbP(jsonb_datum);
}
Let’s say our JSON looks like this:
{
"version": "1.0",
"payload": {
"name": "exampleapp",
"ts_ms": 1720811216000,
"db": "postgresql",
"table": "mytable",
"schema": "myschema"
},
"queries": [
{
"query": "select * from mytable"
},
{
"query": "update mytable set a = 1"
}
]
}
to get the value of db
under the payload
group, we can utilize jsonb’s jsonb_get_element
() routine with this prototype:
Datum jsonb_get_element(Jsonb *jb, Datum *path, int npath,
bool *isnull, bool as_text);
This routine takes a jsonb pointer, which we created earlier that represents the entire example JSON message. It also takes an array of Datum
and npath
in which we have to provide to the function that represents the path
to a JSON element. Please note that this path does not have to lead all the way to a scalar value; it can stop at another inner group, or an array depending on your use case. It also takes a isnull bool pointer
that the function will set to false if the element cannot be found. Lastly, the as_text
boolean value tells the function whether to return the result as Text Datum or Jsonb Datum. My preference is to set it to false so it would return as jsonb datum so I can further operate on it. It is easy to convert it to string representation as well (via stringinfo structure). See below example.
/* myjson points to a complete JSON content */ void jsonb_example(const char * myjson) { Datum jsonb_datum; Jsonb * jb; /* variables needed for fetching element */ Datum datum_elems[2]; Datum res; int numpath = 2; bool isnull; StringInfoData strinfo; /* we first convert char * to datum representation */ jsonb_datum = DirectFunctionCall1(jsonb_in, CStringGetDatum(myjson)); /* then, we convert it to Jsonb * */ jb = DatumGetJsonbP(jsonb_datum); /* prepare element paths to fetch, from outer to inner */ initStringInfo(&strinfo); datum_elems[0] = CStringGetTextDatum("payload"); datum_elems[1] = CStringGetTextDatum("db"); /* fetch it */ res = jsonb_get_element(jb, datum_elems, numPaths, &isnull, false); if (isnull) { /* write NULL if element does not exist */ resetStringInfo(&strinfoo); appendStringInfoString(&strinfoo, "NULL"); } else { Jsonb *resjb = DatumGetJsonbP(res); resetStringInfo(strinfoout); JsonbToCString(&strinfo, &resjb->root, VARSIZE(resjb)); } /* strinfo contains the value of the element at this point. Print it */ elog(WARNING, "data = %s", strinfo.data); }
Now, if we want to fetch a particular value from an array at a particular index. For example, the value of “query” under the array “queries” at index 1 (update mytable set a = 1). We just need to modify the datum_elems
that describes this path. We could simply put a number (as string) in the datum_elems
right after an array element to tell the function that we want to fetch a particular index. See below example:
/* myjson points to a complete JSON content */ void jsonb_example(const char * myjson) { Datum jsonb_datum; Jsonb * jb; /* variables needed for fetching element */ Datum datum_elems[3]; Datum res; int numpath = 3; bool isnull; StringInfoData strinfo; /* we first convert char * to datum representation */ jsonb_datum = DirectFunctionCall1(jsonb_in, CStringGetDatum(myjson)); /* then, we convert it to Jsonb * */ jb = DatumGetJsonbP(jsonb_datum); /* prepare element paths to fetch, from outer to inner */ initStringInfo(&strinfo); datum_elems[0] = CStringGetTextDatum("queries"); datum_elems[1] = CStringGetTextDatum("1"); datum_elems[2] = CStringGetTextDatum("query"); /* fetch it */ res = jsonb_get_element(jb, datum_elems, numPaths, &isnull, false); if (isnull) { /* write NULL if element does not exist */ resetStringInfo(&strinfoo); appendStringInfoString(&strinfoo, "NULL"); } else { Jsonb *resjb = DatumGetJsonbP(res); resetStringInfo(strinfoout); JsonbToCString(&strinfo, &resjb->root, VARSIZE(resjb)); } /* strinfo contains the value of the element at this point. Print it */ elog(WARNING, "data = %s", strinfo.data); }
As you can see, fetch a particular element is pretty straight forward. We just have to prepare the right datum_elems
array that describe a path to a value and everything else stays the same. We could easily write a helper function to make this process easier by automatically creating the datum_elems
from a single string that separates each hierarchy with a dot (ex: “payload.name”, “queries.0.query”…etc).
void getPathElementString(Jsonb * jb, char * path, StringInfoData strinfoout) { Datum * datum_elems = NULL; char * str_elems = NULL, * p = path; int numPaths = 0, curr = 0; char * pathcopy = pstrdup(path); Datum res; bool isnull; if (!strinfoout) { elog(WARNING, "strinfo is null"); return -1; } /* Count the number of elements in the path */ if (strstr(pathcopy, ".")) { while (*p != '\0') { if (*p == '.') { numPaths++; } p++; } numPaths++; /* Add the last one */ } else { numPaths = 1; } datum_elems = palloc0(sizeof(Datum) * numPaths); /* Parse the path into elements */ if (strstr(pathcopy, ".")) { str_elems= strtok(pathcopy, "."); if (str_elems) { datum_elems[curr] = CStringGetTextDatum(str_elems); curr++; while ((str_elems = strtok(NULL, "."))) { datum_elems[curr] = CStringGetTextDatum(str_elems); curr++; } } } else { /* only one level, just use pathcopy*/ datum_elems[curr] = CStringGetTextDatum(pathcopy); } /* Get the element from JSONB */ res = jsonb_get_element(jb, datum_elems, numPaths, &isnull, false); if (isnull) { resetStringInfo(strinfoout); appendStringInfoString(strinfoout, "NULL"); } else { Jsonb *resjb = DatumGetJsonbP(res); resetStringInfo(strinfoout); JsonbToCString(strinfoout, &resjb->root, VARSIZE(resjb)); } pfree(datum_elems); pfree(pathcopy); } /* myjson points to a complete JSON content */ void jsonb_example(const char * myjson) { Datum jsonb_datum; Jsonb * jb; StringInfoData strinfo; /* we first convert char * to datum representation */ jsonb_datum = DirectFunctionCall1(jsonb_in, CStringGetDatum(myjson)); /* then, we convert it to Jsonb * */ jb = DatumGetJsonbP(jsonb_datum); initStringInfo(&strinfo); getPathElementString(jb, "payload.db", &strinfo); elog(WARNING, "payload.db= %s", strinfo.data); getPathElementString(jb, "queries.0.query", &strinfo); elog(WARNING, "queries.0.query= %s", strinfo.data); }
Iterate through Entire json Structure
Now, we know how to fetch a particular value from JSON given that you know what you are looking for. There are times that we have to iterate through the entire JSON to construct an internal data structure for some purpose. In this case, we can utilize Jsonb’s iterate routines. The example code below will create an jsonb iterator and then attempt to iterate through every elements within it. It will indicate when the iterator is about to dive into a group or array and when it is about to exit from a group or array. You will have a chance to save the key
and value
values as desired. The values read by jsonb could be expressed in different data types such as string, binary, numeric…etc. The example below try to convert them to string (except for binary) for printing.
/* myjson points to a complete JSON content */ void jsonb_iterate_example(const char * myjson) { Datum jsonb_datum; Jsonb * jb; /* iterator related */ JsonbIterator *it; JsonbValue v; JsonbIteratorToken r; char * key = NULL; char * value = NULL; /* we first convert char * to datum representation */ jsonb_datum = DirectFunctionCall1(jsonb_in, CStringGetDatum(myjson)); /* then, we convert it to Jsonb * */ jb = DatumGetJsonbP(jsonb_datum); it = JsonbIteratorInit(&jb->root); while ((r = JsonbIteratorNext(&it, &v, false)) != WJB_DONE) { switch (r) { case WJB_BEGIN_OBJECT: elog(WARNING, "begin group --------------------"); break; case WJB_END_OBJECT: elog(WARNING, "end group --------------------"); break; case WJB_BEGIN_ARRAY: elog(WARNING, "begin array --------------------"); break; case WJB_END_ARRAY: elog(WARNING, "end array --------------------"); break; case WJB_KEY: key = pnstrdup(v.val.string.val, v.val.string.len); elog(WARNING, "key: %s", key); break; case WJB_VALUE: case WJB_ELEM: switch (v.type) { case jbvNull: elog(WARNING, "value: NULL"); break; case jbvString: value = pnstrdup(v.val.string.val, v.val.string.len); elog(WARNING, "string value: %s", value); break; case jbvNumeric: { value = DatumGetCString(DirectFunctionCall1(numeric_out, PointerGetDatum(v.val.numeric))); elog(WARNING, "numeric value: %s", value); break; } case jbvBool: elog(WARNING, "boolean value: %s", v.val.boolean ? "true" : "false"); if (v.val.boolean) value = pnstrdup("true", strlen("true")); else value = pnstrdup("false", strlen("false")); break; case jbvBinary: elog(WARNING, "binary value"); break; default: elog(WARNING, "unknown value type: %d", v.type); break; } break; default: elog(WARNING, "Unknown token: %d", r); break; } if (key != NULL && value != NULL) { pfree(key); pfree(value); key = NULL; value = NULL; } }
Summary
Jsonb APIs in PostgreSQL can do much more than a simple fetch and iteration. For example, to push additional values to an existing jsonb structure. Today we mainly focus on fetch and iteration which in my opinion are the most common use cases when dealing with JSON. I hope the code examples shared here could help you and prevent you from introducing third-party json parsers to your extension development which may be an overkill.
Cary is a Senior Software Developer in HighGo Software Canada with 8 years of industrial experience developing innovative software solutions in C/C++ in the field of smart grid & metering prior to joining HighGo. He holds a bachelor degree in Electrical Engineering from University of British Columnbia (UBC) in Vancouver in 2012 and has extensive hands-on experience in technologies such as: Advanced Networking, Network & Data security, Smart Metering Innovations, deployment management with Docker, Software Engineering Lifecycle, scalability, authentication, cryptography, PostgreSQL & non-relational database, web services, firewalls, embedded systems, RTOS, ARM, PKI, Cisco equipment, functional and Architecture Design.
Recent Comments