This blog is aimed at beginners trying to learn the basics of PostgreSQL and Python but already have some experience under their belt. For this tutorial, we will assume you have PostgreSQL correctly installed on Ubuntu. All of these steps were done using PostgreSQL 16 (development version) and Python 3.11.4 on Ubuntu 23.04. We’ll go over an example of setting up a PostgreSQL server and how we can interact with this database using Python.
For our example, we’ll be using Psycopg2, a very convenient Python library that wraps libpq for fast and efficient database manipulation. We can install it using pip like so:
$ pip install psycopg2
Before we start using Python, we need a database to work with. Let’s create a simple one:
$ mkdir data/ $ initdb -D data/ $ pg_ctl -D data/ -l logfile start $ createuser -h localhost -p 5432 -s postgres
With our libraries installed and our database running, we can start scripting!
The first thing we need to do when working on a database is to get an active connection that we can send queries over. Psycopg2 handles this very easily with its
import psycopg2 conn = psycopg2.connect( database="postgres", host="localhost", port="5432", user="postgres", password="" )
Let’s analyze this code snippet:
- Line 1: Our import statement, telling Python that we want to use the
psycopg2library so we can call the functions it has defined.
- Line 3: The
connect()function mentioned prior, this returns a connection object we call
conn. If successful we can use this object to get a
cursorobject which allows us to interact with the database.
- Line 4 – 8: The
database: The database name, we’ve used the default
postgreswhich is present automatically on any regular install.
host: The hostname that our database is running on, in other situations this could be an IP address or a name that resolves to one. In our situation we simply use
port: The port number our database is running on, if we don’t change any configuration parameters then Postgres will default to 5432. If you do configure a different port you need to match it here.
user: The user we want to connect as. This user should have the correct privileges for the queries you want to execute. In our setup steps, we created this user as one is not created by default.
password: The password for the user we are connecting as. If you define a password for your user you need to include it here. In our case, we did not define a password so this parameter can be left blank or excluded entirely.
If everything goes smoothly then we should get a valid connection object that we can use to create a cursor. This is critical as a cursor allows us to run queries. We can get this cursor like so:
cursor = conn.cursor()
Yes, it’s that simple! Now let’s get running some queries.
With our cursor object ready we can send SQL queries directly to the database.
cursor.execute("CREATE TABLE t1 (a int, b int)") cursor.execute("INSERT INTO t1 values (1,2)") cursor.execute("INSERT INTO t1 values (4,5)") cursor.execute("INSERT INTO t1 values (6,7)") cursor.execute("INSERT INTO t1 values (7,8)")
The first query we run creates a table called
t1, while the next four insert data into the table we just created. Great! But do we really have to write out each query?
Let’s look into the
executemany() function. This function will help us to reuse code by introducing variables into the query string. This is how it works:
nums = [(1,2),(3,4),(5,6),(7,8)] cursor.executemany("INSERT INTO t1 values (%s,%s)",nums)
executemany() function takes two parameters, the first is the query we want to execute with variables denoted with a percent sign (%), the second is a list (or tuple) of tuples that denotes the data that will fill those variables. This can make executing large queries much easier to write out.
So, now we’ve inserted all this data, but how do we get it out? This is where fetching comes into play.
A query that returns rows (such as a SELECT statement) needs a way for those rows to be read. Psycopg2 has 3 different functions to go about doing this, we’ll go over each in some detail.
The most basic function is
fetchone(), as the name suggests this returns a single row from an executed query.
nums = [(1,2),(3,4),(5,6),(7,8)] cursor.executemany("INSERT INTO t1 values (%s,%s)",nums) cursor.execute("select * from t1") print(cursor.fetchone())
The output of this will be
If we were to run multiple
fetchone() functions, each call would return the next row as the previous is consumed.
If we want to read every single row returned by a query then we can use the
fetchall() function. Using our previous example:
nums = [(1,2),(3,4),(5,6),(7,8)] cursor.executemany("INSERT INTO t1 values (%s,%s)",nums) cursor.execute("select * from t1") print(cursor.fetchall())
This would output a list of all the row values as tuples.
[(1, 2), (3, 4), (5, 6), (7, 8)]
Our last fetch function,
fetchmany(), is used to get a specified amount of rows returned from a query. This function takes a single parameter, the number of rows we want to fetch.
nums = [(1,2),(3,4),(5,6),(7,8)] cursor.executemany("INSERT INTO t1 values (%s,%s)",nums) cursor.execute("select * from t1") print(cursor.fetchmany(size=2))
This will output the first two rows’ return values as a list of tuples.
[(1, 2), (3, 4)]
If the value of
size is larger than the amount of rows returned, then this function simply returns all the rows even if there are fewer than specified.
In this blog, we went over an example of how to interact with a PostgreSQL database using nothing but Python. We first went over creating a database and how to connect to it using the Psycopg2 library. Then we looked at the various functions Psycopg2 has to both execute queries on the database and retrieve results from it. With this library, developing frontend applications in Python with a backend PostgreSQL database is much easier than it once was.
Tristen received his Bachelor of Applied Science in Computer Engineering from the University of British Columbia in May 2023. He joined HighGo Software Inc. as a Software Engineer fresh out of university and is very excited for his engineering journey to begin. His main interests include Machine Learning, Embedded Systems, and Database Management Systems. With experience in C/C++ and advanced relational databases, Tristen hopes to contribute significantly to the PostgreSQL community as he continues to learn and grow his expertise.