PostgreSQL comes with support for the SQL99 standard Array Types. Used correctly, these can result in simpler schemas and faster queries.
PostgreSQL arrays also translate well into Go. Read on to learn more about array types and the cool things you can do with it, and how to work with them in Go.
Array Types in SQL
In Postgres, you can have a single column contain a bunch of values of the same type, quite similar to arrays in programming languages. Here is how they are represented literally:
In Postgres syntax, the part '{1,2,3}'
is a plain string, ::
is the
Postgres typecast operator, and int[]
is the type (an array of integers). The
expression '{1,2,3}'::int[]
reads as “typecast the string {1,2,3}
into an
array of integers”.
The string notation of curly brackets and comma-separated values is an array value constructor recognized by Postgres.
There is also an alernate syntax, which is less explicit about the element type:
You can have arrays of any basic type (like boolean, timestamp, text, and so on). Multi-dimensional arrays are also possible, although we don’t cover them in this blog post.
Here is how you can have an array as a column type in a table:
Like any other column type, the array column type can also be nullable or not, and can have a default value, or not. For example, if you want the default value to be an empty array (array with no elements):
To insert a row into this table, you can use a plain string as the value for tag:
Postgres will cast the string to the column type, letting you omit the explicit cast.
Using Go
The standard Postgres database/sql driver for Go is lib/pq. (There are others, like pgx and go-pg, which we don’t cover here.) Here’s is how you’ll connect to a database using lib/pq:
To insert a row that contains an array value, use the pq.Array
function like
this:
To read a Postgres array value into a Go slice, use:
Note that in lib/pq, only slices of certain Go types may be passed to pq.Array(). For more information, see the docs.
Selecting
Let’s see some examples of how to use arrays for reading back information from our blog posts table. Here are the contents of the table which we’re working with:
Get all posts and their tag count
The array_length
function gives the number of elements in the array. This
function can work with multi-dimensional arrays also, so we need so specify the
dimension number also (always 1 for one-dimensional arrays):
You can also use the cardinality
function to get the same result in this case:
Get all posts tagged “postgres”
The ‘<@’ operator can be used to check for array membership. It is used like A
<@ B
where A and B are arrays of the same type, and returns true
if all
elements of array A are present in array B. We can use this to get a list of all
posts that are tagged “postgres”:
The left hand side of the operator has to be an array, so we construct an array with the single element “postgres”.
Doing this from Go goes like this:
Get all posts tagged “postgres” and “go”
If we specify more than one element for the left hand side array, all those elements need to be present in the right hand side for the operator to report a match. This means that if we need to list all blogs that are tagged both postgres and go, we can do:
The corresponding change to Go code is simply:
Get all posts tagged either “postgres” or “go”
So how do we do an “OR” then? We can use the &&
operator, which when used as
A && B
returns true when A and B have one or more elements in common.
The Go code changes only in the query:
Get all unique tags
The function unnest
unrolls an array into rows. This is best explained with an
example:
As you can see, each tag from each row in “posts” expands into its own row,
duplicating the other columns (“title” here) as required. You can use unnest
to get only the unique tags:
And since we’re here, let’s see one more trick: rolling up those rows back into an array:
Get tag cloud
Can we get the information required to build a tag cloud? A tag cloud needs each tag, and the number of posts that are tagged with it. In SQL terms, we basically need to do a count over a grouping by tag. We can do this using the previous query as a CTE:
Updates
Let’s see some ways to modify these array values.
Mark all existing posts as legacy
The array_append
function returns a new array with the given elements appended
to it. We can use this to add another tag to each of our post:
Rename tags
You can also search and replace elements in an array using the array_replace
function. If we wanted to rename the tag “go” to “golang” in all posts, we
can do:
Note that this updated all 4 rows, even though only 2 rows contained “go”. The other rows were touched, even though their values did not change. This was because we didn’t have a WHERE clause for the UPDATE. Let’s update only the required ones:
That updates only 2 rows, as required.
Other Array Tips and Tricks
Here are a couple more things about arrays that’s useful to know:
NULL values in an array
Array elements can be NULL, and there is no way to declare an array as containing only “not null” elements. It’s a bit of a pain to work with NULLs in arrays, and is best avoided.
From Go, here is how to read arrays that may contain NULLs:
You’ll have to check each element of tags
to see if it is .Valid
while
processing the result.
Using “ANY” instead of “IN”
Here are two ways of selecting posts written by any of a specific set of authors:
Assuming that the set of authors is known only at run-time (say selected from the UI from a list), how would you form this query in Go?
With the first query, you’d have to escape each value and form the SQL string, which would be inelegant and error-prone. The second query, however, can be written like this:
Array elements as foreign keys
Having an array of primary keys that reference another table would actually be pretty helpful in some cases, but Postgres does not support it. You can store the keys as array elements, but you’ll have to “manually” update the array when the referenced rows in the slave table are deleted.
Array Aggregates
The array_agg
function can be used in aggregate queries to create arrays out
of aggregated input. See the docs for more info.
Multi-dimensional Arrays
Arrays can have more than one dimension. See this page for an overview, and check out other functions here that would be useful in working with them.
Indexing Arrays
Indexes can be created using array expressions. Not all operators (like ‘<@’, ‘&&’ etc) are support by all index types. See here for more info.
Monitoring Your PostgreSQL servers
Since you’re here, you might be interested in our server and database monitoring product OpsDash. It’s written entirely in Golang and can be used to monitor servers, services, app metrics, and databases, including PostgreSQL.
With OpsDash, you can quickly start monitoring your PostgreSQL servers, and get instant insight into key performance and health metrics including WAL archiving and streaming replication.
Additionally, each OpsDash Smart Agent includes the industry-standard statsd interface (and even a graphite interface) to easily report custom metrics.
New Here?
OpsDash is a server monitoring, service monitoring, and database monitoring solution for monitoring Docker, MySQL, PostgreSQL, MongoDB, memcache, Redis, Apache, Nginx, Elasticsearch and more. It provides intelligent, customizable dashboards and rule-based alerting via email, HipChat, Slack, PagerDuty, OpsGenie, VictorOps and Webhooks. Send in your custom metrics with StatsD and Graphite interfaces built into each agent.