When writing server side projects in Go, at some point you will also need to test against a database. Let's take a look at different ways of using Postgres with different performance characteristics. The final approach shows how you can set up a clean database in 20ms (there are a few caveats).
We're not going to cover the "how should you use a real database in your tests" debate. At some point you'll need to test your database layer, so, we'll cover those cases.
If you have searched a bit on how to set up a clean test environment, you've probably come across github.com/ory/dockertest package. There's also testcontainers for setting up containers. Alternatively, you could even invoke docker as a command and use that. Whichever your poison, the approach will look similar. We'll use dockertest for our examples.
Usually, the first thing you do is set up something to act as the client. With dockertest it means creating a dockertest.Pool. And we need to set it up in our TestMain:
var dockerPool *dockertest.Pool
func TestMain(m *testing.M) {
var err error
pool, err = dockertest.NewPool("")
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
// Set a time for our retries. A lower value probably makes more sense.
pool.MaxWait = 120 * time.Second
code := m.Run()
os.Exit(code)
}
var dockerPool *dockertest.Pool
func TestMain(m *testing.M) {
var err error
pool, err = dockertest.NewPool("")
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
// Set a time for our retries. A lower value probably makes more sense.
pool.MaxWait = 120 * time.Second
code := m.Run()
os.Exit(code)
}
If we are writing tests, then using a specific helper is going to be very convenient.
func TestCreateTable(t *testing.T) {
ctx := context.Background()
WithDatabase(ctx, t, func(t *testing.TB, db *pgx.Conn) {
_, err := db.Exec(ctx, `
CREATE TABLE accounts ( user_id serial PRIMARY KEY );
`)
if err != nil {
t.Fatal(err)
}
})
}
func WithDatabase[TB testing.TB](ctx context.Context, tb TB, test func(t TB, db *pgx.Conn)) {
// < snip >
}
func TestCreateTable(t *testing.T) {
ctx := context.Background()
WithDatabase(ctx, t, func(t *testing.TB, db *pgx.Conn) {
_, err := db.Exec(ctx, `
CREATE TABLE accounts ( user_id serial PRIMARY KEY );
`)
if err != nil {
t.Fatal(err)
}
})
}
func WithDatabase[TB testing.TB](ctx context.Context, tb TB, test func(t TB, db *pgx.Conn)) {
// < snip >
}
This approach creates a docker image and calls test callback whenever it's ready.
The callback based approach is especially helpful if you need to test with multiple backends such as Cockroach and Postgres. In your own codebase you probably would return the data layer interface rather than *pgx.Conn directly. For example:
func TestCreateTable(t *testing.T) {
ctx := context.Background()
db := NewDatabase(ctx, t)
_, err := db.Exec(ctx, `
CREATE TABLE accounts ( user_id serial PRIMARY KEY );
A single table migration isn't indicative of a proper database layer, but it's sufficient for seeing the best-case scenario. Adding more tables didn't seem to affect things that much.
Let's get back on track and see how you can implement the first approach. It's should be trivial to convert one to the other:
func WithDatabase[TB testing.TB](ctx context.Context, tb TB, test func(t TB, db *pgx.Conn)) {
// First we need to specify the image we wish to use.
In most cases, creating a new postgres instance per test isn't necessary. It'll be entirely sufficient to have a database per test. If we have SUPERUSER permissions in postgres we can create them dynamically.
To contrast with the previous approach, let's use a locally installed Postgres instance. This can be helpful, if you want to run tests against a remote database or want to avoid the container startup time.
var pgaddr = flag.String("database", os.Getenv("DATABASE_URL"), "database address")
var pgaddr = flag.String("database", os.Getenv("DATABASE_URL"), "database address")
Let's rewrite the function to create a new database per test:
func WithDatabase[TB testing.TB](ctx context.Context, tb TB, test func(t TB, db *pgx.Conn)) {
if *pgaddr == "" {
tb.Skip("-database flag not defined")
}
dbaddr := *pgaddr
// We need to create a unique database name so that our parallel tests don't clash.
But, 90ms is still a lot of time per single test. There's one lesser-known approach we discovered in Storj. It's possible to use a schema to create an isolated namespace that can be dropped together.
Creating a new schema is as straightforward as executing CREATE SCHEMA example; and dropping DROP SCHEMA example CASCADE;. When connecting to the database it's possible to add a connection string parameter ?search\_path=example to execute all queries by default in that schema.
Of course, if you use schemas for other purposes in your system, then this approach may complicate the rest of your code. Similarly, schemas are not as isolated as separate databases.
Now that the disclaimer is out of the way, let's take a look at some code:
func WithSchema[TB testing.TB](ctx context.Context, tb TB, test func(t TB, db *pgx.Conn)) {
if *pgaddr == "" {
tb.Skip("-database flag not defined")
}
dbaddr := *pgaddr
// We need to create a unique schema name so that our parallel tests don't clash.
There's one important flag that you can adjust in Postgres to make it run faster... of course, this should only be used for testing. It's disabling fsync.
The final results of the comparison look like:
Environment
Test
fsync
Time
Windows Threadripper 2950X
Container
on
2.86s ± 6%
Windows Threadripper 2950X
Container
off
2.82s ± 4%
Windows Threadripper 2950X
Database
on
136ms ± 12%
Windows Threadripper 2950X
Database
off
105ms ± 30%
Windows Threadripper 2950X
Schema
on
26.7ms ± 3%
Windows Threadripper 2950X
Schema
off
20.5ms ± 5%
MacOS M1 Pro
Container
on
1.63s ± 16%
MacOS M1 Pro
Container
off
1.64s ± 13%
MacOS M1 Pro
Database
on
136ms ± 12%
MacOS M1 Pro
Database
off
105ms ± 30%
MacOS M1 Pro
Schema
on
19.7ms ± 20%
MacOS M1 Pro
Schema
off
18.5ms ± 31%
Linux Xeon Gold 6226R
Container
on
2.24s ± 10%
Linux Xeon Gold 6226R
Container
off
1.97s ± 10%
Linux Xeon Gold 6226R
Database
on
135ms ± 10%
Linux Xeon Gold 6226R
Database
off
74.2ms ± 10%
Linux Xeon Gold 6226R
Schema
on
29.2ms ± 16%
Linux Xeon Gold 6226R
Schema
off
15.3ms ± 15%
All the tests were run in a container that didn't have persistent disk mounted. The fsync=off would probably have a bigger impact with an actual disk.
So for the conclusion, we looked at three different approaches to creating a clean Postgres environment. The approaches aren't completely equivalent, but use the fastest one that you can.