Featured image of post Calculating the Distance Between Coordinates with DrizzleORM and SQLite

Calculating the Distance Between Coordinates with DrizzleORM and SQLite

Addressing a self imposed performance issue through the power of SQL

The Solution

Apply the Haversine formula using DrizzleORM’s magical sql operator and ensure your SQLite database has installed the math extension (sqlean is a fantastic option for this!), if you’re using Turso, they provide this, check out the Doing it in SQLite section for more info!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
// Some parts are left out for simplicity.
import { stations as stationsSchema } from '@/lib/db/schema';
import { sql, lte } from 'drizzle-orm';

const lat = -33.8930404;
const long = 151.2765367;
const radius = 10; // 10km.

const stations = await db.query.stations.findMany({
    extras: {
        distance: sql<number>`
        6371 * acos(
            cos(radians(${lat}))
            * cos(radians(${stationsSchema.latitude}))
            * cos(radians(${stationsSchema.longitude}) - radians(${long}))
            + sin(radians(${long}))
            * sin(radians(${stationsSchema.latitude}))
            )
        `.as('distance')
    },
    where: (lte(sql`distance`, radius)),
});

You can find the full code here: https://github.com/troypoulter/fuelpulse/blob/dadd1855fac98fc60198b43d4fe6d5a19ebe1e06/app/search/page.tsx#L37-L49.

The Plot

I’ve been recently working on Fuel Pulse, an Australian fuel price tracking website, and I needed to find all the fuel stations near you. When I started, I did it the simple way of calculating it at the API layer, I didn’t know how to work this out at the DB layer yet, as I was trying out some new things: Drizzle ORM and SQLite.

Here’s a look at a small sample of the performance gains I got when shifting this calculation in the DB!

Before and After Performance

Doing it the simple but inefficient way

You can see in my pull request troypoulter/fuelpulse - #6 perf: apply Haversine formula at the DB layer instead of API layer what the code used to look like, here’s a breakdown:

  1. Retrieve a list of all stations from the database.
  2. Map through each station, calculate the distance using the Haversine formula and filter the results.
1
2
3
4
5
6
7
const stations = await db.query.stations.findMany();

const stationsWithDistance = stations.map(station => ({
    ...station,
    distance: parseFloat(haversineDistance(parsedLat!, parsedLong!, station.latitude, station.longitude).toFixed(1))
}))
    .filter(station => station.distance <= parsedRadius);

Now, the issue wasn’t whether it took a long time to run the haversineDistance function over each station once we got them all back (that was fast). The issue was this required returning all the stations, which at the time of writing is 2,854!

This made that DB call take a lot longer than it really needed to be, coupled with wanting to also return prices for each station, I had to make a second query! You can see just how long this was taking.

Calculating distance at the API layer

But, if I could apply the formula when I make the initial DB query, I would only return the stations that are within the distance, and, get the pricing data, all at once!

I didn’t do this initially as I wanted to focus on building the core website, but after I did the initial release, I came back to this and did more research and I found the path forward!

Doing it better

The answer rests with Drizzle’s magical sql operator, where I can mix in raw SQL alongside the type safe queries. It ended up being really neat and simple to apply! I’ve included the price part as well to show the full query in all its beauty!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
const stations = db.query.stations.findMany({
    extras: {
        distance: sql<number>`
        6371 * acos(
            cos(radians(${parsedLat}))
            * cos(radians(${stationsSchema.latitude}))
            * cos(radians(${stationsSchema.longitude}) - radians(${parsedLong}))
            + sin(radians(${parsedLat}))
            * sin(radians(${stationsSchema.latitude}))
            )
        `.as('distance')
    },
    where: (lte(sql`distance`, parsedRadius)),
    with: {
        prices: {
            orderBy: (prices, { desc }) => [desc(prices.id)],
            limit: 10
        }
    },
});

You can see in the pull request how much code I was able to remove and simplify the operations, which brought the improved speeds!

Calculating distance at the DB layer

Doing it in SQLite

This part is what actually took longer to work out, as SQLite, unlike other databases, doesn’t contain the required math functions to apply the Haversine formula by default - you need to load them in.

Say hello to sqlean, a fantastic open-source set of SQLite extensions that includes many useful functions, like math!

Now, luckily, I use Turso to host my SQLite database, and they make it really simple to create a database with these extensions installed.

1
turso db create [database-name] --enable-extensions

If you wanted to do it yourself, you’d need to follow the sqlean guide on installing it for your sqlite which I haven’t tried yet but plan to in the future!

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.
Built with Hugo
Theme Stack designed by Jimmy