From matomo to plausible

Posted on 2025-03-09 in Blog

For several years, I hosted my own Matomo instance to track traffic on my blog and projects while respecting the privacy of my users. It worked well, but it was the only service I had running on MariaDB. All the other ones are running on PostgreSQL. And I had several issues with MariaDB and file corruption over the years, mostly on unexpected server shutdown. Most of the time, I managed to recover the data, but I also lost some if it, sometimes just because MariaDB couldn’t start so Matomo was down for a while. Regarding Matomo itself, it is very feature complete, but is resource intensive (and thus a bit slow on my server). I also always found the interface a bit messy.

Since I’m in the process of cleaning up my servers, I wandered if I could find an alterative. Ideally, I want a free software I can self host. I’d like the data to be in my PostgreSQL instance and the project itself will run in a container (so no barrier regarding runtime language). I had to objection paying a small fee for a good service if needed: I can afford it and it could make my life way easier. Whatever I choose, I wanted something RGPD compatible that respects the privacy of my users.

After some research, I found three possible options:

Matomo which I don’t want to self host any more. Their paid plan starts at 22€/month which I find expensive given my very small usage. On the other side, it’s feature complete, privacy focussed and is well known.
Umami an MIT project written in TypeScript. It supports multiple database including PostgreSQL. It is developed by a US company. It has a free plan that would suit my needs. The next tier starts at $20/month. They don’t rely on cookies to track users but I didn’t find their documentation more details about how they comply with the RGPD or how they respect users privacy. And I failed to find where the data is hosted for their Cloud plan.
Plausible an AGPLv3 project written in Elixir. In the many list of projects, it’s the only one I had heard of before hand. While it supports PostgreSQL, it also requires ClickHouse (a no-sql database designed for analytics). Running the full stack with docker compose worked well, but it’s probably not the easiest one to self-host. Their cheapest plan (which fits my needs) starts at 9€/month which is acceptable for me. It is developed by a European company and their cloud service is hosted in Europe by Hetzner a German hosting provider. Furthermore, their documentation is very clear and complete regarding how they respect the RGPD and users privacy. It’s not the most feature complete, but it has just what I need and is also efficient and simple to use. Their tracking script is also very lightweight.

I started by using the trial the trial period to easily test the tool. I ran it in parallel with my self hosted Matomo to spot any differences. Since they don’t track recurring users the same way, I had some differences in the stats. What’s important is that they were close enough to be considered similar in both tools.

Since I was happy with the new tool, I went for a paid plan to make my life easier. No regret so far. I mostly look at stats once a month with their monthly email report anyway.

Before shutting down Matomo completely, I wanted to export data I had it it. To do this, I used the small Python script below. It took me a bit of time to write and required some trial and errors on my end, so I want to share it in case you are doing something similar. I didn’t bother importing the data in Plausible: most of it is old and I only want to keep it as archive.

import requests
from datetime import date
from dateutil.relativedelta import *

# Replace with your values
api_url = ""  # In my case: https://piwik.jujens.eu/index.php
auth_token = ""
start_date = date(2015, 1, 1)
end_date = date(2024, 12, 31)

# That's the tricky part. Matomo has an API, and I thought I’d be able to do GET
# request to it to get my data. It failed. I tried the POST API with a dict
# and it fail again.
# In the end, the solution I found it to split the suggested GET parameters,
# put it in a dict and do post requests to the API.
dd = "module=API&format=CSV&idSite=1&period=month&date=2024-12-01&method=Actions.getPageTitles&expanded=1&translateColumnNames=1&language=en&token_auth=XXXXX&filter_limit=-1".split("&")
data = {key: value for key, value in list(d.split("=") for d in dd)}

d = start_date
while d < end_date:
    data["date"] = d
    with open(f"page_views_{d.isoformat()}.csv", "wb") as f:
        f.write(requests.post(api_url, data=data).content)
    d += relativedelta(months=1)