LEdoian's Blog

Running Pleroma IPv6-pure

My Fediverse instance, https://pleroma.ledoian.cz, now runs IPv6-pure. [1] Even in year 2026, this was not painless, so I want to share here what I needed to do to make it run.

Note that I migrated my instance from a dual-stack deployment to pure IPv6, for extra fun (not really, read on).

Shoutout to Alyx, who has done the same with Mastodon some time back, though she set up a whole new instance, unlike me.

I'd also like to thank Vojta for helping me test federation and debug issues.

Why

I had my instance running on a cheap Hetzner VPS since 2019-08-26, together with other services. I probably did various weird stuff to the instance I don't remember, but the next interesting day in this story is 2025-01-02, when I ran out of disk space and it would have been unsustainable to keep running Pleroma on that server. So something like this landed in my nginx config:

# hack 2025-01-02 (ENOSPC and need to safely migrate DB probably…)
return 503 "Pleroma not available for a few days/weeks";

My solution was to find a “free hardware with power backup” (a used laptop) and a “free residential colocation space” (find a place where I could connect that laptop to the internet) and run Pleroma there. However, I didn't want to pay for more addresses, so I decided to start running it on IPv6, as I could just use an address from the /64 Hetzner gives each VPS for free [2]. Buying an IPv4 would cost me about as much as the VPS itself.

This was not a top priority and I encountered some issues, so those “few days/weeks” turned into 389 days, and on 2026-01-26 I was able to make a post again. But let me not get ahead of myself.

Pretty much the rest of this post talks about what did not work :-) Most of that is IPv6-related, the rest are selfhosting adventures and the fact that the instance has been down for over a year – I'll separate those into two parts of the blog.

IPv6 issues

Nothing works! Namely:

NAT64 is not an option

This is mostly a precaution. While client-server software could just use NAT64 to reach legacy internet (IPv4-land), I cannot, because I need to be also able to receive connections. I think I technically could use NAT64 for outgoing connections and (another) reverse proxy for incoming connections, but that seemed brittle and would increase the load on my VPS again. So I decided to not do that. Until I did…

Non-issue: Debian does support IPv6

Praise them! I might write another blogpost about why I did that, but the instance runs in a VM that has never seen legacy internet. I don't remember having any issues with debian-installer or anything else related to the system.

Pleroma Git repository does not do IPv6

Oh bother. At one point I had a setup where I would sync the repo to the dual-stack VPS and then clone that to the instance itself, but…

Hex.pm does not do IPv6

I decided to scrap the magic git setup and just use NAT64 just for syncing repositories. The initial idea was to switch between two states: Pleroma shut down, firewall only allows outgoing connections and I am using DNS64 nameservers vs. regular nameservers, two-way firewall, Pleroma running.

I later realised that DNS resolving is a userspace concern, so I have a NAT64-enabled resolv.conf (Thanks Kasper for nat64.net) that I just bind-mount over the regular one in a namespace. Roughly:

unshare -rm -- '
        mount --bind utils/resolv.conf /etc/resolv.conf
        do_stuff
        '

This fixed hex.pm, git.pleroma.social, Github and maybe some other legacy.

Rebar3 does not do IPv6

Like, at all, not even NAT64. Turns out, Erlangs httpc does default to legacy-only operation.

Naturally, the fail was obscured somewhere in mix's logs (which can do IPv6 even though it only uses IPv4 repository), which just said that rebar3 had failed for some package. Debugging was ordinary: isolate the issue into a small example (tiny Erlang package depending on the failing package), reading the source code to understand what was happening, reproducing in interactive shell and reading the docs.

I patched that. I had no experience with Erlang before, but what was I going to do anyway :-D

And since I am using Debian, I would not want to wait until the new rebar3 makes into the system repositories, so I think I repackaged it.

Also, mix somewhy brings its own rebar3, which was not yet patched. Workaround: use system (repackaged) rebar3 to download the dependency for another project, then mix's rebar3 takes that from the local cache.

I cannot get rid of IPv4 properly

A small skill issue: I accidentally blackholed IPv4 communication instead of rejecting it. This lead to software timing out instead of falling forward(?) to IPv6 right away, making most of mix deps run very slowly. A bit hard to debug, because a lot of software uses IPv6 by default and the issue does not occur.

Also, according to my notes I even managed to kill IPv4 communication with localhost, which is still not a good idea.

Vix has a download step as a part of compilation

… and they download from Github, of all places. I think both are bad decisions, but I digress. And I think it also does not do IPv6 even with NAT64, otherwise I would not notice.

I forced Vix to compile NIFs (Erlang's Native Implemented Functions) instead of downloading pre-compiled with VIX_COMPILATION_MODE=PLATFORM_PROVIDED_LIBVIPS, and also started running dependency compilation in unshare -cn -- mix deps.compile, because I think having access to the internet at that point breaks some security assumptions (e.g. it sidesteps expectations from mix.lock).

IPv6-first library does not do IPv6

Pleroma would run at this point I think, but it couldn't do requests outside. The culprit is Hackney, that claims to be an “IPv6 first” HTTP client library. And it is, but only since version 1.22. Pleroma used to have pinned version 1.18.

This has been fixed in the develop branch of Pleroma by ~~MR 4412~~ PR 7789, which I backported on top of my Pleroma 2.10.

In the meantime, Pleroma migrated from GitLab to Forgejo, breaking links and MR/PR history in the process. I can provide you the patch, ask me for commit 883b3a660c33c2cd51328a82ee94dee312111cef [3]. (Or just wait until Pleroma 2.11 is released :-))

I also noticed that even for one-way interactions federating in both directions needs to work: when Alice sends Bob a post, Bob might try to get Alice's display name, so connections happen in both directions.

Instances don't do IPv6

You would expect this one. Some instances don't have AAAA DNS records. Some use CDN for media that is not reachable over IPv6. As Alyx notes, some don't seem to do outbound connections over IPv6. I think I can interact with those, e.g. by favouriting posts or looking those up from my instance, but I never get new statuses pushed from those.

Those can be instances behind reverse proxies (incl. Cloudflare) that don't have IPv6 connectivity, stable Pleroma instances with old Hackney (see above), maybe even instances that run in a Docker container – afaik one needs to explicitly enable IPv6 in Docker.

I think those instances tend to be unable to load my profile picture. Another symptom might be old bio being cached (usually the one that says I am 26 years old), but that could be also explained by exponential backoff since my instance has been down for a long time.

Because I have followers on legacy servers, this also means that my instance tends to cry about unreachable addresses occasionally. And I only get replies to posts on legacy instances, which is a bit confusing.

I don't do IPv6 internet

This might be the most funny part of this post: While I did all this because I only had IPv6 addresses free for use, I don't have IPv6 connection from home myself. So at this point, I have a working instance that I cannot reach :-D

Well, that is not true, obviously (I would also be unable to manage it). Initial solutions for this were SSH's ProxyJump and SOCKS proxying through dual-stack hosts.

But I have IPv6 network at home, it just uses ULAs. And nobody says that you cannot connect from an ULA to a GUA. Nobody should say you need NAT for that either. It's all just a matter of routing tables. So between me and the Hetzner VPS there is a Wireguard tunnel (over IPv4, unfortunately) with static routes to the prefixes at both ends.

And this is silly, because I am using a “IPv4-mostly” web browser to connect to IPv6-pure instance, and that means that with some posts I can only interact through a client for my instance, and other posts I can see in my browser, but cannot interact with them.

Also, something seems unhappy with ULAs, because the instance log claims it doesn't get the X-Forwarded-For header from my reverse proxy. I think that is a lie and something makes assumptions about how a “proper” IPv6 address looks like, but I haven't dug into that yet.

UPDATE 20:10: The ULA warning is misconfiguration/odd default.

Some SSH don't try IPv6 first

Also I learned that the command ssh dual-stack-host.example is not consistent with what address family it tries first. I use hosts with Arch, postmarketOS and Debian, and while Debian uses the IPv6 address in my experience, Arch and postmarketOS default to IPv4.

I haven't yet looked into that. (And the only correct solution to this is to deploy IPv6-pure /j)

UPDATE 20:10: My bad, getaddrinfo(3) returns addresses in a legacy-first order on my IPv4-mostly network. Solved by adding the same label for all my prefixes in gai.conf(5) according to RFC 3484.

Other issues

Having an instance be down for over a year has its own perks:

Upgrading and migrating

In the meantime, new Debian came out, with new Elixir and OTP, and several new versions of Pleroma were released. And stuff pretty much worked out of the box, but sometimes I got stuck on an issue with a weird error message and suspected that it might have been caused by one of those upgrades. Therefore I spent a nontrivial amount of time trying to restore various versions of the stack.

An annoying part of restoring the database from backup is that reindexing the database takes about a day for some reason. So stuff has been moving slowly.

And even then the database has been quite slow apparently, but I solved that at the very end, so I'll tell you about that at the end of the blogpost :-)

Backfilling and being revived

When I got my instance running again, stuff was not as working as I expected. The main issue is that I don't get posts from many instances. On the other hand, my database started hogging resources, leading to various timeouts.

I don't have all the answers, possibly never will. But my guesses are: as it started being able to connect outside, it tried to sync with a lot of servers (as all the background jobs were a year overdue), and at the same time, other instances were aware that my instance returns 503's, so the reasonable thing to do in that case is to retry with exponential backoff. Which might mean that I might not get statuses from some accounts for the next year. This feels like a big waiting game, I have no idea if/when stuff would work.

And as said above, there are some instances that are set up in a way which doesn't allow bidirectional federation over IPv6. I don't think I can distinguish them from those that still deem my instance as dead. (E.g. mastodon.social is also one of the servers I don't receive statuses from (yet?))

To aid with this, I started tracking which accounts have succesfully posted on my main timeline, in a random text file. I also try to “nudge” various instances to start sending me stuff again, e.g. by favouriting random posts hosted there.

Fediverse evolution

Apparently, time went on on Fedi. Some accounts were migrated, some instances stopped existing (e.g. botsin.space), &c. This means that I might have to go through all my accounts and check that they work. The issues with those accounts and their backlog is in the same file I track who sends me statuses. Likewise, the file contains the list of followed accounts on legacy-only instances.

Infrastructure modernisation fails

A small one, but: I wanted Pleroma to listen on a Unix socket instead of a IP socket, according to Pleroma and Cowboy documentation. That just plain does not work. I don't know exactly why, but Pleroma seems to re-launch the server twice, and the second attempt fails, because the socket file already exists.

The error message is unfortunately about Ecto not being able to find repository or something :-/

Another issue was when I tried to solve the database issue by tweaking various limits in AdminFE. Sometimes, trying to reset a limit to its default saves an unparseable value in the database. And the values are in Erlang's binary format, parseable by :erlang.binary_to_term/1.

Various small details

Searching my own statuses does not work. I have no idea why.

UPDATE 20:10: It's a known bug.

The laptop's hardware is not too stable, as I have peeled an antistatic foil that should not have been peeled probably. The instance seems to be running fine, except when the laptop resets itself. I'll find a replacement, one day, maybe…

If you want to read my real-time cry about the deployment (and support current IP version), here is my thread with various observations.

The slow database

One of the last issues was that Vojta did not get my statuses. From his logs, it turned out that some endpoints tend to time out after 15 seconds. We iterated a bit, because at first the pinned posts endpoint was failing, and when I solved that Vojta managed to trigger a similar endpoint elsewhere, still leading his instance to not get my statuses.

After trying to tweak PostgreSQL randomly to make it show me the failing request (note that log_parameter_max_length_on_error = -1 and similar settings need to be put in /etc/postgresql/17/main/postgresql.conf even though psql's shell would autocomplete them when using SET command), it turned out that the particular SELECT uses a wrong index (visible in EXPLAIN ANALYZE SELECT …), leading to slowness [4].

Solution was to ANALYZE table;, if I understand it correctly, PostgreSQL was missing statistics to pick the right index correctly. That might have been caused by the migration, as iirc this kind of data is not contained in the (text) database dump.

This probably solved most of the other database issues as well [5] and now my instance runs like a breeze.


[1]I hate that the term “IPv6-only” can also mean “a network that gives you IPv6 and NAT64 for reaching the legacy internet”, because then there is no simple word for describing “a network which does not care about legacy at all”.
[2]Lucky thing is, Hetzner actually routes the range, so I did not need to do weird hacks to convince the VPS provider that the address does actually exist. I heard this is needed with some providers (e.g. they try neighbour discovery for reaching the address)
[3]I am probably legally required to comply, as Pleroma is provided under AGPL-3 :-)
[4]The slow query has the form SELECT a, b ->> 'c' FROM t WHERE (( b ->> 'd' = 'e' AND ( b ->> 'f' ? 'g'::text OR b ->> 'h' ? 'i'::text )) AND b ->> 'j' <> 'k' ) ORDER BY a ASC NULLS LAST LIMIT 5; and it turned out that without the ORDER BY or LIMIT it is fast.
[5]One problem I had since forever was that when I post, for a while I get 500 for everything. I always blamed slow hardware not being able to handle the load spike, but now it runs very smoothly, so it was probably the database all along.