Recreating My Email Server with Claude

TL;DR: I recently used Claude to successfully recreate my email server's setup in NixOS with enhanced security and monitoring. You can find my setup on GitHub and deploy it yourself if you'd like.

2020: Getting Started

In 2020, I set up a personal email server on an EC2 instance, following Gilles Chehade's excellent guide. The OpenSMTPD + Rspamd + Dovecot setup ended up being rock solid, and ran for 6 years without issue—despite the paltry resources of its t3.nano host (2 vCPUs and 512MB of RAM). Looking back, I'm actually really impressed at how boring running this setup ended up being.

That's not to say that the setup was perfect. I used acme.sh to handle getting Let's Encrypt certificates, and while it did a great job of getting the certificates, I never quite got the auto-refresh setup working. Therefore, I ended up logging in every 3 months or so to manually refresh the certificates—and, of course, this was always at an inconvenient time.

Early on, I also had some downtime due to running out of disk space. This was easy to resolve and never happened again, but I didn't have any observability in place so it actually took me several days of not receiving any mail to even realize there was a problem.

I was also worried that I didn't have a reliable way of recreating the setup. Everything was done ad hoc inside the node, and so if the node was lost it would take me around a week to bring things back up. Since I'd started using this server as my primary email, that would be pretty inconvenient.

Shortly after going live, I also received an email from Amazon that the AMI I built the server on was deprecated. This turned out not to be an issue, as I continued running the node for 5 more years after receiving this—but it definitely nagged in the back of my mind.

Ultimately, while this setup was Good Enough™, I knew I wanted to rework it down the line.

2022: Take Two

After two years of manual certificate refreshes and the disk full incident, I decided to try and fix things. Of course, as a hobby project, improving the existing server didn't feel very satisfying—I wanted to learn a new technology and also create a reproducible setup.

I decided the technology I wanted to use was Kubernetes. I intended to put each of the service into its own container and then deploy to a managed Kubernetes cluster somewhere.

Since my laptop was running NixOS, I decided to use nix to generate the containers. This went relatively smoothly, and I got the basic components containerized and deployed into a local k3s cluster. Then, I worked on a Python script to test this cluster and make sure everything worked before attempting to go live.

That's when everything started to go wrong. I started running into various obscure issues with the Python test's interactions with the cluster, with how exactly nix was building the containers, and more. And in each of these cases, I ran into knowledge gaps such that even if I had ideas of how to proceed with the debug, actually finding the right incantation to execute on them was challenging and time consuming.

Ultimately, I never quite got the core email setup recreated (let alone setting up certificate renewal or monitoring) and I ended up abandoning this effort after a few weeks. It turns out logging in every few months to renew the certs just wasn't that big of a deal.

2026: Take Three—This Time with AI

Now, in 2026, after over 5 years of running the service without incident, I got the hankering to try and overhaul things again. Like in 2022, just fixing the existing server wouldn't be satisfying—I wanted to fully recreate the server using a new technology. This time, that technology was AI.

I downloaded Claude Code and, after forking over $20, I presented the high level goal: recreate the email server in a reproducible fashion while adding a monitoring solution and ensuring the certs would automatically renew.

Right off the bat, I was impressed with Claude's understanding of the problem and the questions it asked for clarification. When presenting the 2022 architecture, Claude immediately pointed out how expensive a managed Kubernetes cluster would be. I'm now actually grateful that I never finished the 2022 implementation, as I can't imagine how frustrating it would be to go to deploy it only to realize that I couldn't afford a managed Kubernetes cluster.

Ultimately, we decided to go somewhere between the first and second setups. I would continue to stick with the existing OpenSMTPD + Rspamd + Dovecot services, and, like in 2022, we would containerize the services. This time, in order to continue to fit everything into a t3.nano, though, we'd just use systemd to administer the containers.

We would continue to use nix to create the containers, and we'd install NixOS on the server. We'd use Terraform to manage provisioning the t3.nano and then nixos-rebuild to deploy the locally defined NixOS config.

After the initial high level discussion, Claude rapidly spat out a massive amount of code. Unfortunately, it didn't actually work. I quickly realized that expecting Claude to magically one-shot a full project was fanciful.

Changing courses, I then decided to try and have Claude build one step at a time, adding tests for each step, and not moving forward until things were working. This approach proved highly successful! While in 2022, starting to set up tests proved the downfall of the project, this time it's what kicked things into gear. Claude was able to easily set up NixOS VM tests that allowed validating each component separately, and then create integration tests to validate the components' interactions.

What was particularly satisfying was that, this time, when I had an idea for how to write a test, Claude was actually able to implement it. Claude's breadth of knowledge with various command line tools and with the ins and outs of nix was remarkable. This was the first time I've ever worked with nix and not once been stuck fighting nix itself.

That's not to say that the entire thing was smooth sailing. In the middle of the development process, I ran into a particularly nasty issue once I moved the containers from the host network to a bridge network. Whenever I'd restart my podman-opensmtpd service (and underlying container), I'd no longer be able to reach it from the host network through the published port. Directly reaching out to the container via its bridge network IP would continue to work, though.

In the logs, we saw this error message:

Error: netavark: setns: IO error: Invalid argument (os error 22)

Claude wasn't able to get far debugging this on its own (it mostly wanted to just switch back to using the host network).

However, I eventually had the idea to use strace to better understand the failing setns syscall. I've hardly ever used strace myself, so this is not normally a tool I reach for—but Claude was able to immediately generate the appropriate commands (and knew to switch to auditctl when strace couldn't find the relevant syscall directly).

Claude initially struggled to interpret the strace output, but once I pasted the man page of the relevant syscalls, it was able to quickly interpret the large auditctl log files and carry out the rest of the investigation.

Ultimately, we were able to determine that the issue was that systemd was running the service's ExecStart and ExecStop in different mount namespaces, which prevented podman in ExecStop from seeing the network namespace it set up in ExecStart. This prevented podman from cleaning up the virtual network, leaving routing rules to the now defunct container around. Consequently, the routing rules to the new container were overshadowed by the leftover rules and the new container was inaccessible.

After noting that several systemd settings like PrivateTmp cause Systemd to create a mount namespace for ExecStart/ExecStop, we were able to resolve the issue by removing these settings.

This is one of those obscure show-stopper bugs that occasionally show up and hinder hobby projects like this. Dealing with bugs similar to this is what ultimately ended my 2022 rewrite. Yet, this time, I was able to ultimately resolve the issue. Although Claude wasn't able to debug the issue solely on its own, Claude's ability to implement the high level strategies I thought up—like "can we get more information about this failing syscall"—made solving this bug much, much easier than if I was debugging by myself. This is where Claude really shined and is why, this time, I completed the migration that I'd failed at in the past.

Ultimately, I was able to fully implement everything I wanted to in the new codebase and I have now completely migrated mail.jbrot.com over to the new infrastructure!

If you want to take a look at the code Claude and I wrote, I've published it on GitHub. You should be able to spin up your own mail server with just a few clicks using it! Feel free to open an issue on the repo if anything doesn't work or is unclear, or share any thoughts directly by sending me an email.

Conclusion

Ultimately, working with Claude was a very positive experience—although I did have a bit of sticker shock when I needed to upgrade to the Max plan. With Claude I was able to successfully complete the project I started back in 2022 faster and to a higher quality than I ever expected.

Claude was definitely not a panacea, but it was like the world's best Junior dev. Claude was able to execute quickly and effectively on every well-scoped task I presented, and was able to execute on things I knew were possible but didn't have enough knowledge to be able to complete efficiently.

With nix, Claude really shined. In the past, my attempts to use nix have been mired with endless frustration. Whenever I'd try and do something new, I'd have to spend hours trying to come up with the magic incantation that executed on something that was obviously possible in theory but incredibly finicky in practice. With Claude, however, I encountered none of this—I would describe to Claude what I wanted to do and it would almost immediately generate the right code. And whenever there was a syntax issue, Claude was easily able to fix it.

I also really appreciated being able to throw away the code Claude created. When working with a real dev, throwing away a week's worth of work after an approach proves subpar is incredibly painful. However, Claude is happy to generate and throw away code over and over—making it much easier to explore different possible solutions and finding the one that works best.

I was also surprised at how fun using Claude was. I always thought that I loved writing code itself and that handing that over to AI would be soul-sucking. However, I found that it was actually invigorating! I was able to do many refactors that, when writing code out myself, I'd consider not worth the effort.

In conclusion, I was very pleasantly surprised with how useful and effective Claude was and I'm very excited to continue using AI to build new projects better and faster!