Years ago when I first registered breakall.org, I used the now defunct Microsoft Custom Domains to manage the domain and receive email at scott@breakall.org. This involved creating an MX record for breakall.org pointed to a Microsoft IP address. Later when I decided to build my own email server, I created a new top priority MX record for breakall.org pointed at my own server, but I left the Microsoft MX record as a secondary. Functionally this meant that for the last four years, outlook.com has been a safety net for email that might be lost if it was sent to my email server at a time when my server was down for any reason. Overall, I’ve had pretty good uptime, but there have been a few… incidents along the way that caused it to be down for a while. So I would occasionally check the Microsoft inbox to see if any emails had fallen into the safety net, and there was rarely anything in there.
Another key element to this story is that in the continuous fight against spam, I learned about and enabled greylisting on my primary server. Greylisting works by responding to an initial contact from an external SMTP server with a 451 message: “Recipient address rejected: Greylisting in effect, please come back later”. Normally, the sending SMTP server will just wait about 10 minutes, then try again, at which point, the message will be processed normally. This cuts down on a lot of spam because malicious email servers don’t typically bother to try again. (A side effect is that incoming mail can be delayed while the external server waits to retry.)
In the last month, I’ve been noticing more emails slipping into the Microsoft inbox. I hadn’t had any issues with uptime on my own email server (that I knew of anyway), and I was receiving a normal amount of email in the primary inbox, so I began to investigate.
What I found in mail.log on the primary server was that certain senders would try once, get deferred by the greylisting, and then just never try again.
Apparently, more “valid” mass mailers (e.g. mandrillapp, elexio) have changed their behavior recently to check for a secondary MX record and immediately contact the secondary server to deliver the email, rather than retrying the primary after a delay period.
So the fix was to remove the secondary MX record, thereby leaving the senders no choice but to retry in order to deliver the email.
Of course this means my safety net is gone.
The next day, I already had some emails from senders that had been showing up in the secondary, so the fix seems to have worked. Time will tell if some other unintended consequence presents itself.