LowEndBox - Cheap VPS, Hosting and Dedicated Server Deals

What Happens When Your Site's DNS Goes Down?

Domain Name Space Illustration

Hello!

As many of you know, I am delighted to enjoy the awesome services of Porkbun for domain registration and DNS resolution. Despite Porkbun’s great work, some of my domains experienced approximately 9.5 hours of DNS downtime on September 6, 2022 beginning at approximately 1:49 a.m. UTC.

As discussed below, uptime monitoring service from Hetrix Tools was helpful in providing information for analysis.

I am glad that the incident occurred, because it gave me opportunities to learn more about DNS, about incident reporting, and about DNS failure prevention.

The DDoS Attack

Porkbun’s DNS servers seem to have suffered a Distributed Denial of Service (DDoS) attack. The DDoS attack prevented normal domain name based connection to the MetalVPS website and to the MetalVPS servers because their domain names could not be resolved. Connection via numerical IP addresses remained possible at all relevant times for those who had the numerical addresses directly available.

Besides MetalVPS, some, but not all, of my other domains at Porkbun were affected. Additionally, both my MetalVPS and personal email services also were down.

Start Time Notification

Hetrix Tools sent a notification email at 6:48 pm MST on September 5. The email reported that their website monitors in San Francisco, Singapore, and Warsaw timed out trying to connect to metalvps.com. The website monitor email reported, “Noticed at: 2022-09-06 01:48:58 (UTC+00:00).”

At 6:55 pm MST, Hetrix sent another email reporting timeouts for ping monitors in New York, San Francisco, Dallas, and Tokyo. The ping monitor email reported, “Noticed at: 2022-09-06 01:55:42 (UTC+00:00).”

MST is UTC -7. So the times correlate. The incident started just before 7 pm in the evening of September 5, my local time.

Here is a screenshot of Hetrix Tools’ graph of the Darkstar ping outage as seen at 10:53 am MST on September 6. Besides the red colored downtime from the DNS outage, two short, yellow, maintenance outages also are visible. The maintenance outages are kernel reboots which happen because Darkstar is running Slackware64-current, a rolling release.

Hetrix Tools Graph

Effect On Website And Servers

While the DNS outage was in progress, nobody could access the metalvps.com website with a web browser. This is because the DNS server would fail to respond to requests to translate the domain name into the required numerical IP address. Also, for the same reason, it was impossible to connect to MetalVPS servers via ssh without using the numerical IP address instead of the domain name. Anybody who did not have the numerical IP address handy couldn’t connect to the website and couldn’t connect to the servers.

Effect On Email

When the incident began, it already was later than the 5:00 pm Pacific Time work day’s end when Porkbun’s support team can get some rest.

I waited about two and a half hours to see whether the issue would resolve. When it continued, I prepared an email to Porkbun’s awesome support.

Email reporting DNS outage

I was surprised that I couldn’t send the email. From my location in Mexico, I wrote the email in Migadu’s webmail interface. But I couldn’t send. Instead, I got the SocketReadException error 523. Error 523 means “origin unreachable,” so it seems like Migadu webmail was checking for and couldn’t verify my MX records because DNS was down.

It seems obvious that receiving emails would be impossible during a DNS outage. This is because the sending Message Transfer Agent (MTA) wouldn’t be able to access an MX record giving the IP address of the receiving MTA. But I was surprised that I couldn’t upload an email for sending and at least have the email queued. Of course, the email probably would have worked if I had sent it from an email service which was independent from all of my domains.

End Time Notification

After trying and failing to email, I decided to sleep. The next morning, I woke up to find Hetrix’ notifications that service was back up.

On September 6, 2022 at 4:24 am MST, Hetrix emailed that the MetalVPS.com website monitor had “Noticed at: 2022-09-06 11:24:52 (UTC+00:00)” that the site was back up. The downtime was reported in that email as “8 hr 59 min.”

One minute later, at 4:25 am MST, Hetrix emailed that Darkstar’s ping monitor was back up. Notification time was given as “2022-09-06 11:25:47 (UTC+00:00),” and downtime was reported as “9 hr 30 min.”

The differences in the outage durations seem due to rounding.

Talking With Porkbun

Delighted that the outage had ended, I wrote to Porkbun inquiring about what had happened. Porkbun replied with great news about their efforts to stop DDoS attacks:

We experienced a DDoS attack on our nameservers which affected a portion of our customers, your domains clearly included. Unfortunately, our systems did not scale up correctly to stave off the attack [ . . . ]

To clarify what you were seeing, we’ve been in the process of migrating our nameserver infrastructure to Cloudflare specifically to combat this kind of issue, [ . . . ] thus any domains you had that we haven’t automatically migrated over would have been affected. We expect to complete migration of all domains in the next couple of weeks, and do want to stress that there isn’t a need to do any manual changes at this time, nor should the migration have any ill effect on its own. And once we complete the migration, downtime due to attacks like this should be a thing of the past.

What worked well?

Fortunately I had Hetrix monitoring set up by IP address as well as by hostname. This happened by accident, not forethought. More specifically, I had set Darkstar’s monitoring by hostname on IPv4 and by IP address on IPv6. The fact that IPv6 remained up helped me figure out that the downtime was due to DNS resolution and not to some problem with Darkstar. Of course, it could be that the IPv4 network would go down while IPv6 remained up. So it wasn’t necessarily DNS.

Of course, working with Porkbun is always a pleasure! Everybody has problems. But working through problems with Porkbun is better than with anybody else I know. Porkbun always has been super nice to me when I have asked for help. They have more than earned our extra patience and consideration as they work to improve their defenses against DDoS attacks.

What needs improvement?

I changed

and

from private to public. I want to add IPv4 and IPv6 ping monitoring reports for MetalVPS.com.

Special Offer

Please remember: Porkbun remains great, even while services are down! Everybody has downtime once in a while, but very few, if any, can match the friendly kindness and great support which every Porkbun customer receives.

If you want to experience wonderful service and top notch support–in my opinion the best available anywhere–please visit Porkbun and use the coupon code LOWENDBOX22 for $1 off one new registration, available for new and existing customers!

— @Not_Oles



Not_Oles

No Comments

    Leave a Reply

    Some notes on commenting on LowEndBox:

    • Do not use LowEndBox for support issues. Go to your hosting provider and issue a ticket there. Coming here saying "my VPS is down, what do I do?!" will only have your comments removed.
    • Akismet is used for spam detection. Some comments may be held temporarily for manual approval.
    • Use <pre>...</pre> to quote the output from your terminal/console, or consider using a pastebin service.

    Your email address will not be published. Required fields are marked *