This is the latest update (original at the bottom):
Subject: [#MAU-767967]: Network outages – still happening
As a follow-up, I would like to share with you a postmortem of the
routing event you experienced this morning.
At about 6:30AM we received a preliminary alarm of a failure on one of
our customer gateway routers. The event was logged and the system was
enabled for failover to the redundant processing card on that gateway
router. A short time later, the routing fabric of the gateway failed
and caused a disruption in network service. Our technician brought the
router back up and ran a diagnostic, which ran clear. At that time, the
system was brought back online and routing resumed to your equipment.
After running stably for about 30 minutes, the system and its backup
failed again. Senior technicians implemented a hard-failure plan and
brought up a “warm spare” gateway router and began loading the
configuration. Network service was restored individually to each
customer affected by the outage as the configuration was checked-in and
loaded.
I should note that while you are one of several customers that was
affected by this incident, this was not a full-scale routing outage.
Our network architecture makes extensive use of sandboxing in order to
not put all eggs in one basket. Nevertheless, I understand that while
not everybody was affected, YOU were affected, and I do apologize for
the incident.
At this time, our warm spare has been placed into production and
functioning normally. We have brought in a cold spare and activated it
into standby so that redundancy is still in place. We anticipate that
we will replace the affected gateway router with new hardware. When
that happens, we will migrate your routing onto the new system, place
the warm spare back into standby and pull power from the cold spare.
All this will be seamless and will go unnoticed from a connectivity
standpoint.
Please be assured that your VIRTBIZ team will continue to review this
incident so that we can further improve our service to you.
I hope that you have a pleasant remainder of your weekend.
Original posts and updates:
We are experiencing a network issue between cogent and virtbiz this morning. First shortly after 6am, and lasted apx. 2 minutes. By the time I started to look into it and contact the DC, it was back up.
It is occurring again now. This time I was already on the servers looking into a spam report from a VPS account. Check checks showed the link between cogent and virtbiz to be down. It back up to the point I could send virtbiz a message (sure they already knew but just in case…), and am waiting to hear back. It had come back up at 7:50. Actually the virtbiz support site came up about 10 minutes before that.
At 8:01, it’s out again… Still waiting on an answer from vb…
As you can see by this, it’s having a problem finding a route.
[root@gt24-1 ~]# traceroute support.virtbiz.com
traceroute to support.virtbiz.com (208.77.216.244), 30 hops max, 40 byte packets
1 208.75.228.193 (208.75.228.193) 0.482 ms 0.809 ms 0.952 ms
2 tulip-core-2-ge3-8.tshost.com (208.75.224.5) 0.326 ms 0.318 ms 0.335 ms
3 core-1-gi7-2.tshost.com (208.75.224.13) 0.314 ms 0.344 ms 0.380 ms
4 te8-3.mpd01.atl01.atlas.cogentco.com (38.104.182.45) 0.310 ms 0.274 ms 0.287 ms
5 te0-0-0-6.mpd21.iah01.atlas.cogentco.com (154.54.28.254) 14.704 ms 14.816 ms te0-2-0-1.mpd21.iah01.atlas.cogentco.com (154.54.2.146) 14.697 ms
6 te2-1.mpd01.dfw01.atlas.cogentco.com (154.54.5.133) 20.387 ms 20.471 ms te3-4.mpd01.dfw01.atlas.cogentco.com (154.54.25.93) 20.560 ms
7 vl3834.na01.b000868-0.dfw01.atlas.cogentco.com (38.112.35.54) 21.105 ms 21.519 ms vl3534.na01.b000868-0.dfw01.atlas.cogentco.com (66.250.13.178) 20.402 ms
8 38.107.227.210 (38.107.227.210) 20.385 ms 20.646 ms 20.818 ms
9 * * *
10 * * *
11 * * *
12 * * *
13 * * *
14 * * *
15 ge8-0.brdr2.dal1.virtbiz.com (64.125.196.45) 25.759 ms 25.751 ms 25.755 ms
16 * * *
17 ge8-0.brdr2.dal1.virtbiz.com (64.125.196.45) 25.950 ms 25.892 ms 25.902 ms
18 * * *
19 ge8-0.brdr2.dal1.virtbiz.com (64.125.196.45) 26.025 ms 25.996 ms 26.005 ms
20 * * *
21 ge8-0.brdr2.dal1.virtbiz.com (64.125.196.45) 26.172 ms 26.126 ms 26.668 ms
22 * * *
23 ge8-0.brdr2.dal1.virtbiz.com (64.125.196.45) 26.741 ms 26.330 ms 26.374 ms
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
As of 9am, we are back up at the moment… Just received this from VB:
We have been having an issue with one of our routers. Our technicians are correcting the issue and services should be fully restored shortly.
I’ll close this ticket now. If we can be of further service, just respond to this message.
Thank you
Jack B. – VIRTBIZ Internet Support
Comments Off