Analysis of outage posts in the NANOG and Outages mailing lists

Understanding the frequency and causes of network failures is a critical first step for evaluating and improving the resiliency of the Internet. Unfortunately, rigorous macroscopic measurements about the resiliency of the Internet are not presently available. To improve our understanding about network outages, we manually examined the posts between Jan 2010 and Oct 2012 in the NANOG [1] and Outages [2] mailing lists, which are frequently used by network operators to report problems. We tried to analyze as many events as possible in which the posts clearly mentioned the words "outage", "hijacking", "route leak", "unreachable" or "down". We omitted events that turned out to be the issue of the complaining person's network. We identified in total 361 outages and were able to classify the causes of 37.2% of them, which corresponds to 134 incidents. Based on this sample, our study shows that the main causes of outages are: 1) fiber cuts, which account for 35.8% of the classified incidents, 2) hardware failures, which account for 12.7% of the classified incidents, 3) route leaks or hijacking attacks, which account for 11.2% of the classified incidents, 4) power outages, which account for 5.2% of the classified incidents, and 5) denial of service attacks, which account for 4.5% of the classified incidents. This is work conducted with Mentari Djatmiko from NICTA, Australia. It is a preliminary study and requires further work to assess the statistical significance of our findings. [1] North american network operators' group (NANOG) mailing list. http://www.nanog.org/mailinglist/ [2] Outages mailing list. http://puck.nether.net/mailman/listinfo/outages



  • Xenofontas Dimitropoulos and Mentari Djatmiko