Cascading failures, the downside of “Eat your own dogfood”

You may have heard that there was a pretty substantial outage in Amazon AWS Cloud services November 25, 2020.  The summary of what happened can be found at: https://aws.amazon.com/message/11201/ and is emblematic of how a simple maintenance action can lead to layer upon layer of unanticipated effects.  Amazon is an amazing company, the AWS service …

Security Vulnerabilities: Discover, React, Deploy – in 72 Hours

There is a new vulnerability in a common piece of enterprise networking, from the industry leader in the class of device called a “load balancer” that is particularly nasty: https://www.wired.com/story/f5-big-ip-networking-vulnerability/ The vendor is a well known, generally well respected supplier, and reacted quickly and effectively to the exploit, producing a patch quickly and correctly advising …

Trials and Tribulations with serverless in AWS

So I needed to create a simple database backend with an api for a mobile app. It’s a small, simple database, with several classical join operations. Perfect for MySQL. It has an unusual characteristic: it’s only going to be used once or twice a month for a few days. It’s a problem management system for …

Getting enough network redundancy at all PSAPs in NG9-1-1

I often hear complaints that in many PSAPs, there isn’t any way to get enough redundancy of network connections to get a reliable ESInet. Often these are rural PSAPs, or larger ones where network diversity wasn’t even considered when siting the facility. I’m here to tell you that I think you CAN get a fair …

Analysis of CenturyLink Dec 2018 outage: Transport Operator/Supplier Diversity is Critical

The FCC has released its report on the December 27, 2018 outage at CenturyLink that affected 9-1-1 service. The problem was a packet storm in a management network that controlled a major part of CL’s optical network. The packets kept multiplying and congesting the system, and since it was in the management network, it was …