Lady Gaga breaks Amazon.com, and what you should learn from it

Lady Gaga just released her latest album. Say what you want to about her music, but Lady Gaga is a genius when it comes to marketing. Rather than bellyache about how digital music distribution and piracy are decreasing revenues for the music industry, Gaga embraced the technology and took a new approach.

To launch her new album, the pop icon offered the entire album for just $.99 on Amazon.com. The result -- record numbers of the album were sold. However, nobody planned for just how successful this idea would be and as a result Amazon's networks couldn't handle the load that was created by anxious fans all over the world trying to purchase and download the new tunes.

As a network engineer or system administrator you should take a lesson or two from this. Most of us spend a lot of time planning for failures. We invest in disaster recovery and preparedness plans, we build fault tolerance and reliability into all of our designs, and we implement network management systems to detect and notify us of performance issues and outages. However, very few of us plan for copious amounts of success, and, in many cases, it's these successes that lead us to failure.

For example, I recently worked on a project to help drive additional web traffic and engagement to a web-based system. The goal was to get a lot more people to check out the site and click through some of the pages there. To help motivate people, we decided to give away some iPads. Guess what? We got so much new traffic that the some of the systems crashed.

I see this all the time. Companies launch new intranet sites and they don't think about how successful it might be or do the necessary capacity planning until it's too late. Marketing teams design new ad campaigns to drive folks into e-commerce sites, but nobody involves the IT teams.   When the campaigns succeed, systems and networks start failing.

When it comes to avoiding situations like this keep the following things in mind.

First, work to ensure that your IT department is well connected to the other parts of the company so that you can find out about these types of things before its too late. This is where a good CIO can really come in handy.

Second, be sure that you know how much additional capacity that your networks and systems have, and what will happen if that capacity is overloaded.

Third, think about what happened to me personally this week. I started doing CrossFit and hired a personal trainer. After the first session the trainer sat me down for a chat. She said that I did great, and that I definitely gave it my all, but that I might want to keep a little in reserve going forward. Sure enough, I should have listened to her as I've been so sore I can't get up out of my chair ever since.  I even had to cancel my next session. I didn't plan for success and that lead to failure.

With your IT resources, be sure that you also keep a little in reserve. Ensure that, even if things are overwhelmed, your system updates, management protocols, and control mechanisms are able to function. This is why you always make your most important and latency-sensitive traffic QoS priority number 2, because priority number 1 is for the traffic that is required to keep the systems up and running.

If you've been in a situation where unplanned success led to failure, please post a comment and share your story here. We can all learn from failures - our own as well as those of others in the community.  Amazon's leading the way.

Flame on...
Josh
Follow me on Twitter

 

Josh Stephens is Head Geek and VP of Technology at SolarWinds, an IT management software company based in Austin, Texas. He shares network management best practices on SolarWinds’ GeekSpeak and thwack. Follow Josh on Twitter@sw_headgeek and SolarWinds @solarwinds_inc.Â