April 20, 2007

Perhaps They Should Have Tested More - Research In Motion

Some of the headlines:
  • RIM: 'Series' of Errors Caused Outage
  • Software glitch caused BlackBerry outage 
  • RIM traces BlackBerry outage to poorly tested software update
  • RIM chalks up blackout to "insufficient" testing
  • Software glitch KOd BlackBerry 
  • Software update brought down BlackBerry e-mail: RIM
  • Better optimisation caused RIM's Crackberries to crack
  • Blackberry outage exposes weakness for RIM
  • BlackBerry Shutdown Sends Danger Signal for RIM
  • BlackBerry Outage Caused By Untested New Feature

Last Tuesday night, millions of Blackberry users in the Western Hemisphere experienced a twelve hour service outage leaving them without wireless e-mail access.

In a statement issues on Thursday, April 19th, Research In Motion wrote:
"RIM has determined that the incident was triggered by the introduction of a new, non-critical system routine that was designed to provide better optimization of the system's cache. The system routine was expected to be non-impacting with respect to the real-time operation of the BlackBerry infrastructure, but the pre-testing of the system routine proved to be insufficient.

The new system routine produced an unexpected impact and triggered a compounding series of interaction errors between the system's operational database and cache."

Some believe that poor testing procedures were used.
"Jack Gold, an analyst at J. Gold Associates LLC, said RIM "sure could have" kept customers better informed this week. "Nothing's worse than not knowing what's happening when a problem occurs."

Gold said he thought that the final explanation was a "bit confusing," adding, "It sounds like they tried to run something on their production servers without it being totally ready.... This is something they should be better at. Playing with software on a production environment is always risky. This time it bit them." "

The failure may lead some to consider alternatives.  Already, some enterprise users are questioning the centralized network architecture of the Blackberry:
"BlackBerry is essentially an outsourced environment. Whenever there's an outage, this determination has to go on whether it's us or them. Once it leaves our site, we have no insight as to what goes on in their network."

Many companies and individuals have become dependent on the wireless e-mail service, to the point that the devices are often called "Crackberrys".
"The BlackBerry blackout was grueling to many — and revealed just how professionally and emotionally dependent so many people had become on their pocket-size electronic lifelines."

"After a massive outage of BlackBerry mobile devices across the globe Wednesday prevented millions from checking the e-mail on the go, workers and technology junkies faced an unsettling truth.

"This outage was more than just a little inconvenient. This was on par with (saying) my computer was down or my phone was down," said Michael Gartenberg, an analyst with Jupiter Research. "People couldn't work the way they were accustomed to working.""

[Update 09/08/07]

Another "software glitch".  Another outage.