GlacierI use Amazon Web Services here at John Carroll every day. We serve media from it, use it as a content delivery network and we store backups there, which I’ve blogged about here.

Every night, we backup our web files and databases and store them in Amazon S3. We keep a few days worth of backups there, and it’s proven itself as an easy way to get files when we’ve needed them.

As a marketing and communications shop, we also have the need to store large video, photo and audio projects once their completed. Those costs can get very high very quickly for large projects.

Today, Amazon announced a new storage and archiving product, called Glacier. From Amazon:

Amazon Glacier is an extremely low-cost storage service that provides secure and durable storage for data archiving and backup. In order to keep costs low, Amazon Glacier is optimized for data that is infrequently accessed and for which retrieval times of several hours are suitable. With Amazon Glacier, customers can reliably store large or small amounts of data for as little as $0.01 per gigabyte per month, a significant savings compared to on-premises solutions.

$0.01 per GB per month? That’s crazy. Compared to regular price of $0.125/GB for regular storage, there’s potential for serious cost savings there.

Something users need to be aware of is the fact that unlike S3’s immediate access, Glacier needs to go find your files and it may take several hours to completely get your project back. For large projects that you don’t need immediate access to, it sounds like a good service.

You can add data in a programmatic way using the Glacier API, or their import/export service where you can physically send a hard drive to Amazon for them to ingest. While it’s not live yet, I’m interested in the promised ability to set a life-cycle on data stored in S3 and having it automatically migrate to Glacier when a certain amount of time has passed.

More about Glacier is available on the Amazon Web Services site.

I’m sure you’ve either read about the Amazon Web Services outage of the weekend or visited a site that uses their architecture, such as Quora or Foursquare.

One part of their servers on demand product had issues – specifically their Elastic Block Storage product in one of their availability zones. Many servers use it for persistent storage, something the AWS EC2 product doesn’t offer by default. With these volumes being flaky, throwing errors or being office, many sites were in trouble.

The services that we use the most here at John Carroll, the Simple Storage Service (S3) and the Cloudfront content delivery network were not affected, thankfully, so I could enjoy the holiday weekend. I would have liked to play some online games on my PS3, but as you’ll see below, that too was off-line.

So what are some takeaways I see coming out of this outage?

First, don’t put all your eggs in one basket. SmugMug CEO Don MacAskill posted a very detailed blog post about the Amazon outage and how and why his company’s servers there weren’t affected. He says:

All of our services in AWS are spread across multiple Availability Zones (AZs). We’d use 4 if we could, but one of our AZs is capacity constrained, so we’re mostly spread across three. (I say “one of our” because your “us-east-1b” is likely different from my “us-east-1b” – every customer is assigned to different AZs and the names don’t match up). When one AZ has a hiccup, we simple use the other AZs. Often this is a graceful, but there can be hiccups – there are certainly tradeoffs.

Second, if you are going to leverage the cloud for services, and you should, you must have a backup plan or set of protocols for what to do if it hits the fan.

For example, if S3 did go down, our WordPress CMS would be affected, as we store user-uploaded assets in S3. To remedy that, we keep a local copy on our server, so our assets stay available to our site visitors. If S3 goes down, we can make a change to a plugin configuration and our assets will still be available. When S3 comes back online, we’d flip the switch and go back to serving things from the cloud.

Third, have a communication plan ready and keep users updated during the day.

The only spot I was finding out official information on the outage was on the AWS Service Health Dashboard, which is fine, that’s where it should be. In addition, many sites put up their own pages (Quora, Reddit come to mind) saying their were being affected by the outage.

If you have a blog, use it. Same goes for Twitter and Facebook. Amazon, even though the info was hidden, was good with updating exactly what was going on and where they were in the process of getting services back online. For example:

Apr 24, 5:05 AM PDT: As detailed in previous updates, the vast majority of affected EBS volumes have been restored by this point, and we are working through a more time-consuming recovery process for remaining volumes. We have made steady progress on this front over the past few hours. If your volume is among those recently recovered, it should be accessible and usable without additional action.

Good information that’s being updated often is important to help keep customers in the loop. Compare that to Sony, who’s Playstation network has been offline since last Wednesday. Their updates have been nebulous, at best. On April 21, they posted on their official blog:

While we are investigating the cause of the Network outage, we wanted to alert you that it may be a full day or two before we’re able to get the service completely back up and running.

The last update given by the company, on April 23, said this:

We sincerely regret that PlayStation Network and Qriocity services have been suspended, and we are working around the clock to bring them both back online. Our efforts to resolve this matter involve re-building our system to further strengthen our network infrastructure. Though this task is time-consuming, we decided it was worth the time necessary to provide the system with additional security.

We thank you for your patience to date and ask for a little more while we move towards completion of this project. We will continue to give you updates as they become available.

And then, silence. It’s now Monday morning in the US and the service is not online and the current status/ETA for being online hasn’t been updated since Saturday. IGN has more on Sony’s PR response to this outage.

That type of communication wouldn’t work on our campuses. Part of your planning must be a communications plan for who is responsible for keeping a certain audience up to date on the status of services.

My colleagues at Allegheny are doing it right this morning. They had a power outage over the weekend and took to their intranet to update the campus community, on a Sunday.

Screen+shot+2011 04 25+at+10 49 47+AM

Am I going to stop using Amazon’s cloud services over this outage? No, definitely not. Is this going to make Amazon improve the service? Yes. Is this a sucky way to do it? Of course.

I’ll be updating this post with feedback from other higher ed web and marketing folks. Andrew Careaga has some interesting thoughts on the outage looking at it through a lens of education.

I’ve been getting a great deal of questions lately from people who are interested in how to use the cloud at their institutions but aren’t sure what sort of things they could be doing there. I’ve been thinking about this and here are a few thoughts I’ve had about ways you can integrate Amazon’s S3 or Rackspace’s CloudFiles product into your web workflow.

1. Videos and Podcasts

Putting your videos and podcast audio in the cloud is a no-brainer. In fact, it’s the first thing we did in S3 a few years ago. Not every video we produce is meant for YouTube, or perhaps you want to have a really nice, high definition embed on your site, that’s what sites like Amazon’s S3 were made for.

Why? If you’ve got a large number of video files and audio files living on the same server as your college’s website, you could be potentially taking away cycles and bandwidth from your site. What happens when one of your videos goes viral? It has the potential to slow down your site and negatively impact the experience of your site’s visitors who are there to find out more about you.

Here’s an example of a video we produced that we hosted in S3. The HD video downloads quickly and it put no strain on my campus server.

2. WordPress Media

I’m a huge fan of WordPress. We’re rolling it out departments across our campus. They will want to upload PDF files, images, and lots of other content to go along with their blog posts and pages. That content can pile up pretty quickly, so why not put that uploaded content into the cloud? You can do it manually or use the Amazon S3 Plugin for WordPress, which will allow you to upload media to S3 and have it be displayed in your blog. It will also create thumbnails of your images and upload them as well.

I’ve been using this plugin on this blog for a while and it’s worked really well.

3. PDF Files

PDF files can get big and if your site is like mine, they are spread out all over the place. Our PDF files range in size from 50k for forms generated from Word to 11 or 12 MB for our athletics media guides.

I’ve been trying to put all our PDF files in a central spot in our S3 account so they are easy to find and update when needed, and during peak times of use, such as right now when students are downloading and completing forms before coming back to campus in a few weeks.

4. CSS and Javascript

Since your site’s CSS and javascripts will get cached after the first visit, why not serve them from S3? It’s fast, seamless to the end user, and since they are cached, you’ll barely notice they aren’t coming from your web server.

5. Images

Maybe S3 isn’t the perfect spot form which to host every image on your website, but it’s a great spot to host galleries, large hi-res version of your photos or serve as a backup spot for your image collection. I keep a bunch of critical PSD files in S3 so that they’re save if my hard drives and other backups fail. Obsessive? Maybe, but having lost critical data in a hard drive crash a few years ago, I’m much more obsessive about backups and having redundant copies of things.

So there you go – five easy, quick things you can start to do in the cloud file storage platform of your choice, be it Amazon or Rackspace.