Martin MC Brown's picture
Martin MC Brown

Computing From the Front Lines

Amazon extends data handling

Over the festive break there were quite a few announcements related to Amazon's EC2 service.

The first is the news that Amazon's SimpleDB has gone through same changes in preparation for the full roll out. The process has moved out of private beta and into an unlimited public beta, giving access to SimpleDB for anybody that wants it. Amazon have opened up the access to allow 1GB and 25 hours of usage for free, and you can create as much data from EC2 to/from SimpleDB as you like.

Obviously this is all pretty useless unless you can use it, but the APIs are really cool. Basically data is stored into domains, and you can have 100 domains with 10GB of data each. Reading and writing data is flexible, storing information using a simple attribute style database. With no rigid structure for the data you can create and retrieve information with varying levels of complexity and depth. Think XML, but without the complexity of the DTD and parsing behind it.

Ultimately they are planning a more traditional SQL style interface, but for the moment most languages and applications work really well with the attribute model. The limitations I can see for the moment are with the heavy analytical style environments where the ability to pull out entire sequences or collate sequences in the database would be a major convenience.

Related to this, Amazon have announced their Public Data Sets. It's hard to improve on the outline on that page, so:

 

Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications. An initial list of data sets is already available, and more will be added soon.

 

The potential here is huge. Massive sets of data that can be queried and examined by the public to extract their own reports and data. For example, the public data sets include a variety of US census data. With access to this through EC2 you could create your own reports and cross-sections of information using the free public data and the power of various EC2 instances for relatively little outlay.

It's not clear how much information will be made available in the public data sets, but if they carry on in a similar vein I see a lot of potential for companies and interested individuals to produce some interesting details.

Finally, Amazon have added a new payment model to the S3 environment. Called Requester Pays, it basically means that you can create information within S3 and then have the person accessing it pay for actually using the information.

This is sort of related to the public data sets, except that you as a provider can make the information available effectively with no charge to you, but a charge to everybody who wants to use the information. That sets up some more interesting potential.

Let's say that as a company I have some interesting statistical information on the sales of a particular product. I could make that available to other people so that they could create their own reports from that information, structured however they want. My only requirement is to upload the data and make it available.

This could have a huge impact on the model and companies that normally format and produce reports using that data. Creating a different styled report for every user that wants it could be a nightmare. But making the raw data available and allowing each user to create their own reports could reduce the expense to me of providing the information considerably. On really large data sets the savings could be enormous.

What is Tech Briefcase?
TechBriefcase is a new, free service where IT Professionals can Search, Store and Share IT white papers and content like this. Learn more
Bookmark content
Speed up your research efforts with content across the web.
Search and Store
Find the white papers you need. Create folders for any topic.
View Anywhere
Open your briefcase on your iPhone, tablet or desktop. Share with colleagues.
Don't have an account yet?