Based on this great blog post by Tim McCormack, I managed to write some scripts that back up files to Amazon S3. The files are encrypted with GnuPG and rsync-ed to S3 using a Python-based tool called duplicity.
Amazon Web Service today announced a new AWS Import/Export feature. A potentially huge step forward for data portabilty when using the Amazon Cloud computing infrastructure.
Amazon Elastic Block Store (EBS) enables a single Amazon EC2 instance to attach to one or more highly available, highly reliable storage volumes of up to 1 TB of data each. Once attached, applications on a single Amazon EC2 instance can read or write from the Amazon EBS volume similar to a disk drive. With Amazon EBS, an Amazon EC2 instance can now be terminated without losing the data that resides on the Amazon EBS volume. One use case involves running a relational database within an Amazon EC2 instance, but maintaining the data within an Amazon EBS volume.
On Amazon EC2 you can run many of the proven IBM platform technologies with which you’re already familiar, including IBM DB2, IBM Informix, IBM Lotus Forms Turbo, IBM Lotus Web Content Management, IBM Mashup Center, IBM WebSphere Application Server, IBM WebSphere sMash, and IBM WebSphere Portal Server, WebSphere eXtreme, and InfoSphere DataStage/QualityStage with its corresponding Windows client.
Amazon Elastic MapReduce is a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data. It utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3).
Using Amazon Elastic MapReduce, you can instantly provision as much or as little capacity as you like to perform data-intensive tasks for applications such as web indexing, data mining, log file analysis, machine learning, financial analysis, scientific simulation, and bioinformatics research. Amazon Elastic MapReduce lets you focus on crunching or analyzing your data without having to worry about time-consuming set-up, management or tuning of Hadoop clusters or the compute capacity upon which they sit.