Wednesday, April 14, 2010

AWS for Startups: what's up there in the Cloud

gSounds great, isn't it - computing services that don't need any infrastructure on your end, almost instantaneous delivery of whatever server and storage capacity you need, suite of just-right software, middleware, virtualization, security, and management tools? The "cloud" label is placed on almost every Internet-based application, adding to the large crowd of cloud computing vendors.

In a previous article, we talked about new and established players offering cloud services. This one is about cloud computing pioneer Amazon and AWS Start-Up Event in Silicon Valley (hashtag: #awsstartup_sv). See also presentation slides from their 2009, 2008 and 2007 start-up events on

The April 14th event was held at Plug and Play Tech Center

Dr. Werner Vogels, Amazon CTO, gave an overview of cloud computing. Here are a few notes taken from his talk:

Design for Automation is important, but things that are out of your control should not be automated. Examples are Human interfaces and Delete operations.
Every task should be decomposed into simpler form.
Ordering pipeline, for example, consists of cart - order - process - store - archive components that need storage and other functionality. If you decompose the pipeline into individual pieces, you will find that most of that storage is needed for particular tasks.
Good things about key value stores is that we know how to scale them. We can't do it with relational databases.
Database administration could be monotonous, overstressed, expensive. Should be more automated and scalable. Need DBA as a service.
Developer should decide how to implement services to obey regulations, failure jurisdictions, break transparency, avoid performance failures
Need evolution not revolutions.
A Virtual Private Cloud with secure VPN connection; a number of devices to have traffic routed through - such as spam, etc
Design with Security in Mind. For example, anonymous access time limit; other limits defined in scripts
Let your customer benefit.
During the past 2 years Amazon reduced pricing 6 times. Amount of bandwidth increased by orders of magnitude more than Amazon itself consumes (see this older slide showing the history of usage growth).
Amazon customers built great applications on top of AWS.
E.g., Cloudmmo - instance cloud, or cloud middleware to reduce infrastructure costs, technology risk and development time
Innovate for your customers
For example, innovate on cloud pricing models: on-demand instances; reserved instances, spot instances
Stax - elastic cloud app for J2EE
Heroku - ruby platform. maps ruby apps to amazon instances; SQL DB toolkit to build new cloud-based applications
Tibco's software helps to innovate by connecting applications and data in a service-oriented architecture, providing intelligence tools to make smarter decisions. They have templates that can be launched to cloud.
RightScale® providesfully automated management platform for Amazon EC2 cloud deployments.
Synteractive consulting and automation solutions provider, incl. CloudAdvantage

Also Netflix, Pfizer, eHarmony, Lilly, Malbec..

Pfizer uses Ec2, eHarmony uses MapReduce to find better matches.
Playfish - a fast growing social games company

Intuit - massive spiked testing with SOASTA's cloud test
Everyone files taxes at the last possible moment, so Intuit needed AWS to scale
Lots of companies use clouds for testing and simulations. Intuit simulates submitting tax forms -- if too many at a time
Ribbit; SimpleGeo; Siemens; twilio
Dr. Vogels' contact is werner at amazon or check him on twitter: @Werner

AWS Customer Presentations included VC-backed startups
@tubemogul, @rberger, @satyar73, @zynga
Adam Rose (adam at founded online video analytics startup Illumenix wich later merged with TubeMogul. The company helps in video delivery using IP and the Internet - deploying uploads and providing analytics, real-time viewership and engagement tracking.
See also useful tips from Nicolas of TubeMogul.
Robert Berger of Runa presented his company's experience with Elastic Compute Cloud, opschef and HBase, along with a few other useful AWS tips. His slides had a bit too large font size, at least for those in the front row. Guy Kawasaki says that optimal font size should be about half of the age of the oldest person in the audience. Centenarians would be certainly happy with this presentation.
A few notes I took were about Javascript for every page, every consumer, one or more AJAX calls; step function, Physical Layer; load every time a new merchant is added..
The company uses Opscode Chef for their tasks, helping to treat "Hardware"as Software .. which took under 5000 lines of ruby code
We are Living in "interesting" times with Amazon. The biggest problem is managing complexity of all the moving parts.
It's impossible to manage horizontal stacks - that's why to use Opscode Chef
there are lots of Learning Curves to Climb
Useful monitoring is hard but not critical
Satya Ramachandran of Jovian Data talked about the reasons to move to the cloud. He discussed HBase on AWS and how it may be dangerous, especially becouse of Hadoop namenode SPOF. if it went down, expect big problems
EC2 can surprise you if deploy multiple versions of horizontally scalable code
JovianData is a paltform as a service to optimize analytics of large data.
NoSQL does not solve application provision challenges
billions of impressions
10 users run 400 reports; 40% of them in hundred milliseconds
JovianData helps to avoid expensive data processing; Reduce Disk IO - by using their propriatory partitioning scheme - by materializing expensive groups (usage-based automatic view materialization).
3 main concerns:
Capital expenditure - long approval times
Over Provisioning - cluster up but nobody is using it
Application Isolation - one person runs unique report, another side section, how to isolate them to not let them run into one another
Biggest mistake - take entire image; 15 terabyte system monthly cost $30K
Role Based Clusters - example - double click services; data cleansing and creating models/ All model creation at night, than hybernated -- Hybernate Model
just kill 20 to 30 nodes; duplicate selectively then kill the nodes
ex: campaign manager needs to run lots of reports/ Classic Tera Scel won't be able to provision in minutes
Satya's advice was - If you are not getting 10x performance improvement - do not move in the clouds.
Aurametrix' comments - and if you do not need 10 or 100x improvement in performance, don't go in the cloud either. AWS is still a bit pricey for webapps and early-stage startups. Neither of the presenting startups fit the bill. A traditional VPS is a better solution if you only have hundreds or thousands of users or beta-testers. GAE is great for experimenting. Back to Satya's talk:
Dynamic Provisioning; using EC2 should be able to reduce performance by 100

Jayme Cox, talked on why Zynga chose AWS.
Zynga is conencting the world through Games. It's hosting 5 of the top 10 Facebook games
they use S3 - simple storage
Jayme Highly recommends RightScale
Scaling services without scaling Sys Admin Team
plan for success - could have 100K users
Horizontal scaling important

The 4 presenting startups also participated in


  1. Thanks for covering the event in such detail. I am Satya Ramachandran. We highlighted two things in JovianDATA presnetation
    1. the cost savings.
    2. performance advantage.

    Cloud is great for performance, but because of the hourly cost it forces you to be very frugal with your infrastructure.