gSounds great, isn't it - computing services that don't need any infrastructure on your end, almost instantaneous delivery of whatever server and storage capacity you need, suite of just-right software, middleware, virtualization, security, and management tools? The "cloud" label is placed on almost every Internet-based application, adding to the large crowd of cloud computing vendors.
In a
previous article, we talked about new and established players offering cloud services. This one is about cloud computing pioneer Amazon and
AWS Start-Up Event in Silicon Valley (
hashtag: ). See also presentation slides from their 2009, 2008 and 2007 start-up events on
slideshare.net.
The April 14th event was held at
Plug and Play Tech Center
Dr. Werner Vogels, Amazon CTO, gave an overview of cloud computing. Here are a few notes taken from his talk:
Design for Automation is important, but things that are out of your control should not be automated. Examples are Human interfaces and Delete operations.
Every task should be decomposed into simpler form.
Ordering pipeline, for example, consists of cart - order - process - store - archive components that need storage and other functionality. If you decompose the pipeline into individual pieces, you will find that most of that storage is needed for particular tasks.
Good things about key value stores is that we know how to scale them. We can't do it with relational databases.
Database administration could be monotonous, overstressed, expensive. Should be more automated and scalable. Need DBA as a service.
Developer should decide how to implement services to obey regulations, failure jurisdictions, break transparency, avoid performance failures
Need evolution not revolutions.
A Virtual Private Cloud with secure VPN connection; a number of devices to have traffic routed through - such as spam, etc
Design with Security in Mind. For example, anonymous access time limit; other limits defined in scripts
Let your customer benefit.
During the past 2 years Amazon reduced pricing 6 times. Amount of bandwidth increased by orders of magnitude more than Amazon itself consumes (see this older slide showing the history of usage growth).
Amazon customers built great applications on top of AWS.
E.g., Cloudmmo - instance cloud, or cloud middleware to reduce infrastructure costs, technology risk and development time
Innovate for your customers
For example, innovate on cloud pricing models: on-demand instances; reserved instances, spot instances
Stax - elastic cloud app for J2EE
Heroku - ruby platform. maps ruby apps to amazon instances; SQL DB
Salesforce.com toolkit to build new cloud-based applications
Tibco's software helps to innovate by connecting applications and data in a service-oriented architecture, providing intelligence tools to make smarter decisions.
They have templates that can be launched to cloud.
RightScale® providesfully automated management platform for Amazon EC2 cloud deployments.
Also Netflix, Pfizer, eHarmony, Lilly, Malbec..
Pfizer uses Ec2, eHarmony uses MapReduce to find better matches.
Playfish - a fast growing social games company
Intuit - massive spiked testing with SOASTA's cloud test
Everyone files taxes at the last possible moment, so Intuit needed AWS to scale
Lots of companies use clouds for testing and simulations. Intuit simulates submitting tax forms -- if too many at a time
Ribbit; SimpleGeo; Siemens; twilio
Dr. Vogels' contact is werner at amazon or check him on twitter:
AWS Customer Presentations included VC-backed startups @, @, @, @
Adam Rose (adam at tubemogul.com) founded online video analytics startup Illumenix wich later merged with TubeMogul. The company helps in video delivery using IP and the Internet - deploying uploads and providing analytics, real-time viewership and engagement tracking.
Robert Berger of Runa presented his company's experience with Elastic Compute Cloud, opschef and HBase, along with a few other
useful AWS tips. His slides had a bit too large font size, at least for those in the front row.
Guy Kawasaki says that optimal font size should be about half of the age of the oldest person in the audience. Centenarians would be certainly happy with this presentation.
A few notes I took were about Javascript for every page, every consumer, one or more AJAX calls; step function, Physical Layer; load every time a new merchant is added..
The company uses Opscode Chef for their tasks, helping to treat "Hardware"as Software .. which took under 5000 lines of ruby code
We are Living in "interesting" times with Amazon. The biggest problem is managing complexity of all the moving parts.
It's impossible to manage horizontal stacks - that's why to use Opscode Chef
there are lots of Learning Curves to Climb
Useful monitoring is hard but not critical
Satya Ramachandran of Jovian Data talked about the reasons to move to the cloud. He discussed HBase on AWS and how it may be dangerous, especially becouse of Hadoop namenode SPOF. if it went down, expect big problems
EC2 can surprise you if deploy multiple versions of horizontally scalable code
JovianData is a paltform as a service to optimize analytics of large data.
NoSQL does not solve application provision challenges
billions of impressions
10 users run 400 reports; 40% of them in hundred milliseconds
JovianData helps to avoid expensive data processing; Reduce Disk IO - by using their propriatory partitioning scheme - by materializing expensive groups (usage-based automatic view materialization).
3 main concerns:
Capital expenditure - long approval times
Over Provisioning - cluster up but nobody is using it
Application Isolation - one person runs unique report, another side section, how to isolate them to not let them run into one another
Biggest mistake - take entire image; 15 terabyte system monthly cost $30K
Role Based Clusters - example - double click services; data cleansing and creating models/ All model creation at night, than hybernated -- Hybernate Model
just kill 20 to 30 nodes; duplicate selectively then kill the nodes
ex: campaign manager needs to run lots of reports/ Classic Tera Scel won't be able to provision in minutes
Satya's advice was - If you are not getting 10x performance improvement - do not move in the clouds.
Aurametrix' comments - and if you do not need 10 or 100x improvement in performance, don't go in the cloud either. AWS is still a hundreds or thousands of users or beta-testers. GAE is great for experimenting. Back to Satya's talk:
Dynamic Provisioning; using EC2 should be able to reduce performance by 100
Jayme Cox, talked on why Zynga chose AWS.
Zynga is conencting the world through Games. It's hosting 5 of the top 10 Facebook games
they use S3 - simple storage
Jayme Highly recommends RightScale
Scaling services without scaling Sys Admin Team
plan for success - could have 100K users
Horizontal scaling important
The 4 presenting startups also participated in
The first question was about overall experience with AWS.
1 (Satya): - experience's been pretty good
2 (Adam) - mixed feelings - no free customer support, hard to get the right person listening to you if .. It would cost you 600 dollars check if this session was Amazon customer support -- - worse than with Microsoft.
3 (Robert) - did not expect any support, now that in production; couple of glitches, hard to tell; looks to me Amazon is stepping up, town executives; Alex is great; I would not mind paying per support if it is instance based.. sometimes you are in a crisis and need support.
4 (Jayme) - found AWS to be extremely responsive form feature request. good at taking feedback.
Another Q - can you imagine your company in the pre-cloud era
4 - it's a complete nightmare. life is easier now
3 - my background is in infrastructure building, hard to imagine
2 - we are still somewhat split, not managing everything ourselves, are somewhere in the middle. We do not have this huge massive traffic yet, but if your app fits this bill you need power
1 - completely different; 100 nodes, 200 nodes;
Questions from the Audience
-What monitoring services you each use
#1 - wanted to build on commodity clusters; used RightScale to some extent; but mostly own
2 - same as previous panelist. Used Ganglia and Nagios - but mostly self-built technology
3- ScoutApp and Pingdom, outsourced monitoring, nice way to quickly get some monitoring. On the path of implementing Naglios.. Some Ganglia, use other staff too.
4- also use Pingdom for external monitoring; Naglios for learning; extensively put own code in the application. Use CloudWatch.
3 - using EBS store
2 - same, data pipeline that does all the data crunchinh
1- moved from Hadoop - as it is too primitive..
A reminder to the reader - among the many open-source and commercial monitoring tools,
Nagios is used for alerting and notification,
SNORT for intrusion detection,
Ganglia,
Cacti and
Scout, are used for performance monitoring.
Q - what steps have you taken to reduce the risk of IP loss. Are you running any intellectual property staff? What assurances do you have to make to the board of directors or VCs to reduce the cloud-associated risk.
As there was a delay in responce from the panelists, Moderator stepped up and told about user agreement polices - whatever data you put on amazon is your data.
No, said the asker form the audience - VCs want to have better assurances...
3 - I do not care if someone downloads everything we have, we won't loose anything anyway.
Moderator - I've been in the field talking to customers; on overall VCs like AWS, they like when you spend on cloud computing versus hardware and operations. What assurances as a VC do you want?.. encrypting the data before storing in the cloud; just encrypt all of the transit data; follow all steps to secure data. Check some of the best practices.
4 - Farmfille - Zynga - our clusters may be faraway from that .. user is the slowest point in that chain
Question from the Audience - how large are your cloud-administration teams
4 - a handful of people
3 - I am the team, looking for someone/; another 4 people are in the development team
2 - 1.5 people
Two more Amazon talks after the break, at 3:45pm
Jinesh often helps developers on 1:1 basis in implementing their own ideas using Amazon’s services, so his talks had many useful tips. His updated slides will be available online soon.
Jinesh - who can be contacted as jvaria at amazon or @ at twitter - wants to know how you are designing your architectures. He will help with Cloud Best Practices and a Complete Buffet of Amazon Services. Notes from his talk:
Most Applications NEED:
1. Compute
Amazon EC2 instances -- on-demand, reserved, spot
Physical infrastructure - Geographical Regions, Availability Zones, Edge Locations - mix and march
Amazon Virtual Private Cloud
on-demand provisioning, scalability in minutes; efficiency of experts
no contracts or long-term commitments
building scalable apps -- :
increased resources - proportionate increase in performance
hands heterogeneity
1. Design for failure - as W. Vogels says - everything fails, all the time - assume it and design backwards. App should continue to funcion even if the underlying physical hardware fails, removed, replaced
load balancer - how to replace?
-- use elastic IP addresses for consistent and re-mappable routes
-- multiple AZs (available zones)
-- real-time monitoring (Amazon CloudWatch)
-- Amazon Elastic Block Store (EBS) for persistent file systems
2. Build Loosely Coupled Systems
Independent components
design everything as black box
de-coupling for Hybrid models
Load-balance clusters
Use Amazon SOS as Buffers
Tight vs Loose coupling using queues
3. Implement Elasticity
Very cloud-specific
Bootstrapping instances
Your role is a database node.. don't assume health or fixed location of components use designs resilant to reboot and re-launch. Bootstrap your instances
Use auto-scaling - free
Use elastic load Balancing on multiple layers
Use configurations in SimpleDB to bootstrap instance
Automate Everything
Cloud is good for standardized tasks --
Java Stack, RoR stack - you can pick and choose/. important how you design Amazon machine images. -- approaches to design AMIS:
-- 1 inventory of fully baked AMIs - Frozen Pizza Model
-- 2 Golden AMI approach - bundle when you need it
-- 3. more control,
4. Build security in every layer --
cloud - losing a little bit of physical control
5. Don't fear constraints
IOPS - miltiple read - read load across multiple sunchronized slaves; write load -> master
adding caching layers, to get throughput
6. Think Parallel
decompose into simplest jobs, load balancer or massive distr tasks - hadoop -
7. leverage storage -- which storage to use when -- deatailed in the white paper - table --
S3. EC2. EBS. RDS - relational can't be heavy weight with lost of images/ Put images and blobs on S3; metadata on simpleDB
Scale Up or Down on Demand
failover - purposely terminated 2-3 instances to see if it maintains min threshold
app with lower dependences, once you build and moved it in the cloud, you'll have more confidence.. Start small, experiment with architectures, through out ..
@simon @junman
Steve Riley, an AWS Evangelist, talked on Security in the AWS Cloud
How many don't worry about security? 3 raised hands were more than Steve expected
How many believe that .they won't have to worry about security in the near future?
Cloud security scares people, oxymoron
internal logs - customer accidentally destroyed their DB - Amazon was able to recover by looking at HTML logs
Amazon helps to take care of information. Not use Zen, couple of modifications - security functionality comes from it. Security group - defines what traffic is allowed into computer
security group vs firewall. Code implementing security group - worked with 3rd party penetration .. highly qualified eyeballs
if you store confidential info - encrypt it. don;t store keys with Amazon either if afraid. For regulatory reasons may need to require build 3 tier architecture web - application -- database
HTTP-SSH-management from vendor/ Amazon don't provide 3 tiers
WebSG - inbound traffic; AppSG - inbound traffic from corpnet for management purposes; DBSG - Service is a disposable horsepower. IF you use IP addresses, do you need to change security groups - no
configure routing in your corporate network - no direct internet access; available in US East only
Mechanism to enable outbound internet - non trivial feature to turn on - will be added
S3 - objects stored in baskets; key used as the name for the object in the basket.
Permissions on backet do not inherit - need to explicitly assign
Encrypting information- fashionable to bash encryption - not a good solution - if they do not have the key - but what if you lose your key? Encrypting is good with a well undestood process to handle the keys. internally or third party - Red Scale for ex
(Read, Write, Full)
Create File on EBS - encrypt it and add a secrete access key -
AWS Sign In White Authentication Device
SOx - easy, outcome based - if control objectives meating the requirements
HIPAA more of a chalenge - current deployments, how-to-whitepaper derived form a project in production , anonimized
SAS 70 Type II certification
--------------
Concluding comments were on how different customers are leveraging AWS., on other AWS events, including yesterday's SF event - Adobe, Autodesk.. other customers were talking about enterprise side of things.
thestartupdigest.com - helped with the event.
AWS Whitepapers
- Overview of Amazon Web Services, Posted 2009-12-08
- The Economics of the AWS Cloud vs. Owned IT Infrastructure, Posted 2009-12-07
- Overview of Security Processes, Posted 2009-06-01
- AWS Security Best Practices, Posted 2010-01-01
- Architecting for the Cloud: Best Practices, Posted 2010-01-01
- Cloud Architectures, Posted 2007-07-01
- Extend Your IT Infrastructure with Amazon Virtual Private Cloud, Posted 2010-01-01
- Creating HIPPA-Compliant Medical Data Applications With AWS, Posted 2009-04-01
Overall, it was an interesting event illustrating why “sky-is-the-limit” and cloud computing is a fully-fledged industry.