VMware rebranded VMware vCloud Hybrid Service to VMware vCloud Air – a name change that illustrates VMware’s innovation and delivering value added as-a-service solutions on top of its hybrid cloud infrastructure. The platform itself, built on VMware vSphere, will remain the same secure, IaaS hybrid cloud it has always been.
For customers, this new and improved program increases cloud service flexibility and provides tremendous choice of global cloud services. Customers also have their choice of local cloud Service Providers (for data sovereignty) from either VMware vCloud Air or our network of Service Provider partners.
VMware will list all VMware vCloud Air Network Service Provider partners on their public portal (link below) and run awareness and demand generation campaigns promoting their partner ecosystem as well as vCloud Air. VMware Service Provider partners with a valid contract via a local aggregator will automatically be positioned in the vCloud Air Network as a member of the vCloud Air Network Program (formerly VSPP). Please visit http://www.vmware.com for additional updated news on their vcloud air product.
Facebook has ditched RAID and replication for its near line storage, using distributed erasure coding to isolate what it calls “warm BLOBs” instead.
Minor Translation required:
BLOB — Binary Large OBject — Facebook user’s photos, videos, etc.
Warm — data that has to be kept and is accessed at a lower rate than hot data, but more than archived or cold data. Typically, it’s more than a week old. Hot BLOBs, of course, are accessed more frequently.
Erasure coding — the adding of calculated parity values (Reed-Solomon codes) to a string of bytes, such that the string can be recovered if an error deletes or distorts some of the complete string. Typically more efficient than RAID at protecting data as it uses less space.
The key details are that the new ”[f4] storage systems uses Reed-Solomon coding and lays blocks out on different racks to ensure resilience to disk, machine, and rack failures within a single data center. It uses XOR coding in the wide-area to ensure resilience to data center failures.”
Facebook’s special problem is that it has three main types of user data, with associated metadata, and these three types need huge amounts of storage. Its main and most-accessed datasets are the recent, less than one-week-old postings on a user’s timeline. These get accessed a lot by the user’s “Friends”.
It uses its Haystack storage system for this data, which uses triple replication to protect the data and make sure it can always be accessed and accessed quickly, with as near to a single disk access as possible (once the metadata calculations have been run).
As this data ages, it is accessed less often, cooling from hot to warm, and yet still requires fast access when it is actually called upon. Trouble is, the damn stuff just keeps on growing. For example, at the end of January this year, Facebook was storing more than 400 billion photos.
Facebook engineers have set up a new storage system, f4, to store this set of warm BLOBs
A paper by the engineers explains: “f4 is a new system that lowers the effective-replication-factor of warm BLOBs while remaining fault-tolerant and able to support the lower throughput demands.”
University of Southern California, Princeton University
Binary Large OBjects (BLOBs)
Facebook’s engineers say:
[f4] uses Reed-Solomon coding and lays blocks out on different racks to ensure resilience to disk, machine, and rack failures within a single data center. Is uses XOR coding in the wide-area to ensure resilience to data center failures. f4 has been running in production at Facebook for over 19 months. f4 currently stores over 65PB of logical data and saves over 53PB of storage.
The question is: will cloud replication methods be driving the storage industry and the large companies that current dominate the industry? Or will the industry step up and provide something that meets the future requirements for most companies?
Facebook’s network bandwidth and the cloud industry’s network bandwidth are far greater than the average company, but these companies want and need the similar reliability. Disk drive AFR has increased dramatically over the years but the hard error rate has not changed in almost 10 years. And the silent data corruption rate of the channels connecting the disk drives has not change in even longer. Storage companies, Facebook just issued you a challenge.
We can see how Facebook is way ahead of the Storage companies with their own development of storage and how they use it within their environment.