When dealing with rental items, if we look at the non-virtual world, the option of charging for usage or by the hour depends on the following two factors:
- Whether the renter has a fair chance of renting the resource to someone else while the resource is idle (i.e. the usage model).
- Whether a rented item, while not in use by the renter, may be used by someone else (i.e. hourly model)
For example, when you rent a car, it can’t be used by anyone else when it is idle, and should therefore be in its parking spot when not in use. In which case, it makes perfect sense to rent the car per hour or day. Now, consider the independent future car. At a moment’s notice, it can meet you at your current location, drive you to your destination, and you’re done with it. The next time you need a car, you follow the same procedure. In this scenario, you only use the car when you need it. Otherwise, it may be used by someone else. Since the actual cost of your drive depends on milage much more than time, it is a good candidate for the charge by usage (i.e. the distance/fuel/engine cycle) cost model.
Usage-Cycle-Based Cost Model
In the PaaS world, statistically, your application does not run all of the time. It wakes up to perform certain tasks, then goes back to sleep again. When asleep, there is no CPU usage and very little memory is required, allowing other applications to use the same resource. This is how application servers, web servers, and shared databases work. Therefore, it makes more sense for these types of resources to use a pricing model according to the ‘unit of work’ performed. Most PaaS providers charge per cycle or tasks since these are the simplest and most common ways to measure usage.
Google’s App Engine provides a good example of cycle charges. Suppose you have two instances, one with ‘x’ memory, and the other with ‘2x’. If you are charged per cycle, obviously the cycle on the ‘2x’ memory’s machine will be more expensive than the one with ‘x’ memory. If you launch your application and it does nothing, you consequently pay nothing because you are paying per cycle. All the while, you are using up memory that cannot be used for other instances. Another example is Heroku’s approach, which is similar to that of cycle usage, charging for work that is performed. You define a worker and are charged for the tasks completed by that worker.
The IaaS Case
In the case of IaaS, charging for usage (i.e. per CPU cycle) is generally accompanied by quite a few challenges. One such challenge is managing a single resource amongst multiple users, deciding when and how the resource is free to be used. Another challenge involves cost justification. It is hard for vendors to justify costs due to the lack of transparency regarding a specific unit’s consumption (i.e. electricity and water bills). This challenge can then lead to difficulties in charge prediction. For example, Amazon had a NoSQL database service called SimpleDB that charged for the CPU cycles needed to process requests and by the number of 1KB units that were read or written. This pricing model proved to be very problematic. Customers had no way of predicting their expected usage and charges. Additionally, possible programing mistakes that yielded higher CPU utilization or higher network throughput would result in a (much) higher bill, and would only come to light a month after the fact.
The Time-Based Cost Model
The common IaaS compute cost model is time-based. When you rent an IaaS VM, your cloud provider allocates a certain number of resources to you, regardless of whether or not they are utilized. This is how Amazon EC2 instances work. Amazon specifically allocates compute and memory capacity for an instance so that it is kept available for you at all times.
This current AWS cloud pricing model takes resource types into account, including the different instance families, such as general purpose, CPU intensive, and memory intensive. Additionally, they optionally provide the unit price for specific amounts of RAM and/or CPU cycles. For example, disks charge by GB per month, bandwidth is also priced according to unit usage, and I/O is priced per request. Taking a look at these charges, you can see that the actual network, I/O, and storage are charged by capacity or “usage cycles”, whereas CPU and RAM are charged globally per hour per month, and not by the number of CPU processing units that were used.
Simplicity and Capacity Management
First and foremost, in relation to the usage-based model, charging by the hour is much more simple for vendors since they don’t need to manage a shared resource. What’s more, adopting the time-based model supports simpler capacity management and optimized resource utilization.
In comparison to the usage-based model, AWS compute costs are completely predictable due to the fact that each specific machine (VM/instance type), has a set cost per hour that enables tracking and forecasting trends and anomalies. Customers typically prefer to be able to predict spend, which is possible with hourly processing charges, resulting in a set monthly expenditure. While payment differs slightly with reserved instances, paying a one-time fee which comes out to a less expensive hourly rate in terms of CPU and RAM, predictability is still preserved.
There is a resemblance of this in what Microsoft Azure is doing, which seems to be even less scalable than the other two giants (i.e Amazon and Google). Azure offers a significant discount if you commit to a number of hours that your application will run, though there is no limit on whether the hours have to be concurrent or not. In theory, if you purchase 720 hours, you can run one instance for a whole month, or you can run 720 instances for one hour. In terms of the provider, these two options make a world of difference. If you only run one instance per month, the provider simply needs to allocate a single server. Conversely, if you run 720 instances, many more resources will need to be allocated, despite the fact that they will solely be used for one hour per month. That is why most of the providers, like Google and Amazon, encourage consumers to use machines in a more predictable way, hence the promotion of AWS reserved instances and Google sustained usage discounts.
Amazon charges per elastic compute unit (ECU), which is roughly equivalent to a 1GHz Pentium processor. If 1 CPU is approximately equal to 1 billion cycles per second, charges can be calculated by dividing the CPU by the number of hours used. Despite the challenges noted above, everything can be measured and it is feasible for every unit to have a cost that can be charged over time.
Although cloud efficiency has a promising future, most of the servers in the cloud are still under-utilized, and vendors cannot gain money by charging for cycle usage. Despite the fact that server utilization falls below 10%, in practice, IaaS providers still need to cover all costs including the return on their actual hardware. While it may be used as a marketing gimmick, payment per usage cycle doesn’t appeal to IaaS vendors.