My previous post mentioned that we are building the NFV platform, which has two components: a management system and a cloud node. The management system manages a single or many cloud nodes on top of the service providers’ network. The platform has been designed to be auto-maintained and highly automated.
As such it ought to be able to supply all the data for itself and for the cloud operator in order to:
- Measure and store everything: every event or usage statistics of IaaS/Applications/network (current/historical information)
- Metering for Billing purposes
- Make smart IaaS or applications provisioning operations such as: deployment & lifecycle or operation of elasticity rules.
- Plan for the required cloud and underlying I/S resources capacity
Trying to solve the question of how to meter everything in a distributed cloud, I started looking at the Ceilometer OpenStack project as an interesting solution to explore. Up until the OpenStack Grizzly release, this project was designed and built for billing, so you could hear eNovance (OpenStack Gold Member) VP products Nick Barcet lecturing “someone forgot to prepare billing for OpenStack so we started with metering”. However, during the OpenStack Portland summit 2013, Nick and Free Software Consultant Julian Danjou gave an architecture lecture discussing Ceilometer as “metering everything in the cloud”.
Ceilometer seems to have all the building blocks for being a robust metering project for the cloud. When building the NFV platform we’re looking at this solution from the viewpoint of multiple OpenStack distributions. Why? because the NFV platform should be able to manage 1-N cloud nodes. Some Tier-1 customers are talking numbers in the 10s or even 100s of such cloud-let elements, spread all over a country.
I won’t dive in to the reasoning behind distribution in this post, but in general remember that we are talking about a graph of nodes and edges (SP networks) that comprise a cloud. Given the accepted number of nodes, and using OpenStack we will need to manage multiple OpenStack distributions, or one OpenStack deployment per node. Though distributed cloud concepts have matured a bit in OpenStack (i.e: cell concept) it is rather still too young in that domain.
Looking at Ceilometer design, having a message bus, supporting Big Data store and UDP publishing functionality, it may be ready to be challenged with distributed monitoring architecture.
So here are my questions:
1) Can this bus span multiple OpenStack deployments and how?
2) What would be the right method to use for publishing? Because the management system should be able to retrieve that data for different purposes (presentation and calculations, whether on-demand, real-time or off-line) so obviously a caching mechanism should be available. Publishing UDP wrapped information to multiple targets burdens the single management system and the networks. how can we aggregate the data in to a single logical system.
3) A possible solution would be for each OpenStack cloud node to have its own Ceilometer instance in charge of publishing the data inside a cloud node and storing it on to an external big data container?
4) Another possible solution is to have a single Ceilometer per OpenStack instance for each cloud node and a distributed big data model having a footprint on each of the cloud nodes for performance needs.
4) What do the developers of this interesting project have in mind when using exterior systems pushing data into the Ceilometer bus?
Havana Blueprints are also very interesting including capacity planning, adaptive-based scheduler algorithms based on Ceilometer metrics etc. Also notice that Ceilometer outputs can be used to make analytics and provisioning decisions and more.
I will continue researching this issue, and would appreciate any thoughts you might have. All in all, would be interested in learning and sharing more about using Ceilometer in our distributed cloud platform metering framework.
Just drop me a line within the blog post,
Till next time