An open source system for automating deployment, scaling, and operations of applications.

Friday, May 19, 2017

Kubernetes: a monitoring guide

Today’s post is by Jean-Mathieu Saponaro, Research & Analytics Engineer at Datadog, discussing what Kubernetes changes for monitoring, and how you can prepare to properly monitor a containerized infrastructure orchestrated by Kubernetes.

Container technologies are taking the infrastructure world by storm. While containers solve or simplify infrastructure management processes, they also introduce significant complexity in terms of orchestration. That’s where Kubernetes comes to our rescue. Just like a conductor directs an orchestra, Kubernetes oversees our ensemble of containers—starting, stopping, creating, and destroying them automatically to keep our applications humming along.

Kubernetes makes managing a containerized infrastructure much easier by creating levels of abstractions such as pods and services. We no longer have to worry about where applications are running or if they have enough resources to work properly. But that doesn’t change the fact that, in order to ensure good performance, we need to monitor our applications, the containers running them, and Kubernetes itself.

Rethinking monitoring for the Kubernetes era

Just as containers have completely transformed how we think about running services on virtual machines, Kubernetes has changed the way we interact with containers. The good news is that with proper monitoring, the abstraction levels inherent to Kubernetes provide a comprehensive view of your infrastructure, even if the containers and applications are constantly moving. But Kubernetes monitoring requires us to rethink and reorient our strategies, since it differs from monitoring traditional hosts such as VMs or physical machines in several ways.

Tags and labels become essential
With containers and their orchestration completely managed by Kubernetes, labels are now the only way we have to interact with pods and containers. That’s why they are absolutely crucial for monitoring since all metrics and events will be sliced and diced using labels across the different layers of your infrastructure. Defining your labels with a logical and easy-to-understand schema is essential so your metrics will be as useful as possible.

There are now more components to monitor
In traditional, host-centric infrastructure, we were used to monitoring only two layers: applications and the hosts running them. Now with containers in the middle and Kubernetes itself needing to be monitored, there are four different components to monitor and collect metrics from.

Applications are constantly moving
Kubernetes schedules applications dynamically based on scheduling policy, so you don’t always know where applications are running. But they still need to be monitored. That’s why using a monitoring system or tool with service discovery is a must. It will automatically adapt metric collection to moving containers so applications can be continuously monitored without interruption.

Be prepared for distributed clusters
Kubernetes has the ability to distribute containerized applications across multiple data centers and potentially different cloud providers. That means metrics must be collected and aggregated among all these different sources. 
For more details about all these new monitoring challenges inherent to Kubernetes and how to overcome them, we recently published an in-depth Kubernetes monitoring guide. Part 1 of the series covers how to adapt your monitoring strategies to the Kubernetes era.

Metrics to monitor

Whether you use Heapster data or a monitoring tool integrating with Kubernetes and its different APIs, there are several key types of metrics that need to be closely tracked:
  • Running pods and their deployments
  • Usual resource metrics such as CPU, memory usage, and disk I/O
  • Container-native metrics
  • Application metrics for which a service discovery feature in your monitoring tool is essential 
All these metrics should be aggregated using Kubernetes labels and correlated with events from Kubernetes and container technologies.
Part 2 of our series on Kubernetes monitoring guides you through all the data that needs to be collected and tracked.

Collecting these metrics

Whether you want to track these key performance metrics by combining Heapster, a storage backend, and a graphing tool, or by integrating a monitoring tool with the different components of your infrastructure, Part 3, about Kubernetes metric collection, has you covered.
Anchors aweigh!

Using Kubernetes drastically simplifies container management. But it requires us to rethink our monitoring strategies on several fronts, and to make sure all the key metrics from the different components are properly collected, aggregated, and tracked. We hope our monitoring guide will help you to effectively monitor your Kubernetes clusters. Feedback and suggestions are more than welcome.

--Jean-Mathieu Saponaro, Research & Analytics Engineer, Datadog

  • Get involved with the Kubernetes project on GitHub 
  • Post questions (or answer questions) on Stack Overflow 
  • Connect with the community on Slack
  • Follow us on Twitter @Kubernetesio for latest updates

Thursday, May 18, 2017

Kargo Ansible Playbooks foster Collaborative Kubernetes Ops

Today’s guest post is by Rob Hirschfeld, co-founder of open infrastructure automation project, Digital Rebar and co-chair of the SIG Cluster Ops.  

Why Kargo?

Making Kubernetes operationally strong is a widely held priority and I track many deployment efforts around the project. The incubated Kargo project is of particular interest for me because it uses the popular Ansible toolset to build robust, upgradable clusters on both cloud and physical targets. I believe using tools familiar to operators grows our community.

We’re excited to see the breadth of platforms enabled by Kargo and how well it handles a wide range of options like integrating Ceph for StatefulSet persistence and Helm for easier application uploads. Those additions have allowed us to fully integrate the OpenStack Helm charts (demo video).

By working with the upstream source instead of creating different install scripts, we get the benefits of a larger community. This requires some extra development effort; however, we believe helping share operational practices makes the whole community stronger. That was also the motivation behind the SIG-Cluster Ops.

With Kargo delivering robust installs, we can focus on broader operational concerns.

For example, we can now drive parallel deployments, so it’s possible to fully exercise the options enabled by Kargo simultaneously for development and testing.  

That’s helpful to built-test-destroy coordinated Kubernetes installs on CentOS, Red Hat and Ubuntu as part of an automation pipeline. We can also set up a full classroom environment from a single command using Digital Rebar’s providers, tenants and cluster definition JSON.

Let’s explore the classroom example:

First, we define a student cluster in JSON like the snippet below

 "attribs": {
   "k8s-version": "v1.6.0",
   "k8s-kube_network_plugin": "calico",
   "k8s-docker_version": "1.12"
 "name": "cluster01",
 "tenant": "cluster01",
 "public_keys": {
   "cluster01": "ssh-rsa AAAAB....."
 "provider": {
   "name": "google-provider"
 "nodes": [
     "roles": [ "etcd","k8s-addons", "k8s-master" ],
     "count": 1
     "roles": [ "k8s-worker" ],
     "count": 3

Then we run the Digital Rebar workloads reference script which inspects the deployment files to pull out key information.  Basically, it automates the following steps:

rebar provider create {“name”:“google-provider”, [secret stuff]}
rebar tenants create {“name”:“cluster01”}
rebar deployments create [contents from cluster01 file]

The deployments create command will automatically request nodes from the provider. Since we’re using tenants and SSH key additions, each student only gets access to their own cluster. When we’re done, adding the --destroy flag will reverse the process for the nodes and deployments but leave the providers and tenants.

We are invested in operational scripts like this example using Kargo and Digital Rebar because if we cannot manage variation in a consistent way then we’re doomed to operational fragmentation.  

I am excited to see and be part of the community progress towards enterprise-ready Kubernetes operations on both cloud and on-premises. That means I am seeing reasonable patterns emerge with sharable/reusable automation. I strongly recommend watching (or better, collaborating in) these efforts if you are deploying Kubernetes even at experimental scale. Being part of the community requires more upfront effort but returns dividends as you get the benefits of shared experience and improvement.

When deploying at scale, how do you set up a system to be both repeatable and multi-platform without compromising scale or security?

With Kargo and Digital Rebar as a repeatable base, extensions get much faster and easier. Even better, using upstream directly allows improvements to be quickly cycled back into upstream. That means we’re closer to building a community focused on the operational side of Kubernetes with an SRE mindset.

If this is interesting, please engage with us in the Cluster Ops SIG, Kargo or Digital Rebar communities. 

-- Rob Hirschfeld, co-founder of RackN and co-chair of the Cluster Ops SIG

  • Get involved with the Kubernetes project on GitHub
  • Post questions (or answer questions) on Stack Overflow
  • Connect with the community on Slack
  • Follow us on Twitter @Kubernetesio for latest updates

Dancing at the Lip of a Volcano: The Kubernetes Security Process - Explained

Editor's note: Today’s post is by Jess Frazelle of Google and Brandon Philips of CoreOS about the Kubernetes security disclosures and response policy. 

Software running on servers underpins ever growing amounts of the world's commerce, communications, and physical infrastructure. And nearly all of these systems are connected to the internet; which means vital security updates must be applied rapidly. As software developers and IT professionals, we often find ourselves dancing on the edge of a volcano: we may either fall into magma induced oblivion from a security vulnerability exploited before we can fix it, or we may slide off the side of the mountain because of an inadequate process to address security vulnerabilities. 

The Kubernetes community believes that we can help teams restore their footing on this volcano with a foundation built on Kubernetes. And the bedrock of this foundation requires a process for quickly acknowledging, patching, and releasing security updates to an ever growing community of Kubernetes users. 

With over 1,200 contributors and over a million lines of code, each release of Kubernetes is a massive undertaking staffed by brave volunteer release managers. These normal releases are fully transparent and the process happens in public. However, security releases must be handled differently to keep potential attackers in the dark until a fix is made available to users.

We drew inspiration from other open source projects in order to create the Kubernetes security release process. Unlike a regularly scheduled release, a security release must be delivered on an accelerated schedule, and we created the Product Security Team to handle this process.

This team quickly selects a lead to coordinate work and manage communication with the persons that disclosed the vulnerability and the Kubernetes community. The security release process also documents ways to measure vulnerability severity using the Common Vulnerability Scoring System (CVSS) Version 3.0 Calculator. This calculation helps inform decisions on release cadence in the face of holidays or limited developer bandwidth. By making severity criteria transparent we are able to better set expectations and hit critical timelines during an incident where we strive to:
  • Respond to the person or team who reported the vulnerability and staff a development team responsible for a fix within 24 hours
  • Disclose a forthcoming fix to users within 7 days of disclosure
  • Provide advance notice to vendors within 14 days of disclosure
  • Release a fix within 21 days of disclosure

As we continue to harden Kubernetes, the security release process will help ensure that Kubernetes remains a secure platform for internet scale computing. If you are interested in learning more about the security release process please watch the presentation from KubeCon Europe 2017 on YouTube and follow along with the slides. If you are interested in learning more about authentication and authorization in Kubernetes, along with the Kubernetes cluster security model, consider joining Kubernetes SIG Auth. We also hope to see you at security related presentations and panels at the next Kubernetes community event: CoreOS Fest 2017 in San Francisco on May 31 and June 1.

As a thank you to the Kubernetes community, a special 25 percent discount to CoreOS Fest is available using k8s25code or via this special 25 percent off link to register today for CoreOS Fest 2017. 

--Brandon Philips of CoreOS and Jess Frazelle of Google

  • Post questions (or answer questions) on Stack Overflow
  • Join the community portal for advocates on K8sPort
  • Follow us on Twitter @Kubernetesio for latest updates
  • Connect with the community on Slack
  • Get involved with the Kubernetes project on GitHub

Friday, April 21, 2017

How Bitmovin is Doing Multi-Stage Canary Deployments with Kubernetes in the Cloud and On-Prem

Editor's Note: Today’s post is by Daniel Hoelbling-Inzko, Infrastructure Architect at Bitmovin, a company that provides services that transcode digital video and audio to streaming formats, sharing insights about their use of Kubernetes.

Running a large scale video encoding infrastructure on multiple public clouds is tough. At Bitmovin, we have been doing it successfully for the last few years, but from an engineering perspective, it’s neither been enjoyable nor particularly fun. 

So obviously, one of the main things that really sold us on using Kubernetes, was it’s common abstraction from the different supported cloud providers and the well thought out programming interface it provides. More importantly, the Kubernetes project did not settle for the lowest common denominator approach. Instead, they added the necessary abstract concepts that are required and useful to run containerized workloads in a cloud and then did all the hard work to map these concepts to the different cloud providers and their offerings.

The great stability, speed and operational reliability we saw in our early tests in mid-2016 made the migration to Kubernetes a no-brainer.

And, it didn’t hurt that the vision for scale the Kubernetes project has been pursuing is closely aligned with our own goals as a company. Aiming for >1,000 node clusters might be a lofty goal, but for a fast growing video company like ours, having your infrastructure aim to support future growth is essential. Also, after initial brainstorming for our new infrastructure, we immediately knew that we would be running a huge number of containers and having a system, with the expressed goal of working at global scale, was the perfect fit for us. Now with the recent Kubernetes 1.6 release and its support for 5,000 node clusters, we feel even more validated in our choice of a container orchestration system.

During the testing and migration phase of getting our infrastructure running on Kubernetes, we got quite familiar with the Kubernetes API and the whole ecosystem around it. So when we were looking at expanding our cloud video encoding offering for customers to use in their own datacenters or cloud environments, we quickly decided to leverage Kubernetes as our ubiquitous cloud operating system to base the solution on.

Just a few months later this effort has become our newest service offering: Bitmovin Managed On-Premise encoding. Since all Kubernetes clusters share the same API, adapting our cloud encoding service to also run on Kubernetes enabled us to deploy into our customer’s datacenter, regardless of the hardware infrastructure running underneath. With great tools from the community, like kube-up and turnkey solutions, like Google Container Engine, anyone can easily provision a new Kubernetes cluster, either within their own infrastructure or in their own cloud accounts. 

To give us the maximum flexibility for customers that deploy to bare metal and might not have any custom cloud integrations for Kubernetes yet, we decided to base our solution solely on facilities that are available in any Kubernetes install and don’t require any integration into the surrounding infrastructure (it will even run inside Minikube!). We don’t rely on Services of type LoadBalancer, primarily because enterprise IT is usually reluctant to open up ports to the open internet - and not every bare metal Kubernetes install supports externally provisioned load balancers out of the box. To avoid these issues, we deploy a BitmovinAgent that runs inside the Cluster and polls our API for new encoding jobs without requiring any network setup. This agent then uses the locally available Kubernetes credentials to start up new deployments that run the encoders on the available hardware through the Kubernetes API.

Even without having a full cloud integration available, the consistent scheduling, health checking and monitoring we get from using the Kubernetes API really enabled us to focus on making the encoder work inside a container rather than spending precious engineering resources on integrating a bunch of different hypervisors, machine provisioners and monitoring systems.

Multi-Stage Canary Deployments

Our first encounters with the Kubernetes API were not for the On-Premise encoding product. Building our containerized encoding workflow on Kubernetes was rather a decision we made after seeing how incredibly easy and powerful the Kubernetes platform proved during development and rollout of our Bitmovin API infrastructure. We migrated to Kubernetes around four months ago and it has enabled us to provide rapid development iterations to our service while meeting our requirements of downtime-free deployments and a stable development to production pipeline. To achieve this we came up with an architecture that runs almost a thousand containers and meets the following requirements we had laid out on day one:

  1. Zero downtime deployments for our customers
  2. Continuous deployment to production on each git mainline push
  3. High stability of deployed services for customers

Obviously #2 and #3 are at odds with each other, if each merged feature gets deployed to production right away - how can we ensure these releases are bug-free and don’t have adverse side effects for our customers?

To overcome this oxymoron, we came up with a four-stage canary pipeline for each microservice where we simultaneously deploy to production and keep changes away from customers until the new build has proven to work reliably and correctly in the production environment.

Once a new build is pushed, we deploy it to an internal stage that’s only accessible for our internal tests and the integration test suite. Once the internal test suite passes, QA reports no issues, and we don’t detect any abnormal behavior, we push the new build to our free stage. This means that 5% of our free users would get randomly assigned to this new build. After some time in this stage the build gets promoted to the next stage that gets 5% of our paid users routed to it. Only once the build has successfully passed all 3 of these hurdles, does it get deployed to the production tier, where it will receive all traffic from our remaining users as well as our enterprise customers, which are not part of the paid bucket and never see their traffic routed to a canary track.

This setup makes us a pretty big Kubernetes installation by default, since all of our canary tiers are available at a minimum replication of 2. Since we are currently deploying around 30 microservices (and growing) to our clusters, it adds up to a minimum of 10 pods per service (8 application pods + minimum 2 HAProxy pods that do the canary routing). Although, in reality our preferred standard configuration is usually running 2 internal, 4 free, 4 others and 10 production pods alongside 4 HAProxy pods - totalling around 700 pods in total. This also means that we are running at least 150 services that provide a static ClusterIP to their underlying microservice canary tier.

A typical deployment looks like this:

Services (ClusterIP)

An example service definition the production track will have the following label selectors:

apiVersion: v1
kind: Service
 name: account-service-production
   app: account-service-production
   tier: service
   lb: private
 - port: 8080
   name: http
   targetPort: 8080
   protocol: TCP
   app: account-service
   tier: service
   track: production

In front of the Kubernetes services, load balancing the different canary versions of the service, lives a small cluster of HAProxy pods that get their haproxy.conf from the Kubernetes ConfigMaps that looks something like this:

frontend http-in
 bind *:80
 log local2 debug

 acl traffic_internal    hdr(X-Traffic-Group) -m str -i INTERNAL
 acl traffic_free        hdr(X-Traffic-Group) -m str -i FREE
 acl traffic_enterprise  hdr(X-Traffic-Group) -m str -i ENTERPRISE

 use_backend internal   if traffic_internal
 use_backend canary     if traffic_free
 use_backend enterprise if traffic_enterprise

 default_backend paid

backend internal
 balance roundrobin
 server internal-lb        user-resource-service-internal:8080   resolvers dns check inter 2000
backend canary
 balance roundrobin
 server canary-lb          user-resource-service-canary:8080     resolvers dns check inter 2000 weight 5
 server production-lb      user-resource-service-production:8080 resolvers dns check inter 2000 weight 95
backend paid
 balance roundrobin
 server canary-paid-lb     user-resource-service-paid:8080       resolvers dns check inter 2000 weight 5
 server production-lb      user-resource-service-production:8080 resolvers dns check inter 2000 weight 95
backend enterprise
 balance roundrobin
 server production-lb      user-resource-service-production:8080 resolvers dns check inter 2000 weight 100

Each HAProxy will inspect a header that gets assigned by our API-Gateway called X-Traffic-Group that determines which bucket of customers this request belongs to. Based on that, a decision is made to hit either a canary deployment or the production deployment.

Obviously, at this scale, kubectl (while still our main day-to-day tool to work on the cluster) doesn’t really give us a good overview of whether everything is actually running as it’s supposed to and what is maybe over or under replicated.

Since we do blue/green deployments, we sometimes forget to shut down the old version after the new one comes up, so some services might be running over replicated and finding these issues in a soup of 25 deployments listed in kubectl is not trivial, to say the least.
So, having a container orchestrator like Kubernetes, that’s very API driven, was really a godsend for us, as it allowed us to write tools that take care of that.

We built tools that either run directly off kubectl (eg bash-scripts) or interact directly with the API and understand our special architecture to give us a quick overview of the system. These tools were mostly built in Go using the client-go library.

One of these tools is worth highlighting, as it’s basically our only way to really see service health at a glance. It goes through all our Kubernetes services that have the tier: service selector and checks if the accompanying HAProxy deployment is available and all pods are running with 4 replicas. It also checks if the 4 services behind the HAProxys (internal, free, others and production) have at least 2 endpoints running. If any of these conditions are not met, we immediately get a notification in Slack and by email.

Managing this many pods with our previous orchestrator proved very unreliable and the overlay network frequently caused issues. Not so with Kubernetes - even doubling our current workload for test purposes worked flawlessly and in general, the cluster has been working like clockwork ever since we installed it.

Another advantage of switching over to Kubernetes was the availability of the kubernetes resource specifications, in addition to the API (which we used to write some internal tools for deployment). This enabled us to have a Git repo with all our Kubernetes specifications, where each track is generated off a common template and only contains placeholders for variable things like the canary track and the names.

All changes to the cluster have to go through tools that modify these resource specifications and get checked into git automatically so, whenever we see issues, we can debug what changes the infrastructure went through over time!

To summarize this post - by migrating our infrastructure to Kubernetes, Bitmovin is able to have:
  • Zero downtime deployments, allowing our customers to encode 24/7 without interruption
  • Fast development to production cycles, enabling us to ship new features faster
  • Multiple levels of quality assurance and high confidence in production deployments
  • Ubiquitous abstractions across cloud architectures and on-premise deployments
  • Stable and reliable health-checking and scheduling of services
  • Custom tooling around our infrastructure to check and validate the system
  • History of deployments (resource specifications in git + custom tooling)

We want to thank the Kubernetes community for the incredible job they have done with the project. The velocity at which the project moves is just breathtaking! Maintaining such a high level of quality and robustness in such a diverse environment is really astonishing. 

--Daniel Hoelbling-Inzko, Infrastructure Architect, Bitmovin

  • Post questions (or answer questions) on Stack Overflow
  • Join the community portal for advocates on K8sPort
  • Get involved with the Kubernetes project on GitHub
  • Follow us on Twitter @Kubernetesio for latest updates
  • Connect with the community on Slack
  • Download Kubernetes