An open source system for automating deployment, scaling, and operations of applications.

Monday, July 25, 2016

Why OpenStack's embrace of Kubernetes is great for both communities

Today, Mirantis, the leading contributor to OpenStack, announced that it will re-write its private cloud platform to use Kubernetes as its underlying orchestration engine. We think this is a great step forward for both the OpenStack and Kubernetes communities. With Kubernetes under the hood, OpenStack users will benefit from the tremendous efficiency, manageability and resiliency that Kubernetes brings to the table, while positioning their applications to use more cloud-native patterns. The Kubernetes community, meanwhile, can feel confident in their choice of orchestration framework, while gaining the ability to manage both container- and VM-based applications from a single platform.

The Path to Cloud Native

Google spent over ten years developing, applying and refining the principles of cloud native computing. A cloud-native application is:

  • Container-packaged. Applications are composed of hermetically sealed, reusable units across diverse environments;
  • Dynamically scheduled, for increased infrastructure efficiency and decreased operational overhead; and 
  • Microservices-based. Loosely coupled components significantly increase the overall agility, resilience and maintainability of applications.

These principles have enabled us to build the largest, most efficient, most powerful cloud infrastructure in the world, which anyone can access via Google Cloud Platform. They are the same principles responsible for the recent surge in popularity of Linux containers. Two years ago, we open-sourced Kubernetes to spur adoption of containers and scalable, microservices-based applications, and the recently released Kubernetes version 1.3 introduces a number of features to bridge enterprise and cloud native workloads. We expect that adoption of cloud-native principles will drive the same benefits within the OpenStack community, as well as smoothing the path between OpenStack and the public cloud providers that embrace them.

Making OpenStack better

We hear from enterprise customers that they want to move towards cloud-native infrastructure and application patterns. Thus, it is hardly surprising that OpenStack would also move in this direction [1], with large OpenStack users such as eBay and GoDaddy adopting Kubernetes as key components of their stack. Kubernetes and cloud-native patterns will improve OpenStack lifecycle management by enabling rolling updates, versioning, and canary deployments of new components and features. In addition, OpenStack users will benefit from self-healing infrastructure, making OpenStack easier to manage and more resilient to the failure of core services and individual compute nodes. Finally, OpenStack users will realize the developer and resource efficiencies that come with a container-based infrastructure.

OpenStack is a great tool for Kubernetes users

Conversely, incorporating Kubernetes into OpenStack will give Kubernetes users access to a robust framework for deploying and managing applications built on virtual machines. As users move to the cloud-native model, they will be faced with the challenge of managing hybrid application architectures that contain some mix of virtual machines and Linux containers. The combination of Kubernetes and OpenStack means that they can do so on the same platform using a common set of tools.

We are excited by the ever increasing momentum of the cloud-native movement as embodied by Kubernetes and related projects, and look forward to working with Mirantis, its partner Intel, and others within the OpenStack community to brings the benefits of cloud-native to their applications and infrastructure.


--Martin Buhr, Product Manager, Strategic Initiatives, Google

[1] Check out the announcement of Kubernetes-OpenStack Special Interest Group here, and a great talk about OpenStack on Kubernetes by CoreOS CEO Alex Polvi at the most recent OpenStack summit here.


Thursday, July 21, 2016

The Bet on Kubernetes, a Red Hat Perspective

Editor’s note: Today’s guest post is from a Kubernetes contributor Clayton Coleman, Architect on OpenShift at Red Hat, sharing their adoption of the project from its beginnings.

Two years ago, Red Hat made a big bet on Kubernetes. We bet on a simple idea: that an open source community is the best place to build the future of application orchestration, and that only an open source community could successfully integrate the diverse range of capabilities necessary to succeed. As a Red Hatter, that idea is not far-fetched - we’ve seen it successfully applied in many communities, but we’ve also seen it fail, especially when a broad reach is not supported by solid foundations. On the one year anniversary of Kubernetes 1.0, two years after the first open-source commit to the Kubernetes project, it’s worth asking the question:

Was Kubernetes the right bet?

The success of software is measured by the successes of its users - whether that software enables for them new opportunities or efficiencies. In that regard, Kubernetes has succeeded beyond our wildest dreams. We know of hundreds of real production deployments of Kubernetes, in the enterprise through Red Hat’s multi-tenant enabled OpenShift distribution, on Google Container Engine (GKE), in heavily customized versions run by some of the world's largest software companies, and through the education, entertainment, startup, and do-it-yourself communities. Those deployers report improved time to delivery, standardized application lifecycles, improved resource utilization, and more resilient and robust applications. And that’s just from customers or contributors to the community - I would not be surprised if there were now thousands of installations of Kubernetes managing tens of thousands of real applications out in the wild.

I believe that reach to be a validation of the vision underlying Kubernetes: to build a platform for all applications by providing tools for each of the core patterns in distributed computing. Those patterns:

  • simple replicated web software
  • distributed load balancing and service discovery
  • immutable images run in containers
  • co-location of related software into pods
  • simplified consumption of network attached storage
  • flexible and powerful resource scheduling
  • running batch and scheduled jobs alongside service workloads
  • managing and maintaining clustered software like databases and message queues


Allow developers and operators to move to the next scale of abstraction, just like they have enabled Google and others in the tech ecosystem to scale to datacenter computers and beyond. From Kubernetes 1.0 to 1.3 we have continually improved the power and flexibility of the platform while ALSO improving performance, scalability, reliability, and usability. The explosion of integrations and tools that run on top of Kubernetes further validates core architectural decisions to be composable, to expose open and flexible APIs, and to deliberately limit the core platform and encourage extension.

Today Kubernetes has one of the largest and most vibrant communities in the open source ecosystem, with almost a thousand contributors, one of the highest human-generated commit rates of any single-repository project on GitHub, over a thousand projects based around Kubernetes, and correspondingly active Stack Overflow and Slack channels. Red Hat is proud to be part of this ecosystem as the largest contributor to Kubernetes after Google, and every day more companies and individuals join us. The idea of Kubernetes found fertile ground, and you, the community, provided the excitement and commitment that made it grow.

So, did we bet correctly? For all the reasons above, and hundreds more: Yes.

What’s next?

Happy as we are with the success of Kubernetes, this is no time to rest! While there are many more features and improvements we want to build into Kubernetes, I think there is a general consensus that we want to focus on the only long term goal that matters - a healthy, successful, and thriving technical community around Kubernetes. As John F. Kennedy probably said: 

> Ask not what your community can do for you, but what you can do for your community

In a recent post to the kubernetes-dev list, Brian Grant laid out a great set of near term goals - goals that help grow the community, refine how we execute, and enable future expansion. In each of the Kubernetes Special Interest Groups we are trying to build sustainable teams that can execute across companies and communities, and we are actively working to ensure each of these SIGs is able to contribute, coordinate, and deliver across a diverse range of interests under one vision for the project.

Of special interest to us is the story of extension - how the core of Kubernetes can become the beating heart of the datacenter operating system, and enable even more patterns for application management to build on top of Kubernetes, not just into it. Work done in the 1.2 and 1.3 releases around third party APIs, API discovery, flexible scheduler policy, external authorization and authentication (beyond those built into Kubernetes) is just the start. When someone has a need, we want them to easily find a solution, and we also want it to be easy for others to consume and contribute to that solution. Likewise, the best way to prove ideas is to prototype them against real needs and to iterate against real problems, which should be easy and natural.

By Kubernetes’ second birthday, I hope to reflect back on a long year of refinement, user success, and community participation. It has been a privilege and an honor to contribute to Kubernetes, and it still feels like we are just getting started. Thank you, and I hope you come along for the ride!

-- Clayton Coleman, Contributor and Architect on Kubernetes and OpenShift at Red Hat. Follow him on Twitter and GitHub: @smarterclayton

Happy Birthday Kubernetes. Oh, the places you’ll go!

Editor’s note, Today’s guest post is from an independent Kubernetes contributor, Justin Santa Barbara, sharing his reflection on growth of the project from inception to its future.

Dear K8s,

It’s hard to believe you’re only one - you’ve grown up so fast. On the occasion of your first birthday, I thought I would write a little note about why I was so excited when you were born, why I feel fortunate to be part of the group that is raising you, and why I’m eager to watch you continue to grow up!

--Justin

You started with an excellent foundation - good declarative functionality, built around a solid API with a well defined schema and the machinery so that we could evolve going forwards. And sure enough, over your first year you grew so fast: autoscaling, HTTP load-balancing support (Ingress), support for persistent workloads including clustered databases (PetSets). You’ve made friends with more clouds (welcome Azure & OpenStack to the family), and even started to span zones and clusters (Federation). And these are just some of the most visible changes - there’s so much happening inside that brain of yours!

I think it’s wonderful you’ve remained so open in all that you do - you seem to write down everything on Github - for better or worse. I think we’ve all learned a lot about that on the way, like the perils of having engineers make scaling statements that are then weighed against claims made without quite the same framework of precision and rigor. But I’m proud that you chose not to lower your standards, but rose to the challenge and just ran faster instead - it might not be the most realistic approach, but it is the only way to move mountains!

And yet, somehow, you’ve managed to avoid a lot of the common dead-ends that other open source software has fallen into, particularly as those projects got bigger and the developers end up working on it more than they use it directly. How did you do that? There’s a probably-apocryphal story of an employee at IBM that makes a huge mistake, and is summoned to meet with the big boss, expecting to be fired, only to be told “We just spent several million dollars training you. Why would we want to fire you?”. Despite all the investment google is pouring into you (along with Redhat and others), I sometimes wonder if the mistakes we are avoiding could be worth even more. There is a very open development process, yet there’s also an “oracle” that will sometimes course-correct by telling us what happens two years down the road if we make a particular design decision. This is a parent you should probably listen to!

And so although you’re only a year old, you really have an old soul. I’m just one of the many people raising you, but it’s a wonderful learning experience for me to be able to work with the people that have built these incredible systems and have all this domain knowledge. Yet because we started from scratch (rather than taking the existing Borg code) we’re at the same level and can still have genuine discussions about how to raise you. Well, at least as close to the same level as we could ever be, but it’s to their credit that they are all far too nice ever to mention it!

If I would pick just two of the wise decisions those brilliant people made:

  • Labels & selectors give us declarative “pointers”, so we can say “why” we want things, rather than listing the things directly. It’s the secret to how you can scale to great heights; not by naming each step, but saying “a thousand more steps just like that first one”.
  • Controllers are state-synchronizers: we specify the goals, and your controllers will indefatigably work to bring the system to that state. They work through that strongly-typed API foundation, and are used throughout the code, so Kubernetes is more of a set of a hundred small programs than one big one. It’s not enough to scale to thousands of nodes technically; the project also has to scale to thousands of developers and features; and controllers help us get there.

And so on we will go! We’ll be replacing those controllers and building on more, and the API-foundation lets us build anything we can express in that way - with most things just a label or annotation away! But your thoughts will not be defined by language: with third party resources you can express anything you choose. Now we can build Kubernetes without building in Kubernetes, creating things that feel as much a part of Kubernetes as anything else. Many of the recent additions, like ingress, DNS integration, autoscaling and network policies were done or could be done in this way. Eventually it will be hard to imagine you before these things, but tomorrow’s standard functionality can start today, with no obstacles or gatekeeper, maybe even for an audience of one.

So I’m looking forward to seeing more and more growth happen further and further from the core of Kubernetes. We had to work our way through those phases; starting with things that needed to happen in the kernel of Kubernetes - like replacing replication controllers with deployments. Now we’re starting to build things that don’t require core changes. But we’re still still talking about infrastructure separately from applications. It’s what comes next that gets really interesting: when we start building applications that rely on the Kubernetes APIs. We’ve always had the Cassandra example that uses the Kubernetes API to self-assemble, but we haven’t really even started to explore this more widely yet. In the same way that the S3 APIs changed how we build things that remember, I think the k8s APIs are going to change how we build things that think.

So I’m looking forward to your second birthday: I can try to predict what you’ll look like then, but I know you’ll surpass even the most audacious things I can imagine. Oh, the places you’ll go!


-- Justin Santa Barbara, Independent Kubernetes Contributor

A Very Happy Birthday Kubernetes

Last year at OSCON, I got to reconnect with a bunch of friends and see what they have been working on. That turned out to be the Kubernetes 1.0 launch event. Even that day, it was clear the project was supported by a broad community -- a group that showed an ambitious vision for distributed computing. 

Today, on the first anniversary of the Kubernetes 1.0 launch, it’s amazing to see what a community of dedicated individuals can do. Kubernauts have collectively put in 237 person years of coding effort since launch to bring forward our most recent release 1.3. However the community is much more than simply coding effort. It is made up of people -- individuals that have given their expertise and energy to make this project flourish. With more than 830 diverse contributors, from independents to the largest companies in the world, it’s their work that makes Kubernetes stand out. Here are stories from a couple early contributors reflecting back on the project:


The community is also more than online GitHub and Slack conversation; year one saw the launch of KubeCon, the Kubernetes user conference, which started as a grassroot effort that brought together 1,000 individuals between two events in San Francisco and London. The advocacy continues with users globally. There are more than 130 Meetup groups that mention Kubernetes, many of which are helping celebrate Kubernetes’ birthday. To join the celebration, participate at one of the 20 #k8sbday parties worldwide: Austin, Bangalore, Beijing, Boston, Cape Town, Charlotte, Cologne, Geneva, Karlsruhe, Kisumu, Montreal, Portland, Raleigh, Research Triangle, San Francisco, Seattle, Singapore, SF Bay Area, or Washington DC.


The Kubernetes community continues to work to make our project more welcoming and open to our kollaborators. This spring, Kubernetes and KubeCon moved to the Cloud Native Compute Foundation (CNCF), a Linux Foundation Project, to accelerate the collaborative vision outlined only a year ago at OSCON …. lifting a glass to another great year.



-- Sarah Novotny, Kubernetes Community Wonk

Monday, July 18, 2016

Bringing End-to-End Kubernetes Testing to Azure (Part 2)

Editor’s Note: Today’s guest post is Part II from a series by Travis Newhouse, Chief Architect at AppFormix, writing about their contributions to Kubernetes.

Historically, Kubernetes testing has been hosted by Google, running e2e tests on Google Compute Engine (GCE) and Google Container Engine (GKE). In fact, the gating checks for the submit-queue are a subset of tests executed on these test platforms. Federated testing aims to expand test coverage by enabling organizations to host test jobs for a variety of platforms and contribute test results to benefit the Kubernetes project. Members of the Kubernetes test team at Google and SIG-Testing have created a Kubernetes test history dashboard that publishes the results from all federated test jobs (including those hosted by Google).

In this blog post, we describe extending the e2e test jobs for Azure, and show how to contribute a federated test to the Kubernetes project.

END-TO-END INTEGRATION TESTS FOR AZURE

After successfully implementing “development distro” scripts to automate deployment of Kubernetes on Azure, our next goal was to run e2e integration tests and share the results with the Kubernetes community.

We automated our workflow for executing e2e tests of Kubernetes on Azure by defining a nightly job in our private Jenkins server. Figure 2 shows the workflow that uses kube-up.sh to deploy Kubernetes on Ubuntu virtual machines running in Azure, then executes the e2e tests. On completion of the tests, the job uploads the test results and logs to a Google Cloud Storage directory, in a format that can be processed by the scripts that produce the test history dashboard. Our Jenkins job uses the hack/jenkins/e2e-runner.sh and hack/jenkins/upload-to-gcs.sh scripts to produce the results in the correct format.

Kubernetes on Azure - Flow Chart - New Page.png
Figure 2 - Nightly test job workflow

HOW TO CONTRIBUTE AN E2E TEST


Throughout our work to create the Azure e2e test job, we have collaborated with members of SIG-Testing to find a way to publish the results to the Kubernetes community. The results of this collaboration are documentation and a streamlined process to contribute results from a federated test job. The steps to contribute e2e test results can be summarized in 4 steps.

  1. Create a Google Cloud Storage bucket in which to publish the results.
  2. Define an automated job to run the e2e tests. By setting a few environment variables, hack/jenkins/e2e-runner.sh deploys Kubernetes binaries and executes the tests.
  3. Upload the results using hack/jenkins/upload-to-gcs.sh.
  4. Incorporate the results into the test history dashboard by submitting a pull-request with modifications to a few files in kubernetes/test-infra.

The federated tests documentation describes these steps in more detail. The scripts to run e2e tests and upload results simplifies the work to contribute a new federated test job. The specific steps to set up an automated test job and an appropriate environment in which to deploy Kubernetes are left to the reader’s preferences. For organizations using Jenkins, the jenkins-job-builder configurations for GCE and GKE tests may provide helpful examples.

RETROSPECTIVE

The e2e tests on Azure have been running for several weeks now. During this period, we have found two issues in Kubernetes. Weixu Zhuang immediately published fixes that have been merged into the Kubernetes master branch.

The first issue happened when we wanted to bring up the Kubernetes cluster using SaltStack on Azure using Ubuntu VMs. A commit (07d7cfd3) modified the OpenVPN certificate generation script to use a variable that was only initialized by scripts in the cluster/ubuntu. Strict checking on existence of parameters by the certificate generation script caused other platforms that use the script to fail (e.g. our changes to support Azure). We submitted a pull-request that fixed the issue by initializing the variable with a default value to make the certificate generation scripts more robust across all platform types.

The second pull-request cleaned up an unused import in the Daemonset unit test file. The import statement broke the unit tests with golang 1.4. Our nightly Jenkins job helped us find this error and we promptly pushed a fix for it.

CONCLUSION AND FUTURE WORK

The addition of a nightly e2e test job for Kubernetes on Azure has helped to define the process to contribute a federated test to the Kubernetes project. During the course of the work, we also saw the immediate benefit of expanding test coverage to more platforms when our Azure test job identified compatibility issues.

We want to thank Aaron Crickenberger, Erick Fejta, Joe Finney, and Ryan Hutchinson for their help to incorporate the results of our Azure e2e tests into the Kubernetes test history. If you’d like to get involved with testing to create a stable, high quality releases of Kubernetes, join us in the Kubernetes Testing SIG (sig-testing).


--Travis Newhouse, Chief Architect at AppFormix



Sunday, July 17, 2016

Update on Kubernetes for Windows Server Containers

Today's post is written by Jitendra Bhurat, Product Manager at Apprenda, and Cesar Wong, Principal Software Engineer at Red Hat; describing the progress made to bring Kubernetes on Windows Server. 

Large organizations have significant investments and long-term roadmaps for both Linux and Windows Server. With Microsoft adopting Docker in Windows Server 2016, organizations are looking for a simplified means to orchestrating containers in both their Linux and Windows environments. 

As the adoption and contribution curve of Kubernetes is unmatched, there has been a lot of interest in making Kubernetes compatible with the Microsoft ecosystem. At Apprenda, we have been building .NET distributed systems for the better part of a decade and kicked off the porting of Kubernetes to Windows in June.

Our current goal is to develop a minimally viable proof of concept (POC) of Kubernetes on Windows Server 2016 so that the community can learn about the pitfalls that will need to be overcome for future production environments. As the community drives this project to its first milestone, a number of lessons have been learned, work is ongoing, and we are currently investigating a number of networking options on Windows Server.

GETTING STARTED

In June, Apprenda and Kismatic (acquired by Apprenda), formed the Kubernetes Windows SIG team, and along with Red Hat, formed a partnership to create a Windows version of the Kubernetes Kubelet and Kube-Proxy, the two key components for operating Kubernetes on Windows Server. For a minimum viable POC, the initial work was divided into the following logical areas to help facilitate project management:


Focus Area
Architectural Notes
Status
Part of POC
Container Runtime
Expanding Kubernetes container runtimes to support Windows Server 2016 Docker containers
Work for POC complete
Yes
cAdvisor
Resource usage and performance characteristics in Kubernetes for Docker containers. Other architectural features in Kubernetes can run without this component.  
Research will start after POC milestone
No
Pod Architecture
Fundamental unit of container bundling in Kubernetes. Current area of research ongoing on ultimate architecture in Windows Server.   
Work for POC close to complete and will be finished when networking work is completed.
Yes
Networking and Kube-Proxy
Networking and communications among components (e.g. services to pod, pod to pod, etc.).
Still active area of research for POC. For some parts of Kubernetes there are no direct parallels in Windows Server - e.g. IPTables. Currently investigating Open vSwitch for container networking.
Yes
OOM Score
Frees up memory by sacrificing processes when all else fails. There is no direct comparison for OOM in Windows Server.
Research will start after POC milestone
No

Cesar Wong from Red Hat offered to contribute to Container Runtime and Pod Architecture sections and Apprenda has been working on Kubelet Integration and Kube-Proxy. 

It is important to mention that for Windows Server environments, the Kubernetes control plane (API server, schedulers, etcd, etc) would continue to run on Linux and there would not be a Windows-only version of Kubernetes. Given the vast majority of organizations are running both Linux and Windows Server instances, this requirement is not a technical roadblock to adoption. For example, vSphere has a similar requirement

Container Runtime 

The current Kubelet implementation for Linux relies on an infrastructure container per pod to hold on to an IP address while other containers may be killed/restarted. This requires that containers be able to share their network stack (--net=container:id). In Docker for Windows, it is not possible to share the network stack across containers. It is also not possible to set a container’s DNS servers. In the container runtime POC for Windows, we created a new container runtime that removed the requirement of an infrastructure container. It also uses the IP of the first container in the pod as the pod’s IP address. With some limitations, however, it was possible to stand up pods with Windows-specific containers. 

It is worth noting that a refactor of the container runtime in the Kubelet is under way and the POC code will need to be updated to reflect this new runtime architecture.

Pod Architecture

Pods in Kubernetes can include multiple containers, all sharing the same network, PID and volume namespace. Since Windows containers cannot share namespaces, multiple containers inside a pod are more isolated from each other. Furthermore, each container gets its own IP address, while the pod will only expose a single IP address for all containers. 

At least initially, this means that Windows pods in Kubernetes should be limited to a single container. This also reflects the fact that containers in Windows are more monolithic than their Linux counterparts, including more of the underlying operating system pieces, including a service manager and other processes.
Kubelet Integration

The goal here was to have Kubelet running on Windows Server Container and be capable of accepting requests/commands from the Kubernetes API Server running on Linux. The Kubelet code base is, unfortunately, closely coupled with the Linux OS. For example, even for trivial things like finding the hostname, the Kubelet code assumes Linux as the operating system. Thanks to the versatility of Kubernetes, many of the OS dependencies (like hostname)  were fixable using command line flags to provide a default value.  

Thus far, we have been able to disjoin parts of the Kubelet from its dependencies which allow it to run on Windows with the proper abstraction layers. While the current status of the Kubelet is good enough for a POC, there is more work that needs to be done to get it into a state of general availability. For one, instead of using flags and environmental variables, it would be best to have code changes upstream. There are also a number of bugs that need fixing. For example, we encountered a Golang bug where moving a directory fails in Windows and had to provide an alternate implementation.  
Networking / Kube-Proxy
Organizations using containers need the ability to deploy multiple containers and pods on a single host, have them share an IP address and be able to easily talk to other containers and pods on that same host. To accomplish this design goal, we conducted a lot of research on Windows Container Networking, as this held the keys to both the Kubernetes pod and service level networking. 

We looked in-depth at the different networking modes supported by Windows Containers and found L2 Bridge Networking mode to be the most appropriate in our case, as it allowed inter-container communication across different hosts. Working closely with the Microsoft networking team, we were able to identify and resolve issues we were having setting up the L2 Bridge networking mode in our environment using the TP5 release of Windows Server 2016.  
As we dug deep into networking, two options were clear for the Kube-Proxy implementation:
  • Implement Kube-Proxy natively on Windows  
  • Run a Linux version of the Kube-Proxy on a Hyper-V VM using the same bridge as the other containers running on Windows and have the Kube-Proxy forward requests to the other containers and have the Windows host forward requests to the proxy

For the POC, we decided to run Kube-Proxy on a Hyper-V Linux virtual machine and configure L2 Bridge mode networking with a private subnet. This implementation would enable us to forward traffic from the Kube-Proxy to containers running on Windows. Unfortunately, it did not work as it did in theory. 

On further investigation, with the help of Microsoft, we determined that for this to work, we would have to configure the L2 Bridge networking mode with an externally accessible gateway and on the same subnet as the container host. Such a requirement goes against the networking isolation boundary that the pod currently enjoys because each container on that host and other hosts can communicate with each other and all the other container hosts. This means any container can talk to any other container regardless of pod membership.  
We are currently looking at using Open vSwitch (OVS) to configure overlay networking to overcome the issue described above. Cloudbase, which is also a member of the Kubernetes Windows SIG, is actively involved in this effort. Their team has successfully implemented OVS in Windows Server and their work is promising for this effort. The community is also currently engaged with the Microsoft lead on Windows Server networking to find an alternative in case this route does not pan out. 

As we continue to make progress on the POC, we welcome ideas from the community to help us advance this vision. You can connect with us in the following ways:

--Jitendra Bhurat, Product Manager at Apprenda. Container Runtime and Pod Architecture sections contributed by Cesar Wong, Principal Software Engineer at Red Hat

Friday, July 15, 2016

Steering an Automation Platform at Wercker with Kubernetes

Editor’s note: today’s guest post is by Andy Smith, the CTO of Wercker, sharing how Kubernetes helps them save time and speed up development.  

At Wercker we run millions of containers that execute our users’ CI/CD jobs. The vast majority of them are ephemeral and only last as long as builds, tests and deploys take to run, the rest are ephemeral, too -- aren't we all --, but tend to last a bit longer and run our infrastructure. As we are running many containers across many nodes, we were in need of a highly scalable scheduler that would make our lives easier, and as such, decided to implement Kubernetes.

Wercker is a container-centric automation platform that helps developers build, test and deploy their applications. We support any number of pipelines, ranging from building code, testing API-contracts between microservices, to pushing containers to registries, and deploying to schedulers. All of these pipeline jobs run inside Docker containers and each artifact can be a Docker container.

And of course we use Wercker to build Wercker, and deploy itself onto Kubernetes!

Overview

Because we are a platform for running multi-service cloud-native code we've made many design decisions around isolation. On the base level we use CoreOS and cloud-init to bootstrap a cluster of heterogeneous nodes which I have named Patricians, Peasants, as well as controller nodes that don't have a cool name and are just called Controllers. Maybe we should switch to Constables.

k8s-architecture.jpg


Patrician nodes are where the bulk of our infrastructure runs. These nodes have appropriate network interfaces to communicate with our backend services as well as be routable by various load balancers. This is where our logging is aggregated and sent off to logging services, our many microservices for reporting and processing the results of job runs, and our many microservices for handling API calls.

On the other end of the spectrum are the Peasant nodes where the public jobs are run. Public jobs consist of worker pods reading from a job queue and dynamically generating new runner pods to handle execution of the job. The job itself is an incarnation of our open source CLI tool, the same one you can run on your laptop with Docker installed. These nodes have very limited access to the rest of the infrastructure and the containers the jobs themselves run in are even further isolated.

Controllers are controllers, I bet ours look exactly the same as yours.

Dynamic Pods
Our heaviest use of the Kubernetes API is definitely our system of creating dynamic pods to serve as the runtime environment for our actual job execution. After pulling job descriptions from the queue we define a new pod containing all the relevant environment for checking out code, managing a cache, executing a job and uploading artifacts. We launch the pod, monitor its progress, and destroy it when the job is done.

Ingresses
In order to provide a backend for HTTP API calls and allow self-registration of handlers we make use of the Ingress system in Kubernetes. It wasn't the clearest thing to set up, but reading through enough of the nginx example eventually got us to a good spot where it is easy to connect services to the frontend.

Upcoming Features in 1.3

While we generally treat all of our pods and containers as ephemeral and expect rapid restarts on failures, we are looking forward to Pet Sets and Init Containers as ways to optimize some of our processes. We are also pleased with official support for Minikube coming along as it improves our local testing and development. 

Conclusion

Kubernetes saves us the non-trivial task of managing many, many containers across many nodes. It provides a robust API and tooling for introspecting these containers, and it includes much built in support for logging, metrics, monitoring and debugging. Service discovery and networking alone saves us so much time and speeds development immensely.
Cheers to you Kubernetes, keep up the good work :)

-- Andy Smith, CTO, Wercker

Dashboard - Full Featured Web Interface for Kubernetes


Editor’s note: this post is part of a series of in-depth articles on what's new in Kubernetes 1.3

Kubernetes Dashboard is a project that aims to bring a general purpose monitoring and operational web interface to the Kubernetes world. Three months ago we released the first production ready version, and since then the dashboard has made massive improvements. In a single UI, you’re able to perform majority of possible interactions with your Kubernetes clusters without ever leaving your browser. This blog post breaks down new features introduced in the latest release and outlines the roadmap for the future. 

Full-Featured Dashboard

Thanks to a large number of contributions from the community and project members, we were able to deliver many new features for Kubernetes 1.3 release. We have been carefully listening to all the great feedback we have received from our users (see the summary infographics) and addressed the highest priority requests and pain points.

The Dashboard UI now handles all workload resources. This means that no matter what workload type you run, it is visible in the web interface and you can do operational changes on it. For example, you can modify your stateful MySQL installation with Pet Sets, do a rolling update of your web server with Deployments or install cluster monitoring with DaemonSets. 


Home screen that shows all workloads running in a cluster.

In addition to viewing resources, you can create, edit, update, and delete them. This feature enables many use cases. For example, you can kill a failed Pod, do a rolling update on a Deployment, or just organize your resources. You can also export and import YAML configuration files of your cloud apps and store them in a version control system.


YAML resource editor and exporter.

The release includes a beta view of cluster nodes for administration and operational use cases. The UI lists all nodes in the cluster to allow for overview analysis and quick screening for problematic nodes. The details view shows all information about the node and links to pods running on it.


Node view that lists its details and Pods running on it.

There are also many smaller scope new features that the we shipped with the release, namely: support for namespaced resources, internationalization, performance improvements, and many bug fixes (find out more in the release notes). All these improvements result in a better and simpler user experience of the product.

Future Work

The team has ambitious plans for the future spanning across multiple use cases. We are also open to all feature requests, which you can post on our issue tracker.

Here is a list of our focus areas for the following months:
  • Handle more Kubernetes resources - To show all resources that a cluster user may potentially interact with. Once done, Dashboard can act as a complete replacement for CLI. 
  • Monitoring and troubleshooting - To add resource usage statistics/graphs to the objects shown in Dashboard. This focus area will allow for actionable debugging and troubleshooting of cloud applications.
  • Security, auth and logging in - Make Dashboard accessible from networks external to a Cluster and work with custom authentication systems.

Connect With Us

We would love to talk with you and hear your feedback!


-- Piotr Bryk, Software Engineer, Google

Thursday, July 14, 2016

Citrix + Kubernetes = A Home Run

Editor’s note: today’s guest post is by Mikko Disini, a Director of Product Management at Citrix Systems, sharing their collaboration experience on a Kubernetes integration. 

Technical collaboration is like sports. If you work together as a team, you can go down the homestretch and pull through for a win. That’s our experience with the Google Cloud Platform team.

Recently, we approached Google Cloud Platform (GCP) to collaborate on behalf of Citrix customers and the broader enterprise market looking to migrate workloads. This migration required including the NetScaler Docker load balancer, CPX, into Kubernetes nodes and resolving any issues with getting traffic into the CPX proxies.  

Why NetScaler and Kubernetes?

  1. Citrix customers want the same Layer 4 to Layer 7 capabilities from NetScaler that they have on-prem as they move to the cloud as they begin deploying their container and microservices architecture with Kubernetes 
  2. Kubernetes provides a proven infrastructure for running containers and VMs with automated workload delivery
  3. NetScaler CPX provides Layer 4 to Layer 7 services and highly efficient telemetry data to a logging and analytics platform, NetScaler Management and Analytics System

I wish all our experiences working together with a technical partner were as good as working with GCP. We had a list of issues to enable our use cases and were able to collaborate swiftly on a solution. To resolve these, GCP team offered in depth technical assistance, working with Citrix such that NetScaler CPX can spin up and take over as a client-side proxy running on each host. 

Next, NetScaler CPX needed to be inserted in the data path of GCP ingress load balancer so that NetScaler CPX can spread traffic to front end web servers. The NetScaler team made modifications so that NetScaler CPX listens to API server events and configures itself to create a VIP, IP table rules and server rules to take ingress traffic and load balance across front end applications. Google Cloud Platform team provided feedback and assistance to verify modifications made to overcome the technical hurdles. Done!

NetScaler CPX use case is supported in Kubernetes 1.3. Citrix customers and the broader enterprise market will have the opportunity to leverage NetScaler with Kubernetes, thereby lowering the friction to move workloads to the cloud. 

You can learn more about NetScaler CPX here.


 -- Mikko Disini, Director of Product Management - NetScaler, Citrix Systems

Cross Cluster Services - Achieving Higher Availability for your Kubernetes Applications


Editor’s note: this post is part of a series of in-depth articles on what's new in Kubernetes 1.3


As Kubernetes users scale their production deployments we’ve heard a clear desire to deploy services across zone, region, cluster and cloud boundaries. Services that span clusters provide geographic distribution, enable hybrid and multi-cloud scenarios and improve the level of high availability beyond single cluster multi-zone deployments. Customers who want their services to span one or more (possibly remote) clusters, need them to be reachable in a consistent manner from both within and outside their clusters.

In Kubernetes 1.3, our goal was to minimize the friction points and reduce the management/operational overhead associated with deploying a service with geographic distribution to multiple clusters. This post explains how to do this. 

Note: Though the examples used here leverage Google Container Engine (GKE) to provision Kubernetes clusters, they work anywhere you want to deploy Kubernetes.

Let’s get started. The first step is to create is to create Kubernetes clusters into 4 Google Cloud Platform (GCP) regions using GKE.

  • asia-east1-b
  • europe-west1-b
  • us-east1-b
  • us-central1-b

Let’s run the following commands to build the clusters:


gcloud container clusters create gce-asia-east1 \
 --scopes cloud-platform \
 --zone asia-east1-b
gcloud container clusters create gce-europe-west1 \
 --scopes cloud-platform \
 --zone=europe-west1-b
gcloud container clusters create gce-us-east1 \
 --scopes cloud-platform \
 --zone=us-east1-b
gcloud container clusters create gce-us-central1 \
 --scopes cloud-platform \
 --zone=us-central1-b

Let’s verify the clusters are created:

gcloud container clusters list
NAME              ZONE            MASTER_VERSION  MASTER_IP       NUM_NODES  STATUS
gce-asia-east1    asia-east1-b    1.2.4           104.XXX.XXX.XXX 3          RUNNING
gce-europe-west1  europe-west1-b  1.2.4           130.XXX.XX.XX   3          RUNNING
gce-us-central1   us-central1-b   1.2.4           104.XXX.XXX.XX  3          RUNNING
gce-us-east1      us-east1-b      1.2.4           104.XXX.XX.XXX  3          RUNNING


The next step is to bootstrap the clusters and deploy the federation control plane on one of the clusters that has been provisioned. If you’d like to follow along, refer to Kelsey Hightower’s tutorial which walks through the steps involved.

Federated Services

Federated Services are directed to the Federation API endpoint and specify the desired properties of your service.

Once created, the Federated Service automatically:
  1. creates matching Kubernetes Services in every cluster underlying your cluster federation,
  2. monitors the health of those service "shards" (and the clusters in which they reside), and
  3. manages a set of DNS records in a public DNS provider (like Google Cloud DNS, or AWS Route 53), thus ensuring that clients of your federated service can seamlessly locate an appropriate healthy service endpoint at all times, even in the event of cluster, availability zone or regional outages.
Clients inside your federated Kubernetes clusters (i.e. Pods) will automatically find the local shard of the federated service in their cluster if it exists and is healthy, or the closest healthy shard in a different cluster if it does not.

Federations of Kubernetes Clusters can include clusters running in different cloud providers (e.g. GCP, AWS), and on-premise (e.g. on OpenStack). All you need to do is create your clusters in the appropriate cloud providers and/or locations, and register each cluster's API endpoint and credentials with your Federation API Server.

In our example, we have clusters created in 4 regions along with a federated control plane API deployed in one of our clusters, that we’ll be using to provision our service. See diagram below for visual representation.


Creating a Federated Service

Let’s list out all the clusters in our federation:

kubectl --context=federation-cluster get clusters
NAME               STATUS    VERSION   AGE
gce-asia-east1     Ready               1m
gce-europe-west1   Ready               57s
gce-us-central1    Ready               47s
gce-us-east1       Ready               34s

Let’s create a federated service object:


kubectl --context=federation-cluster create -f services/nginx.yaml

The '--context=federation-cluster' flag tells kubectl to submit the request to the Federation API endpoint, with the appropriate credentials. The federated service will automatically create and maintain matching Kubernetes services in all of the clusters underlying your federation.

You can verify this by checking in each of the underlying clusters, for example:


kubectl --context=gce-asia-east1a get svc nginx
NAME      CLUSTER-IP     EXTERNAL-IP      PORT(S)   AGE
nginx     10.63.250.98   104.199.136.89   80/TCP    9m

The above assumes that you have a context named 'gce-asia-east1a' configured in your client for your cluster in that zone. The name and namespace of the underlying services will automatically match those of the federated service that you created above.

The status of your Federated Service will automatically reflect the real-time status of the underlying Kubernetes services, for example:


kubectl --context=federation-cluster describe services nginx
Name:                   nginx
Namespace:              default
Labels:                 run=nginx
Selector:               run=nginx
Type:                   LoadBalancer
IP:         
LoadBalancer Ingress:   104.XXX.XX.XXX, 104.XXX.XX.XXX, 104.XXX.XX.XXX, 104.XXX.XXX.XX
Port:                   http    80/TCP
Endpoints:              <none>
Session Affinity:       None
No events.

The 'LoadBalancer Ingress' addresses of your federated service corresponds with the 'LoadBalancer Ingress' addresses of all of the underlying Kubernetes services. For inter-cluster and inter-cloud-provider networking between service shards to work correctly, your services need to have an externally visible IP address. Service Type: Loadbalancer is typically used here.

Note also what we have not yet provisioned any backend Pods to receive the network traffic directed to these addresses (i.e. 'Service Endpoints'), so the federated service does not yet consider these to be healthy service shards, and has accordingly not yet added their addresses to the DNS records for this federated service.

Adding Backend Pods

To render the underlying service shards healthy, we need to add backend Pods behind them. This is currently done directly against the API endpoints of the underlying clusters (although in future the Federation server will be able to do all this for you with a single command, to save you the trouble). For example, to create backend Pods in our underlying clusters:


for CLUSTER in asia-east1-a europe-west1-a us-east1-a us-central1-a
do
kubectl --context=$CLUSTER run nginx --image=nginx:1.11.1-alpine --port=80
done

Verifying Public DNS Records

Once the Pods have successfully started and begun listening for connections, Kubernetes in each cluster (via automatic health checks) will report them as healthy endpoints of the service in that cluster. The cluster federation will in turn consider each of these service 'shards' to be healthy, and place them in serving by automatically configuring corresponding public DNS records. You can use your preferred interface to your configured DNS provider to verify this. For example, if your Federation is configured to use Google Cloud DNS, and a managed DNS domain 'example.com':


$ gcloud dns managed-zones describe example-dot-com
creationTime: '2016-06-26T18:18:39.229Z'
description: Example domain for Kubernetes Cluster Federation
dnsName: example.com.
id: '3229332181334243121'
kind: dns#managedZone
name: example-dot-com
nameServers:
- ns-cloud-a1.googledomains.com.
- ns-cloud-a2.googledomains.com.
- ns-cloud-a3.googledomains.com.
- ns-cloud-a4.googledomains.com.
$ gcloud dns record-sets list --zone example-dot-com
NAME                                                                                                 TYPE      TTL     DATA
example.com.                                                                                       NS     21600  ns-cloud-e1.googledomains.com., ns-cloud-e2.googledomains.com.
example.com.                                                                                      SOA     21600 ns-cloud-e1.googledomains.com. cloud-dns-hostmaster.google.com. 1 21600 3600 1209600 300
nginx.mynamespace.myfederation.svc.example.com.                            A     180     104.XXX.XXX.XXX, 130.XXX.XX.XXX, 104.XXX.XX.XXX, 104.XXX.XXX.XX
nginx.mynamespace.myfederation.svc.us-central1-a.example.com.     A     180     104.XXX.XXX.XXX
nginx.mynamespace.myfederation.svc.us-central1.example.com.
nginx.mynamespace.myfederation.svc.us-central1.example.com.         A    180     104.XXX.XXX.XXX, 104.XXX.XXX.XXX, 104.XXX.XXX.XXX
nginx.mynamespace.myfederation.svc.asia-east1-a.example.com.       A    180     130.XXX.XX.XXX
nginx.mynamespace.myfederation.svc.asia-east1.example.com.
nginx.mynamespace.myfederation.svc.asia-east1.example.com.           A    180     130.XXX.XX.XXX, 130.XXX.XX.XXX
nginx.mynamespace.myfederation.svc.europe-west1.example.com.  CNAME    180   nginx.mynamespace.myfederation.svc.example.com.
... etc.

Note: If your Federation is configured to use AWS Route53, you can use one of the equivalent AWS tools, for example:


$aws route53 list-hosted-zones
and
$aws route53 list-resource-record-sets --hosted-zone-id Z3ECL0L9QLOVBX

Whatever DNS provider you use, any DNS query tool (for example 'dig' or 'nslookup') will of course also allow you to see the records created by the Federation for you.

Discovering a Federated Service from pods Inside your Federated Clusters

By default, Kubernetes clusters come preconfigured with a cluster-local DNS server ('KubeDNS'), as well as an intelligently constructed DNS search path which together ensure that DNS queries like "myservice", "myservice.mynamespace", "bobsservice.othernamespace" etc issued by your software running inside Pods are automatically expanded and resolved correctly to the appropriate service IP of services running in the local cluster.

With the introduction of Federated Services and Cross-Cluster Service Discovery, this concept is extended to cover Kubernetes services running in any other cluster across your Cluster Federation, globally. To take advantage of this extended range, you use a slightly different DNS name (e.g. myservice.mynamespace.myfederation) to resolve federated services. Using a different DNS name also avoids having your existing applications accidentally traversing cross-zone or cross-region networks and you incurring perhaps unwanted network charges or latency, without you explicitly opting in to this behavior.

So, using our NGINX example service above, and the federated service DNS name form just described, let's consider an example: A Pod in a cluster in the us-central1-a availability zone needs to contact our NGINX service. Rather than use the service's traditional cluster-local DNS name ("nginx.mynamespace", which is automatically expanded to"nginx.mynamespace.svc.cluster.local") it can now use the service's Federated DNS name, which is"nginx.mynamespace.myfederation". This will be automatically expanded and resolved to the closest healthy shard of my NGINX service, wherever in the world that may be. If a healthy shard exists in the local cluster, that service's cluster-local (typically 10.x.y.z) IP address will be returned (by the cluster-local KubeDNS). This is exactly equivalent to non-federated service resolution.

If the service does not exist in the local cluster (or it exists but has no healthy backend pods), the DNS query is automatically expanded to `"nginx.mynamespace.myfederation.svc.us-central1-a.example.com". Behind the scenes, this is finding the external IP of one of the shards closest to my availability zone. This expansion is performed automatically by KubeDNS, which returns the associated CNAME record. This results in a traversal of the hierarchy of DNS records in the above example, and ends up at one of the external IP's of the Federated Service in the local us-central1 region.

It is also possible to target service shards in availability zones and regions other than the ones local to a Pod by specifying the appropriate DNS names explicitly, and not relying on automatic DNS expansion. For example, "nginx.mynamespace.myfederation.svc.europe-west1.example.com" will resolve to all of the currently healthy service shards in Europe, even if the Pod issuing the lookup is located in the U.S., and irrespective of whether or not there are healthy shards of the service in the U.S. This is useful for remote monitoring and other similar applications.

Discovering a Federated Service from Other Clients Outside your Federated Clusters

For external clients, automatic DNS expansion described is no longer possible. External clients need to specify one of the fully qualified DNS names of the federated service, be that a zonal, regional or global name. For convenience reasons, it is often a good idea to manually configure additional static CNAME records in your service, for example:


eu.nginx.acme.com        CNAME nginx.mynamespace.myfederation.svc.europe-west1.example.com.
us.nginx.acme.com        CNAME nginx.mynamespace.myfederation.svc.us-central1.example.com.
nginx.acme.com             CNAME nginx.mynamespace.myfederation.svc.example.com.

That way your clients can always use the short form on the left, and always be automatically routed to the closest healthy shard on their home continent. All of the required failover is handled for you automatically by Kubernetes Cluster Federation. 

Handling Failures of Backend Pods and Whole Clusters

Standard Kubernetes service cluster-IP's already ensure that non-responsive individual Pod endpoints are automatically taken out of service with low latency. The Kubernetes cluster federation system automatically monitors the health of clusters and the endpoints behind all of the shards of your federated service, taking shards in and out of service as required. Due to the latency inherent in DNS caching (the cache timeout, or TTL for federated service DNS records is configured to 3 minutes, by default, but can be adjusted), it may take up to that long for all clients to completely fail over to an alternative cluster in in the case of catastrophic failure. However, given the number of discrete IP addresses which can be returned for each regional service endpoint (see e.g. us-central1 above, which has three alternatives) many clients will fail over automatically to one of the alternative IP's in less time than that given appropriate configuration.

Community

We'd love to hear feedback on Kubernetes Cross Cluster Services. To join the community:

Please give Cross Cluster Services a try, and let us know how it goes!


-- Quinton Hoole, Engineering Lead, Google and Allan Naim, Product Manager, Google