An open source system for automating deployment, scaling, and operations of applications.

Tuesday, August 16, 2016

Kubernetes Namespaces: use cases and insights


“Who's on first, What's on second, I Don't Know's on third” 
Who's on First? by Abbott and Costello

Introduction

Kubernetes is a system with several concepts. Many of these concepts get manifested as “objects” in the RESTful API (often called “resources” or “kinds”). One of these concepts is Namespaces. In Kubernetes, Namespaces are the way to partition a single Kubernetes cluster into multiple virtual clusters. In this post we’ll highlight examples of how our customers are using Namespaces. 

But first, a metaphor: Namespaces are like human family names. A family name, e.g. Wong, identifies a family unit. Within the Wong family, one of its members, e.g. Sam Wong, is readily identified as just “Sam” by the family. Outside of the family, and to avoid “Which Sam?” problems, Sam would usually be referred to as “Sam Wong”, perhaps even “Sam Wong from San Francisco”.  

Namespaces are a logical partitioning capability that enable one Kubernetes cluster to be used by multiple users, teams of users, or a single user with multiple applications without concern for undesired interaction. Each user, team of users, or application may exist within its Namespace, isolated from every other user of the cluster and operating as if it were the sole user of the cluster. (Furthermore, Resource Quotas provide the ability to allocate a subset of a Kubernetes cluster’s resources to a Namespace.)

For all but the most trivial uses of Kubernetes, you will benefit by using Namespaces. In this post, we’ll cover the most common ways that we’ve seen Kubernetes users on Google Cloud Platform use Namespaces, but our list is not exhaustive and we’d be interested to learn other examples from you.

Use-cases covered
  1. Roles and Responsibilities in an enterprise for namespaces
  2. Partitioning landscapes: dev vs. test vs. prod
  3. Customer partitioning for non-multi-tenant scenarios
  4. When not to use namespaces
Use-case #1: Roles and Responsibilities in an Enterprise

A typical enterprise contains multiple business/technology entities that operate independently of each other with some form of overarching layer of controls managed by the enterprise itself. Operating a Kubernetes clusters in such an environment can be done effectively when roles and responsibilities pertaining to Kubernetes are defined. 

Below are a few recommended roles and their responsibilities that can make managing Kubernetes clusters in a large scale organization easier.

  1. Designer/Architect role: This role will define the overall namespace strategy, taking into account product/location/team/cost-center and determining how best to map these to Kubernetes Namespaces. Investing in such a role prevents namespace proliferation and “snowflake” Namespaces.
  2. Admin role: This role has admin access to all Kubernetes clusters. Admins can create/delete clusters and add/remove nodes to scale the clusters. This role will be responsible for patching, securing and maintaining the clusters. As well as implementing Quotas between the different entities in the organization. The Kubernetes Admin is responsible for implementing the namespaces strategy defined by the Designer/Architect. 

These two roles and the actual developers using the clusters will also receive support and feedback from the enterprise security and network teams on issues such as security isolation requirements and how namespaces fit this model, or assistance with networking subnets and load-balancers setup.

Anti-patterns
  1. Isolated Kubernetes usage “Islands” without centralized control: Without the initial investment in establishing a centralized control structure around Kubernetes management there is a risk of ending with a “mushroom farm” topology i.e. no defined size/shape/structure of clusters within the org. The result is a difficult to manage, higher risk and elevated cost due to underutilization of resources.
  2. Old-world IT controls choking usage and innovation: A common tendency is to try and transpose existing on-premises controls/procedures onto new dynamic frameworks .This results in weighing down the agile nature of these frameworks and nullifying the benefits of rapid dynamic deployments.
  3. Omni-cluster: Delaying the effort of creating the structure/mechanism for namespace management can result in one large omni-cluster that is hard to peel back into smaller usage groups. 
Use-case #2: Using Namespaces to partition development landscapes

Software development teams customarily partition their development pipelines into discrete units. These units take various forms and use various labels but will tend to result in a discrete dev environment, a testing|QA environment, possibly a staging environment and finally a production environment. The resulting layouts are ideally suited to Kubernetes Namespaces. Each environment or stage in the pipeline becomes a unique namespace.

The above works well as each namespace can be templated and mirrored to the next subsequent environment in the dev cycle, e.g. dev->qa->prod. The fact that each namespace is logically discrete allows the development teams to work within an isolated “development” namespace. DevOps (The closest role at Google is called Site Reliability Engineering “SRE”)  will be responsible for migrating code through the pipelines and ensuring that appropriate teams are assigned to each environment. Ultimately, DevOps is solely responsible for the final, production environment where the solution is delivered to the end-users.

A major benefit of applying namespaces to the development cycle is that the naming of software components (e.g. micro-services/endpoints) can be maintained without collision across the different environments. This is due to the isolation of the Kubernetes namespaces, e.g. serviceX in dev would be referred to as such across all the other namespaces; but, if necessary, could be uniquely referenced using its full qualified name serviceX.development.mycluster.com in the development namespace of mycluster.com.

Anti-patterns
  1. Abusing the namespace benefit resulting in unnecessary environments in the development pipeline. So; if you don’t do staging deployments, don’t create a “staging” namespace.
  2. Overcrowding namespaces e.g. having all your development projects in one huge “development” namespace. Since namespaces attempt to partition, use these to partition by your projects as well. Since Namespaces are flat, you may wish something similar to: projectA-dev, projectA-prod as projectA’s namespaces.
Use-case #3: Partitioning of your Customers

If you are, for example, a consulting company that wishes to manage separate applications for each of your customers, the partitioning provided by Namespaces aligns well. You could create a separate Namespace for each customer, customer project or customer business unit to keep these distinct while not needing to worry about reusing the same names for resources across projects.

An important consideration here is that Kubernetes does not currently provide a mechanism to enforce access controls across namespaces and so we recommend that you do not expose applications developed using this approach externally.

Anti-patterns
  1. Multi-tenant applications don’t need the additional complexity of Kubernetes namespaces since the application is already enforcing this partitioning.
  2. Inconsistent mapping of customers to namespaces. For example, you win business at a global corporate, you may initially consider one namespace for the enterprise not taking into account that this customer may prefer further partitioning e.g. BigCorp Accounting and BigCorp Engineering. In this case, the customer’s departments may each warrant a namespace.
When Not to use Namespaces

In some circumstances Kubernetes Namespaces will not provide the isolation that you need. This may be due to geographical, billing or security factors. For all the benefits of the logical partitioning of namespaces, there is currently no ability to enforce the partitioning. Any user or resource in a Kubernetes cluster may access any other resource in the cluster regardless of namespace. So, if you need to protect or isolate resources, the ultimate namespace is a separate Kubernetes cluster against which you may apply your regular security|ACL controls.

Another time when you may consider not using namespaces is when you wish to reflect a geographically distributed deployment. If you wish to deploy close to US, EU and Asia customers, a Kubernetes cluster deployed locally in each region is recommended.

When fine-grained billing is required perhaps to chargeback by cost-center or by customer, the recommendation is to leave the billing to your infrastructure provider. For example, in Google Cloud Platform (GCP), you could use a separate GCP Project or Billing Account and deploy a Kubernetes cluster to a specific-customer’s project(s).

In situations where confidentiality or compliance require complete opaqueness between customers, a Kubernetes cluster per customer/workload will provide the desired level of isolation. Once again, you should delegate the partitioning of resources to your provider.

Work is underway to provide (a) ACLs on Kubernetes Namespaces to be able to enforce security; (b) to provide Kubernetes Cluster Federation. Both mechanisms will address the reasons for the separate Kubernetes clusters in these anti-patterns. 

An easy to grasp anti-pattern for Kubernetes namespaces is versioning. You should not use Namespaces as a way to disambiguate versions of your Kubernetes resources. Support for versioning is present in the containers and container registries as well as in Kubernetes Deployment resource. Multiple versions should coexist by utilizing the Kubernetes container model which also provides for auto migration between versions with deployments. Furthermore versions scope namespaces will cause massive proliferation of namespaces within a cluster making it hard to manage.

Caveat Gubernator

You may wish to, but you cannot create a hierarchy of namespaces. Namespaces cannot be nested within one another. You can’t, for example, create my-team.my-org as a namespace but could perhaps have team-org.

Namespaces are easy to create and use but it’s also easy to deploy code inadvertently into the wrong namespace. Good DevOps hygiene suggests documenting and automating processes where possible and this will help. The other way to avoid using the wrong namespace is to set a kubectl context

As mentioned previously, Kubernetes does not (currently) provide a mechanism to enforce security across Namespaces. You should only use Namespaces within trusted domains (e.g. internal use) and not use Namespaces when you need to be able to provide guarantees that a user of the Kubernetes cluster or ones its resources be unable to access any of the other Namespaces resources. This enhanced security functionality is being discussed in the Kubernetes Special Interest Group for Authentication and Authorization, get involved at SIG-Auth

--Mike Altarace & Daz Wilkin, Strategic Customer Engineers, Google Cloud Platform


















Monday, August 15, 2016

SIG Apps: build apps for and operate them in Kubernetes

Editor’s note: This post is by the Kubernetes SIG-Apps team sharing how they focus on the developer and devops experience of running applications in Kubernetes.

Kubernetes is an incredible manager for containerized applications. Because of this, numerous companies have started to run their applications in Kubernetes.

Kubernetes Special Interest Groups (SIGs) have been around to support the community of developers and operators since around the 1.0 release. People organized around networking, storage, scaling and other operational areas.

As Kubernetes took off, so did the need for tools, best practices, and discussions around building and operating cloud native applications. To fill that need the Kubernetes SIG Apps came into existence.

SIG Apps is a place where companies and individuals can:

  • see and share demos of the tools being built to enable app operators
  • learn about and discuss needs of app operators
  • organize around efforts to improve the experience

Since the inception of SIG Apps we’ve had demos of projects like KubeFuse, KPM, and StackSmith. We’ve also executed on a survey of those operating apps in Kubernetes.

From the survey results we’ve learned a number of things including:

  • That 81% of respondents want some form of autoscaling
  • To store secret information 47% of respondents use built-in secrets. At reset these are not currently encrypted. (If you want to help add encryption there is an issue for that.) 
  • The most responded questions had to do with 3rd party tools and debugging
  • For 3rd party tools to manage applications there were no clear winners. There are a wide variety of practices
  • An overall complaint about a lack of useful documentation. (Help contribute to the docs here.)
  • There’s a lot of data. Many of the responses were optional so we were surprised that 935 of all questions across all candidates were filled in. If you want to look at the data yourself it’s available online.

When it comes to application operation there’s still a lot to be figured out and shared. If you've got opinions about running apps, tooling to make the experience better, or just want to lurk and learn about what's going please come join us.



--Matt Farina, Principal Engineer, Hewlett Packard Enterprise

Create a Couchbase cluster using Kubernetes

Editor’s note: today’s guest post is by Arun Gupta, Vice President Developer Relations at Couchbase, showing how to setup a Couchbase cluster with Kubernetes.  

Couchbase Server is an open source, distributed NoSQL document-oriented database. It exposes a fast key-value store with managed cache for submillisecond data operations, purpose-built indexers for fast queries and a query engine for executing SQL queries. For mobile and Internet of Things (IoT) environments, Couchbase Lite runs native on-device and manages sync to Couchbase Server.

Couchbase Server 4.5 was recently announced, bringing many new features, including production certified support for Docker. Couchbase is supported on a wide variety of orchestration frameworks for Docker containers, such as Kubernetes, Docker Swarm and Mesos, for full details visit this page.  

This blog post will explain how to create a Couchbase cluster using Kubernetes. This setup is tested using Kubernetes 1.3.3, Amazon Web Services, and Couchbase 4.5 Enterprise Edition.

Like all good things, this post is standing on the shoulder of giants. The design pattern used in this blog was defined in a Friday afternoon hack with @saturnism. A working version of the configuration files was contributed by @r_schmiddy.

Couchbase Cluster

A cluster of Couchbase Servers is typically deployed on commodity servers. Couchbase Server has a peer-to-peer topology where all the nodes are equal and communicate to each other on demand. There is no concept of master nodes, slave nodes, config nodes, name nodes, head nodes, etc, and all the software loaded on each node is identical. It allows the nodes to be added or removed without considering their “type”. This model works particularly well with cloud infrastructure in general. For Kubernetes, this means that we can use the exact same container image for all Couchbase nodes.

A typical Couchbase cluster creation process looks like:
  • Start Couchbase: Start n Couchbase servers
  • Create cluster: Pick any server, and add all other servers to it to create the cluster
  • Rebalance cluster: Rebalance the cluster so that data is distributed across the cluster

In order to automate using Kubernetes, the cluster creation is split into a “master” and “worker” Replication Controller (RC). 

The master RC has only one replica and is also published as a Service. This provides a single reference point to start the cluster creation. By default services are visible only from inside the cluster. This service is also exposed as a load balancer. This allows the Couchbase Web Console to be accessible from outside the cluster.

The worker RC use the exact same image as master RC. This keeps the cluster homogenous which allows to scale the cluster easily.



Configuration files used in this blog are available here. Let’s create the Kubernetes resources to create the Couchbase cluster.


Create Couchbase “master” Replication Controller

Couchbase master RC can be created using the following configuration file:
apiVersion: v1
kind: ReplicationController
metadata:
 name: couchbase-master-rc
spec:
 replicas: 1
 selector:
   app: couchbase-master-pod
 template:
   metadata:
     labels:
       app: couchbase-master-pod
   spec:
     containers:
     - name: couchbase-master
       image: arungupta/couchbase:k8s
       env:
         - name: TYPE
           value: MASTER
       ports:
       - containerPort: 8091
----
apiVersion: v1
kind: Service
metadata:
 name: couchbase-master-service
 labels:
   app: couchbase-master-service
spec:
 ports:
   - port: 8091
 selector:
   app: couchbase-master-pod
 type: LoadBalancer

This configuration file creates a couchbase-master-rc Replication Controller. This RC has one replica of the pod created using the arungupta/couchbase:k8s image. This image is created using the Dockerfile here. This Dockerfile uses a configuration script to configure the base Couchbase Docker image. First, it uses Couchbase REST API to setup memory quota, setup index, data and query services, security credentials, and loads a sample data bucket. Then, it invokes the appropriate Couchbase CLI commands to add the Couchbase node to the cluster or add the node and rebalance the cluster. This is based upon three environment variables:
  • TYPE: Defines whether the joining pod is worker or master
  • AUTO_REBALANCE: Defines whether the cluster needs to be rebalanced
  • COUCHBASE_MASTER: Name of the master service

For this first configuration file, the 
TYPE environment variable is set to MASTER and so no additional configuration is done on the Couchbase image.

Let’s create and verify the artifacts.

Create Couchbase master RC:

kubectl create -f cluster-master.yml
replicationcontroller "couchbase-master-rc" created
service "couchbase-master-service" created

List all the services:

kubectl get svc
NAME                       CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE
couchbase-master-service   10.0.57.201                 8091/TCP   30s
kubernetes                 10.0.0.1      <none>        443/TCP    5h

Output shows that couchbase-master-service is created.

Get all the pods:
kubectl get po
NAME                        READY     STATUS    RESTARTS   AGE
couchbase-master-rc-97mu5   1/1       Running   0          1m

A pod is created using the Docker image specified in the configuration file.
Check the RC:

kubectl get rc
NAME                  DESIRED   CURRENT   AGE
couchbase-master-rc   1         1         1m

It shows that the desired and current number of pods in the RC are matching.

Describe the service:

kubectl describe svc couchbase-master-service
Name: couchbase-master-service
Namespace: default
Labels: app=couchbase-master-service
Selector: app=couchbase-master-pod
Type: LoadBalancer
IP: 10.0.57.201
LoadBalancer Ingress: a94f1f286590c11e68e100283628cd6c-1110696566.us-west-2.elb.amazonaws.com
Port: <unset> 8091/TCP
NodePort: <unset> 30019/TCP
Endpoints: 10.244.2.3:8091
Session Affinity: None
Events:
 FirstSeen LastSeen Count From SubobjectPath Type Reason Message
 --------- -------- ----- ---- ------------- -------- ------ -------
 2m 2m 1 {service-controller } Normal CreatingLoadBalancer Creating load balancer
 2m 2m 1 {service-controller } Normal CreatedLoadBalancer Created load balancer

Among other details, the address shown next to LoadBalancer Ingress is relevant for us. This address is used to access the Couchbase Web Console.

Wait for ~3 mins for the load balancer to be ready to receive requests. Couchbase Web Console is accessible at <ip>:8091 and looks like:


The image used in the configuration file is configured with the Administrator username and password password. Enter the credentials to see the console:


Click on Server Nodes to see how many Couchbase nodes are part of the cluster. As expected, it shows only one node:

Click on Data Buckets to see a sample bucket that was created as part of the image:

This shows the travel-sample bucket is created and has 31,591 JSON documents.

Create Couchbase “worker” Replication Controller
Now, let’s create a worker replication controller. It can be created using the configuration file:
apiVersion: v1
kind: ReplicationController
metadata:
 name: couchbase-worker-rc
spec:
 replicas: 1
 selector:
   app: couchbase-worker-pod
 template:
   metadata:
     labels:
       app: couchbase-worker-pod
   spec:
     containers:
     - name: couchbase-worker
       image: arungupta/couchbase:k8s
       env:
         - name: TYPE
           value: "WORKER"
         - name: COUCHBASE_MASTER
           value: "couchbase-master-service"
         - name: AUTO_REBALANCE
           value: "false"
       ports:
       - containerPort: 8091

This RC also creates a single replica of Couchbase using the same arungupta/couchbase:k8s image. The key differences here are:
  • TYPE environment variable is set to WORKER. This adds a worker Couchbase node to be added to the cluster.
  • COUCHBASE_MASTER environment variable is passed the value of couchbase-master-service. This uses the service discovery mechanism built into Kubernetes for pods in the worker and the master to communicate.
  • AUTO_REBALANCE environment variable is set to false. This ensures that the node is only added to the cluster but the cluster itself is not rebalanced. Rebalancing is required to to re-distribute data across multiple nodes of the cluster. This is the recommended way as multiple nodes can be added first, and then cluster can be manually rebalanced using the Web Console.
Let’s create a worker:

kubectl create -f cluster-worker.yml
replicationcontroller "couchbase-worker-rc" created



Check the RC:

kubectl get rc
NAME                  DESIRED   CURRENT   AGE
couchbase-master-rc   1         1         6m
couchbase-worker-rc   1         1         22s


A new couchbase-worker-rc is created where the desired and the current number of instances are matching.

Get all pods:
kubectl get po
NAME                        READY     STATUS    RESTARTS   AGE
couchbase-master-rc-97mu5   1/1       Running   0          6m
couchbase-worker-rc-4ik02   1/1       Running   0          46s

An additional pod is now created. Each pod’s name is prefixed with the corresponding RC’s name. For example, a worker pod is prefixed with couchbase-worker-rc.

Couchbase Web Console gets updated to show that a new Couchbase node is added. This is evident by red circle with the number 1 on the Pending Rebalance tab.


Clicking on the tab shows the IP address of the node that needs to be rebalanced:


Scale Couchbase cluster

Now, let’s scale the Couchbase cluster by scaling the replicas for worker RC:

kubectl scale rc couchbase-worker-rc --replicas=3
replicationcontroller "couchbase-worker-rc" scaled


Updated state of RC shows that 3 worker pods have been created:

kubectl get rc
NAME                  DESIRED   CURRENT   AGE
couchbase-master-rc   1         1         8m
couchbase-worker-rc   3         3         2m


This can be verified again by getting the list of pods:
kubectl get po
NAME                        READY     STATUS    RESTARTS   AGE
couchbase-master-rc-97mu5   1/1       Running   0          8m
couchbase-worker-rc-4ik02   1/1       Running   0          2m
couchbase-worker-rc-jfykx   1/1       Running   0          53s
couchbase-worker-rc-v8vdw   1/1       Running   0          53s

Pending Rebalance tab of Couchbase Web Console shows that 3 servers have now been added to the cluster and needs to be rebalanced.

Rebalance Couchbase Cluster
Finally, click on Rebalance button to rebalance the cluster. A message window showing the current state of rebalance is displayed:


Once all the nodes are rebalanced, Couchbase cluster is ready to serve your requests:


In addition to creating a cluster, Couchbase Server supports a range of high availability and disaster recovery (HA/DR) strategies. Most HA/DR strategies rely on a multi-pronged approach of maximizing availability, increasing redundancy within and across data centers, and performing regular backups.

Now that your Couchbase cluster is ready, you can run your first sample application.

For further information check out the Couchbase Developer Portal and Forums, or see questions on Stack Overflow.  



--Arun Gupta, Vice President Developer Relations at Couchbase