Ever since it emerged out of the halls of Google five years ago, Kubernetes has quickly become one of the hot technologies of the decade. Simply put, Kubernetes is now the undisputed platform of choice for composing and running applications comprised of microservices – small, independently deployable services that run in containers and work together to function as a larger application that can be ported across various types of infrastructure.
Kubernetes is an orchestration tool, which in this case means it enables developers to view, coordinate, and manage containerized workloads and services with the goal of running resilient distributed systems. According to the latest figures from the Cloud Native Computing Foundation (CNCF), published in August 2018, 40 percent of respondents from enterprise companies (over 5000 firms) are already running Kubernetes in production.
While that’s good progress for the open source project, it’s important to note that the vast majority of these organizations are running only a handful of applications with Kubernetes as they get to grips with the technology. But the direction of travel is clear: Container-based microservices applications are the future and Kubernetes is their platform. That’s why the big three cloud providers have all launched managed versions of Kubernetes – and Cisco, HPE, IBM/Red Hat, Microsoft, VMware/Pivotal, and others have incorporated Kubernetes into their core software offerings.
Kubernetes is enabling enterprises of all sizes to improve their developer velocity, nimbly deploy and scale applications, and modernize their technology stacks. For example, the online retailer Ocado, which has been delivering fresh groceries to UK households since 2000, has built its own technology platform to manage logistics and warehouses. In 2017, the company decided to start migrating its Docker containers to Kubernetes, taking its first application into production in the summer of 2017 on its own private cloud.
The big benefits of this shift for Ocado and others have been much quicker time-to-market and more efficient use of computing resources. At the same time, Kubernetes adopters also tend to cite the same drawback: The learning curve is steep, and although the technology makes life easier for developers in the long run, it doesn’t make life less complex.
Here are some examples of large global companies running Kubernetes in production, how they got there, and what they have learned along the way.
Bloomberg reaps the benefits of early adoption
Financial data specialist Bloomberg turned to Kubernetes in 2015, when the tool was still in alpha, before moving into production in 2017 once the necessary continuous integration, monitoring, and testing was proved out.
Bloomberg processes hundreds of billions of financial data points every day, with 14,000 different applications powering its ubiquitous Terminal product alone. The IT organization wanted to boost the speed at which it could bring new applications and services to users and free up developers from operational tasks.
After assessing various orchestration platforms, such as Cloud Foundry, Mesosphere Marathon, and various Docker offerings, Bloomberg opted for Kubernetes because it “had a good foundation and it was clear they were confronting the right problems. You could see a vision and roadmap as to how it would evolve that were aligned with what we were thinking,” explains Andrey Rybka, head of compute infrastructure in the Office of the CTO at Bloomberg.
Over time Bloomberg has worked on a homegrown platform-as-a-service layer on top of Kubernetes to give developers the right level of abstraction to work effectively with the technology. This self-service web portal is essentially a command-line interface and REST API which integrates with a Git-based version control system, CI build system, and central artifact repository.
One of the key goals for Bloomberg was to make better use of existing hardware investments using the autoscaling capabilities of Kubernetes, along with the ability to self-provision and flex virtual compute, networking, and storage without having to issue tickets. “With Kubernetes, we’re able to very efficiently use our hardware to the point where we can get close to 90 to 95 percent utilization rates” at times of peak demand, Rybka said as part of a CNCF case study. Much of that efficiency comes from the ability to constrain resources for a given workload, so it doesn’t starve other workloads.
As is the case with most enterprises adopting Kubernetes in production, the main challenges arose around the use of YAML to write manifests, which specify how Kubernetes allocates resources. “These are powerful concepts in Kubernetes that require a steep learning curve,” Rybka said.
As Steven Bower, Bloomberg’s data and analytics infrastructure lead, put it: “Kubernetes makes a lot of things easier but not necessarily simpler.”
As a result, Bloomberg started with basic manifests, limited to a small subset of criteria from which developers could scale up their usage as they got more comfortable with the technology, as well as running plenty of internal training programs.
“We have a lot of existing infrastructure and there is zero chance that will miraculously move to Kubernetes off big iron [mainframes],” he said. Instead the orchestration platform is being targeted at web-based applications and net-new systems. In the data and analytics Infrastructure team, where Bower works, the initial approach was to stand up a new data science compute platform for the machine learning engineers to run complex workloads using tools like Spark and TensorFlow.
As his parting piece of advice, Rybka talked about the importance of building expertise. “You really have to have an expert team that is in touch with upstream Kubernetes and the CNCF and the whole ecosystem to have that in-house knowledge. You can’t just rely on a vendor and need to understand all the complexities around this,” he said.
News UK taps Kubernetes to scale on demand
The UK arm of media giant News Corp has been dabbling with Kubernetes since 2017, moving from their own custom Kubernetes clusters to the managed Elastic Kubernetes Service (EKS) from Amazon Web Services in 2018. This makes up part of a stack that also includes a bunch of AWS services, including Elastic Container Service, the Fargate compute engine, AWS Batch, and Elastic Beanstalk.
The first in-production application to be moved into this managed Kubernetes environment was a legacy Java system for access control and user login. Once the environment proved robust enough, the organization began steadily identifying and migrating other applications.
Speaking at monitoring specialist New Relic’s London Futurestack event earlier this year, Marcin Cuber, a former cloud devops engineer at News UK, said that “operationally, this simplifies what we have to maintain and monitor. On top of that we have EKS in its own isolated VPC, allowing us to specify our own security groups and network access control lists.”
The key goal for News UK was to better be able to scale up its environment around breaking news events and unpredictable reader volumes. “If there is breaking news, for example, we want every reader to be able to gather real-time updates worldwide and of course, to have a flawless experience,” Cuber said.
Where Kubernetes differs from VM autoscaling comes down to speed. “VMs take long to spin up and when there is a spike of traffic, it is not fast enough to bring new capacity into the AutoScalingGroup,” Cuber said. “Docker containers running in Kubernetes are smaller and lightweight, therefore allowing us to scale in a matter of a few seconds rather than minutes.”
Cuber also had some advice for any organizations looking to adopt Docker and Kubernetes. First was to make your Docker images as small as possible and to focus on running stateless applications with Kubernetes. “This will improve your scalability and portability,” he said.
Next is to run health checks for your applications and to use YAML to deploy anything. “This way you can utilize temporary credentials that will expire soon after your deployment and you never have to worry about static located credentials,” he added.
News UK also wanted to cut costs by pairing EKS clusters with AWS spot instances – where AWS sells spare compute capacity at a discount rate but can also reclaim that capacity at any time.
“There’s a huge advantage of using spot instances; we are making around 70 percent savings compared to on-demand pricing,” Cuber said. As a way to circumvent the issue of nodes being taken away, the engineers set up an AWS Lambda function that detects the termination signal from AWS and automatically drains the nodes due to be affected.
Amadeus drinks the Kubernetes Kool-Aid
Spanish travel tech giant Amadeus has been working with Kubernetes as far back as version 0.7 five years ago. In the ensuing two years the company was keen to see things like monitoring, alerting, and the wider ecosystem mature before committing any business-critical applications to Kubernetes. The company now feels it made the right bet.
Amadeus is one of the big three global distribution systems that enable travel agents and metasearch engines like Expedia and Kayak to sell flight, hotel room, and rental car bookings. Late in 2016 the organization started to move its first application – for airline availability – to Kubernetes in production, hand in hand with Red Hat’s OpenShift platform. The plan was actually to move a hotel reservation application first, but as that project bloated, the airline availability application, which was built for Linux and needed to be moved to the public cloud to better serve its airline clients’ growing demands for lower latency, made it to production faster.
“The good thing we had from the start is all our apps are on Linux, so they are container-friendly from the start,” Etienne said. “Of course they were monolithic, but it was really more about how to move existing apps to containers and then Kubernetes, so the position was pretty straightforward.”
Shifting to Kubernetes fit with a broader business goal for Amadeus to shift from on-premises deployments to the public cloud, predominantly with its partner Google Cloud, so that it could better scale to meet seasonal demand and cut down on over-provisioning infrastructure costs.
In terms of challenges, Amadeus is a strong engineering organization, so once some training had been completed the technical challenges paled into insignificance compared to the cultural shift that tools like Kubernetes required from the organization.
“One of the main challenges is shifting mindset in terms of what it means for developers,” Etienne said. “They used to think about the machine the application runs on and now you forget about the machine and everything is configuration driven with YAML files everywhere.”
“Everyone was already getting ready for containers, so the biggest shift was operating apps in an agnostic way,” he added.
The overall goal for Amadeus is to move all production workloads to run on a single operating model with Kubernetes, and the organization is around 10 to 15 percent of the way there so far. “As with any strategy, if we reach that goal, it is too early to say,” Sebastien Pellise, director of platform solution management at Amadeus said.
Another, softer benefit of adopting tools like Kubernetes is with recruiting and retaining talent, because “working on these type of things is so much more sexy to advanced engineers than working on a mainframe,” said Dietmar Fauser, former SVP of technology platforms and engineering at Amadeus, in an interview earlier this year.
Gearing up for a Kubernetes future
One of the more interesting aspects of these various case studies is their consistency. Regardless of industry – be it financial services, media, retail, or technology – organizations of all sizes are grappling with a sea change in the way software is built and deployed in small, discrete, loosely coupled chunks of functionality.
There are also consistencies among challenges and benefits. All of these organizations are compelled to enact sometimes painful cultural change and face significant recruitment challenges as they compete for talent with the likes of Google and Facebook. All of these organizations are also starting to speed up their development cycles, reduce costs and downtime, and deliver more value more frequently for their customers.
At this point, it’s not an exaggeration to say that any organization that fails to get up to speed with containers and Kubernetes will struggle to keep up in our new, accelerated, software-driven world.