Q&A: Patrick Ohly Talks About Chopping Wood and Kubernetes Logs

Chris_Norman · ‎11-01-2022

Like being a lumberjack, contributing to open source projects takes a lot of effort. But if you're diligent, consistent and keep your tools sharp, you end up with great logs, which in turn can help you debug your code much more easily. One person who exemplifies this approach is Patrick Ohly.

The Cloud Native Computing Foundation* recently announced the 2022 Community Awards, and Ohly was recognized for his dedication and tireless work behind the scenes in the “Chop Wood, Carry Water” category. He was also recognized in the Kubernetes* Contributor Awards for 2022 for his work in three Special Interest Groups – Architecture, Instrumentation and Testing, mainly for his work around improvements in Kubernetes logging capabilities.

We asked him about his lumberjack skills, what he’s been contributing to Kubernetes and how his work helps Intel’s customers.

Q: You’ve been working on open source with Intel for over a decade, working on Linux* distributions like Moblin*, MeeGo* and Tizen* and then on embedded projects, specifically Yocto*. What inspires you about open source that makes you want to work in this environment?

A: First, of course, is the availability of the source code: that makes it possible to study how things work and change them as needed. Nothing is hidden, so one gets to see the good, the bad and the ugly. The good parts are what one can learn from when starting in a new field and the others provide ideas for improvements.

Then there’s also the community around an open source project: that is, an opportunity to work with experts from all over the world. Later, after having gained some experience, it's possible to help other developers and users.

Q: In 2018, you switched from a role as Senior Software Engineer to Cloud Software Architect, focusing on enabling additional hardware in Kubernetes and various upstream enhancements.
How does working on Kubernetes compare with previous roles? Was it an easy transition to make?

A: Some of the underlying technology (Linux*) and tools (makefiles, scripting) were still the same. Other parts were new (Golang, containers) but that wasn’t hard to pick up. The community was a lot larger and more diverse. Thankfully, I had the chance to attend KubeCon and could connect with fellow developers not just via online chat or conference calls, but also in person. Overall, it was an easy transition.

Q: What’s the secret to being a successful contributor to an open source project?

A: Technical skills help, but it’s not just about that. It’s also a lot about working well with others, explaining what you want to do, why a problem is worth spending time on (in particular, the time of people who will have to review and maintain the code you write!) and why a proposed solution is good. Good communication is crucial.

Q: Tell me about the work that you did that was recently recognized by the various SIGs that you participate in. What sorts of improvements have you been making to the code organization in Kubernetes architecture? Specifically, you were recognized for updates to klog, the Kubernetes logging library.
What are the limitations with the previous Kubernetes logging methods, and how will the changes you are making result in a more structured logging across the Kubernetes code base?

A: The big architectural change was the introduction of contextual logging. Structured logging had already addedJavaScript Object Notation (JSON) as an easier to parse output format, but all log calls were still going through one global logger instance. Injecting additional values into all log entries for a certain call chain, for example a request ID, was not possible in a consistent manner. When running unit tests in parallel, their log output would all get mixed up.

Contextual logging is primarily a coding convention that defines how to pass a logger instance around. Klog provides parts of the API and one implementation. Because this affects a lot of code and other developers, SIG Architecture was involved. Later, the Golang community started a similar initiative that will lead to a language standard for this kind of logging. My expectation is that Kubernetes can easily migrate to that once it is available.

Q: What benefits will customers see from this structured logging?

A: For example, in kube-scheduler, the pod that is being scheduled will soon be listed in every log entry related to it. That will make it more reliable to find information for specific problems.

[For more, check out an article he wrote on the difference between structured and contextual logging]

Q: In addition to architecting the solution, you were also heavily involved in the instrumentation and testing, and the migration of the end-to-end (E2E) test suite to Ginkgo version 2. Were there any major challenges that you had to overcome?

A: Ginkgo is used to define and run end-to-end tests. Compared to the built-in Go unit test feature it has better support for running very large test suites. But version 1 didn’t have good support for handling aborted tests and cleaning up afterwards, which is important when a test creates, for example, a persistent volume in the cloud that continues to cost money when it doesn’t get deleted.
Version 2 is addressing that. But while some parts are similar, it’s still different enough that we had to adapt various parts of the Kubernetes E2E suite and tools that consume the test results. Kubernetes depends on automated testing, so such changes must be made carefully because when testing is broken, all work gets blocked.

Q: How do you find the time for all these activities? Do you want to give a shout out to any colleagues who have helped?

A: Logging and testing is something that I am doing on the side, it’s not my main responsibility in Kubernetes. I’ve worked on it when I was blocked elsewhere because reviewers needed more time, and during my spare time.

Most of the work so far was done with developers from the community, but recently also developers at Intel took an interest. I look forward to the conversion of kubelet to contextual logging that Feruzjon Muyassarov started working on.

Q: Are there any other things that you are working on? Is there anything that can we look forward to in upcoming releases?

A: Absolutely! I'm driving a feature called “dynamic resource allocation”, a new API for allocating resources like GPUs or field-programmable gate array (FPGA)s. This is changing core Kubernetes concepts around scheduling and how to extend Kubernetes, so it has received a lot of scrutiny. We are now at the point where the implementation is ready for review and inclusion into Kubernetes 1.26 as a new alpha feature.

Q: What bleeding edge technologies are you most excited about for the future (not just related to Kubernetes)?

A: I’m very curious what other areas are going to pick up container technology. My first job at Intel was on high performance computing and MPI, and even there, people are thinking about Kubernetes. Other non-traditional usages of Kubernetes are edge computing and batch processing.

About the author

Chris Norman is an Open Source Advocate who has promoted the use of open source ecosystems for over a decade. You can find him as pixelgeek on Twitter, IRC and GitHub.

Photo by Abby Savage on Unsplash