KubeCon + CloudNativeCon India 2024: Full Schedule

In-person
11-12 December
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon India 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in India Standard Time (UTC+5:30). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.

9:55am IST

Keynote: And the Service Is Now Part of the Mesh! A Case Study of Service Mesh Adoption at Flipkart - I V Prasad Reddy & Purushotham Malipedda, Flipkart

Wednesday December 11, 2024 9:55am - 10:10am IST

Auditorium

Istio based ServiceMesh offers a lot of features. However realisation of these features differ in their requirement for adoption levels. Adoption of Service Mesh for a single service, all services in a call graph, all services of the organisation unlocks different sets of features. At Flipkart, leading e-commerce service in India, with ~6K services and a scale of 15 Mil RPS, practical constraints, scale challenges and failures of attempts at traditional onboarding, pushed us to arrive at a ZERO touch, operator powered and phased onboarding model. This approach expedited our Public Cloud bursting, Disaster Recovery solutions and unlocked features like service graph, securing east-west traffic etc.. With the services already onboarded to mesh, they are able to leverage mesh features with minimal and incremental changes that increased the usage. In this talk we like to share our journey of adopting Service Mesh, challenges faced and how adoption progressed with the operator model

Speakers

Purushotham Malipedda

Platform Engineering at Flipkart, Flipkart Internet Pvt Ltd.

Standardising non functional concerns of micro services with service mesh platform

I V Prasad Reddy

Mr, Flipkart

I. V. Prasad Reddy is a Software Architect working with Flipkart. He leads Network Services team within Platform Engineering group which manages Load Balancing, Service Discovery, Network Isolation. He is passionate about platforms and has recently been working on modernising the... Read More →

Wednesday December 11, 2024 9:55am - 10:10am IST
Auditorium

Keynote Sessions, Connectivity

Content Experience Level Intermediate

11:30am IST

Harnessing Emerging Data-Plane Technologies for Next-Gen Load Balancers - Shatakshi Mishra & Abed Mohammad Kamaluddin, Marvell Technology

Wednesday December 11, 2024 11:30am - 12:05pm IST

201

Cloud Load Balancers (CLBs) have evolved significantly, now featuring advanced capabilities such as proxying, SSL/TLS termination, and auto-scaling. The underlying data-path technologies have advanced from traditional Linux networking to include eBPF, DPDK, VPP, and P4. Each of these technologies offers unique advantages and trade-offs: eBPF excels in real-time processing but lacks some comprehensive features, DPDK provides high performance with reduced transparency, VPP offers modular packet handling with strong Linux compatibility, and P4-based solutions, while still evolving, promise enhanced flexibility and performance. This session will delve into these data-path technologies, examining their maturity, limitations, and suitability for modern data centers. As specialized hardware units become increasingly integral to large data centers, we will highlight how these units effectively leverage data-path technologies to perform numerous tasks, thereby enhancing overall efficiency.

Speakers

Abed Mohammad Kamaluddin

Director Accelerator Solutions, Marvell

Abed Mohammad Kamaluddin serves as the Director of Accelerator Solutions at Marvell Technologies. Leading a software team, he focuses on Packet Accelerators, Transport and Network Security stacks, VPP software, and DPU solutions, with a particular emphasis on 5G, Security, and Networking... Read More →

Shatakshi Mishra

Senior Software Engineer, Marvell Technology

Shatakshi Mishra is a Senior Software Engineer at Marvell Technology with 3+ years of experience. She is skilled in P4 Programming Language, Kubernetes Orchestration, and Cloud Native Technologies, dedicated to providing innovative solutions and leveraging cutting-edge technologi... Read More →

Wednesday December 11, 2024 11:30am - 12:05pm IST
201

Connectivity

Content Experience Level Intermediate

11:30am IST

Configuring Object Store for Vector Database Applications - Jiffin Tony Thottan, IBM

Wednesday December 11, 2024 11:30am - 12:05pm IST

B202B

Vector databases like Milvus and LanceDB are revolutionizing similarity search and AI workloads. However, their performance in cloud-native environments depends heavily on optimized storage configurations. This session will delve into configuring Ceph's RADOS Gateway (RGW) using Rook for this workload. And provide a sample demo of how to run these applications with RGW.

Speakers

Jiffin Tony Thottan

Backend Engineer, IBM

Jiffin Tony Thottan is part of the IBM Storage Team working as a Backend Engineer in Ceph. Initially part of the NFS team and contributed to GlusterFS , NFS-Ganesha projects. He has given presentations about his work at various conferences like FOSDEM, Storage Developer Conference... Read More →

Wednesday December 11, 2024 11:30am - 12:05pm IST
B202B

Data Processing + Storage

Content Experience Level Intermediate

12:20pm IST

From Bottleneck to Breakthrough: EarnIn's Network Evolution with Linkerd - Kush Trivedi, EarnIn & Somnath Chakraborty, Earnin

Wednesday December 11, 2024 12:20pm - 12:55pm IST

201

As EarnIn, one of the leading fintech companies, rapidly scaled in the number of microservices and user base, we recognized the need to ensure our network layer could keep pace with our scale. Initially relying on an enterprise service mesh that eventually became a bottleneck for our applications as well as our operators, we explored all the available service mesh out in the market, ultimately choosing Linkerd. This talk will delve into our decision-making process, focusing on why we chose LinkerD and how it has transformed our network and application architectures. We'll cover the crucial aspects of Linkerd that enhanced our system's performance, robustness, and security, all while simplifying operations and ensuring reliabilityᅳensuring our operators can rest easy. Join us to learn how LinkerD can elevate your service mesh strategy to meet the demands of substantial growth while keeping your architecture simple.

Speakers

Kush Trivedi

Sr. Platform Engineer, EarnIn

I am working with EarnIn where I leading the service-mesh & observability paradigm as well as cloud-native adaption effort for the Kubernetes use-case and product revenue fit, as well as working on the platform infrastructure to provide one-touch onboarding of the applications on... Read More →

Somnath Chakraborty

Somnath Chakraborty, Earnin

Engineering Manager at Earnin, leading the India Platform team based in Bengaluru. Have worked extensively infrastructure and platform management, AWS and Azure cloud technologies, DevOps, and automation.

Wednesday December 11, 2024 12:20pm - 12:55pm IST
201

Connectivity

Content Experience Level Intermediate

12:20pm IST

Data Protection Considerations for Elastic Cloud-Native Applications - Pankaj Ahire, Veritas Technologies

Wednesday December 11, 2024 12:20pm - 12:55pm IST

B202B

In this comprehensive session, we will delve into the intricate details of data protection in the context of cloud-native applications. We will start by examining the automation of application discovery. This includes a special focus on application-consistent backups, particularly crucial for distributed applications utilizing multiple persistent volumes. We will then navigate the considerations for temporary storage and compute needs during backup and restore operations. The discussion will extend to the impact on production applications' compute and I/O performance. As a highlight, the session will explore strategies for detecting and protecting applications from ransomware attacks. The final segment of the presentation will cover application recovery and thereby mobility of applications across different Kubernetes platform distributions. Each aspect is designed to prepare participants for a future where cloud-native data protection is efficient, resilient, and cost-effective.

Speakers

Pankaj Ahire

Mr. Pankaj, Veritas Technologies

Pankaj is technical lead and building data protection capabilities for Kubernetes, Hypervisors, OpenStack, File Systems in Veritas NetBackup. He is one of the key members of NetBackup engineering that laid the foundation for Kubernetes protection. He has overall more than two decades... Read More →

Wednesday December 11, 2024 12:20pm - 12:55pm IST
B202B

Data Processing + Storage

Content Experience Level Intermediate

12:20pm IST

Side Effects - Lessons Learnt While Building Traffic Platforms Serving 1.5 Million+ TPS Using K8s - Sumit Mathur & Sushanth Kamath A, Intuit

Wednesday December 11, 2024 12:20pm - 12:55pm IST

B101A

At Intuit, we have ~2000 production services that serve North-South and East-West Traffic using API Gateway and Service Mesh. These services span across ~350 k8s clusters and are managed by 10,000+ engineers. This talk walks you through multiple design iterations conducted by the Traffic team at Intuit in building the platforms that serve 1.5 Million+ TPS. Topics that are covered include lessons learnt while 1. Designing a cost effective, low latency, highly scalable, multi tenant API Gateway and Service Mesh platforms. 2.Using a non blocking model to externalise Authorization using an OPA based sidecar. 3.Distributing traffic configurations that are self serviced by 10,000+ engineers to the Traffic runtime platforms. 4. Designing Rate Limiting solutions with Low latency and low error

Speakers

Sushanth Kamath A

Leads Traffic team at Intuit India, Senior Staff Engineer at Intuit, Intuit

works as a Senior Staff engineer at Intuit leading the traffic team (API Gateway and Service Mesh) at IDC. I have worked on API Gateway ecosystem for 7+ years and on Service Mesh for 4+ years. I am very passionate about building software that runs at scale!

Sumit Mathur

Group Engineering Manager of Traffic Team at Intuit, Intuit India Product Development Centre Private Limited

I am Sumit Mathur with 15 years of experience in platform and cloud technologies. I am an Engineering leader at Intuit and lead the API Gateway, Service Mesh, Configuration as Service and In Session Customer experience team.

Wednesday December 11, 2024 12:20pm - 12:55pm IST
B101A

Platform Engineering

Content Experience Level Intermediate

2:55pm IST

Multi-Node Finetuning LLMs on Kubernetes: A Practitioner’s Guide - Ashish Kamra & Boaz Ben Shabat, Red Hat

Wednesday December 11, 2024 2:55pm - 3:30pm IST

Auditorium

Large Language Model (LLM) finetuning on enterprise private data has emerged as an important strategy for enhancing model performance on specific downstream tasks. This process however demands substantial compute resources, and presents some unique challenges in Kubernetes environments. This session offers a practical, step-by-step guide to implementing multi-node LLM finetuning on Kubernetes clusters with GPUs, utilizing PyTorch FSDP and the Kubeflow training operator. We'll cover - preparing a Kubernetes cluster for LLM finetuning, optimizing cluster, system, and network configurations, and comparing performance of various network topologies including pod networking, secondary networks, and GPU Direct RDMA over ethernet for peak performance. By the end of this session, the audience will have a comprehensive understanding of the intricacies involved in multi-node LLM finetuning on Kubernetes empowering them to introduce the same in their own production Kubernetes environments.

Speakers

ASHISH KAMRA

Senior Manager, Red Hat

Dr. Ashish Kamra is an accomplished engineering leader with over 15 years of experience managing high-performing teams in AI, machine learning, and cloud computing. He joined Red Hat in March 2017, where he currently serves as the Senior Manager of AI Performance at Red Hat. In this... Read More →

Boaz Ben Shabat

Senior AI Performance Engineer, Red Hat

I am an Engineer with extensive experience in optimizing large-scale, high-performance computing environments. My expertise includes network architecture, system performance tuning, and cloud infrastructure. I excel in solving complex technical challenges and improving efficiency... Read More →

Wednesday December 11, 2024 2:55pm - 3:30pm IST
Auditorium

AI + ML

Content Experience Level Intermediate

2:55pm IST

Is GitOps a Broken Experience - Where Is the Ops in GitOps ? - Rajalakshmi Kamath V, Walmart Global Tech

Wednesday December 11, 2024 2:55pm - 3:30pm IST

305

Imagine the k8s resource model only had “spec” and not the “status”. How operationally challenging would it have been to infer this critical information from out of band approaches? While Gitops frameworks work very well to translate a pre-defined intended state, it fails to provide visibility into the actual state of the system. This entails a need to build tools to improve infrastructure observability. Gitops, also fails to address immediate operational control and recovery in the event of outages and offers a broken experience. Gitops has been the primary developer experience for Walmart running over 200k workloads and 10k deployments per day on 1000s of k8s clusters across public and private cloud. This talk highlights our learnings on building various ops tooling which provides deployment/running state of the workloads to complement our Gitops based deployment pipelines.

Speakers

Rajalakshmi Kamath V

Staff Software Engineer, Walmart Global Tech

Rajalakshmi Kamath is a Staff Software Engineer at Walmart Global Tech focussed on core cloud native functionalities that power Walmart's container platform. She has a knack for building new solutions and is a prolific contributor for building tools for the Walmart Cloud Native Platform... Read More →

Wednesday December 11, 2024 2:55pm - 3:30pm IST
305

Operations + Performance

Content Experience Level Intermediate

3:45pm IST

Optimizing 5G Networks: Deploying AI/ML Workloads with the AIMLFW of O-RAN SC - Subhash Kumar Singh, Samsung

Wednesday December 11, 2024 3:45pm - 4:20pm IST

Auditorium

This session will explore the AI/ML Framework (AIMLFW) within O-RAN SC (O-RAN Software Community) community, designed for dynamic and efficient 5G network management. Key Topics: - Introduction to O-RAN SC and AIMLFW: Overview of O-RAN’s architecture and mission. - AI/ML Use Cases in O-RAN: Real-world applications like traffic prediction and anomaly detection, supported by AIMLFW’s scalable platform. - Architecture and Components of AIMLFW: * Kubeflow for Model Training * KServe for Model Deployment * O-RAN Specification for AI/ML Workload Deployment * Core ML Lifecycle Components - Challenges and Solutions in AI/ML Deployment: Addressing common challenges in distributed 5G environments. - Future Directions and Community Collaboration: Potential integration with Flyte and MLflow for enhanced AI/ML workflow management.

Speakers

Subhash Kumar Singh

Mr., Samsung

Subhash Kumar Singh is a Senior Chief Engineer at Samsung, where he leads the AI/ML Framework (AIMLFW) project within the O-RAN Software Community (SC). Over the years, Subhash has been actively involved in several prominent open-source communities. His extensive experience in these... Read More →

Wednesday December 11, 2024 3:45pm - 4:20pm IST
Auditorium

AI + ML

Content Experience Level Intermediate

3:45pm IST

The Importance of Designing Single-Resource Controllers - Feny Mehta & Francesco Ilario, Red Hat

Wednesday December 11, 2024 3:45pm - 4:20pm IST

201

While anyone working with Kubernetes knows about Controllers and Custom Resources, most don't know that there are many ways of designing a Controller, but just one good practice. That practice is "single resource": your controller should only manage a single resource, instead of having a single controller for multiple resources or multiple controllers for a single resource. This talk guides you on implementation of single-resource Controllers, showcases examples of a well-designed controller and good practice of designing Single-Resource controllers by comparing it with discouraged alternative designs. This talk focuses on practical insights, on aforementioned design’s pros and cons, and how its implementation affects the efficiency of Operators. After the talk, attendees will be able to design and implement single-resource controller, and have a deeper understanding of good practices of designing a kubernetes controller

Speakers

Feny Mehta

Software Engineer, Red Hat

Feny is a Kubernetes and Golang enthusiast. She has been into the software field for about 7 years but just 3 years into the vast field of kubernetes.

Francesco Ilario

Principal Software Engineer, Red Hat

I'm a Software Engineer passionate about Open Source, Linux, Go, and Kubernetes.

Wednesday December 11, 2024 3:45pm - 4:20pm IST
201

Cloud Native Novice

Content Experience Level Intermediate

4:50pm IST

PepsiCo’s Smart Edge Computing Delivers Anomaly Detection & Proactive Problem Solving to Boost Sales - Praseed Naduvath & Amit Mele, PepsiCo

Wednesday December 11, 2024 4:50pm - 5:25pm IST

Auditorium

In the era of digital transformation, PepsiCo is leading the way in integrating edge computing to ensure real-time data processing across its network. Utilizing lightweight Kubernetes solutions like K3s and RKE2, PepsiCo has built a platform that boosts computational capabilities at edge locations. Supported by Rancher and Longhorn, this platform enables efficient microservices deployment, providing the agility needed to meet dynamic market demands. A key component is the deployment of advanced ML models for camera and video inferences, which need substantial GPU resources. PepsiCo employs cutting-edge GPU sharing techniques to optimize these costly assets, improving performance and scalability while reducing costs. Join us to explore PepsiCo's edge computing strategy, its use of lightweight Kubernetes, and innovative GPU sharing techniques. Learn how PepsiCo is harnessing edge computing to drive operational excellence, sales growth and maintain a competitive edge.

Speakers

Amit Mele

Amit M, PepsiCo

I currently hold the position of Deputy Director of Integration Engineering in PepsiCo with 17 years of experience, I specialize in platform engineering and application development. With certifications in CKA, CKS, K3S, and Edge Architect, I’ve spent 6 years in platform strategy... Read More →

Praseed Naduvath

Praseed Naduvath, PepsiCo

Praseed Naduvath is a techno-manager with over 18 years in IT, specializing in cloud infrastructure, container orchestration, and service mesh technologies. A Certified Kubernetes Administrator and Security Specialist, he excels in managing and securing complex Kubernetes environments... Read More →

Wednesday December 11, 2024 4:50pm - 5:25pm IST
Auditorium

AI + ML

Content Experience Level Intermediate

4:50pm IST

Scalable ML Inferencing Pipeline Using K8s - Smitha Jayaram & Vinod Eswaraprasad, NVIDIA

Wednesday December 11, 2024 4:50pm - 5:25pm IST

305

Inference engines are used to generate prediction or deduce new information based on certain rules and data. With rise in the number of applications that benefit from inferencing, there is a definitive need to build Inferencing Services that are robust, performant (latency optimized) and can seamlessly scale on demand. In this session, we will discuss a proven set of procedures and guidelines for building and managing inference pipelines on Kubernetes. We will discuss details of the underlying hardware (GPU/CPU/memory) and K8s configuration requirements of some of the well-known Inference engines. We will demonstrate how robust and fault tolerant pipelines for LLM and RAG can be built using basic K8s constructs like operators, statefulsets and persistent volumes; along with enabling automated monitoring, triaging and remediation of the failed hardware and software components.

Speakers

Smitha Jayaram

Principal Software Engineer, Nvidia

Smitha is principal software engineer at NVIDIA focusing on software solutions for building scalable, cloud native GPU computing infrastructure. She has been working in the area of building scalable storage solutions, and cloud native platform design for the last 24 years. In the... Read More →

Vinod Eswaraprasad

Solution Engineering, NVIDIA

Vinod is principal software engineer at NVIDIA focusing on software solutions for building scalable GPU computing infrastructure. He has been working in the area of building fault-tolerant, scalable, and distributed platform architecture and design for the last 26 years. In the current... Read More →

Wednesday December 11, 2024 4:50pm - 5:25pm IST
305

AI_dev Sessions, MLOps + GenOps + DataOps

Content Experience Level Intermediate

4:50pm IST

Reimagining Kubernetes Pods: Nested Containers with CRI-O - Sohan Kunkerkar, Red Hat Inc

Wednesday December 11, 2024 4:50pm - 5:25pm IST

B202B

With user namespaces reaching beta in Kubernetes and new developments in CRI-O, we’re closer to making nested containers within pods more flexible and powerful. Traditionally limited by masked /proc and restricted user namespaces, this approach now offers capabilities similar to Podman. In this talk, we will explore how Kubernetes’ security features—privileged mode, rootless containers, and network isolation—can enable running containers inside pods. We’ll examine the support matrix for various configurations and discuss upcoming work to bring VM-like flexibility to Kubernetes pods for more secure and dynamic container orchestration.

Speakers

Sohan Kunkerkar

Senior Software Engineer, Red Hat Inc

Sohan Kunkerkar is a Senior Software Engineer at Red Hat, bringing expertise in distributed systems, backend engineering, and containers. His active contributions extend to CRI-O, a container runtime engine, and various sub-projects within the Kubernetes Sig-Node community. Sohan... Read More →

Wednesday December 11, 2024 4:50pm - 5:25pm IST
B202B

Emerging + Advanced

Content Experience Level Intermediate

4:50pm IST

Using Kubernetes for Operating Legacy, the Way to Happy Legacy Developers - Soren Davidsen, Bankdata.dk

Wednesday December 11, 2024 4:50pm - 5:25pm IST

B101A

How do we build a self-service platform that allows developers to interact with a legacy application through Kubernetes? In most other parts of the organisation, developers seek to operate on Kubernetes using the now-familiar concepts of GitOps and cloud-native ways for deploying their applications. When Kubernetes is a one-stop for deployment, the cognitive loads for platform and infrastructure can be minimized, and gives better focus to implementing business needs. In this talk, Bankdata.dk will share their experience building a Kubernetes API-driven self-service platform for operating three of their most critical legacy systems on Kubernetes, through ArgoCD and GitOps. The platform builds on the Java operator framework, and wraps operations on a portal server also running in Kubernetes. Bankdata.dk delivers all IT to 9 danish banks, including the 2nd and 3rd largest.

Speakers

Soren Davidsen

Software developer, Bankdata.dk

Soren is a software developer at Bankdata's platform team, where his focus is developer experience and automation, apart from ensuring operational robustness for the member banks. Outside of work, time is spent with family, reading hard sci-fi, and and maintaining/contributing to... Read More →

Wednesday December 11, 2024 4:50pm - 5:25pm IST
B101A

Platform Engineering

Content Experience Level Intermediate

5:40pm IST

Effortless Clustering: Rethinking ClusterAPI with Systemd-Sysext - Sayan Chowdhury, Microsoft

Wednesday December 11, 2024 5:40pm - 6:15pm IST

B202B

Through the years, ClusterAPI has evolved into an indispensable tool, streamlining the lifecycle management of Kubernetes clusters across multiple infrastructure providers. The current approach adds a layer of complexity at the image-building stage, presenting users with a multitude of options. But what if we challenge this conventional approach? This presentation introduces a paradigm shift in ClusterAPI image building, leveraging systemd-sysext and image composability. Join me in this talk as we explore how this innovative approach could help cope with the never-ending matrix of Kubernetes versions and Distro images, significantly enhancing usability for users managing their workloads.

Speakers

Sayan Chowdhury

Senior Software Engineer, Microsoft

Sayan is a Linux Software Engineer at Microsoft and a maintainer of Flatcar Container Linux. As a Release Manager, he works to maintain and build Flatcar. With a strong passion for open source, Sayan has been involved in other communities, namely Python, Fedora and Mozilla. Sayan... Read More →

Wednesday December 11, 2024 5:40pm - 6:15pm IST
B202B

Emerging + Advanced

Content Experience Level Intermediate

5:40pm IST

Faster Deployments at PepsiCo with Self-Service Continuous Delivery Using the App of Apps Pattern - Chaitanya G & Prasanti Kadiyala, PepsiCo

Wednesday December 11, 2024 5:40pm - 6:15pm IST

B101A

At PepsiCo, we are committed to enhancing efficiency and developer experience by continuously reviewing our processes and systems. As part of streamlining our deployment workflows, we transitioned to the GitOps methodology and adopted ArgoCD, a powerful GitOps tool with a feature-rich web UI that facilitates intuitive management of deployments, rollbacks, and application health. By leveraging ArgoCD's App of Apps pattern, we could define a bootstrapper Application, enabling the automation of application deployment. This approach allowed us to achieve self-service Continuous Delivery, empowering teams to independently manage their deployments while maintaining centralized control and visibility. Join us, as we share some of the challenges, including managing dependencies, configuring environments and ensuring consistency across deployments. Additionally, we will highlight the strategies and best practices, including the implementation of App of Apps pattern and the integration of Helm.

Speakers

Chaitanya G

Chaitanya G, PepsiCo

With 15 years of extensive experience in the IT industry, I serve as the DevSecOps lead at PepsiCo, spearheading the holistic development and implementation of DevOps, GitOps, and Agile strategies within the organization.

Prasanti Kadiyala

PepsiCo, Sr Analyst, PepsiCo

A seasoned DevOps engineer with over 10 years of experience in the IT industry, with deep understanding of cloud computing, containerization, and infrastructure as code, with a particular focus on Kubernetes and GitOps. Started my career as a .NET developer, where I gained a solid... Read More →

Wednesday December 11, 2024 5:40pm - 6:15pm IST
B101A

Platform Engineering

Content Experience Level Intermediate

11:30am IST

⚡ Lightning Talk: Safer Cluster Upgrades with Mixed Version Proxy - Richa Banker, Google

Thursday December 12, 2024 11:30am - 11:35am IST

Auditorium

Upgrading Kubernetes clusters often presents numerous challenges, including potential downtime, compatibility issues, and the complexity of managing multiple versions. The Mixed Version Proxy feature introduced in Kubernetes 1.28 aims to mitigate these challenges. This talk will delve into the technical intricacies of the Mixed Version Proxy, exploring its design and implementation. We will then highlight the substantial benefits it offers for cluster upgrades, such as minimizing downtime and enhancing overall reliability. Attendees will gain practical knowledge through (possibly a demonstration) on enabling and utilizing the Mixed Version Proxy. Finally, we will provide insights into the future roadmap for this feature, including upcoming beta releases and enhancements.

Speakers

Richa Banker

Software Engineer, Google

Currently a software engineer at Google. Exploring and contributing to OSS Kubernetes on the side.

Thursday December 12, 2024 11:30am - 11:35am IST
Auditorium

⚡ Lightning Talks, Operations + Performance

Content Experience Level Intermediate

11:30am IST

Modernizing Network Automation: Embracing Cloud-Native Excellence for Scalable Automation Platforms - Diana Appanna & Srivatsa Srinivasa, Nokia

Thursday December 12, 2024 11:30am - 12:05pm IST

B101A

Traditionally, Network Management System (NMS) are very important and complex systems designed to help network operators from large service providers to small enterprises handling new service deployment, device lifecycle management, and fault management. However, their monolithic architecture combined all functions into one system, causing scalability issues and inefficiencies. The siloed design also impeded network management by isolating functions and restricting real-time monitoring. To address these constraints, we transitioned to a cloud-native architecture. We leveraged a comprehensive suite of open-source and CNCF technologies, including service mesh (Istio), container orchestration (Kubernetes), observability stack (Prometheus, Grafana, OpenSearch, Fluent Bit), and ingress gateway solutions (MetalLB, NGINX). This transformation resolved legacy issues while future-proofing our customers' networks, ensuring they stay agile, efficient, and prepared for future challenges.

Speakers

Diana Appanna

Senior Technical Specialist, Nokia

Passionate Software Designer with over a decade of experience in developing and optimizing network management systems adept at implementing FCAPS (Fault, Configuration, Accounting, Performance, and Security) principles and cloud-native design

Srivatsa Srinivasa

Mr, Nokia

With over 15 years of experience in the networking industry, Srivatsa is a seasoned expert known for deep understanding of cloud native architecture and its transformative impact on modern technology landscapes. Throughout his career, Srivatsa has been at the forefront of pioneering... Read More →

Thursday December 12, 2024 11:30am - 12:05pm IST
B101A

Cloud Native Experience

Content Experience Level Intermediate

11:30am IST

A Debugging Journey from Coredns to Coreutils - Akhil Mohan & Humble Devassy Chirammal, VMware by Broadcom

Thursday December 12, 2024 11:30am - 12:05pm IST

201

It all started when CoreDNS pods entered a crash loop backoff state soon after the cluster was deployed. The only change was an unprivileged execution desired in the latest coreDNS image. The issue occurred only in worker nodes, not in control plane nodes, and refreshing the image by deleting and pulling it again resolved the issue. It happened in one OS distribution flavor but not in another. We investigated Docker build, setcap, and libc, and found that the required capability was missing in nodes during the issue. The stack was complex, involving FIPS/CGO, build system, runtime configuration, and binary packaging in OVA. Debugging included CoreDNS binary, containerd, runc, SELinux, AppArmor, Photon OS, and the kernel... lots of learning which could be useful for many developers/admins or cluster operators to debug unprivileged pod execution. Finally, we discovered something interesting in the bootstrapping scripts. Let's debug.

Speakers

Akhil Mohan

Software Engineer, VMware by Broadcom

Akhil works as a Software Engineer at VMware by Broadcom. An active contributor to projects in cloud native and container ecosystem. Akhil is a maintainer of containerd, and the kubernetes publishing-bot sub project. He works mostly on container runtimes and kubernetes sig-node a... Read More →

Humble Devassy Chirammal

R&D Engineer, VMware by Broadcom

Humble is the Tech Lead for Kubernetes Distribution Engineering at VMware by Broadcom. He has more than 19 years of experience that includes extensive work in Openshift engineering, KVM Virtualisation..etc. He has led SDS based products while working at Red Hat. He is one of the... Read More →

Thursday December 12, 2024 11:30am - 12:05pm IST
201

Operations + Performance

Content Experience Level Intermediate

11:45am IST

⚡ Lightning Talk: From Creation to Termination: Your Pod's Digital Footprint - Subha Seenivasan, Google

Thursday December 12, 2024 11:45am - 11:50am IST

Auditorium

Kubernetes relies on Kubelet and Containerd to manage the lifecycle of pods. These components provide a wealth of log data, but the sheer volume and complexity can be overwhelming, particularly for those new to the Kubernetes ecosystem. This talk aims to dissect a pod's journey from birth to termination, pinpointing key events like volume mounts, health checks, and the dreaded teardown. Listeners will learn to filter, search, and interpret logs like a pro, turning raw data into actionable insights. Whether you're troubleshooting a misbehaving pod or simply curious about its inner workings, this talk aims to equip the listeners with the skills to navigate the Kubernetes log labyrinth with confidence.

Speakers

Subha Seenivasan

Technical Solutions Specialist, GKE, Google

Tech enthusiast, 5+ years in container orchestration (Docker, Kubernetes). Currently at Google, specializing in GKE & GDC for VMware/Bare metal on the Technical Solutions Engineering team.

Thursday December 12, 2024 11:45am - 11:50am IST
Auditorium

⚡ Lightning Talks, Operations + Performance

Content Experience Level Intermediate

11:50am IST

⚡ Lightning Talk: Enhancing Hyperparameter Optimization with Advanced Parameter Distributions - Shashank Mittal, Kubeflow

Thursday December 12, 2024 11:50am - 11:55am IST

Auditorium

In this session, Shashank Mittal, a Google Summer of Code (GSoC) 2024 contributor for Kubeflow, will discuss his work on enhancing the Katib component's Experiment APIs to support a wide range of parameter distributions for hyperparameter optimization. Katib is an integral part of the Kubeflow ecosystem that enables scalable and efficient hyperparameter tuning in machine learning workflows. The session will focus on the technical challenges and solutions involved in adding support for distributions such as uniform, log-uniform, normal, and lognormal, among others. Attendees will gain insights into the process of integrating these distributions into Katib's APIs and suggestion services like Optuna and Hyperopt. This talk is designed for developers and data scientists interested in MLOps and cloud-native machine learning infrastructure, highlighting the importance of advanced hyperparameter optimization techniques in improving model performance.

Speakers

Shashank Mittal

Shashank Mittal, Kubeflow

Shashank Mittal is a B.Tech student at IIT BHU and a 2024 Google Summer of Code contributor to Kubeflow. He specializes in MLOps and DevOps, focusing on enhancing hyperparameter optimization capabilities in cloud-native environments. His work in Kubeflow's Katib involves integrating... Read More →

Thursday December 12, 2024 11:50am - 11:55am IST
Auditorium

⚡ Lightning Talks, AI + ML

Content Experience Level Intermediate

12:20pm IST

Cell-Based Kubernetes - The Secret to Scalable, Repeatable and Resilient Cloud Architecture - Shweta Vohra, Booking.com & Saiyam Pathak, Loft Labs

Thursday December 12, 2024 12:20pm - 12:55pm IST

Auditorium

Cell-based architecture offers concentrated, self-contained functionalities but can be challenging to implement with Kubernetes clusters. However, integrating this architecture with Kubernetes brings significant benefits like enhanced scalability, resiliency, and resource management. This session will demystify the process, focusing on secure inter-cell communication, API gateway integration, and building resilience using Kubernetes-native features. I’ll demonstrate how to use Istio for secure inter-cell communication, ensuring efficient interaction between services while maintaining strict security. This includes configuring traffic management rules and leveraging Istio’s mutual TLS for secure communications with minimal complexity. We’ll also explore enforcing separation of concerns and governance for multi-tenancy and privacy. Attendees will gain actionable insights into leveraging cell-based architecture on Kubernetes, transforming complex setups into robust, scalable solutions.

Speakers

Shweta Vohra

Lead Architect, Booking.com

Shweta Vohra is an Architect, Author, and Inventor with over 20 years of experience in the software industry. Her expertise spans from complex embedded systems design to hybrid cloud-native solutions, and most recently, the creation of data and machine learning platforms. She is the... Read More →

Saiyam Pathak

Principal Developer Advocate, Loft Labs

Saiyam is working as Principal Developer Advocate at Loft Labs. He is the founder of Kubesimplify and BuildSafe. Previously at Civo, Walmart Labs, Oracle, and HP, Saiyam has worked on many facets of Kubernetes.When not coding, Saiyam contributes to the community by writing blogs and... Read More →

Thursday December 12, 2024 12:20pm - 12:55pm IST
Auditorium

SDLC (Software Development Lifecycle)

Content Experience Level Intermediate

3:45pm IST

Trust in Green: Towards a Cloud Native Approach for Building Sustainable and Reliable Enterprise AI - Vincent Caldeira, Red Hat

Thursday December 12, 2024 3:45pm - 4:20pm IST

Auditorium

As organisations increasingly integrate AI solutions, the demand for environmentally sustainable practices within this space has never been more critical. This presentation delves into the collaborative effort between the Cloud Native Computing Foundation (CNCF) AI WG and the TAG Environmental Sustainability to define a repeatable design approach aimed at fostering sustainable AI in cloud-native environments. Our discussion will outline the crucial considerations in such approach, including efficient management of compute resources, storage optimisation, and advanced networking solutions. Attendees will gain insights into the lifecycle of AI/ML deployments, from inception through operation, emphasising resilience, scalability, and resource efficiency. By highlighting innovative "green" strategies, this session will provide actionable best practices and recommendations, alongside a forward-looking perspective on future trends and research directions in sustainable AI.

Speakers

Vincent Caldeira

CTO APAC, Red Hat

Vincent Caldeira, CTO of Red Hat in APAC, is responsible for strategic partnerships and technology strategy. Named a top CTO in APAC in 2023, he has 20+ years in IT, excelling in technology transformation in finance. An authority in open source, cloud computing, and digital transformation... Read More →

Thursday December 12, 2024 3:45pm - 4:20pm IST
Auditorium

AI + ML

Content Experience Level Intermediate

3:45pm IST

Keeping the Lights on: Zero Downtime Application Upgrades in Kubernetes - Shashank Pai & Sagar Jadhav, InfraCloud Technologies

Thursday December 12, 2024 3:45pm - 4:20pm IST

201

Imagine adding all your favorite products to your shopping cart, only to have the page reload and—poof—everything’s gone. Frustrating, right?. For mission-critical apps, seamless upgrades in Kubernetes demand a solid grasp of how to tackle these challenges effectively. This session equips you with practical strategies for performing zero-downtime upgrades in Kubernetes. We will cover the role of readiness and liveness probes, explore rolling update strategies and Blue/Green deployments, and compare their benefits. We will also discuss handling SIGTERM signals, managing sidecar containers, and the importance of PodDisruptionBudgets and TopologySpreadConstraints. Finally, we will examine how Priority Classes and Pod Eviction impact stability during Kubernetes upgrades. By the end, you'll be equipped to confidently handle upgrades without downtime, ensuring your applications stay online and reliable, even during the most challenging updates.

Speakers

Sagar Jadhav

Senior Product Engineer, InfraCloud Technologies Pvt. Ltd.

Software Engineering has been Sagar Jadhav's field of expertise for more than a decade. Docker, Kubernetes, Terraform, Go, NodeJS, and other tools and programming languages are among the ones he has experience with.He is currently working in the space of infrastructure provisioning... Read More →

Shashank Pai

Senior SRE, Infracloud

Name: Shashank Pai Title: Senior SRE at InfraCloud Biography: Shashank is a Senior SRE at InfraCloud, bringing extensive experience in managing and scaling cloud-native applications. His expertise lies in Kubernetes, CI/CD automation, and infrastructure as code. He is passionate about... Read More →

Thursday December 12, 2024 3:45pm - 4:20pm IST
201

Operations + Performance

Content Experience Level Intermediate

3:45pm IST

Fuzzing for Stability: Uncovering and Mitigating Helm's CVE - Jakub Ciolek, AlphaSense

Thursday December 12, 2024 3:45pm - 4:20pm IST

B202B

Join this talk to uncover the story of a high severity CVE-2024-26147 [CVSS: 7.5] discovered in Helm and understand the role of fuzzing in maintaining the ecosystem’s integrity. Through this demonstration, you'll see firsthand the systematic approach used to identify the vulnerability that caused Helm to panic when faced with missing YAML metadata. The issue enabled crashing Helm SDK-based clients over the network and additionally, bricking local Helm client installations. We'll dive into the specific tools and techniques that were instrumental in detecting the issue, focusing on their applicability to your daily work. This session is designed not just to share a discovery but to foster a community-wide commitment to proactive security practices. Learn how these insights can be applied to strengthen the security and reliability of your Kubernetes deployments, ensuring a safer environment for all users of the ecosystem.

Speakers

Jakub Ciolek

Senior Tech Lead - Cloud Platform, AlphaSense

Jakub Ciolek is a seasoned Senior Tech Lead at AlphaSense, focused on Kubernetes and open-source innovation. He has made notable contributions to the Go compiler and identified key vulnerabilities in Helm and Argo CD. He is dedicated to driving forward secure, scalable solutions in... Read More →

Thursday December 12, 2024 3:45pm - 4:20pm IST
B202B

Security

Content Experience Level Intermediate

4:50pm IST

Streamlining Machine Learning Operations with GitOps - Kunal Kushwaha, Civo

Thursday December 12, 2024 4:50pm - 5:25pm IST

201

As AI becomes more popular, it's bringing some new headaches when it comes to managing models, from building them to getting them into production. The usual DevOps tools don't always cut it, especially when you need everything to be consistent and handle large datasets without a hitch. As ML projects grow, trying to manage everything manually just gets messy and prone to mistakes. In this talk, I'll break down how GitOps can make life easier by automating and simplifying ML operations. With Git as the go-to source for everything, GitOps keeps your ML models consistent and reproducible across different environments. I'll share some real-world examples, talk about the challenges, and give you practical tips on how to bring GitOps into your ML workflow. The goal is to help you deploy models faster and with fewer headaches.

Speakers

Kunal Kushwaha

Field CTO, Civo

Building a better future through technology and innovation.

Thursday December 12, 2024 4:50pm - 5:25pm IST
201

AI_dev Sessions, Foundations + Frameworks + Tools for Machine Learning

Content Experience Level Intermediate

4:50pm IST

Scaling Beyond Autoscaling: How We Grew Our Kubernetes Clusters - Bhavin Gandhi & Ruturaj Kadikar, InfraCloud Technologies

Thursday December 12, 2024 4:50pm - 5:25pm IST

305

Picture this: Your startup just landed a big client, and as your business starts to grow, so does your Kubernetes cluster. The number of microservices and the traffic that they serve has started increasing. You configure auto-scaling of Nodes and Pods. Do you think that is enough, or do you need to look into other areas? What other components need scaling? What strategies should be devised? How do we verify whether there will not be any impact during scaling? In this session, we answer all these questions and come up with a blueprint for scaling a Kubernetes cluster from our experience with scaling our production cluster from merely 20-25 nodes to around 250 nodes with more than 5000 pods. We will touch base on the issues we faced and their resolution that will form a good prescription for those who want to scale their Kubernetes infrastructure.

Speakers

Bhavin Gandhi

Staff SRE, InfraCloud Technologies

Bhavin is working with InfraCloud Technologies. His main area of interest are Free/Libre and Open Source Software, DevSecOps, containers and Kubernetes. He has been part of cloud native transformation journeys for various companies. You can check what he is up to on his personal website... Read More →

Ruturaj Kadikar

Senior Engineer, Site Reliability, InfraCloud Technologies

I like to apply site reliability principles to facilitate business growth through the seamless and reliable performance of systems and infrastructure. I have built private clouds with OpenStack, built Kubernetes on premises & scaled Kubernetes on cloud for enterprises of various... Read More →

Thursday December 12, 2024 4:50pm - 5:25pm IST
305

Operations + Performance

Content Experience Level Intermediate

4:50pm IST

SPIFFE as a Glue for Large Scale Telco Deployments: A Nephio Perspective - Rahul Jadhav, AccuKnox

Thursday December 12, 2024 4:50pm - 5:25pm IST

B202B

Emerging Telco trends such as ORAN, advanced 5G core demands a disaggregated arch for scaling. Kubernetes based deployments are becoming a norm and much of the open CNCF/LF tooling are playing a major role. The aim of this submission is to talk about the challenges that Nephio(www.nephio.org) SIG-Security team faced about streamlining security operations across multi-cluster multi-region, multi-vendor based deployments. The aim is to talk about specific instances/use-cases where the Nephio management cluster needs to securely interact with regional/edge clusters for the control plane needs. Also why/how the Nephio security team envisaged SPIFFE as a foundational layer to bind multi region together. A particular problem statement in the context of ORAN deployments where SMO (Service Mgmt Orchestation) has to securely interact with IMS (Infra Mgmt Service) for secure creation of infrastructure and the role SPIFFE played in the context would be highlighted.

Speakers

Rahul Jadhav

Cofounder, CTO, AccuKnox

An avid coder, a systems engineer working on solutions involving security and performance of cloud-native tech. Contributed towards several open sources including Linux Kernel and worked closely with IETF Standards (such as ROLL, 6lo, LWIG) and Linux Foundation. Taken several projects... Read More →

Thursday December 12, 2024 4:50pm - 5:25pm IST
B202B

Security

Content Experience Level Intermediate

5:40pm IST

Scaling Private LLM Model Services with Kserve and Modelcar OCI: A Real-World Implementation - Mayuresh Krishna, initializ

Thursday December 12, 2024 5:40pm - 6:15pm IST

Auditorium

Deploying large language models (LLMs) is inherently complex, challenging, and expensive. This case study demonstrates how Kubernetes, specifically Kserve with Modelcar OCI storage backend, simplifies the deployment and management of private LLM services. First, we explore how Kserve enables efficient and scalable model serving within a Kubernetes environment, allowing seamless integration and optimized GPU utilization. Second, we delve into how Modelcar OCI artifacts streamline artifact delivery beyond container images, reducing duplicate storage usage, increasing download speeds, and minimizing governance overhead. The session will cover implementation details, benefits, best practices, and lessons learned. Walk away learning how to leverage Kubernetes, Kserve, and OCI artifacts to enhance your MLOps journey, achieving significant efficiency gains and overcoming common challenges in deploying and scaling private LLM services.

Speakers

MK

CTO & Co-Founder, initializ

Mayuresh Krishna is the CTO and Co-Founder of initializ.ai, where he drives product engineering, building AI models and private AI services. He has previously worked at VMware Tanzu as a Solution Engineering Leader & Pivotal Software as a Senior Platform Architect.

Thursday December 12, 2024 5:40pm - 6:15pm IST
Auditorium

AI + ML

Content Experience Level Intermediate

5:40pm IST

Unleashing Observability: Enabling Distributed Tracing in gRPC with OpenTelemetry - Purnesh Dixit & Sourabh Singh, Google

Thursday December 12, 2024 5:40pm - 6:15pm IST

201

gRPC is often chosen for its high-performance communication, making it crucial to minimize latency. Distributed tracing helps identify and diagnose latency bottlenecks in gRPC calls, ensuring that the performance benefits of gRPC are fully realized. Setting up tracing from scratch for gRPC applications is complex and warrants significant maintenance overhead. Moreover, there is not much literature on how to do it properly, resulting in bespoke and suboptimal solutions. In this talk, Purnesh Dixit from the gRPC team will talk about the best practices for utilizing distributed tracing effectively in gRPC environments by configuring the new OpenTelemetry plug-in, created by the gRPC team in google.

Speakers

Purnesh Dixit

Purnesh Dixit (gRPC Team, Google), Google

Purnesh is a software engineer on the gRPC team at Google. He is a contributor to the OpenTelemetry support for distributed tracing in gRPC-go. He is also one of the maintainer of grpc-go open source library.

Sourabh Singh

Google

Thursday December 12, 2024 5:40pm - 6:15pm IST
201

Observability

Content Experience Level Intermediate

5:40pm IST

Expedia Group's GitOps Revolution: Extensive Scalability Testing on ArgoCD for 30K+ Applications - Shivani Mehrotra, Expedia Group & Mohit Kumar, Coforge Limited

Thursday December 12, 2024 5:40pm - 6:15pm IST

B202B

Expedia Group's journey to implement GitOps with ArgoCD is a story of innovation, scalability, and overcoming challenges. Our GitOps journey involved migrating from KubeFed to ArgoCD, focusing on extensive scalability testing across hundreds of virtual clusters, set up using open source tool, vcluster. We proactively identified potential challenges and prepared comprehensive test cases tailored to different application flavors. We created three types of applications for testing, with sizes varying between 15-30 resources, including CRDs and jobs, small applications containing 15 resources and large applications containing 30 resources. We experimented with multiple test scenarios, using permutation and combination of applications tested on 300 vclusters, scaling approximately 1,000 applications to 30,000+ across these clusters. We concluded this initiative with determining optimal settings for various tunable parameters in the ArgoCD controllers.

Speakers

Mohit Kumar

Coforge, Senior DevOps Engineer, Coforge Limited

Mohit, Senior DevOps Engineer at Coforge, specializes in GitOps and DevOps methodologies with a focus on Kubernetes orchestration and cloud infrastructure. His expertise ensures high availability and scalability across global platforms. Committed to the forefront of technology, Mohit... Read More →

Shivani Mehrotra

Shivani Mehrotra, Expedia Group, SDE-II, Expedia Group

Shivani, SDE-II at Expedia Group is a platform engineer, specializing in building robust systems. Passionate about innovation, Shivani thrives on challenges, delivering impactful results in her role. Outside of work, Shivani enjoys exploring new technologies and staying at the forefront... Read More →

Thursday December 12, 2024 5:40pm - 6:15pm IST
B202B

Platform Engineering

Content Experience Level Intermediate

5:40pm IST

Ensuring Reliability of Production Ready Kubernetes Operators Using Envtest - Aniruddha Basak, Independent & Rayan Das, OneTrust LLC

Thursday December 12, 2024 5:40pm - 6:15pm IST

305

We need to keep in mind many principles & techniques while writing a Kubernetes operator and especially so when used in production. Projects like kubebuilder go a long way to help in building operators. However, talks focus on how to write operators. But in this talk, you will learn how testing can help you create Kubernetes operators that are reliable and resilient to failure and give you the confidence to deploy them to production clusters. Often writing unit tests is not enough for an operator that manages infrastructure. Introducing e2e tests will help, but writing and running e2e tests are time-consuming. This session will share how a user can use envtest to simulate an environment very similar to a real environment and run the tests on behalf of that. In addition, you’ll learn what to keep in mind while writing the tests, such as it doesn’t support garbage collection, and a user must be very careful while creating objects and deleting them after a test suite is done.

Speakers

Rayan Das

Senior Site Reliability Engineer, OneTrust LLC

As a Senior Site Reliability Engineer, I devote my expertise to work on the infrastructure of OneTrust Privacy Software. Within the Kubernetes community, I've served as the SIG-Release Enhancement Shadow for Kubernetes v1.29, I applied for release shadow for v1.31 as well. Beyond... Read More →

Aniruddha Basak

Cloud Software Engineer, Independent

Aniruddha Basak is a Cloud Software Engineer based in Germany. He’s currently one of the Cluster API providers, Hetzner, and a Hivelocity maintainer, and builds solutions for Kubernetes tooling and Cluster Management. In his free time, he loves contributing to upstream Kubernetes... Read More →

Thursday December 12, 2024 5:40pm - 6:15pm IST
305

SDLC (Software Development Lifecycle)

Content Experience Level Intermediate