Name: Scalable ML Inferencing Pipeline Using K8s - Smitha Jayaram & Vinod Eswaraprasad, NVIDIA
Start: 2024-12-11T16:50:00+0530
End: 2024-12-11T17:25:00+0530

In-person
11-12 December
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon India 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in India Standard Time (UTC+5:30). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.

Wednesday December 11, 2024 4:50pm - 5:25pm IST

Room 5

Inference engines are used to generate prediction or deduce new information based on certain rules and data. With rise in the number of applications that benefit from inferencing, there is a definitive need to build Inferencing Services that are robust, performant (latency optimized) and can seamlessly scale on demand. In this session, we will discuss a proven set of procedures and guidelines for building and managing inference pipelines on Kubernetes. We will discuss details of the underlying hardware (GPU/CPU/memory) and K8s configuration requirements of some of the well-known Inference engines. We will demonstrate how robust and fault tolerant pipelines for LLM and RAG can be built using basic K8s constructs like operators, statefulsets and persistent volumes; along with enabling automated monitoring, triaging and remediation of the failed hardware and software components.

Speakers

Smitha Jayaram

Principal Software Engineer, Nvidia

Smitha is principal software engineer at NVIDIA focusing on software solutions for building scalable, cloud native GPU computing infrastructure. She has been working in the area of building scalable storage solutions, and cloud native platform design for the last 24 years. In the... Read More →

Vinod Eswaraprasad

Solution Engineering, NVIDIA

Vinod is principal software engineer at NVIDIA focusing on software solutions for building scalable GPU computing infrastructure. He has been working in the area of building fault-tolerant, scalable, and distributed platform architecture and design for the last 26 years. In the current... Read More →

Wednesday December 11, 2024 4:50pm - 5:25pm IST
Room 5

AI_dev Sessions, MLOps + GenOps + DataOps

Content Experience Level Intermediate

KubeCon + CloudNativeCon India 2024

Smitha Jayaram

Vinod Eswaraprasad

Attendees (2)

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!