Loading…
Attending this event?
In-person
11-12 December
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon India 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in India Standard Time (UTC+5:30)To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
Wednesday December 11, 2024 4:50pm - 5:25pm IST
Inference engines are used to generate prediction or deduce new information based on certain rules and data. With rise in the number of applications that benefit from inferencing, there is a definitive need to build Inferencing Services that are robust, performant (latency optimized) and can seamlessly scale on demand. In this session, we will discuss a proven set of procedures and guidelines for building and managing inference pipelines on Kubernetes. We will discuss details of the underlying hardware (GPU/CPU/memory) and K8s configuration requirements of some of the well-known Inference engines. We will demonstrate how robust and fault tolerant pipelines for LLM and RAG can be built using basic K8s constructs like operators, statefulsets and persistent volumes; along with enabling automated monitoring, triaging and remediation of the failed hardware and software components.
Speakers
avatar for Smitha Jayaram

Smitha Jayaram

Principal Software Engineer, Nvidia
Smitha is principal software engineer at NVIDIA focusing on software solutions for building scalable, cloud native GPU computing infrastructure. She has been working in the area of building scalable storage solutions, and cloud native platform design for the last 24 years. In the... Read More →
avatar for Vinod Eswaraprasad

Vinod Eswaraprasad

Solution Engineering, NVIDIA
Vinod is principal software engineer at NVIDIA focusing on software solutions for building scalable GPU computing infrastructure. He has been working in the area of building fault-tolerant, scalable, and distributed platform architecture and design for the last 26 years. In the current... Read More →
Wednesday December 11, 2024 4:50pm - 5:25pm IST
Room 5

Attendees (2)


Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link