1407 Webinar Reg page

July 16, 2026 |  3:00 PM ET / 12:00 PM PT

How to deploy your own LLM and take it to production with Fuzzball

Running your own language model sounds straightforward until you see what it actually requires: compute provisioning, GPU allocation, model downloads, service wiring, storage configuration, and authentication, with no guarantee it will stay running once you get it there. Most teams look at that list and reach for a commercial AI service instead. The ones that don't spend months on infrastructure work before a single model reaches production.

Fuzzball removes that overhead by capturing AI model deployment as a reusable, templated workflow. In this live demo, the CIQ team shows exactly what it looks like to go from the Fuzzball workflow catalog to a running LLM on infrastructure your team owns and controls, without writing a single line of infrastructure configuration. The session uses NVIDIA DGX Spark as the demo environment and covers the full path from first deployment through production inference, including how to choose the right inference backend for your workload, swap models without rebuilding your stack, and carry the same workflow definition into any compute environment when your project grows.

Join us July 16 at 3:00 PM ET / 12:00 PM PT.


Together, they will cover

  • How Fuzzball's workflow catalog deploys a complete LLM stack, including inference backend and chat interface, through a single form submission
  • How to swap models without making infrastructure changes, treating the model as a parameter rather than an architectural decision
  • How the same workflow definition runs on-premises, on DGX Spark, in the cloud, or across any environment where Fuzzball runs
  • How sovereign and private AI workloads stay on infrastructure you own and control, with your data never leaving your environment

Attendees will leave with

  • A clear, step-by-step understanding of how to deploy and run an LLM on infrastructure they control using Fuzzball
  • Practical guidance on choosing and scaling an inference backend from a single team's use to production-scale serving
  • A live view of how the same Fuzzball workflow extends from a single GPU system to larger infrastructure without a rebuild

Speakers

Moderator

Hope Lynch, Director of Product Marketing, CIQ

Panelists

Wolfgang Resch, Research Computing Engineer, CIQ

David Godlove, Technical Product Writer, CIQ

Agenda preview

  • Why traditional Linux waits for patches, and how RLC-Hardened fights back
  • LKRG deep dive: runtime kernel protection that detects exploitation as it happens
  • The layered defense stack: how LKRG + hardened_malloc + hardened glibc make your foundation hostile to attackers
  • From 40+ hours to 30 minutes: automated STIG compliance in RLC-Hardened
  • Real ROI: how security-first architecture saves 1-3 FTEs annually
  • Live Q&A with our expert panel
Sovereign AI 01-28-26-Webinar-1

Agenda preview

  • Why running your own LLM is harder than it looks and what teams get wrong
  • Live demo: from Fuzzball workflow catalog to running LLM on NVIDIA DGX Spark
  • Swapping models without rebuilding your stack
  • How the same workflow definition runs anywhere Fuzzball runs
  • Sovereign AI: your model, your data, your infrastructure
  • Live Q&A with the CIQ team