Back to hands-on classes
Coming upIntermediateOnline

Learn how to use local LLMs on Ollama and Groq

Run powerful models on your own hardware or ultra-fast Groq inference engines.

Schedule

July 25 - August 1, 2026

Duration

2 weeks • 2hrs/wk

Project

Hands-on capstone

Detailed Curriculum

4 practical sections built around live exercises.

01

Local model setup with Ollama

Install, run, and manage open models locally.

Topics covered

  • Ollama installation
  • Model pulls and runtime commands
  • Prompting local models
  • Hardware expectations

Hands-on lab

Run a local model and compare output, latency, and memory use across model sizes.

02

Customization and performance

Tune local models for specific tasks and constraints.

Topics covered

  • Modelfiles
  • System prompts
  • Quantization basics
  • RAM, VRAM, and disk tradeoffs

Hands-on lab

Create a custom Ollama model profile for a focused assistant.

03

Groq and hybrid inference

Use Groq when speed matters and understand when cloud inference is the better path.

Topics covered

  • Groq API basics
  • Latency testing
  • Fallback routing
  • Cost and privacy tradeoffs

Hands-on lab

Build a small script that can switch between local Ollama and Groq-backed inference.

04

Private AI app patterns

Design apps that keep sensitive workflows close to your machine or private infrastructure.

Topics covered

  • Offline-friendly workflows
  • Local RAG options
  • Data privacy decisions
  • Deployment boundaries

Hands-on lab

Design a private assistant for one sensitive document or business workflow.

What You Get Out Of It

Concrete capabilities you should leave with.

Run and customize local models with Ollama

Understand quantization and hardware limits

Use Groq for fast hosted inference

Design privacy-first AI workflows