Alexios Bluff Mara × Illinois State University
Research Collaboration · Cardinal & Code

Gemma 4 Good archive

Gemma was the local-first test bed.

The Gemma 4 Good work taught us where local inference is genuinely useful: small teams, private data, owned hardware, and demos where the marginal cost matters. It also taught us where a public website should not depend on a desktop GPU being online.

What Stays

Local inference

Worth keeping

Gemma on owned hardware is still a strong story for privacy, experimentation, and predictable costs. It belongs behind well-defined tools and reproducible runbooks, not vague live-demo promises.

Mercury

Use Hermes directly

The fork produced useful integration work, but carrying a long-lived fork is the wrong maintenance burden right now. Keep adapters small and upstream what is generally useful.

Cortex

Publish artifacts first

Cortex should have a static Cloudflare-backed gallery of videos, thumbnails, metadata, and recordings. The live upload app can remain a lab mode while the PC is available.

Cloudflare-First Shape

Code lives in Git and deploys to Cloudflare Pages. Media lives in Cloudflare R2 and is referenced by a checked-in manifest. Dynamic Cortex compute stays local until we have a better cloud GPU story; every public page should still be useful when Seratonin is off.