Fleet versions
Fleet versions
The Fleet Versions page tells you exactly which agent build is running on every host, broken down by OS. It’s the page you open during a release rollout to watch the new version propagate, and the page you open during an incident if you suspect a bad build needs to be halted across the fleet.
What you get
A single-page dashboard plus one full-fleet matrix table.
Header. “Fleet Versions” plus a small subtitle showing the total
active host count (N active hosts). The page auto-refreshes every
60 seconds.
Rollout strip. A horizontal card with the rollout summary:
- Current version — the version Mimir is currently promoting to agents.
- Rollout — the percentage of the fleet authorized to receive the new version on its next update check. The agent update controller honors this cap server-side; new versions don’t go to more hosts than the rollout percentage permits.
- Updated —
N / Mcount showing how many hosts have actually picked up the current version vs. the total fleet. - FIM stale — a red counter that appears only when one or more hosts have a file-integrity-monitor heartbeat older than the threshold. Zero hosts → the field is hidden.
- An Export CSV download link that pulls the whole matrix as a spreadsheet you can hand to leadership or paste into a ticket.
- A red Revoke + Freeze button on the right.
Version matrix table. One row per distinct agent version
currently in the fleet. The columns are the OS families Mimir
tracks — Linux, Windows, macOS, BSD, Unknown —
plus a Total column. Cells with zero hosts render as — so the
ones with non-zero counts stand out.
Click any row to jump to the Hosts list filtered to that exact
version (/hosts?q=version:<version>). It’s the fastest way to
answer “which 12 boxes are still on the old build?”
Why use it
Three patterns:
- Rollout monitoring. You promoted a new agent build to 25%. Open this page, watch the rollout percentage and the Updated count climb. The 60-second auto-refresh keeps the numbers live without manual reloads.
- Stragglers. Most of the fleet has the latest version but a handful haven’t picked it up. Click the old version’s row and the Hosts list shows you exactly which ones — usually offline, stale, or pinned to an unsupported OS.
- Emergency stop. A new build is crashing agents on a subset of hosts. Use Revoke + Freeze (see below) to halt the rollout immediately while you investigate.
How to use it
Watch a rollout
- Open Fleet Versions from the left nav.
- Read the Rollout strip: current version, current
percentage, and the
N / Mcount. - Let the page auto-refresh (every 60 seconds). The matrix table shifts as agents pick up the new build — old-version rows shrink, the new-version row grows.
- If you want a snapshot for a ticket or for leadership, click Export CSV.
Triage stragglers
- In the matrix, click the row for the version you’re trying to move agents off of.
- The Hosts list opens filtered to that version. Sort by Last seen to separate “offline for weeks” from “online but stuck.”
- For online stragglers, check the host detail page for any update-related errors in the timeline.
Revoke + Freeze (emergency stop)
The Revoke + Freeze button is the emergency-stop control. It does two things at once:
- Freezes the rollout to 0% so no further hosts pick up the current version.
- Adds the current version’s SHA to the revocation list — agents that haven’t installed yet will refuse to install it on their next update check.
The confirmation modal is explicit:
This will immediately freeze rollout to 0% and add
‘s SHA to the revocation list. All agents will stop receiving updates until you re-promote a release. Are you sure?
Click Revoke + Freeze to proceed. A toast confirms the version was revoked and the rollout frozen; the page refreshes and the rollout percentage drops to 0%.
Use this when:
- A new agent build is causing crashes or data loss across the fleet.
- A signed release has been disclosed as compromised.
- You need to halt all updates while you investigate without touching agent-side config.
This is not the right tool for routine rollback (“the new build is fine, I just want to revert to the old one”). For that, promote the prior release through your normal channel; Revoke + Freeze is specifically the disable-everything-until-an-operator-acts control.
After a revoke, no agent installs any new version until an admin re-promotes a release through the operator-side rollout process. The “Are you sure?” prompt isn’t being dramatic — the fleet really does stop receiving updates.
Permissions
Listing is gated by withAnyAuth — any signed-in user can read
the matrix and export the CSV. The Revoke + Freeze action
requires the appropriate operator role and is intended for
incident-response use. The button itself doesn’t render an
admin-only hint in the UI because the back-end gate is the
load-bearing check; if a non-admin clicks it, the call returns an
error and a toast surfaces the failure.
Troubleshooting
The Revoke + Freeze button is greyed out. No promoted
version exists yet — the rollout strip’s “Current version” field
will show —. There’s nothing to revoke. This is the expected
state on a fresh deployment that hasn’t promoted its first
release.
The Updated count says 50 / 1000 and isn’t moving. Either
the rollout cap is at a low percentage (only 5% of the fleet is
authorized; raise the cap in operator-side rollout config), or
the fleet’s checking in less often than you’d expect (the agent
update interval is on the order of an hour, so the propagation
isn’t instant).
A version row I don’t recognize is in the matrix. Something checked in with a build string Mimir hadn’t seen before. Click the row, look at which hosts have it — usually a single test box or a stuck rollback from a previous run. Decommission or update the host if it’s not supposed to be on that version.
The FIM stale counter is red even though the agents look healthy. The file-integrity-monitor heartbeat is a separate signal from the regular agent heartbeat — an agent can be online and reporting pack data while its FIM subsystem has stopped sending its own pulse. Open the host detail page for one of the affected hosts and look at the timeline for FIM-specific events.
Where to next
- Hosts — the rows on this page are versions; the hosts behind them live on the Hosts list and their per-host detail page.
- Fleet intelligence — if a rollout produces an unexpected wave of change events, the intelligence page is where they cluster.