Home

Kubernetes knows what's running. Kollect makes it a record. Declare what matters in a few CRs
and get a durable, always-current inventory wherever your platform needs it — a Git history you
can diff, a database your portal can query, an event stream your automation can react to. Start
with one Git repo; grow to multi-tenant fan-out across teams without rebuilding anything.
Record the hero demo locally: DEMO-GIF-GUIDE.md.
Git-simple to start · platform-grade to grow — kollect.dev/v1alpha1 · event-driven · CRD-native · fleet-ready
What Kollect does¶
Kubernetes is the source of truth for what is running; it is a poor system of record for stakeholder inventory. Kollect maintains a read model — live state captured once, then served from export data:
Scope and Target select resources by GVK and namespace; Profile extracts the attributes that matter (CEL or JSONPath); Inventory rolls up matching objects, debounces churn, and exports snapshots to pluggable sinks (Git, object stores, databases, event streams). Every backend sees the same aggregated rows; sinks are interchangeable projections.
Inventory is configuration, not code — owned per team in its own namespace.
Pre-beta
APIs and defaults may change until the first release candidate. See the roadmap for current status.
Why Kollect?¶
CRD-native¶
Declare profiles, sinks, targets, and inventory in Kubernetes; GitOps-friendly from day one.
Multi-tenant¶
KollectScope gates which teams and namespaces can export to which sinks.
How it works¶

The in-memory snapshot per inventory is canonical; every sink is a projection of it — no single backend is privileged. Sink roles (snapshot store, relational store, event emitter) are documented in ADR-0401; reconciliation detail in Architecture and Data flows.
Supported & planned sinks¶
| Family CRD | spec.type |
Status |
|---|---|---|
KollectSnapshotSink |
git, gitlab, s3 |
Core — production-ready |
KollectSnapshotSink |
gcs |
Beta — shipped, maturing |
KollectDatabaseSink |
postgres |
Core |
KollectDatabaseSink |
mongodb, bigquery |
Beta — bigquery v0.7.x hardening |
KollectEventSink |
kafka, nats |
Beta — nats v0.7.x hardening |
KollectSnapshotSink |
azureblob |
Planned |
| Object-store sinks | Parquet layout | Planned — on S3/GCS |
Release timing and deferred backends: Roadmap — Supported & planned sinks.
The resource model¶
A pipeline is just a handful of Kubernetes resources: config you declare (KollectProfile,
family sinks — KollectSnapshotSink, KollectDatabaseSink, KollectEventSink, KollectScope)
and objects the operator reconciles (KollectTarget, KollectInventory). Cluster-scoped
KollectCluster* variants add cross-namespace rollup.
flowchart LR
K8s(["Kubernetes API"]):::api
subgraph declare["You declare — static config"]
direction TB
Profile["<b>KollectProfile</b><br/>what to extract"]
Scope["<b>KollectScope</b><br/>guardrails"]
Snap["<b>KollectSnapshotSink</b><br/>snapshot store"]
Db["<b>KollectDatabaseSink</b><br/>relational SoR"]
Ev["<b>KollectEventSink</b><br/>event emitter"]
end
subgraph run["Operator reconciles"]
direction TB
Target["<b>KollectTarget</b><br/>what to watch"]
Inv["<b>KollectInventory</b><br/>aggregate · debounce · export"]
end
subgraph out["Sink projections — choose any"]
direction TB
SnapOut["Git · GitLab · S3 · GCS<br/><i>snapshot store</i>"]
Rel["Postgres · MongoDB<br/><i>relational SoR</i>"]
EvtOut["Kafka<br/><i>event emitter</i>"]
end
K8s -- "informer per GVK" --> Target
Profile --> Target
Target --> Inv
Scope -. gates .-> Target
Scope -. gates .-> Inv
Inv --> Snap
Inv --> Db
Inv --> Ev
Snap --> SnapOut
Db --> Rel
Ev --> EvtOut
classDef api fill:#1F2937,stroke:#6B7280,color:#fff;
classDef config fill:#326CE5,stroke:#1b3a8c,color:#fff;
classDef work fill:#18B6A3,stroke:#0e6f63,color:#fff;
classDef proj fill:#7FB3FF,stroke:#326CE5,color:#081A4B;
class Profile,Scope,Snap,Db,Ev config;
class Target,Inv work;
class SnapOut,Rel,EvtOut proj;
| Kind | You set | Role |
|---|---|---|
KollectProfile |
GVK + CEL / JSONPath attributes | What to extract from each object |
KollectTarget |
selectors + profileRef |
What to watch and collect |
KollectInventory |
family sink refs + cadence | Aggregate, debounce, and export |
KollectSnapshotSink |
type + endpoint + secretRef |
Snapshot store (Git, GitLab, S3, GCS) |
KollectDatabaseSink |
type + credentials | Relational SoR (Postgres, MongoDB) |
KollectEventSink |
type + brokers | Event emitter (Kafka) |
KollectScope |
allowed GVKs / namespaces / sinks | Guardrails for the team namespace |
Full fields: CR reference · model rationale: ADR-0201.
Performance¶
Kollect is built for large single clusters and multi-cluster fleets, with honest, tested targets (ADR-0603) — 10,000+ rows validated in nightly load tests, 100,000-row design target per cluster, and fleet fan-in with no hub merge tier. Tuning knobs are catalogued in the performance guide.
Documentation map¶
| Section | Start here |
|---|---|
| Getting started | Quick start · Development setup · Examples |
| Core concepts | CRD model · CR reference · Multi-cluster fleet |
| Operator manual | Install & ops · Upgrading · Helm values |
| Performance & ops | Performance tuning · Scaling & fleet · Best practices · Troubleshooting |
| Background | Prerequisites & basics · Architecture (package graph) · Data flows |
| Reference | Custom resources · FAQ · ADRs · RFCs |
| Contributing | Roadmap · Planned features · ADR/RFC process · Release process |
Try an example¶
- Deployment inventory → Git / Postgres / Kafka — the end-to-end walkthrough
- Postgres state store (relational SoR)
- NATS event sink
- Helm release inventory (Argo primary; Flux secondary)
- Live demo inventory exported to Git — see real output
Go deeper¶
- Platform decisions — the locked design summary
- Sink taxonomy: state vs stream — why no backend is privileged
- Read-only UI console (frozen preview) — early adopter SPA; program frozen until v0.7.x+
- Roadmap — build-order phases and current status