Skip to content

Understand the basics

Curated prerequisites before you install Kollect or apply sample custom resources. You do not need to be a Kubernetes expert, but comfort with the concepts below makes QUICKSTART.md and the examples much easier to follow.

Who this page is for

Platform engineers evaluating Kollect on kind, SREs wiring sinks for a tenant namespace, and contributors reading ARCHITECTURE.md. If you already run CRD-based operators daily, skip to Quick start.

Kubernetes fundamentals

Kollect is a namespaced operator — you install one controller deployment (typically via Helm) and declare inventory pipelines as Custom Resources in tenant namespaces.

Concept Why it matters for Kollect
Pods, Deployments, Namespaces Sample targets watch Deployment objects; multitenant installs scope by namespace
RBAC (Role, ClusterRole, ServiceAccount) The manager needs list/watch on GVKs your profiles select; SAR checks degrade gracefully
Secrets Git, Postgres, Kafka, and NATS sinks read credentials from spec.secretRef
Labels and annotations kollect.dev/watch opt-in/opt-out (ADR-0205)

Useful primers:

Custom resources (CRDs)

Kollect extends the API with nine kollect.dev/v1alpha1 kinds. You declare what to collect (KollectProfile), where to export (KollectSink), which resources (KollectTarget), and how to roll up (KollectInventory). Cluster-scoped kinds (KollectCluster*) add cross-namespace rollup on platform clusters.

Term Meaning
CRD CustomResourceDefinition — schema registered with the apiserver
CR A concrete custom resource instance (your YAML)
Reconciler Controller loop that drives CR .status toward .spec
Webhook Admission validation before create/update (CEL paths, sink enum, scope rules)

Start here after this page:

Vertical K-shaped funnel diagram showing Kubernetes resources filtered by Scope and Target, attributes extracted by Profile, aggregated into Inventory rows, and exported to sinks.

Pre-beta API

Fields and status conditions may change until beta. Check ROADMAP.md before production rollout.

CEL and JSONPath extraction

KollectProfile defines named attributes — field paths evaluated against each watched object. Cluster targets reference a namespaced KollectProfile by name + namespace (ADR-0208); there is no KollectClusterProfile kind. Kollect supports:

  • JSONPath — kubectl-style ({.metadata.name}) or $-prefixed paths; use [*] for all array elements (ADR-0302)
  • CELcel:-prefixed expressions for computed values and filters

Example attribute row:

attributes:
  - name: image
    path: "{.spec.template.spec.containers[0].image}"
  - name: ready
    path: 'cel:has(object.status.conditions) && object.status.conditions.exists(c, c.type == "Available" && c.status == "True")'

Walkthrough with expected output: examples/deployment-inventory.md.

External references:

Event-driven informers

Kollect does not poll the API on a cron. For each profile GVK it registers a dynamic informer — shared watches that emit add/update/delete events. Target controllers extract attributes on change; inventory controllers debounce and export to sinks (ADR-0301, DATA-FLOWS.md).

Watch scope

Large clusters need deliberate watchNamespaces, KollectScope, and profile GVK choice. See PERFORMANCE.md and examples/multi-tenant-watch-namespaces.md.

Export sinks and GitOps context

The in-memory inventory snapshot is canonical; sinks are projections classified by role (ADR-0401):

Central in-memory inventory snapshot with equal projection arrows to Git, object store, Postgres, NATS, and Kafka sinks grouped by snapshot, relational, and event roles.

Role Shipped spec.type values Typical use
Snapshot store git, gitlab, s3, gcs Auditable JSON history; S3/GCS format: parquet for analytics (ADR-0401)
Relational SoR postgres Queryable tables; delete reconciliation removes stale rows
Event emitter nats, kafka Change streams for automation and downstream consumers

Git/GitLab exports produce commits portals and compliance workflows can diff. Postgres holds queryable state; NATS/Kafka emit events — many teams pair Postgres + NATS in sinkRefs. See examples/postgres-state-store.md and examples/nats-event-sink.md.

GitOps-friendly, not a GitOps engine

Kollect exports inventory to Git or other backends; it does not replace Argo CD or Flux. For Helm release inventory, see examples/helm-release-inventory.md.

Multi-cluster (optional)

Single-cluster installs are fully supported. Multi-cluster fleet mode runs one operator per cluster and partitions shared sinks via spec.cluster and optional spec.pathTemplate (ADR-0501, ADR-0407, examples/multi-cluster-fleet.md). KollectClusterTarget and KollectClusterInventory controllers reconcile cluster-scoped rollup and export to namespaced sinks.

Next steps

Goal Page
Install on kind and apply samples QUICKSTART.md
Build, test, and debug locally DEVELOPMENT.md
Architecture and CRD relationships ARCHITECTURE.md
Locked design decisions PLATFORM-DECISIONS.md
Scenario walkthroughs examples/README.md