Best practices¶

Platform-oriented guidance for designing Kollect scope, sinks, and multi-cluster topology. For install defaults, see Operator manual.

Assumptions

Read Understand the basics and Platform decisions before changing production values. Sink role taxonomy: ADR-0401.

Pre-beta API

v1alpha1 fields may change until the first release candidate. Check ROADMAP before fleet rollout.

Scope design¶

Per-team install (recommended default)¶

Install one operator per tenant boundary with namespaced RBAC and a restricted informer cache (ADR-0203, ADR-0703 (archived)):

tenantMode: true
watchNamespaces:
  - team-a
mode: single
featureGates:
  inventoryHttp:
    enabled: false

Place KollectProfile, KollectSink, KollectTarget, and KollectInventory in the team namespace. Portal read paths should use Postgres or Kafka sink export — not the optional HTTP inventory API in production.

Same-namespace references¶

Namespaced pipeline objects must reference peers in the same namespace:

Field	Must resolve in
`KollectTarget.spec.profileRef`	Target namespace
`KollectInventory.spec.sinkRefs`	Inventory namespace
`KollectConnectionTest.spec.sinkRef`	Test namespace

Cluster-wide rollup uses KollectClusterInventory with spec.sinkNamespace instead.

Watch labels and `KollectScope`¶

Exclude platform namespaces with Target excludedNamespaces or Scope deniedNamespaces — no Namespace patch RBAC (ADR-0207).
Trivy HIGH-only collection: resourceRules with label match OR CEL matchPolicy — see examples/kollecttarget_trivy-high.yaml.
Run platform targets with watchMode: All; let teams opt out noisy namespaces via kollect.dev/namespace-watch: disabled (ANNOTATIONS-LABELS.md).
Use watchMode: OptIn only in shared clusters where most tenants should be ignored by default.
Enforce policy with KollectScope — violations set Degraded=True and block export (hard degrade, not silent skip).

Scope vs Helm watchNamespaces

Helm watchNamespaces limits what the operator informer cache sees. KollectScope limits what a tenant may configure. Use both for defense in depth.

Sink roles (ADR-0401)¶

Classify sinks by role, not vendor. The in-memory snapshot per KollectInventory is the canonical artifact; every sink is a projection of it.

Role	Backends	Answers	Deletes
Snapshot store	Git, S3/GCS Parquet, HTTP	Current state, written whole each cycle	Free — absent from snapshot = deleted
Relational SoR	Postgres	Queryable current state, SQL joins for portals	Requires delete reconciliation
Event emitter	NATS JetStream (lean default), Kafka/Redpanda	Change stream for downstream integration	Tombstone (consumer-owned)

Choosing a sink¶

Need	Prefer
Audit trail, GitOps-friendly history	Git snapshot
Queryable inventory without running a DB	S3/GCS Parquet snapshot
Rich relational portal, BI joins	Postgres SoR
Event-driven integrations, fan-out	NATS or Kafka emitter

Not a sink type

Prometheus metrics come from the operator /metrics endpoint only — not a KollectSink type (ADR-0601).

Postgres and event emitters¶

Postgres must delete rows (or tombstone) for resources no longer in the snapshot — upsert-only drifts stale (ADR-0401).
Set spec.cluster on sinks in multi-cluster installs so the backend primary key merges rows across clusters.
Tune per-sink exportMinInterval on structured sinkRefs[] before adding more backends — portal Postgres at 30s + Git audit at 1h is the default sample (ADR-0413). See Performance tuning.

Connection testing¶

Keep spec.connectionTest: false in Git-managed manifests. Probe on demand with the kollect.dev/test-connection annotation or a KollectConnectionTest CR (Connection test example).

Multi-cluster fleet (shared sink)¶

Default multi-cluster path: each cluster runs the same single-mode operator and exports to a shared sink (Postgres, Git, Kafka, NATS) with spec.cluster set (or {cluster} in pathTemplate). The backend primary key merges rows across clusters — no aggregation tier inside the operator (ADR-0501, ADR-0401).

flowchart LR
  subgraph clusters [Per-cluster operators]
    S1[Kollect cluster A]
    S2[Kollect cluster B]
  end
  subgraph backend [Shared backend]
    PG[(Snapshot · Database · Event sinks)]
  end
  S1 -->|export + cluster id| PG
  S2 -->|export + cluster id| PG

Walkthrough: Multi-cluster fleet.

Operational checklist¶

Area	Practice
Upgrades	Two-step: `kubectl apply -f dist/install-crds.yaml`, then `helm upgrade` — never delete CRDs
Secrets	Credentials in Kubernetes Secrets only; restrict operator SA Secret access
HTTP API	Keep `featureGates.inventoryHttp.enabled: false` unless debugging
Observability	Watch `ConnectionVerified`, `SinkReachable`, `Synced`; alert on `Degraded=True`
Performance	Restrict `watchNamespaces`, narrow target selectors, raise `exportMinInterval` under load