This documentation walks you through the process of installing CloudPrem on any Kubernetes cluster using PostgreSQL for metadata storage and MinIO for S3-compatible object storage.
This setup is ideal for environments where you manage your own infrastructure or don’t use a major cloud provider’s managed services.
Prerequisites
Before you begin, confirm you have:
kubectl installed and configured to access your Kubernetes cluster
kubectl version --client
Helm 3.x installed
helm version
A Kubernetes cluster (v1.25 or higher) up and running
kubectl get nodes
A Datadog account with the CloudPrem feature enabled
kubectl run minio-client \
--rm -it \
--image=minio/mc:latest \
--command -- bash -c "mc alias set myminio <MINIO_ENDPOINT> <ACCESS_KEY> <SECRET_KEY> && mc ls myminio/<BUCKET_NAME>"
If successful, the command lists the contents of your MinIO bucket.
Create a datadog-values.yaml file to override the default values with your custom configuration. This is where you define environment-specific settings such as the service account, ingress setup, resource requests and limits, and more.
Any parameters not explicitly overridden in datadog-values.yaml fall back to the defaults defined in the chart’s values.yaml.
# Show default valueshelm show values datadog/cloudprem
The following is an example datadog-values.yaml file with overrides for a vanilla Kubernetes setup with MinIO:
datadog-values.yaml
# Datadog configurationdatadog:# The Datadog site (https://docs.datadoghq.com/getting_started/site/) to connect to. Defaults to `datadoghq.com`.# site: datadoghq.com# The name of the existing Secret containing the Datadog API key. The secret key name must be `api-key`.apiKeyExistingSecret:datadog-secret# Environment variables# The MinIO credentials are mounted from the Kubernetes secret.# Any environment variables defined here are available to all pods in the deployment.environment:AWS_REGION:us-east-1# Service account configurationserviceAccount:create:truename:cloudprem# CloudPrem node configurationconfig:# The root URI where index data is stored. This should be an S3-compatible path pointing to your MinIO bucket.# All indexes created in CloudPrem are stored under this location.default_index_root_uri:s3://<BUCKET_NAME>/indexesstorage:s3:endpoint:<MINIO_ENDPOINT># force_path_style_access must be true for MinIO.force_path_style_access:true# Metastore configuration# The metastore is responsible for storing and managing index metadata.# It requires a PostgreSQL database connection string to be provided by a Kubernetes secret.# The secret should contain a key named `QW_METASTORE_URI` with a value in the format:# postgresql://<username>:<password>@<host>:<port>/<database>## The metastore connection string is mounted into the pods using extraEnvFrom to reference the secret.metastore:extraEnvFrom:- secretRef:name:cloudprem-metastore-uri- secretRef:name:cloudprem-minio-credentials# Indexer configuration# The indexer is responsible for processing and indexing incoming data it receives data from various sources# (for example, Datadog Agents, log collectors) and transforms it into searchable files called "splits"# stored in MinIO.## The indexer is horizontally scalable - you can increase `replicaCount` to handle higher indexing throughput.# Resource requests and limits should be tuned based on your indexing workload.## The default values are suitable for moderate indexing loads of up to 20 MB/s per indexer pod.indexer:replicaCount:2extraEnvFrom:- secretRef:name:cloudprem-minio-credentialsresources:requests:cpu:"4"memory:"8Gi"limits:cpu:"4"memory:"8Gi"# Searcher configuration# The searcher is responsible for executing search queries against the indexed data stored in MinIO.# It handles search requests from Datadog's query service and returns matching results.## The searcher is horizontally scalable - you can increase `replicaCount` to handle more concurrent searches.# Resource requirements for searchers are highly workload-dependent and should be determined empirically.# Key factors that impact searcher performance include:# - Query complexity (for example, number of terms, use of wildcards or regex)# - Query concurrency (number of simultaneous searches)# - Amount of data scanned per query# - Data access patterns (cache hit rates)## Memory is particularly important for searchers as they cache frequently accessed index data in memory.searcher:replicaCount:2extraEnvFrom:- secretRef:name:cloudprem-minio-credentialsresources:requests:cpu:"4"memory:"16Gi"limits:cpu:"4"memory:"16Gi"# Control plane configurationcontrolPlane:extraEnvFrom:- secretRef:name:cloudprem-minio-credentials# Janitor configurationjanitor:extraEnvFrom:- secretRef:name:cloudprem-minio-credentials
Replace the following placeholders with your actual values:
<BUCKET_NAME>: The name of your MinIO bucket (for example, cloudprem)
<MINIO_ENDPOINT>: The MinIO endpoint URL (for example, http://minio.minio.svc.cluster.local:9000)