Configuration Explanation
Note: This document only explains the important parameters. Some parameters, due to their similar or identical functions, are not elaborated upon here.
All non-third-party services can customize the service name
<service>.nameto change the service name and the domain name exposed by the ingress (except for the portal's main domain).All non-third-party services can define deployments and most deployment-related attributes of the StatefulSet, such as labels, annotations, replicas, serviceAccount, environments, resources, volumeMounts, livenessProbe, readinessProbe, startupProbe, lifecycle, stdin, tty, volumes, nodeSelector, tolerations, affinity, securityContext, etc.
This document explains the important configuration items in the CSGHub Helm Chart, including global configuration, core service configuration, built-in components, and the priority of each parameter. It is suitable for deployment, operation and maintenance, and secondary development personnel.
1. Global Parameter
1.1 Release edition
global:
edition: "ce" or "ee"
This specifies the deployment version, either Community or Enterprise edition. Different versions will affect the image tag, enabled features, and dependencies.
1.2 Global Ingress
global:
ingress:
domain: "example.com"
useTop: false
tls:
enabled: false
secretName: "<kubernetes tls secret name>"
service:
type: "LoadBalancer" or "NodePort"
Explanation:
- domain
The base domain used by the CSGHub system. All services will generate their final access domains based on this.
-
useTop
Whether to directly use the top-level domain.
-
true: The main CSGHub service directly uses the top-level domain (e.g.,
example.com). -
false: Services use subdomains (default
csghub.example.com).
-
-
tls
- enabled
- true:To enable HTTPS encrypted access, you must provide a secretName.
- false: HTTPS encrypted access is not enabled.
- enabled
-
service
-
type
Specifies how the csghub Ingress is exposed externally. Optional values are
LoadBalancerorNodePort.Note: This value needs to be specified during deployment. Modifying the
svctype after deployment will affect access.This field is referenced via a YAML anchor (&type), which will affect
ingress-nginx.controller.service.type.
-
Priority:
global.ingress < ingress < <service>.ingress
1.3 Global Image
global:
image:
registry: "opencsg-registry.cn-beijing.cr.aliyuncs.com"
tag: "v1.11.0"
pullPolicy: "IfNotPresent"
pullSecrets:
- acr-pull-secret
-
registry
This parameter overrides the image repositories for all images in the Helm chart. Modifying it is generally not recommended, as it defaults to relying on the original image repository of the image to pull relevant images. This might be docker.io, etc. If used in China, it can be set to
opencsg-registry.cn-beijing.cr.aliyuncs.com. -
tag
This is used to define the version number of the csghub image. If the image namespace belongs to opencsghq, the template will automatically complete identifiers such as edition based on whether the tag is compliant. For example, the tag in the example will be output as
v1.11.0-ce/v1.11.0-eeduring actual rendering. -
pullPolicy
Image pull policy.
-
pullSecrets
Configure the pull key to pull images from the private image repository.
Priority:
global.image < image < <service>.image
1.4 Global Persistence
global:
persistence:
storageClass: "hostpath"
accessModes: ["ReadWriteOnce"]
size: "10Gi"
-
storageClass
The default storage class used by all StatefulSets.
-
accessModes
The default access modes used by all StatefulSets.
-
Size
The default storage volume size used by all StatefulSets when creating PVCs.
Priority:
global.persistence < <service>.persistence
1.5 PostgreSQL、Redis、Mongo、Object Storage、Registry 等 external 配置
Each component allow:
<service>:
enabled: true or false
external: {}
-
enabled
- true: Enables the built-in service component; in this case, the external configuration will not take effect.
-
external
When enabled is set to false, the connection information for the corresponding external service component is set via external.
Priority:
global.service < <service>.service
1.6 Global ChartContext
chartContext:
isBuiltIn: true
-
isBuitIn
The default value is true. Its main purpose is to achieve seamless integration of dataflow, runner, and csgship chart, and to indicate whether the chart is deployed independently or bundled with the csghub main service.
2. CSGHub Core Configuration
2.1 Image(Core service image)
image:
registry: "opencsg-registry.cn-beijing.cr.aliyuncs.com"
tag: "v1.11.0"
pullPolicy: "IfNotPresent"
pullSecrets:
- acr-pull-secret
This parameter functions similarly to global.image, but its scope is limited to services started using the csghub-server/csghub-portal image.
Priority:
global.image < image(Here) < <service>.image
2.2 Logging
logging:
level: "info" or "warning" or "debug" or "error"
Used to set the log level for all services started from the csghub-server image. Global control over log levels.
3. Portal
3.1 Image
portal:
repository: "opencsghq/csghub-portal"
Other configurations can be ignored; they are inherited from global.image and image.
Priority:
global.image < image < portal.image
3.2 Ingress
Without going into details, it functions the same as global.ingress, but it cannot declare ingress.service.type, and all other parameters have higher priority than global.image.
3.3 Docs
portal:
docs:
domain: "docs.example.com"
or
portal:
docs:
host: "192.168.18.19"
port: 8003
This configuration is used to link the CSGHub document center to an externally deployed document instance (CSGHub does not have a built-in document center).
Currently, two configuration methods are provided (choose one):
- domain
Specifies the domain name of the deployed external document center instance.
- host and port
If no domain name is configured, you can directly specify the host and port of the document center instance.
3.4 PostgreSQL
portal:
postgresql:
host: "<postgresql host>"
port: "<postgreql port>"
database: "<postgresql csghub portal database>"
user: "<postgresql user>"
password: "<postgresql password>"
timezone: "Etc/UTC"
sslmode: "prefer"
This parameter defines the database connection information for the Portal. Compared to global.postgresql.external, it includes a database parameter. Because this parameter cannot be specified globally, using the same database for all components is discouraged, and Helm charts haven't internally adapted for it.
Standard parameter settings are not detailed here.
Priority:
global.postgresql.external < portal.postgresql
3.5 ObjectStore
portal:
objectStore:
endpoint: "<object store endpoint>"
accessKey: "<object store access key>"
secretKey: "<object store secret key>"
bucket: "<object store public bucket>"
region: "<object store region>"
secure: "<object store tls>"
encrypt: "<object store server encrypt>"
pathStyle: "<object store path style>"
The object storage connection information used to define the Portal has an additional bucket parameter compared to global.objectStore.external. Because this parameter cannot be specified globally, it is not recommended for all components to use the same database, and Helm charts have not adapted it internally.
Priority:
global.objectStore.external < portal.objectStore
4. Server
4.1 gitlabShell
server:
gitlabShell:
sshPort: 22
This defines the port number for the SSH service when cloning using git over ssh. The default port is 22 in LoadBalancer mode and 30022 in NodePort mode. Modifying this port is generally not recommended, as it involves adjusting the Ingress Controller's TCP exposure rules.
4.2 multiSync
server:
multiSync:
enabled: true
proxy: "<proxy to connect internet>"
-
enabled
Defaults to true, indicating that multi-source synchronization is enabled.
-
proxy
Defaults to nil, used to specify the network proxy used to connect to the Internet during multi-source synchronization.
4.3 SwaggerAPI
server:
swaggerAPI
enabled: false
-
enabled
The default value is false, which disables the Swagger API helper instance.
5. RProxy
rproxy:
coredns:
enabled: true
image:
repository: "coredns/coredns"
tag: "1.11.1"
nginx:
enabled: true
image:
repository: "nginx"
tag: "latest"
This section will not be explained in detail. Coridns and Nginx were components used in versions prior to v1.12.0 to assist rproxy in traffic forwarding. Starting with v1.12.0, these two components are deprecated and no longer used.
6. Notifier
6.1 SMTP
notifier:
smtp:
host: "<smtp host>"
port: "<smtp port>"
username: "<smtp username>"
password: "<smtp password>"
Configure the notifier mail server.
6.2 FeiShu
notifier:
feiShu:
appId: "<feishu app id>"
appSecret: "<feishu app secret>"
Configure the notifier to send notifications to Lark.
7. Runner(Chart built-in)
此部分配置会直接传递到 Runner 子 Chart。
-
region
Default: region-0. Used to identify the cluster where the runner resides, e.g., "cn-north". Custom formats and rules are available.
-
interval
Default: 60 seconds. The time interval between Runner reports information to CSGHub.
-
namespace
Default: spaces. The Kubernetes namespace used for deploying inference, fine-tuning, and application spaces.
-
autoConfigure
Default: true. Specifies whether to automatically configure dependent components such as Knative Serving, Argo Workflow, and LeaderWorkSet. These components are essential for inference, fine-tuning, model evaluation, application spaces, and MCP.
-
mergingNamespace
Default: disable. By default, with autoConfigure enabled, different types of components will automatically create different Kubernetes namespaces. This parameter allows for appropriate namespace merging.
-
disable
Do not perform any namespace merging.
-
multi
Merge namespaces appropriately.
-
single
Merge all resources into a single namespace (not recommended).
-
-
kymlMode
Default: create. Used for maintaining resources created by autoConfigure.
-
create
Create only. Skip if the resource already exists.
-
update
Update resources using Apply mode.
-
replace
Force replacement of resources, deleting and then recreating them.
-
-
userPublicDomain
Default: true. Specifies the method for accessing inference, fine-tuning, application space, and other instances.
- true: Indicates using a separate domain name.
- false: Uses subPath access, which may restrict the use of application space, MCP, and other features.
-
pipIndexUrl
Default: https://pypi.tuna.tsinghua.edu.cn/simple/. Defines the PyPi source used when building the application space image.
-
extraBuildArgs
Default: nil. Used to specify more parameters when building images with Kaniko.
-
modelRegistry
Default: nil. Specifies the container image repository from which to pull images for a specified architecture when starting an inference instance. OpenCSG ACR is used by default.
-
knative.serving.domain
Default: example.com. Defines the default internal domain name for exposing the ksvc service. No DNS resolution configuration is required; it is used only for internal routing.
-
rbac
-
create
Default: true. Specifies whether to create the Kubernetes permissions required for runner creation and related resources.
-
-
logcollector
-
enabled
Default: false. Specifies whether to enable the logcollector service. This service needs to be enabled if you want to retain ksvc instance logs for the past 7 days.
-
loki.address
Default: nil. Defines the address of the loki service for storing logs. If not set, the csghub loki instance is used by default.
-
8. Dataflow(Chart built-in)
Data processing tool. Dataflow Helm Chart can be deployed independently or bundled with CSGHub Helm Chart (by setting .Values.global.chartContext: true). This chart includes the following components:
-
dataflow
-
label studio
-
Celery worker
-
PostgreSQL (disabled by default when bundled)
-
Redis (disabled by default when bundled)
-
MongoDB
-
Ingress-nginx (disabled by default when bundled)
-
Prometheus (disabled by default when bundled)
Enabled via dataflow.enabled. It is installed bundled with CSGHub Helm Chart by default and requires no additional configuration. Currently, customizable settings are as follows:
dataflow:
enabled: true
dataflow:
image: {}
postgresql: {}
redis: {}
mongo: {}
persistence: {}
labelStudio:
image: {}
postgresql: {}
persistence: {}
All parameter definition rules are the same as those described above.
9. csgship(内建子Chart)
AI-assisted coding assistant backend service. CSGShip Helm Chart can be deployed independently or bundled with CSGHub Helm Chart (by setting .Values.global.chartContext: true). This Chart contains the following components:
- agentic
- billing
- casdoor
- frontend
- megalinter-server
- megalinter-worker
- postgresql
- redis
- secscan
- web
All components require no special or additional configuration.
10. Other Services
Besides the csghub-server service, the following derivative services are based on the same image:
-
accounting
-
user
-
dataviewer
-
mirror
-
temporalWorker
-
gateway
They have extremely similar configuration parameters for image, PostgreSQL, Redis, etc., and by default inherit all parameters from csghub-server. Common custom parameters are mostly passed in proprietary environments settings.
11. Third-party Built-In Components
11.1 PostgreSQL
11.1.1 Databases
postgresql:
databases:
- "csghub_casdoor"
- "csghub_temporal"
- "csghub_server"
- "csghub_portal"
- "csghub_dataflow"
- "csghub_label_studio"
- "csghub_csgship"
Defines the data created during database initialization; only valid during database initialization.
11.1.2 Parameters
postgresql:
parameters:
max_connections: 200
......
This parameter is used to customize database parameters. By default, the database starts with all parameters at their default values, but you can optimize them using this parameter.
11.1.3 Other Configuration
The details are not elaborated here, as they are all general configurations.
11.2 Redis
11.2.1 requirePass
redis:
requirePass: false
This parameter is not enabled by default. It can be enabled if csgship is not running. Currently, csgship does not support password verification for Redis.
11.3 MinIO
11.3.1 Console
minio:
console:
enabled: true
service:
port: 9001
protocol: "TCP"
Define whether the port for the MinIO service UI is enabled, etc.
11.3.2 Region
minio:
region: "cn-north-1"
Define the default region for minio.
11.3.3 Buckets
minio:
buckets:
- name: "csghub-registry"
policy: "none" # Access policy: none, download, public
- name: "csghub-billing"
policy: "none"
- name: "csghub-server"
policy: "none"
- name: "csghub-portal"
policy: "none"
- name: "csghub-portal-public"
policy: "download"
- name: "csghub-runner"
policy: "none"
Defines the bucket to be created. Unlike postgresql.databases, this parameter can be modified after startup and always checks if the bucket has been created.
-
name
The bucket name
-
policy
The bucket access policy.
-
none Default value, i.e., private
-
download Allows read-only access
-
public Allows public read and write access
-
11.3.4 Other Configuration
The rest are all standard configurations, which will not be elaborated here.
11.4 Registry
Unless there are any special configuration requirements, they will not be elaborated here.
11.5 Gitaly
11.5.1 Storage
gitaly:
storage: "default"
The default name for the gitaly storage.
11.5.2 Other Configuration
The rest are all standard configurations, which will not be elaborated here.
11.6 GitlabShell
11.6.1 RBAC
gitlabShell:
rbac:
create: true
Whether to create RBAC permissions. The primary user creates a key pair containing SSH keys to verify git over SSH operations.
11.6.2 Other Configuration
The rest are all standard configurations, which will not be elaborated here.
11.7 Nats
The rest are all standard configurations, which will not be elaborated here.
11.8 Casdoor
The rest are all standard configurations, which will not be elaborated here.
11.9 Temporal
11.9.1 Console
temporal:
enabled: false
Whether to enable the temporal console UI. It is disabled by default. Due to OAuth settings, if the UI is enabled by default and the Casdoor service is not ready, the entire Temporal service will fail to function. Therefore, if you want to enable this service, please ensure that your Casdoor service is ready (i.e., accessible via Ingress).
11.9.2 Other Configuration
Unless there are any special configuration requirements, they will not be elaborated here.
12 Third-party Dependencies
The following components are for illustrative purposes only. In actual use, they generally do not need to be modified; the default configuration is sufficient.
12.1 ingress-nginx
The default ingress controller is used. Its enabling/disabling function can be controlled via ingress-nginx.enabled.
Please do not modify the default configuration, as this may cause service access errors.
12.2 fluentd
The initial default collection tool is now disabled by default, but we may consider removing it later.
12.3 loki
Log storage and query engine. Due to low usage intensity, a minimal deployment was adopted here.
12.3.1 Ingress
loki:
ingress:
enabled: false
basicAuth: {}
# username: ""
# password: ""
If the logcollector in loki and the runner chart are not the same instance, you need to enable loki ingress. However, logcollector does not currently support basicAuth authentication.
12.4 Tempo
Trace is a log collection tool. Enabling it will significantly impact performance. It is not recommended to enable it unless there is a specific need.
12.5 Prometheus
This is used to collect background analysis data for inference, fine-tuning, and other instances; it is disabled by default.