Architecture Design
1. Overview
CSGHub is an open-source, trusted Large Language Model (LLM) asset management platform. Its architecture is designed with private deployment as the core objective, striving to provide users with a complete suite of asset management capabilities consistent with Hugging Face. It enables full lifecycle governance for LLM native assets, including models, datasets, and code. The system adopts a microservice architecture, offering excellent scalability that supports a smooth evolution from lightweight single-machine Docker deployments to large-scale Kubernetes clusters, adapting to various scales and deployment scenarios.
2. Logical Architecture and Components
CSGHub utilizes a standardized microservice architecture where core components have clear responsibilities and work in synergy.
- Docker Deployment: All components run in different processes within the same container, simplifying deployment and management.
- Kubernetes Deployment: Components run as independent Pods, achieving component isolation, elastic scaling, and high-availability deployment.
2.1 Core Business & Access Layer
- Portal & Server: The primary entry point for the platform. It provides the Web UI and core business logic APIs, managing metadata for assets like models and datasets.
- User & Casdoor: Builds a complete identity management system handling registration, login, permission allocation, and multi-tenant OAuth authentication to ensure secure access.
- Nginx & RProxy: Manages all traffic ingress and dynamic routing. RProxy specifically handles dynamic load requests for "Space" applications, ensuring precise forwarding and load balancing.
- Notifier: A unified notification service integrating email, webhooks, and system messages to push critical events (task completion, asset updates, alerts) to users.
- DataViewer: An online dataset preview tool supporting content parsing and visual display for various formats, helping users quickly understand dataset details.
2.2 AI Computing & Orchestration Layer
This layer is responsible for resource allocation, AI task execution, and backend support for code assistants.
- AI Gateway: The unified entry point for AI services, integrating inference request routing, rate limiting, billing statistics, and security controls.
- CSGShip: The backend service for code assistants, providing support for the CodeSouler IDE plugin.
- Runner (Critical Component): A distributed task executor (successor to Space Builder) responsible for compute-intensive tasks like Space app building, model fine-tuning, and general task execution.
- Dataflow: A data pipeline service focused on cleaning, transforming, and formatting large-scale datasets to support model training and inference.
- Temporal & Worker: The "brain" of asynchronous task management, managing state machines for long-running tasks like resource synchronization and image building to ensure stability and error recovery.
- Accounting: A resource billing system that tracks computing usage, storage occupancy, and API calls for resource control and cost accounting.
2.3 Asset Storage & Acceleration Layer
This layer handles persistence, versioning, and high-speed transfer of large files.
- xNet (Core Acceleration): An intelligent acceleration engine designed for large files (LFS, model weights). It optimizes transmission paths and caching to solve the pain point of slow asset transfers.
- Gitaly & Gitlab-shell: A high-performance Git storage backend providing version control and SSH access for models, code, and datasets.
- Mirroring Service: Consists of
mirror_repoandmirror_lfsmodules to synchronize assets between domestic and international repositories. - Object Storage (MinIO) & Registry: The physical storage foundation. MinIO stores model files and datasets, while the Registry manages container images for Spaces.
2.4 Infrastructure Layer
- Databases: Includes PostgreSQL (storing metadata like user info and task configs) and Redis (handling caching and session management).
- NATS: A high-performance event bus facilitating asynchronous communication and decoupling between microservices.
- Observability: Integrates Prometheus (metrics) and Loki (centralized logging) for real-time monitoring and troubleshooting.
3. Deployment Methods
CSGHub offers multiple deployment options to meet different business needs and environmental constraints:
- Docker Compose: A "all-in-one" single-image solution. Best for rapid local onboarding, product demos, and developer debugging. It features a minimal delivery process but limited scalability.
- Kubernetes (Helm): A standardized distributed deployment for production environments. It supports Pod-level elastic scaling, high availability, and enterprise-grade stability.
- Air-gap Deployment (Coming Soon): Designed for high-security environments without internet access (e.g., finance or government). It uses pre-downloaded image tarballs and internal private registries.
- Quick Install: An automated script based on K3s. It sets up a lightweight Kubernetes environment and initializes CSGHub in one click, ideal for single-machine environments requiring K8s orchestration.
4. Network Access and Port Specifications
4.1 Docker Compose (Multiple Exposed Ports)
Since all services run in a single container/namespace, multiple ports are mapped to the host:
- Main Entry: Port
80(Nginx) for Web and API access. - Git SSH: Port
2222(Git Over SSH) to avoid conflict with the host's default SSH (Port 22). - Identity: Port
8000(Casdoor) for authentication and SSO. - Code Assistant: Ports
8001(Frontend) and8002(API) for CSGShip. - Object Storage: Port
9000(API) and9001(Console) for MinIO management.
4.2 Kubernetes / Air-gap / Quick Install (80/443 Convergence)
In these modes, network access is unified by an Ingress or Envoy-Gateway:
- Unified Entry: All functions (Web, API, Auth, Inference) are accessed via standard 80 (HTTP) or 443 (HTTPS) ports.
- Standard Git Access: SSH operations typically use the standard Port 22 via a LoadBalancer, aligning with standard user habits.