Skip to main content

CSGHub-Lite Introduction

CSGHub-Lite is a lightweight tool designed for desktop environments, enabling users to easily run Large Language Models (LLMs) locally. It serves as the personal desktop version of the CSGHub platform, integrating model download, local inference, interactive chat, and an OpenAI-compatible REST API.

Core Features

  • One-Command Start — Use the csghub-lite run command to automatically download, load, and enter the chat interface.
  • Model Keep-Alive — After exiting the chat, the model stays loaded in memory by default (for 5 minutes) for instant reconnection next time.
  • Auto-Start Server — The background API server starts automatically with the command, no manual setup required.
  • Multiple Model Sources — Supports downloading models from the official CSGHub platform (hub.opencsg.com) or private deployment environments.
  • Local Inference Capability — Powered by llama.cpp, it supports the GGUF format and can automatically convert SafeTensors models.
  • Streaming Interactive Chat — Provides a smooth command-line chat experience with streaming output support.
  • Compatible Interfaces — Provides REST APIs compliant with OpenAI and Ollama specifications.
  • Cross-Platform Support — Compatible with macOS, Linux, and Windows.
  • Resume Downloads — Interrupted downloads can resume from the last position, saving time and bandwidth.

Supported Model Formats

FormatDownload SupportInference Support
GGUFYesYes (via llama.cpp)
SafeTensorsYesYes (Auto-converted to GGUF)