Chproxy, is an http proxy and load balancer for ClickHouse database. It provides the following features:
- May proxy requests to multiple distinct
ClickHouseclusters depending on the input user. For instance, requests from
appserveruser may go to
stats-rawcluster, while requests from
reportserveruser may go to
- May map input users to per-cluster users. This prevents from exposing real usernames and passwords used in
ClickHouseclusters. Additionally this allows mapping multiple distinct input users to a single
- May accept incoming requests via HTTP and HTTPS.
- May limit HTTP and HTTPS access by IP/IP-mask lists.
- May limit per-user access by IP/IP-mask lists.
- May limit per-user query duration. Timed out or canceled queries are forcibly killed
via KILL QUERY.
- May limit per-user requests rate.
- May limit per-user number of concurrent requests.
- All the limits may be independently set for each input user and for each per-cluster user.
- May delay request execution until it fits per-user limits.
- Per-user response caching may be configured.
- Response caches have built-in protection against thundering herd problem aka
- Evenly spreads requests among replicas and nodes using
- Monitors node health and prevents from sending requests to unhealthy nodes.
- Supports automatic HTTPS certificate issuing and renewal via Let’s Encrypt.
- May proxy requests to each configured cluster via either HTTP or HTTPS.
- Prepends User-Agent request header with remote/local address and in/out usernames before proxying it to
ClickHouse, so this info may be queried from system.query_log.http_user_agent.
- Exposes various useful metrics in prometheus text format.
- Configuration may be updated without restart - just send
- Easy to manage and run - just pass config file path to a single
- Easy to configure:
server: http: listen_addr: ":9090" allowed_networks: ["127.0.0.0/24"] users: - name: "default" to_cluster: "default" to_user: "default" # by default each cluster has `default` user which can be overridden by section `users` clusters: - name: "default" nodes: ["127.0.0.1:8123"]
How to install
Building from source
Chproxy is written in Go. The easiest way to install it from sources is:
go get -u github.com/Vertamedia/chproxy
If you don't have Go installed on your system - follow this guide.
Why it was created
max_execution_timemay be exceeded due to the current implementation deficiencies.
max_concurrent_queriesworks only on a per-node basis. There is no way to limit the number of concurrent queries on a cluster if queries are spread across cluster nodes.
Such "leaky" limits may lead to high resource usage on all the cluster nodes. After facing this problem we had to maintain two distinct http proxies in front of our
ClickHouse cluster - one for spreading
INSERTs among cluster nodes and another one for sending
SELECTs to a dedicated node where limits may be enforced somehow. This was fragile and inconvenient to manage, so
chproxy has been created :)
INSERTs among cluster shards
INSERTs are sent from app servers located in a limited number of subnetworks.
INSERTs from other subnetworks must be denied.
INSERTs may be routed to a distributed table on a single node. But this increases resource usage (CPU and network) on the node comparing to other nodes, since it must parse each row to be inserted and route it to the corresponding node (shard).
It would be better to spread
INSERTs among available shards and to route them directly to per-shard tables instead of distributed tables. The routing logic may be embedded either directly into applications generating
INSERTs or may be moved to a proxy. Proxy approach is better since it allows re-configuring
ClickHouse cluster without modification of application configs and without application downtime. Multiple identical proxies may be started on distinct servers for scalability and availability purposes.
The following minimal
chproxy config may be used for this use case:
server: http: listen_addr: ":9090" # Networks with application servers. allowed_networks: ["10.10.1.0/24"] users: - name: "insert" to_cluster: "stats-raw" to_user: "default" clusters: - name: "stats-raw" # Requests are spread in `round-robin` + `least-loaded` fashion among nodes. # Unreachable and unhealthy nodes are skipped. nodes: [ "10.10.10.1:8123", "10.10.10.2:8123", "10.10.10.3:8123", "10.10.10.4:8123" ]
SELECTs from reporting apps among cluster nodes
Reporting apps usually generate various customer reports from
SELECT query results.
The load generated by such
ClickHouse cluster may vary depending
on the number of online customers and on the generated report types. It is obvious
that the load must be limited in order to prevent cluster overload.
SELECTs may be routed to a distributed table on a single node. But this increases resource usage (RAM, CPU and network) on the node comparing to other nodes, since it must do final aggregation, sorting and filtering for the data obtained from cluster nodes (shards).
It would be better to create identical distributed tables on each shard and spread
SELECTs among all the available shards.
The following minimal
chproxy config may be used for this use case:
server: http: listen_addr: ":9090" # Networks with reporting servers. allowed_networks: ["10.10.2.0/24"] users: - name: "report" to_cluster: "stats-aggregate" to_user: "readonly" max_concurrent_queries: 6 max_execution_time: 1m clusters: - name: "stats-aggregate" nodes: [ "10.10.20.1:8123", "10.10.20.2:8123" ] users: - name: "readonly" password: "****"
Authorize users by passwords via HTTPS
Suppose you need to access
ClickHouse cluster from anywhere by username/password.
This may be used for building graphs from ClickHouse-grafana or tabix.
It is bad idea to transfer unencrypted password and data over untrusted networks.
So HTTPS must be used for accessing the cluster in such cases.
chproxy config may be used for this use case:
server: https: listen_addr: ":443" autocert: cache_dir: "certs_dir" users: - name: "web" password: "****" to_cluster: "stats-raw" to_user: "web" max_concurrent_queries: 2 max_execution_time: 30s requests_per_minute: 10 deny_http: true # Allow `CORS` requests for `tabix`. allow_cors: true # Enable requests queueing - `chproxy` will queue up to `max_queue_size` # of incoming requests for up to `max_queue_time` until they stop exceeding # the current limits. # This allows gracefully handling request bursts when more than # `max_concurrent_queries` concurrent requests arrive. max_queue_size: 40 max_queue_time: 25s # Enable response caching. See cache config below. cache: "shortterm" clusters: - name: "stats-raw" nodes: [ "10.10.10.1:8123", "10.10.10.2:8123", "10.10.10.3:8123", "10.10.10.4:8123" ] users: - name: "web" password: "****" caches: - name: "shortterm" dir: "/path/to/cache/dir" max_size: 150Mb # Cached responses will expire in 130s. expire: 130s
All the above configs combined
All the above cases may be combined in a single
server: http: listen_addr: ":9090" allowed_networks: ["10.10.1.0/24","10.10.2.0/24"] https: listen_addr: ":443" autocert: cache_dir: "certs_dir" users: - name: "insert" allowed_networks: ["10.10.1.0/24"] to_cluster: "stats-raw" to_user: "default" - name: "report" allowed_networks: ["10.10.2.0/24"] to_cluster: "stats-aggregate" to_user: "readonly" max_concurrent_queries: 6 max_execution_time: 1m - name: "web" password: "****" to_cluster: "stats-raw" to_user: "web" max_concurrent_queries: 2 max_execution_time: 30s requests_per_minute: 10 deny_http: true allow_cors: true max_queue_size: 40 max_queue_time: 25s cache: "shortterm" clusters: - name: "stats-aggregate" nodes: [ "10.10.20.1:8123", "10.10.20.2:8123" ] users: - name: "readonly" password: "****" - name: "stats-raw" nodes: [ "10.10.10.1:8123", "10.10.10.2:8123", "10.10.10.3:8123", "10.10.10.4:8123" ] users: - name: "default" - name: "web" password: "****" caches: - name: "shortterm" dir: "/path/to/cache/dir" max_size: 150Mb expire: 130s
There are two types of users:
in-users (in global section) and
out-users (in cluster section).
This means all requests will be matched to
in-users and if all checks are Ok - will be matched to
with overriding credentials.
Suppose we have one ClickHouse user
read-only permissions and
max_concurrent_queries: 4 limit.
There are two distinct applications
reading from ClickHouse. We may create two distinct
to_user: "web" and
max_concurrent_queries: 2 each in order to avoid situation when a single application exhausts all the 4-request limit on the
out-users are independent.
Chproxy can be configured with multiple
cluster must have a name and either a list of nodes
or a list of replicas with nodes. See cluster-config for details.
Requests to each cluster are balanced among replicas and nodes using
The node priority is automatically decreased for a short interval if recent requests to it were unsuccessful.
This means that the
chproxy will choose the next least loaded healthy node among least loaded replica
for every new request.
Additionally each node is periodically checked for availability. Unavailable nodes are automatically excluded from the cluster until they become available again. This allows performing node maintenance without removing unavailable nodes from the cluster config.
Chproxy automatically kills queries exceeding
max_execution_time limit. By default
chproxy tries to kill such queries
default user. The user may be overriden with kill_query_user.
cluster's users section isn't specified, then
default user is used with no limits.
Chproxy may be configured to cache responses. It is possible to create multiple
cache-configs with various settings.
Response caching is enabled by assigning cache name to user. Multiple users may share the same cache.
SELECT responses are cached.
Caching is disabled for request with
no_cache=1 in query string.
Optional cache namespace may be passed in query string as
cache_namespace=aaaa. This allows caching
distinct responses for the identical query under distinct cache namespaces. Additionally,
an instant cache flush may be built on top of cache namespaces - just switch to new namespace in order
to flush the cache.
Chproxy removes all the query params from input requests (except the user's params and listed here)
before proxying them to
ClickHouse nodes. This prevents from unsafe overriding
Be careful when configuring limits, allowed networks, passwords etc.
chproxy tries detecting the most obvious configuration errors such as
allowed_networks: ["0.0.0.0/0"] or sending passwords via unencrypted HTTP.
hack_me_please: true may be used for disabling all the security-related checks during config validation (if you are feelying lucky :) ).
Example of full configuration:
# Whether to print debug logs. # # By default debug logs are disabled. log_debug: true # Whether to ignore security checks during config parsing. # # By default security checks are enabled. hack_me_please: true # Optional response cache configs. # # Multiple distinct caches with different settings may be configured. caches: # Cache name, which may be passed into `cache` option on the `user` level. # # Multiple users may share the same cache. - name: "longterm" # Path to directory where cached responses will be stored. dir: "/path/to/longterm/cachedir" # Maximum cache size. # `Kb`, `Mb`, `Gb` and `Tb` suffixes may be used. max_size: 100Gb # Expiration time for cached responses. expire: 1h # When multiple requests with identical query simultaneously hit `chproxy` # and there is no cached response for the query, then only a single # request will be proxied to clickhouse. Other requests will wait # for the cached response during this grace duration. # This is known as protection from `thundering herd` problem. # # By default `grace_time` is 5s. Negative value disables the protection # from `thundering herd` problem. grace_time: 20s - name: "shortterm" dir: "/path/to/shortterm/cachedir" max_size: 100Mb expire: 10s # Optional network lists, might be used as values for `allowed_networks`. network_groups: - name: "office" # Each item may contain either IP or IP subnet mask. networks: ["127.0.0.0/24", "10.10.0.1"] - name: "reporting-apps" networks: ["10.10.10.0/24"] # Optional lists of query params to send with each proxied request to ClickHouse. # These lists may be used for overriding ClickHouse settings on a per-user basis. param_groups: # Group name, which may be passed into `params` option on the `user` level. - name: "cron-job" # List of key-value params to send params: - key: "max_memory_usage" value: "40000000000" - key: "max_bytes_before_external_group_by" value: "20000000000" - name: "web" params: - key: "max_memory_usage" value: "5000000000" - key: "max_columns_to_read" value: "30" - key: "max_execution_time" value: "30" # Settings for `chproxy` input interfaces. server: # Configs for input http interface. # The interface works only if this section is present. http: # TCP address to listen to for http. # May be in the form IP:port . IP part is optional. listen_addr: ":9090" # List of allowed networks or network_groups. # Each item may contain IP address, IP subnet mask or a name # from `network_groups`. # By default requests are accepted from all the IPs. allowed_networks: ["office", "reporting-apps", "22.214.171.124"] # ReadTimeout is the maximum duration for proxy to reading the entire # request, including the body. # Default value is 1m read_timeout: 5m # WriteTimeout is the maximum duration for proxy before timing out writes of the response. # Default is largest MaxExecutionTime + MaxQueueTime value from Users or Clusters write_timeout: 10m # IdleTimeout is the maximum amount of time for proxy to wait for the next request. # Default is 10m idle_timeout: 20m # Configs for input https interface. # The interface works only if this section is present. https: # TCP address to listen to for https. listen_addr: ":443" # Paths to TLS cert and key files. # cert_file: "cert_file" # key_file: "key_file" # Letsencrypt config. # Certificates are automatically issued and renewed if this section # is present. # There is no need in cert_file and key_file if this section is present. # Autocert requires application to listen on :80 port for certificate generation autocert: # Path to the directory where autocert certs are cached. cache_dir: "certs_dir" # The list of host names proxy is allowed to respond to. # See https://godoc.org/golang.org/x/crypto/acme/autocert#HostPolicy allowed_hosts: ["example.com"] # Metrics in prometheus format are exposed on the `/metrics` path. # Access to `/metrics` endpoint may be restricted in this section. # By default access to `/metrics` is unrestricted. metrics: allowed_networks: ["office"] # Configs for input users. users: # Name and password are used to authorize access via BasicAuth or # via `user`/`password` query params. # Password is optional. By default empty password is used. - name: "web" password: "****" # Requests from the user are routed to this cluster. to_cluster: "first cluster" # Input user is substituted by the given output user from `to_cluster` # before proxying the request. to_user: "web" # Whether to deny input requests over HTTP. deny_http: true # Whether to allow `CORS` requests like `tabix` does. # By default `CORS` requests are denied for security reasons. allow_cors: true # Requests per minute limit for the given input user. # # By default there is no per-minute limit. requests_per_minute: 4 # Response cache config name to use. # # By default responses aren't cached. cache: "longterm" # An optional group of params to send to ClickHouse with each proxied request. # These params may be set in param_groups block. # # By default no additional params are sent to ClickHouse. params: "web" # The maximum number of requests that may wait for their chance # to be executed because they cannot run now due to the current limits. # # This option may be useful for handling request bursts from `tabix` # or `clickhouse-grafana`. # # By default all the requests are immediately executed without # waiting in the queue. max_queue_size: 100 # The maximum duration the queued requests may wait for their chance # to be executed. # This option makes sense only if max_queue_size is set. # By default requests wait for up to 10 seconds in the queue. max_queue_time: 35s - name: "default" to_cluster: "second cluster" to_user: "default" allowed_networks: ["office", "126.96.36.199/24"] # The maximum number of concurrently running queries for the user. # # By default there is no limit on the number of concurrently # running queries. max_concurrent_queries: 4 # The maximum query duration for the user. # Timed out queries are forcibly killed via `KILL QUERY`. # # By default there is no limit on the query duration. max_execution_time: 1m # Whether to deny input requests over HTTPS. deny_https: true # Configs for ClickHouse clusters. clusters: # The cluster name is used in `to_cluster`. - name: "first cluster" # Protocol to use for communicating with cluster nodes. # Currently supported values are `http` or `https`. # By default `http` is used. scheme: "http" # Cluster node addresses. # Requests are evenly distributed among them. nodes: ["127.0.0.1:8123", "shard2:8123"] # Each cluster node is checked for availability using this interval. # By default each node is checked for every 5 seconds. heartbeat_interval: 1m # Timed out queries are killed using this user. # By default `default` user is used. kill_query_user: name: "default" password: "***" # Configuration for cluster users. users: # The user name is used in `to_user`. - name: "web" password: "password" max_concurrent_queries: 4 max_execution_time: 1m - name: "second cluster" scheme: "https" # The cluster may contain multiple replicas instead of flat nodes. # # Chproxy selects the least loaded node among the least loaded replicas. replicas: - name: "replica1" nodes: ["127.0.1.1:8443", "127.0.1.2:8443"] - name: "replica2" nodes: ["127.0.2.1:8443", "127.0.2.2:8443"] users: - name: "default" max_concurrent_queries: 4 max_execution_time: 1m - name: "web" max_concurrent_queries: 4 max_execution_time: 10s requests_per_minute: 10 max_queue_size: 50 max_queue_time: 70s allowed_networks: ["office"]
Full specification is located here
Metrics are exposed in prometheus text format at
|status_codes_total||Counter||Distribution by response status codes||
|request_sum_total||Counter||The number of processed requests||
|request_success_total||Counter||The number of successfully proxied requests||
|concurrent_limit_excess_total||Counter||The number of rejected requests due to max_concurrent_queries limit||
|host_penalties_total||Counter||The number of given penalties by host||
|host_health||Gauge||Health state of hosts by clusters||
|concurrent_queries||Gauge||The number of concurrent queries at the moment||
|request_queue_size||Gauge||Request queue size at the moment||
|user_queue_overflow_total||Counter||The number of overflows for per-user request queues||
|cluster_user_queue_overflow_total||Counter||The number of overflows for per-cluster_user request queues||
|request_body_bytes_total||Counter||The amount of bytes read from request bodies||
|response_body_bytes_total||Counter||The amount of bytes written to response bodies||
|cache_hits_total||Counter||The amount of cache hits||
|cache_miss_total||Counter||The amount of cache misses||
|cache_size||Gauge||Size of each cache||
|cache_items||Gauge||The number of items in each cache||
|request_duration_seconds||Summary||Request duration. Includes possible queue wait time||
|proxied_response_duration_seconds||Summary||Duration for responses proxied from clickhouse||
|cached_response_duration_seconds||Summary||Duration for cached responses. Includes the duration for sending response to client||
|canceled_request_total||Counter||The number of requests canceled by remote client||
|killed_request_total||Counter||The number of requests killed by proxy||
|timeout_request_total||Counter||The number of timed out requests||
|config_last_reload_successful||Gauge||Whether the last configuration reload attempt was successful|
|config_last_reload_success_timestamp_seconds||Gauge||Timestamp of the last successful configuration reload|
|bad_requests_total||Counter||The number of unsupported requests|