воскресенье, 27 января 2019 г.

The current state of Graphite and its ecosystem

I opened the issue on Graphite project to discuss the future of Graphite project and found out that (my obviously opinionated) review for current ecosystem state is quite big, so I decided to put it separately - below.

The current state of Graphite and its ecosystem
1. Original Graphite https://graphiteapp.org
Language: Python
Storage: Whisper (Python)
Clustering capabilities: medium
Plus points:
* Current implementation standard.
* Still widely deployed, packaged by many distributions.
* Still working great for small to medium installations.
* Graphite-web is still a most full implementation of Graphite render protocol, most of the 3rd party storage implementations still using it as for render engine.
Minus points:
* Whisper storage: No compression, 12 bytes per point, very IO intensive
* Python: vertically scalable only by spawning more instances, which making scaling of relay and carbon components quite hard.
* Current clustering protocol of graphite-web is much better than in 0.9.x Graphite but still not working very well for big and/or volatile clusters.

2. Go-Graphite stack https://github.com/go-graphite/
Go-graphite is an effort to consolidate Golang re-implementations of different Graphite components, which were developed by Booking.com and other companies.
Language: Go / C
Storage: Whisper (Go)
Clustering capabilities: strong
Plus points:
* Go producing single binary per component, easily deployable and vertically  scalable
* New clustering protocol ("carbonserver") working much better in big clusters (Booking.com probably have biggest Graphite cluster in the world, based on that setup)
Minus points:
* Scattered components and development.
The project has no Golang-implemented relay yet, users should use 3rd party relays, e.g. carbon-c-relay or carbon-relay-ng.
The project has no storage component and using lomik's go-carbon, which currently have "carbonserver" built-in.
Carbonapi (graphite-web reimplementation) is not fully compatible with graphite-web and also currently forked in 2 separate forks - community fork and Booking.com fork.

3. Clickhouse stack
Clickhouse is an open-source analytic database, currently, open-sourced by Yandex. During internal development, it was used as Graphite storage, so it has some good implementation of Graphite parts inside (like aggregation). Yandex also open-sourced internal Java-based implementation of Graphite-compatible render part, named "Graphouse", but currently lomik's Golang reimplementation of components - carbon-clickhouse and graphite-clickhouse are much more popular. Please note, that this project contains no rendering components and will use Graphite-web or carbonapi for actual rendering.
Language: Go
Storage: Clickhouse (C++)
Clustering capabilities: strong
Plus points:
* Very good storage: low IO requirements, good compression (2-4 bytes per points typically)
* Can be used in small, medium and large installations - storage is scalable (despite lack of re-sharding, so, a bit like moving whisper files when extending cluster), other components are stateless go binaries
Minus points:
* Depends on Clickhouse's Graphite support - that's not the main purpose of Clickhouse, so, it theoretically can be removed or not-developed in future versions (but currently it's still there)
* User need to experiment with different storage schemas
* Extending big Clickhouse cluster currently can be painful (well, less painful then whisper, probably, I just mean can be not as smooth as e.g. Cassandra cluster).

"Yuuge" (Trump-voice) projects
We have currently 2 projects which were initially developed targeting big and very big Graphite installations - "Metrictank" and "Biggraphite"

4. Metrictank https://github.com/grafana/metrictank
Developed by Grafana Labs for supporting Grafana Cloud and WorldPing projects. A multitenant project aimed for big installations. I'm currently implemented MT cluster in my job, so, I'll describe it in a separate article.
Language: Go
Storage: Cassandra (Java) / BigTable (Google-cloud)
Clustering capabilities: strong
Plus points:
* Designed for scalability - all components are scalable, using Kafka as a bus for metric transport and clustering, using SWIM cluster for cache nodes
* Using strong caching layer for off-loading permanent storage, storing N hours of data in RAM cache for compression/deduplication.
* Re-implement some render functions in Golang, with proper fallback to Graphite-web
* Designed to run in containers (e.g. in Kubernetes)
* Good compression ratio for storage (also around 2-4 bpp)
Minus points:
* Cache nodes are quite RAM hungry and can go OOM (which require big overhead), especially during cluster start. Cache storage quite ineffective (comparable to storage) - 20-30 bytes per point (which is quite logical, the cache should be fast and not compact)
* Quite a complex system, you need to experiment with different deploy/setup strategies (well, that's probably true for every big and loaded storage)
* Not really useful in small installations (better to pick go-carbon or Clickhouse stack)

5. Biggraphite https://github.com/criteo/biggraphite
Designed by Criteo for own Graphite installation. Using Cassandra for extending storage, but reusing other components of Python stack.
Language: Python
Storage: Cassandra (Java) / Elasticsearch (Java)
Clustering capabilities: strong
Plus points:
* Scalable solution (you still need to scale python carbon instances, though)
Minus points:
* Big storage overhead (16-24 bytes per point)
* Not really useful in small installations (better to pick go-carbon or Clickhouse stack)

So, how I mentioned many times before, IMO Graphite is not only a project currently, but more like the whole ecosystem of projects, developed at a different time by different developers for different purposes. Not all of these projects are compatible with all features of the original project, but a user can (and should) pick up that or another implementation considering own use case, requirements, and implementation.

I'm planning to make separate writing about Metrictank and Clickhouse stacks soon.

Комментариев нет: