首页 > 解决方案 > 超过 6 小时无法查询 Graphite

问题描述

我正在尝试输入一系列罕见的(<1/分钟)指标,并能够在过去几个小时内对其进行查询。不幸的是,尽管尝试了我在 Google 中可以找到的常用技巧,但我看不到超过 6 小时。我的配置有什么问题?以下是我用来设置环境的文件:

./storage-aggregation.conf

[min]
pattern = \.lower$
xFilesFactor = 0
aggregationMethod = min

[max]
pattern = \.upper(_\d+)?$
xFilesFactor = 0
aggregationMethod = max

[sum]
pattern = \.sum$
xFilesFactor = 0
aggregationMethod = sum

[count]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum

[count_legacy]
pattern = ^stats_counts.*
xFilesFactor = 0
aggregationMethod = sum

[default_average]
pattern = .*
xFilesFactor = 0
aggregationMethod = average

./docker-compose.yml

version: '3.3'
services:
  graphite:
    image: graphiteapp/graphite-statsd
    container_name: 'graphite'
    ports:
      - '2003:2003'
    volumes:
      - ./persistence/graphite/storage:/opt/graphite/storage
      - ./storage-aggregation.conf:/opt/graphite/conf/storage-aggregation.conf
      - ./storage-schemas.conf:/opt/graphite/conf/storage-schemas.conf

  grafana:
    build: './grafana'
    ports:
      - '3000:3000'
    links:
      - graphite

./storage-schemas.conf

[carbon]
pattern = ^carbon\.
retentions = 10s:6h,1m:90d

[default_1min_for_1day]
pattern = .*
retentions = 10s:1800d,1m:1800d,10m:1800d

./grafana/provisioning/datasources/all.yml

datasources:
- name: 'graphite'
  type: 'graphite'
  access: 'proxy'
  org_id: 1
  url: 'http://graphite:8080'
  is_default: true
  version: 1
  editable: true

./grafana/provisioning/dashboards/all.yml

- name: 'default'
  org_id: 1
  folder: ''
  type: 'file'
  options:
    folder: '/var/lib/grafana/dashboards'

./grafana/Dockerfile

FROM grafana/grafana:7.0.0
ADD ./provisioning /etc/grafana/provisioning
ADD ./config.ini /etc/grafana/config.ini
ADD ./dashboards /var/lib/grafana/dashboards
USER 0
RUN chmod a+w /var/lib/grafana -R /etc/grafana/config.ini
USER 472

./grafana/config.ini

[paths]
provisioning = /etc/grafana/provisioning

[server]
enable_gzip = true

[users]
default_theme = light

仪表板几乎是默认设置。我在这里想念什么?

标签: dockergrafanagraphite

解决方案


您的保留指定原始间隔为 10 秒,但您发送的数据少于每分钟。这意味着原始保留看起来像0s,<value>; 10s, null; 20s, null; 30s, null; 40s, null; 50s, null; 60s, <value maybe, but could also be null>

您将 XFF 设置为 0,这意味着累积到 1 分钟需要 6 个非空原始值。你只有 1,所以它累积到null.

您应该考虑将原始保留时间更新为超过 10 秒,如果您想传播该值,即使您有大量空值,请将 XFF 设置为 0.9(如果至少 10% 的值,这将允许下一个聚合接受一个值较低的间隔是已知的。

最后,您的10s:1800d,1m:1800d,10m:1800d设置没有意义,因为永远不会使用较低的保留时间(因为它们都涵盖 1800d),如果您真的想要 1800d 的原始数据,那么您可以只使用10s:1800d,但这仍然会导致文件巨大而笨重. 我建议一个更合理的时间表(低间隔 = 短保留,更高的间隔 = 更长的保留,您的耳语文件的总大小将是每个聚合级别的保留/间隔的总和,石墨总是会选择覆盖查询期间的第一次保留)与符合您对汇总应如何处理空值的期望的 XFF 值相结合。


推荐阅读