Alerts with Grafana Loki on Kubernetes

Environment information

Kubernetes on GKE
Grafana Loki deployed as Helm chart
- Chart: https://github.com/grafana/loki/tree/main/production/helm/loki
- Version: 5.8.9
Grafana Loki is deployed along with the Kube Prometheus Stack
Alerts are processed by Alertmanager (part of the Kube Prometheus Stack)

Configuration

This is the content for the values.yaml for the Grafana Loki chart.

loki:
  auth_enabled: false

  storage:
    type: gcs
    .....
    ...
    ..

  schemaConfig:
    configs:
      - from: 2023-01-01
        store: boltdb-shipper
        object_store: gcs
        schema: v11
        index:
          period: 24h
          prefix: ........
        chunks:
          period: 24h


  # Alerts configuration - THE MOST IMPORTANT PART (1)
  rulerConfig:
    wal:
      # /var/loki is mounted as PVC
      dir: /var/loki/ruler-wal
    storage:
      type: local
      local:
        directory: /rules
    rule_path: /tmp/scratch
    # Internal address of Alertmanager
    alertmanager_url: http://kube-prometheus-stack-alertmanager:9093
    ring:
      kvstore:
        store: inmemory
    enable_api: true
    enable_alertmanager_v2: true
    # Enable sending metrics to Prometheus
    remote_write:
      enabled: true
      client:
        # Internal address of Prometheus
        url: http://kube-prometheus-stack-prometheus:9090/api/v1/write
  # ---

# Service Account used for Workload Identity to get access
# to the bucket used as storage.
serviceAccount:
  create: true
  name: loki-sa
  annotations:
    iam.gke.io/gcp-service-account: ....

# THE MOST IMPORTANT PART (2)
backend:
  # Alerts configuration
  extraVolumes:
    - name: loki-rules
      configMap:
        name: loki-rules
    - name: loki-rules-scratch
      emptyDir: {}
  extraVolumeMounts:
    - name: loki-rules
      mountPath: /rules/fake
    - name: loki-rules-scratch
      mountPath: /tmp/scratch

With the above configuration, Loki will be able to send alerts to the Alertmanager/Prometheus.

The following ConfigMap configures alerts and recording rules for Loki:

apiVersion: v1
kind: ConfigMap
metadata:
  name: loki-rules
  namespace: <the same namespace as Loki>
data:
  # Recording rule just as an example
  recording-rules.yaml: |-
    groups:
      - name: my_app
        interval: 5m
        rules:
          - record: loki:my_app:logs:count:1h
            expr: |
              count_over_time({app="my-app", container="my-container"} [1h])
  alert-rules.yaml: |-
    groups:
      - name: MyFirstGroup
        rules:
          - alert: ExampleAlert1
            expr: |
              absent_over_time({app="my-app"} [40m])
            for: 10m
            labels:
                severity: error
            annotations:
                summary: My app has stopped streaming logs.
                description: My app did not send any logs since 40m.
          - alert: ExampleAlert2
            expr: |
              sum by(app) (count_over_time({app="my-app"} | json | severity = `ERROR` | __error__="" [5m])) > 2
            for: 1s
            labels:
                severity: warning
            annotations:
                summary: 2 errors in logs has occured for my-app since 5m.
      - name: MySecondGroup
        rules:
          - alert: ExampleAlert3
            expr: ......
            for: ....
            labels:
                severity: warning
            annotations:
                summary: ...
                description: ....

The above alerting rules will appear on the alerts list in the Grafana dashboard.

Knowledge sharing blog | @thevops

Knowledge sharing blog | @thevops

Alerts with Grafana Loki on Kubernetes

Environment information

Configuration