跳转到内容

kube-prometheus-stack 面试题

30 道题
分类
可观测性
子分类
metrics
题目数
30 道
已阅读 0 / 30 题
1 kube-prometheus-stack 的核心架构是什么?

答案:

kube-prometheus-stack 是 Prometheus Operator、Prometheus、Alertmanager、Grafana 和 Node Exporter 等组件的组合 Helm Chart,提供 K8s 集群的完整监控方案。

架构组成:

kube-prometheus-stack (Helm Chart)
    ├── Prometheus Operator      → CRD 控制器
    ├── Prometheus               → Server 实例
    ├── Alertmanager             → 告警管理集群
    ├── Grafana                  → 可视化面板
    ├── Node Exporter            → 节点指标
    ├── Kube State Metrics       → K8s 对象状态
    ├── Prometheus Adapter       → 自定义/HPA Metrics
    └── Exporters                → 其他 Exporters

核心 CRD:

CRDAPI 版本用途
Prometheusmonitoring.coreos.com/v1Prometheus 实例
Alertmanagermonitoring.coreos.com/v1Alertmanager 实例
ServiceMonitormonitoring.coreos.com/v1Service 指标采集
PodMonitormonitoring.coreos.com/v1Pod 指标采集
PrometheusRulemonitoring.coreos.com/v1告警和记录规则
Probemonitoring.coreos.com/v1Blackbox 探针
AlertmanagerConfigmonitoring.coreos.com/v1告警路由配置
2 kube-prometheus-stack 默认采集哪些指标?

答案:

kube-prometheus-stack 内置多个默认指标采集任务,覆盖 K8s 集群核心组件。

默认 Job:

Job 名采集目标指标用途
kubernetes-apiserversAPI Server请求延迟、错误率、资源版本
kubernetes-nodesNode 节点节点 CPU/内存/磁盘
kubernetes-cadvisorkubelet cAdvisor容器 CPU/内存/网络/磁盘
kubernetes-service-endpointsService 端点应用指标
kubernetes-podsAnnotation 标记的 Pod应用自定义指标
kubeletkubelet 指标Pod 状态、容器操作
kube-state-metricsKSM 服务Deployment/Pod/Node 等对象状态
node-exporter节点相关节点硬件、OS、文件系统
prometheus-operatorOperator 自身Operator 运行状态
prometheusPrometheus 自身采集/存储/查询性能
alertmanagerAlertmanager告警处理状态
grafanaGrafanaGrafana 运行状态
windows-exporterWindows 节点Windows 节点指标
3 ServiceMonitor 的工作原理是什么?

答案:

ServiceMonitor 是 kube-prometheus-stack 的核心 CRD,定义如何从 Service 后端采集指标。

工作流程:

```mermaid
graph TD
    SM["ServiceMonitor CRD"] -->|"selector 匹配 Service"| SVC["Service (Selector)"]
    SVC -->|"label 匹配 Pod"| POD["Pod (metrics 端点)"]
    POD -->|"/metrics"| PROM["Prometheus Server"]

**ServiceMonitor 定义:**

```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: myapp-monitor
  namespace: monitoring
spec:
  # 选择要监控的 Service
  selector:
    matchLabels:
      app: myapp
  # 采集端点
  endpoints:
    - port: http-metrics      # Service 端口名
      interval: 15s
      path: /metrics
      scheme: http
      timeout: 10s
      # 过滤和改写标签
      relabelings:
        - sourceLabels: [__meta_kubernetes_pod_node_name]
          targetLabel: node
      # 指标过滤
      metricRelabelings:
        - sourceLabels: [__name__]
          regex: "go_.*"
          action: drop
  # Service 所属命名空间
  namespaceSelector:
    matchNames:
      - default
      - production

关联条件:

ServiceMonitor → Service
  ServiceMonitor.spec.selector 匹配 Service.metadata.labels
  ServiceMonitor.namespaceSelector 匹配 Service 的命名空间

Service → Pod
  Service.spec.selector 匹配 Pod.metadata.labels
  Pod 的端口名匹配 ServiceMonitor.endpoints.port
4 PodMonitor 与 ServiceMonitor 的区别是什么?

答案:

PodMonitor 直接采集 Pod 指标,不经过 Service 层;ServiceMonitor 通过 Service 发现后端 Pod。

对比分析:

维度ServiceMonitorPodMonitor
采集入口Service 端点Pod 直接采集
LB 负载均衡Service 天然负载均衡直接访问每个 Pod
Endpoint 过滤通过 Service Label通过 Pod Label + annotation
适用场景Deployment 类型负载DaemonSet / StatefulSet
端口发现Service 端口名Pod 容器端口名
网络策略需开放 Service 端口需直接访问 Pod IP

PodMonitor 示例:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: myapp-podmonitor
spec:
  selector:
    matchLabels:
      app: myapp-daemon
  podMetricsEndpoints:
    - port: metrics
      interval: 10s
      path: /metrics
  namespaceSelector:
    any: true

选择建议:

ServiceMonitor:
  - Deployment 或 ReplicaSet(多副本)
  - 需要 Service 层负载均衡
  - Service 端口有明确的 metrics 端口

PodMonitor:
  - DaemonSet(每个节点一个 Pod)
  - Headless Service 场景
  - StatefulSet(每个 Pod 独立采集)
  - 需要采集每个 Pod 的精确指标
5 kube-prometheus-stack 的 PrometheusRule 如何管理告警规则?

答案:

PrometheusRule CRD 将 Prometheus 告警规则和记录规则作为 K8s 资源管理,支持动态更新。

PrometheusRule 结构:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: node-alerts
  labels:
    role: alert-rules
    prometheus: k8s
spec:
  groups:
    # 告警规则组
    - name: node-alerts
      interval: 30s
      rules:
        - alert: NodeCPUUsageHigh
          expr: (100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)) > 80
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "Node XQOPEN $labels.instance XQCLOSE CPU usage > 80%"

        - alert: NodeMemoryUsageHigh
          expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
          for: 5m
          labels:
            severity: critical
          annotations:
            summary: "Node XQOPEN $labels.instance XQCLOSE memory > 90%"

    # 记录规则组
    - name: node-recording
      interval: 1m
      rules:
        - record: node:node_cpu_utilization:ratio
          expr: (1 - avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])))

Prometheus 关联规则:

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s
spec:
  ruleSelector:
    matchLabels:
      role: alert-rules
      prometheus: k8s

规则热更新:

规则更新流程:
  1. 创建/更新 PrometheusRule CR
  2. Prometheus Operator 检测到变更
  3. 动态重新加载 Prometheus 规则
  4. 无需重启 Prometheus Server

验证规则加载:
  kubectl get prometheusrule -n monitoring
  kubectl exec prometheus-k8s-0 -- wget -qO- http://localhost:9090/api/v1/rules
6 AlertmanagerConfig CRD 的作用是什么?

答案:

AlertmanagerConfig CRD 允许在 K8s 中以声明式方式管理 Alertmanager 的路由、接收者和抑制规则。

AlertmanagerConfig 定义:

apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: myapp-alerts
  namespace: monitoring
  labels:
    alertmanagerConfig: myapp
spec:
  route:
    groupBy: ['namespace']
    groupWait: 30s
    groupInterval: 5m
    repeatInterval: 12h
    receiver: 'slack-notifications'
    routes:
      - match:
          severity: critical
        receiver: 'pagerduty-critical'
  receivers:
    - name: 'slack-notifications'
      slackConfigs:
        - apiURL:
            name: slack-webhook
            key: url
          channel: '#alerts'
          title: 'XQOPEN template "slack.title" . XQCLOSE'
          text: 'XQOPEN template "slack.text" . XQCLOSE'
    - name: 'pagerduty-critical'
      pagerDutyConfigs:
        - routingKey:
            name: pagerduty-key
            key: routing_key
          severity: critical
  inhibitRules:
    - sourceMatch:
        - name: severity
          value: critical
      targetMatch:
        - name: severity
          value: warning
      equal: ['namespace']

Alertmanager 关联:

apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
metadata:
  name: main
spec:
  alertmanagerConfigSelector:
    matchLabels:
      alertmanagerConfig: myapp
  alertmanagerConfigNamespaceSelector:
    matchNames:
      - monitoring

支持的通知类型:

- Slack
- PagerDuty
- Email (SMTP)
- OpsGenie
- VictorOps
- WeChat
- Telegram
- Discord
- Webhook
- Pushover
- SNS
7 kube-prometheus-stack 的 Grafana Dashboard 如何管理?

答案:

kube-prometheus-stack 内置大量预配置的 Grafana Dashboard,同时支持通过 ConfigMap 添加自定义面板。

内置 Dashboard:

Dashboard 名用途
Kubernetes / API ServerAPI Server 性能监控
Kubernetes / Nodes节点资源监控
Kubernetes / PodsPod 状态和资源监控
Kubernetes / DeploymentsDeployment 资源监控
Kubernetes / StatefulSetsStatefulSet 监控
Kubernetes / KubeletKubelet 运行状态
Kubernetes / Networking网络流量和策略
Kubernetes / Persistent Volumes存储卷监控
Node Exporter / NodesNode Exporter 全指标
Node Exporter / USE MethodUSE 方法论仪表盘
Prometheus / OverviewPrometheus 自身性能
Prometheus / Remote WriteRemote Write 状态
Alertmanager / Overview告警处理状态

自定义 Dashboard:

apiVersion: v1
kind: ConfigMap
metadata:
  name: custom-dashboard
  namespace: monitoring
  labels:
    grafana_dashboard: "1"
data:
  custom-dashboard.json: |
    {
      "title": "Custom App Dashboard",
      "panels": [...]
    }    

Dashboard Sidecar 配置:

grafana:
  sidecar:
    dashboards:
      enabled: true
      label: grafana_dashboard
      labelValue: "1"
      searchNamespace: ALL
      folderAnnotation: grafana_folder
8 kube-prometheus-stack 的 Prometheus Adapter 的作用是什么?

答案:

Prometheus Adapter 将 Prometheus 指标暴露为 K8s Custom Metrics API,支持 HPA 和 Vertical Pod Autoscaler 基于自定义指标扩缩容。

架构:

```mermaid
graph TD
    PD["Pod / Deployment"] -->|"/metrics"| PROM["Prometheus"]
    PROM -->|"PromQL 查询"| ADAPTER["Prometheus Adapter"]
    ADAPTER -->|"Custom Metrics API"| API["K8s API Server"]
    API --> HPA["HPA (Horizontal Pod Autoscaler)"]
    API --> VPA["VPA (Vertical Pod Autoscaler)"]

**配置示例:**

```yaml
# prometheus-adapter 配置
prometheus-adapter:
  prometheus:
    url: http://prometheus-operated:9090
    port: 9090
  rules:
    default: false
    custom:
      - seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
        resources:
          overrides:
            namespace: {resource: "namespace"}
            pod: {resource: "pod"}
        name:
          matches: "^(.*)_total$"
          as: "${1}_per_second"
        metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'

HPA 使用:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: 100
9 kube-prometheus-stack 如何管理 Prometheus 实例的资源配置?

答案:

kube-prometheus-stack 通过 Prometheus CRD 的 spec 字段精细化控制 Prometheus 实例的资源分配和存储。

Prometheus CRD 资源配置:

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s
spec:
  # 实例副本数
  replicas: 2

  # 容器资源
  resources:
    requests:
      memory: 4Gi
      cpu: 2
    limits:
      memory: 8Gi

  # 存储
  storage:
    volumeClaimTemplate:
      spec:
        storageClassName: fast-ssd
        accessModes: [ReadWriteOnce]
        resources:
          requests:
            storage: 100Gi

  # TSDB 配置
  retention: 15d
  retentionSize: 80GB
  walCompression: true

  # 查询配置
  query:
    maxConcurrency: 50
    maxSamples: 50000000
    timeout: 2m

  # 采集配置
  scrapeInterval: 30s
  scrapeTimeout: 10s
  evaluationInterval: 30s

  # 其他配置
  externalLabels:
    cluster: production-us-east
  externalUrl: https://prometheus.example.com

  # 规则选择
  ruleSelector:
    matchLabels:
      role: alert-rules
  ruleNamespaceSelector:
    matchNames:
      - monitoring

  # ServiceMonitor 选择
  serviceMonitorSelector:
    matchLabels:
      app: monitored
  serviceMonitorNamespaceSelector: {}

资源估算:

集群规模Node 数Pod 数Prometheus 资源存储
小型< 10< 5002Core / 4GB50GB
中型10-50500-20004Core / 8GB200GB
大型50-2002000-100008Core / 16GB1TB
超大型> 200> 1000016Core / 32GB2TB+
10 kube-prometheus-stack 的 NetworkPolicy 如何配置?

答案:

kube-prometheus-stack 通过 NetworkPolicy 控制监控组件间的网络访问,保障安全隔离。

默认网络策略:

# 允许 Prometheus 访问所有 namespace 的 metrics 端点
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-prometheus-scraping
spec:
  podSelector: {}
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: monitoring
          podSelector:
            matchLabels:
              app.kubernetes.io/name: prometheus
      ports:
        - port: 9090
        - port: 10250   # kubelet
        - port: 10255   # kubelet-readonly

组件间通信策略:

Prometheus → ServiceMonitor targets (all namespaces)
Prometheus → Alertmanager (monitoring)
Grafana → Prometheus (monitoring)
Grafana → Alertmanager (monitoring)
Alertmanager → Webhook/Slack (external)

建议策略:
  1. monitoring namespace 内全通
  2. 仅允许 Prometheus 出站到目标 namespace
  3. 仅允许 Grafana 入站到监控系统
  4. 阻止外部访问 Prometheus API
11 kube-prometheus-stack 如何集成 Kube State Metrics?

答案:

kube-prometheus-stack 默认内置 Kube State Metrics(KSM),用于采集 K8s 对象状态指标。

KSM 采集的指标:

资源对象核心指标用途
Nodenode_status, node_condition节点健康、资源容量
Podpod_status, pod_container_*Pod 运行状态、重启次数
Deploymentdeployment_*期望副本数、可用副本数
StatefulSetstatefulset_*就绪副本数
DaemonSetdaemonset_*期望/就绪/调度副本数
Serviceservice_*服务数量
Namespacenamespace_*命名空间状态
PersistentVolumepv_, pvc_存储卷状态、容量
Endpointendpoint_*端点状态
HorizontalPodAutoscalerhpa_*HPA 当前/期望副本数

配置:

kube-state-metrics:
  enabled: true
  collectors:
    - deployments
    - pods
    - nodes
    - statefulsets
    - daemonsets
    - persistentvolumeclaims
    - persistentvolumes
  metricLabelsAllowlist:
    - pods=[*]
    - nodes=[*]
    - deployments=[app,version]
  namespace: monitoring
  resources:
    limits:
      memory: 512Mi
      cpu: 200m

关键告警规则:

# 基于 KSM 指标的告警
- alert: KubeDeploymentReplicasMismatch
  expr: (kube_deployment_spec_replicas - kube_deployment_status_replicas_available) > 0
  for: 10m
  labels:
    severity: warning

- alert: KubePodCrashLooping
  expr: rate(kube_pod_container_status_restarts_total[5m]) > 0
  for: 5m
  labels:
    severity: critical
12 kube-prometheus-stack 的多集群监控方案是什么?

答案:

kube-prometheus-stack 支持多种多集群监控方案,覆盖联邦到全局视图的多种架构。

方案一:Prometheus 联邦

```mermaid
graph TD
    A["集群 A: Prometheus-A"] -->|"/federate"| GLOBAL["全局 Prometheus (Global)"]
    B["集群 B: Prometheus-B"] -->|"/federate"| GLOBAL
    GLOBAL --> GRAFANA["Grafana (Global)"]

**方案二:Thanos/VM 全局聚合**
graph TD
    A["集群 A: kube-prometheus-stack"] -->|"Remote Write"| GLOBAL["全局 Thanos / VictoriaMetrics"]
    B["集群 B: kube-prometheus-stack"] -->|"Remote Write"| GLOBAL
    GLOBAL --> GRAFANA["Grafana (Global)"]

**多集群配置:**

```yaml
# 集群 A
prometheus:
  externalLabels:
    cluster: cluster-a
    environment: production
  remoteWrite:
    - url: http://thanos-receiver:19291/api/v1/receive

# 集群 B
prometheus:
  externalLabels:
    cluster: cluster-b
    environment: production
  remoteWrite:
    - url: http://thanos-receiver:19291/api/v1/receive

Grafana 多集群统一视图:

grafana:
  datasources:
    datasources.yaml:
      apiVersion: 1
      datasources:
        - name: Thanos
          type: prometheus
          url: http://thanos-query:9090
          access: proxy
          isDefault: true
13 kube-prometheus-stack 的监控性能指标有哪些?

答案:

kube-prometheus-stack 自带对 Prometheus 自身性能的监控,关键指标涵盖采集、存储和查询三个维度。

采集性能:

# 采集目标状态
prometheus_target_interval_length_seconds

# 采集失败率
rate(prometheus_target_scrapes_total{job="prometheus"}[5m])

# 采集延迟
prometheus_target_scrape_duration_seconds

# 采集样本数
rate(prometheus_target_scrapes_exceeded_sample_limit_total[5m])

存储性能:

# TSDB Head 序列数
prometheus_tsdb_head_series

# 块数量
prometheus_tsdb_blocks_loaded

# WAL 写入速率
rate(prometheus_tsdb_wal_written_bytes_total[5m])

# 存储大小
prometheus_tsdb_storage_blocks_bytes

查询性能:

# 查询延迟
prometheus_engine_query_duration_seconds

# 并发查询数
prometheus_engine_queries_concurrent_max

# 查询队列
prometheus_engine_query_queue_length

# 查询超时
rate(prometheus_engine_queries_failed_total[5m])

Operator 性能:

# Operator 队列
prometheus_operator_reconcile_errors_total

# Reconcile 延迟
prometheus_operator_reconcile_duration_seconds

# 协调次数
prometheus_operator_reconcile_operations_total
14 kube-prometheus-stack 如何配置 Grafana 的数据源?

答案:

kube-prometheus-stack 通过 Helm values 或 ConfigMap 预配置 Grafana 数据源。

默认数据源:

grafana:
  datasources:
    datasources.yaml:
      apiVersion: 1
      datasources:
        - name: Prometheus
          type: prometheus
          url: http://prometheus-operated:9090
          access: proxy
          isDefault: true
          editable: false
        - name: Alertmanager
          type: alertmanager
          url: http://alertmanager-operated:9093
          access: proxy
          isDefault: false
          editable: false

添加额外数据源:

grafana:
  additionalDataSources:
    - name: Loki
      type: loki
      url: http://loki:3100
      access: proxy
    - name: Tempo
      type: tempo
      url: http://tempo:3200
      access: proxy
    - name: Jaeger
      type: jaeger
      url: http://jaeger-query:16686
      access: proxy

通过 Secret 管理敏感信息:

apiVersion: v1
kind: Secret
metadata:
  name: grafana-datasources
  namespace: monitoring
stringData:
  datasources.yaml: |
    apiVersion: 1
    datasources:
      - name: CloudWatch
        type: cloudwatch
        jsonData:
          authType: keys
          defaultRegion: us-east-1
        secureJsonData:
          accessKey: <access-key>
          secretKey: <secret-key>    
15 kube-prometheus-stack 如何配置 Prometheus 的额外 Scrape Config?

答案:

对于 Prometheus Operator CRD 无法覆盖的采集场景,kube-prometheus-stack 支持通过额外 Scrape Config 实现。

配置方式:

prometheus:
  prometheusSpec:
    additionalScrapeConfigs:
      - job_name: 'kube-controller-manager'
        kubernetes_sd_configs:
          - role: endpoints
        relabel_configs:
          - source_labels: [__meta_kubernetes_endpoint_port_name]
            action: keep
            regex: "http-metrics"
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: namespace
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      - job_name: 'etcd'
        kubernetes_sd_configs:
          - role: endpoints
        scheme: https
        tls_config:
          ca_file: /etc/kubernetes/pki/etcd/ca.crt
          cert_file: /etc/kubernetes/pki/etcd/server.crt
          key_file: /etc/kubernetes/pki/etcd/server.key
        relabel_configs:
          - source_labels: [__meta_kubernetes_endpoint_port_name]
            action: keep
            regex: "etcd"

通过 Secret 管理:

apiVersion: v1
kind: Secret
metadata:
  name: additional-scrape-configs
  namespace: monitoring
stringData:
  prometheus-additional.yaml: |
    - job_name: 'external-service'
      static_configs:
        - targets: ['external-service.example.com:9090']
      basic_auth:
        username: 'admin'
        password: 'password'    

prometheus:
  prometheusSpec:
    additionalScrapeConfigsSecret:
      name: additional-scrape-configs
      key: prometheus-additional.yaml
16 kube-prometheus-stack 的升级策略和注意事项是什么?

答案:

kube-prometheus-stack 升级涉及 CRD 变更、配置迁移和组件版本更新,需遵循特定流程。

升级流程:

1. 备份现有配置
   helm get values prometheus-stack > backup.yaml

2. 更新 Helm Repo
   helm repo update

3. 检查 CRD 变更
   kubectl get crd | grep monitoring.coreos.com

4. 升级 CRD(部分版本需手动)
   kubectl apply -f https://...

5. 升级 Chart
   helm upgrade prometheus-stack prometheus-community/kube-prometheus-stack -f values.yaml

6. 验证升级
   kubectl get pods -n monitoring
   kubectl get prometheus -n monitoring

升级注意事项:

CRD 兼容性:
  - 检查 API 版本变更(v1alpha1 → v1)
  - 部分 CRD 需手动 apply
  - Helm 不会动管理的 CRD

Grafana 版本:
  - 注意 Grafana 大版本升级
  - 插件兼容性
  - Dashboard 索引变更

Prometheus 版本:
  - PromQL 语法变更
  - Remote Write 协议版本
  - 告警规则兼容性

数据保留:
  - 升级不丢失已有数据
  - 但建议升级前做 snapshot 备份
  - 检查 storage 配置是否正确
17 kube-prometheus-stack 的 Pod Security Policy 兼容性是什么?

答案:

kube-prometheus-stack 各组件需要不同的安全上下文运行,在 PSP/OOP 环境下需特别配置。

组件安全需求:

组件安全上下文说明
PrometheusrunAsUser: 1000存储卷写入
AlertmanagerrunAsUser: 1000存储数据
GrafanarunAsUser: 472Dashboard 持久化
Node ExporterhostPID, hostNetwork节点指标采集
Kube State Metrics非 root只读 API 访问
Prometheus Operator非 root创建 Pod

Pod Security Admission 配置:

# monitoring namespace 标签
metadata:
  name: monitoring
  labels:
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/warn: baseline

# 或使用 SecurityContextConstraint (OpenShift)
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
  name: prometheus-scc
allowHostPID: true
allowHostNetwork: true
18 kube-prometheus-stack 的 K8s 事件监控如何实现?

答案:

kube-prometheus-stack 通过 kube-state-metrics 和事件导出器监控 K8s 事件。

事件采集方案:

方案工具类型说明
事件到 Metricskube-state-metrics指标事件计数和状态
事件到 Logs事件导出器日志事件详情采集
事件到告警PrometheusRule告警基于事件的告警

事件告警规则:

- alert: KubePodCrashLooping
  expr: rate(kube_pod_container_status_restarts_total[10m]) > 0
  for: 5m
  labels:
    severity: critical

- alert: KubeNodeNotReady
  expr: kube_node_status_condition{condition="Ready",status="true"} == 0
  for: 5m
  labels:
    severity: critical

- alert: KubePersistentVolumeFillingUp
  expr: (kubelet_volume_stats_available_bytes / kubelet_volume_stats_capacity_bytes) < 0.1
  for: 5m
  labels:
    severity: critical

事件导出器部署:

events-exporter:
  enabled: true
  config:
    exporters:
      - type: prometheus
    routes:
      - match:
          - severity: Warning
            type: BackOff
          - severity: Warning
            type: Failed
19 kube-prometheus-stack 如何配置告警通知模板?

答案:

kube-prometheus-stack 通过 Alertmanager 配置和模板系统实现告警通知的自定义格式化。

内置模板:

alertmanager:
  config:
    global:
      slack_api_url: '<webhook>'
    route:
      receiver: 'default'
    receivers:
      - name: 'default'
        slack_configs:
          - channel: '#alerts'
            title: 'XQOPEN template "slack.title" . XQCLOSE'
            text: 'XQOPEN template "slack.text" . XQCLOSE'
    templates:
      - '/etc/alertmanager/config/template_*.tmpl'

自定义模板:

alertmanager:
  templateFiles:
    custom.tmpl: |
      XQOPEN define "slack.title" XQCLOSE
        [XQOPEN .Status | toUpper XQCLOSE] XQOPEN .GroupLabels.alertname XQCLOSE
      XQOPEN end XQCLOSE

      XQOPEN define "slack.text" XQCLOSE
        *告警详情*
        > 集群: XQOPEN .ExternalURL XQCLOSE
        > 告警名: XQOPEN .GroupLabels.alertname XQCLOSE
        > 严重级别: XQOPEN .CommonLabels.severity XQCLOSE
        > 开始时间: XQOPEN .StartsAt.Format "2006-01-02 15:04:05" XQCLOSE
        > 告警信息:
        XQOPEN range .Alerts XQCLOSE
          > XQOPEN .Annotations.summary XQCLOSE
          > XQOPEN .Annotations.description XQCLOSE
        XQOPEN end XQCLOSE
      XQOPEN end XQCLOSE

      XQOPEN define "email.subject" XQCLOSE
        [XQOPEN .Status | toUpper XQCLOSE] XQOPEN .GroupLabels.alertname XQCLOSE - XQOPEN .GroupLabels.severity XQCLOSE
      XQOPEN end XQCLOSE      

模板变量:

变量说明
{{ .Status }}firing / resolved
{{ .Alerts }}告警列表
{{ .GroupLabels }}分组标签
{{ .CommonLabels }}公共标签
{{ .ExternalURL }}Alertmanager 外部 URL
{{ .StartsAt }}开始时间
{{ .EndsAt }}结束时间
{{ .Annotations }}注释信息
{{ .Labels }}标签信息
20 kube-prometheus-stack 的持久化存储配置是什么?

答案:

kube-prometheus-stack 各组件根据数据特性使用不同的存储配置。

组件存储需求:

组件是否需要持久化存储类型说明
PrometheusPVC (SSD)TSDB 数据,IOPS 敏感
AlertmanagerPVC静默和通知状态
Grafana推荐PVCDashboard 持久化
Node Exporter-无状态
KSM-无状态

Prometheus 存储配置:

prometheus:
  prometheusSpec:
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: premium-rwo
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 200Gi
    # 存储容量限制
    retention: 15d
    retentionSize: 170GB
    # WAL 配置
    walCompression: true

Grafana 存储配置:

grafana:
  persistence:
    enabled: true
    storageClassName: standard-rwo
    accessModes: ["ReadWriteOnce"]
    size: 10Gi
  sidecar:
    dashboards:
      enabled: true
      label: grafana_dashboard

Alertmanager 存储配置:

alertmanager:
  alertmanagerSpec:
    storage:
      volumeClaimTemplate:
        spec:
          storageClassName: standard-rwo
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 10Gi
21 kube-prometheus-stack 如何自定义 Prometheus 启动参数?

答案:

kube-prometheus-stack 通过 prometheusSpec 字段直接传递原生 Prometheus 启动参数。

配置方式:

prometheus:
  prometheusSpec:
    # 基础参数
    retention: 30d
    retentionSize: 100GB
    scrapeInterval: 30s
    evaluationInterval: 30s

    # 外部标签
    externalLabels:
      cluster: production
      region: us-east-1

    # 远程写入
    remoteWrite:
      - url: http://victoriametrics:8428/api/v1/write
        queueConfig:
          capacity: 10000
          maxSamplesPerSend: 1000
          batchSendDeadline: 5s

    # 远程读取
    remoteRead:
      - url: http://victoriametrics:8428/api/v1/read

    # 查询参数
    query:
      maxConcurrency: 50
      maxSamples: 50000000
      timeout: 5m

    # 内存限制
    enableFeatures:
      - memory-snapshot-on-shutdown

    # TSDB 参数
    tsdb:
      outOfOrderTimeWindow: 30s
      enableExemplarStorage: true
      exemplarsRetention: 7d

通过 extraArgs 传递:

prometheus:
  prometheusSpec:
    additionalArgs:
      - name: storage.tsdb.retention.size
        value: "100GB"
      - name: web.enable-lifecycle
        value: "true"
      - name: web.external-url
        value: "https://prometheus.example.com"
22 kube-prometheus-stack 如何管理 Prometheus 的持久化 WAL?

答案:

kube-prometheus-stack 支持 WAL 持久化配置,确保 Prometheus 重启后数据不丢失。

WAL 配置:

prometheus:
  prometheusSpec:
    # WAL 压缩(默认开启)
    walCompression: true

    # 存储规范
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: ssd
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 100Gi
      # WAL 独立存储(可选项)
      walVolumeClaimTemplate:
        spec:
          storageClassName: fast-ssd
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 20Gi

WAL 监控:

# WAL 写入速率
rate(prometheus_tsdb_wal_written_bytes_total[5m])

# WAL 段数量
prometheus_tsdb_wal_segments_current

# WAL 截断时间
prometheus_tsdb_wal_truncate_duration_seconds

# WAL 损坏事件
prometheus_tsdb_wal_corruptions_total

WAL 恢复场景:

异常关闭 → 重启 Prometheus:
  1. 扫描 WAL 目录
  2. 回放未压缩的数据到 Head
  3. 丢弃损坏的 WAL 段
  4. 恢复内存中的时间序列
  5. 正常开始采集

恢复性能:
  1 小时 WAL 回放 ≈ 几分钟
  依赖 WAL 大小和序列数
23 kube-prometheus-stack 如何监控 etcd?

答案:

kube-prometheus-stack 通过额外 Scrape Config 采集 etcd 指标,需要 etcd 证书认证。

etcd 指标端口:

etcd 默认 metrics 端口: 2381 (v3.5+)
etcd 安全 metrics 端口: 2382

采集配置:

prometheus:
  prometheusSpec:
    additionalScrapeConfigs:
      - job_name: 'etcd'
        kubernetes_sd_configs:
          - role: endpoint
        scheme: https
        tls_config:
          ca_file: /etc/prometheus/secrets/etcd-certs/ca.crt
          cert_file: /etc/prometheus/secrets/etcd-certs/server.crt
          key_file: /etc/prometheus/secrets/etcd-certs/server.key
          insecure_skip_verify: true
        relabel_configs:
          - source_labels: [__meta_kubernetes_endpoint_port_name]
            action: keep
            regex: "etcd|etcd-metrics|metrics"
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: namespace
          - source_labels: [__meta_kubernetes_pod_name]
            action: replace
            target_label: pod

# 证书 Secret
apiVersion: v1
kind: Secret
metadata:
  name: etcd-certs
  namespace: monitoring
type: Opaque
stringData:
  ca.crt: <etcd-ca-cert>
  server.crt: <etcd-server-cert>
  server.key: <etcd-server-key>

etcd 告警规则:

- alert: EtcdLeaderChanges
  expr: rate(etcd_server_leader_changes_seen_total[5m]) > 0
  for: 5m
  labels:
    severity: critical

- alert: EtcdHighFsyncDurations
  expr: histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m])) > 1
  for: 5m
  labels:
    severity: critical

- alert: EtcdDbSizeHigh
  expr: etcd_server_quota_backend_bytes / 1024 / 1024 > 1024
  for: 5m
  labels:
    severity: warning
24 kube-prometheus-stack 如何自定义 Grafana 的配置文件?

答案:

kube-prometheus-stack 通过 Helm values 和 ConfigMap 自定义 Grafana 配置。

grafana.ini 配置:

grafana:
  grafana.ini:
    server:
      root_url: https://grafana.example.com
      domain: grafana.example.com
    auth:
      disable_login_form: false
    auth.ldap:
      enabled: true
      config_file: /etc/grafana/ldap.toml
    auth.generic_oauth:
      enabled: true
      client_id: grafana
      client_secret: <secret>
      auth_url: https://auth.example.com/oauth/authorize
      token_url: https://auth.example.com/oauth/token
      api_url: https://auth.example.com/api/userinfo
    security:
      admin_user: admin
      admin_password: <strong-password>
    smtp:
      enabled: true
      host: smtp.example.com:587
      user: grafana@example.com
      password: <password>
      from_address: grafana@example.com
    log:
      mode: console
      level: info
    analytics:
      reporting_enabled: false

LDAP 配置:

grafana:
  ldap:
    enabled: true
    config: |
      [[servers]]
      host = "ldap.example.com"
      port = 389
      use_ssl = false
      start_tls = true
      bind_dn = "cn=admin,dc=example,dc=com"
      bind_password = <password>
      search_filter = "(sAMAccountName=%s)"
      search_base_dns = ["dc=example,dc=com"]      

插件配置:

grafana:
  plugins:
    - grafana-piechart-panel
    - grafana-worldmap-panel
    - grafana-clock-panel
    - natel-discrete-panel
25 kube-prometheus-stack 如何配置 HPA 基于 Prometheus 指标?

答案:

kube-prometheus-stack 利用 Prometheus Adapter 将 Prometheus 指标暴露为 K8s Custom Metrics API,供 HPA 使用。

完整链路:

Pod → Prometheus → Prometheus Adapter → K8s API Server → HPA

Prometheus Adapter 配置:

prometheus-adapter:
  enabled: true
  prometheus:
    url: http://prometheus-operated:9090
    port: 9090
  rules:
    default: false
    custom:
      - seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
        resources:
          overrides:
            namespace: {resource: "namespace"}
            pod: {resource: "pod"}
        name:
          matches: "^(.*)_total$"
          as: "${1}_per_second"
        metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'

      - seriesQuery: 'redis_up{namespace!="",pod!=""}'
        resources:
          overrides:
            namespace: {resource: "namespace"}
            pod: {resource: "pod"}
        name:
          as: "redis_up"
        metricsQuery: 'avg(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'

HPA 配置:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: 100
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 80
26 kube-prometheus-stack 如何实现零停机升级?

答案:

kube-prometheus-stack 通过多副本、滚动更新和数据持久化实现零停机升级。

零停机前提:

条件配置说明
Prometheus 多副本replicas: 2+一个升级,另一个继续服务
数据持久化storageSpec重启后数据不丢失
Grafana 多副本replicas: 2+需共享存储或外界数据库
Alertmanager 集群cluster告警去重和状态同步

Prometheus 滚动更新:

prometheus:
  prometheusSpec:
    replicas: 2
    # 更新策略
    podMetadata:
      annotations:
        prometheus.io/should_be_updated: "true"
    # 反亲和
    affinity:
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                  - key: app.kubernetes.io/name
                    operator: In
                    values:
                      - prometheus
              topologyKey: kubernetes.io/hostname

升级过程中:

1. prometheus-0 升级
   - prometheus-1 正常服务
   - 采集暂停(prometheus-0)
   - 重启后从持久化存储恢复

2. prometheus-0 恢复
   - prometheus-1 开始升级
   - prometheus-0 正常服务

3. 验证
   - 检查采集目标状态
   - 确认告警规则加载
   - 验证数据连续性
27 kube-prometheus-stack 如何监控集群证书过期?

答案:

kube-prometheus-stack 通过 blackbox-exporter 和证书指标采集,监控集群证书的有效期。

证书过期检测方案:

方案采集工具监控指标
TLS 证书blackbox exporterprobe_ssl_earliest_cert_expiry
K8s 证书kube-state-metricskube_secret_metadata_*
自定义textfile collector自定义脚本采集

Blackbox 证书监控:

apiVersion: monitoring.coreos.com/v1
kind: Probe
metadata:
  name: tls-certificates
spec:
  module: tls_connect
  prober:
    url: blackbox-exporter:9115
  targets:
    staticConfig:
      static:
        - https://api.example.com:6443
        - https://grafana.example.com:443
        - https://prometheus.example.com:9090

证书告警规则:

- alert: KubernetesCertificateExpirySoon
  expr: avg by (endpoint) (probe_ssl_earliest_cert_expiry - time()) < 30 * 86400
  for: 1h
  labels:
    severity: warning
  annotations:
    summary: "证书将在 XQOPEN $value | humanizeDuration XQCLOSE 后过期"

- alert: KubernetesCertificateExpiring
  expr: avg by (endpoint) (probe_ssl_earliest_cert_expiry - time()) < 7 * 86400
  for: 1h
  labels:
    severity: critical
  annotations:
    summary: "证书将在 XQOPEN $value | humanizeDuration XQCLOSE 后过期"
28 kube-prometheus-stack 如何配置告警抑制和静默?

答案:

kube-prometheus-stack 通过 Alertmanager 的抑制规则和静默管理减少告警风暴。

抑制规则配置:

alertmanager:
  config:
    inhibit_rules:
      # 集群级故障抑制节点级告警
      - source_match:
          severity: "critical"
          alertname: "KubeNodeNotReady"
        target_match:
          severity: "warning"
        equal: ["cluster"]

      # 节点故障抑制 Pod 级告警
      - source_match:
          alertname: "KubeNodeNotReady"
        target_match:
          alertname: "KubePodNotReady"
        equal: ["node"]

      # 高严重度抑制低严重度
      - source_match:
          severity: "critical"
        target_match:
          severity: "info"
        equal: ["namespace", "cluster"]

  # 通过 AlertmanagerConfig CRD 配置
  alertmanagerSpec:
    alertmanagerConfigSelector:
      matchLabels:
        app: myapp

抑制规则逻辑:

如果满足 source_match 的告警存在,
并且 target_match 的告警与 source 的 equal 标签值相同,
则 target 告警被抑制(不发送通知)。

静默管理:

# 创建静默(2 小时)
amtool silence add \
  --alertmanager.url=http://alertmanager:9093 \
  --author="admin" \
  --comment="维护窗口" \
  --duration=2h \
  alertname="NodeCPUUsageHigh"

# 过期后自动删除静默
# 查看活跃静默
amtool silence query --alertmanager.url=http://alertmanager:9093
29 kube-prometheus-stack 如何实现 RBAC 权限隔离?

答案:

kube-prometheus-stack 通过 K8s RBAC 控制不同团队对监控数据的访问权限。

RBAC 模型:

ClusterRole: prometheus-viewer
  - get /api/v1/query
  - list /api/v1/targets
  - 只读权限

ClusterRole: prometheus-admin
  - POST /api/v1/admin/tsdb/snapshot
  - POST /api/v1/admin/tsdb/delete_series
  - 管理权限

只读角色:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-viewer
rules:
  - nonResourceURLs:
      - /api/v1/query
      - /api/v1/query_range
      - /api/v1/labels
      - /api/v1/series
      - /api/v1/targets
      - /api/v1/rules
      - /api/v1/alerts
    verbs:
      - get
      - list

Grafana 权限隔离:

grafana:
  grafana.ini:
    auth.proxy:
      enabled: true
      header_name: X-WEBAUTH-USER
      header_property: username
      sync_ttl: 60m
    auth:
      oauth_auto_login: true

  # LDAP 组织角色映射
  ldap:
    config: |
      [[servers]]
      ...
      [[servers.group_mappings]]
      group_dn = "cn=devops,ou=groups,dc=example,dc=com"
      org_role = "Admin"
      [[servers.group_mappings]]
      group_dn = "cn=viewer,ou=groups,dc=example,dc=com"
      org_role = "Viewer"      

Namespace 级别访问:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: namespace-monitor
  namespace: myapp
rules:
  - apiGroups: [""]
    resources: ["services", "pods", "endpoints"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["monitoring.coreos.com"]
    resources: ["servicemonitors", "podmonitors"]
    verbs: ["get", "list"]
30 kube-prometheus-stack 常见的故障排查方法是什么?

答案:

kube-prometheus-stack 的故障排查从 Operator 状态、采集目标、存储和查询四个维度展开。

排查流程:

1. 检查 Operator 状态
   kubectl get pods -n monitoring | grep operator
   kubectl logs -n monitoring prometheus-operator-xxx

2. 检查 Prometheus 实例
   kubectl get prometheus -n monitoring
   kubectl describe prometheus k8s -n monitoring

3. 检查采集目标
   kubectl port-forward -n monitoring prometheus-k8s-0 9090
   # 访问 /targets 查看采集状态

4. 检查规则加载
   # 访问 /rules
   # 访问 /api/v1/rules

5. 检查 Alertmanager
   kubectl get alertmanager -n monitoring
   kubectl port-forward -n monitoring alertmanager-main-0 9093
   # 访问 /#/status

常见问题:

症状可能原因排查方法
采集目标 UP==0ServiceMonitor 标签不匹配kubectl describe servicemonitor
指标数据缺失relabel 过滤了指标检查 metric_relabel_configs
告警未触发规则未加载/api/v1/rules 检查
Grafana 无数据数据源配置错误检查 Grafana datasource
Prometheus OOM高基数检查 prometheus_tsdb_head_series
Adatper 无响应PromQL 查询超时检查 adapter 日志

诊断命令:

# 查看所有 CRD 实例
kubectl get prometheus,alertmanager,servicemonitor,podmonitor,prometheusrule,probe -A

# 查看 ServiceMonitor 转换的目标
kubectl describe servicemonitor -n monitoring myapp

# 测试规则加载
kubectl exec -n monitoring prometheus-k8s-0 -- wget -qO- http://localhost:9090/api/v1/rules

# 检查告警状态
kubectl exec -n monitoring alertmanager-main-0 -- amtool alert