首页 > 解决方案 > kops 'protectKernelDefaults' 标志和 'EventRateLimit' 准入插件不起作用

问题描述

我正在尝试通过 kOps(1.21.0) 为 aws 上的自托管 Kubernetes 实施一些 CIS 安全基准建议到 kubernetes 版本 1.21.4。

但是,当我在 kubelet 配置和 EventRateLimit adminssion 插件 kube api 服务器配置中尝试protectKernelDefaults:true 时,k8s 集群无法启动。我正在尝试使用这些设置创建一个新集群,而不是尝试更新任何现有设置。

我正在尝试使用的 kops 集群 yaml 是

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  name: k8s.sample.com
spec:
  cloudLabels:
    team_number: "0"
    environment: "dev"
  api:
    loadBalancer:
      type: Internal
      additionalSecurityGroups:
        - sg-id
    crossZoneLoadBalancing: false
    dns: { }
  authorization:
    rbac: { }
  channel: stable
  cloudProvider: aws
  configBase: s3://state-data/k8s.sample.com
  etcdClusters:
    - cpuRequest: 200m
      etcdMembers:
        - encryptedVolume: true
          instanceGroup: master-eu-west-3a
          name: a
      memoryRequest: 100Mi
      name: main
      env:
        - name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
          value: 2d
        - name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
          value: 1m
        - name: ETCD_LISTEN_METRICS_URLS
          value: http://0.0.0.0:8081
        - name: ETCD_METRICS
          value: basic
    - cpuRequest: 100m
      etcdMembers:
        - encryptedVolume: true
          instanceGroup: master-eu-west-3a
          name: a
      memoryRequest: 100Mi
      name: events
      env:
        - name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
          value: 2d
        - name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
          value: 1m
        - name: ETCD_LISTEN_METRICS_URLS
          value: http://0.0.0.0:8081
        - name: ETCD_METRICS
          value: basic
  iam:
    allowContainerRegistry: true
    legacy: false
  kubeControllerManager:
    enableProfiling: false
    logFormat: json
  kubeScheduler:
    logFormat: json
    enableProfiling: false
  kubelet:
    anonymousAuth: false
    logFormat: json
    protectKernelDefaults: true
    tlsCipherSuites: [ TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256 ]
  kubeAPIServer:
    auditLogMaxAge: 7
    auditLogMaxBackups: 1
    auditLogMaxSize: 25
    auditLogPath: /var/log/kube-apiserver-audit.log
    auditPolicyFile: /srv/kubernetes/audit/policy-config.yaml
    enableProfiling: false
    logFormat: json
    enableAdmissionPlugins:
      - NamespaceLifecycle
      - LimitRanger
      - ServiceAccount
      - PersistentVolumeLabel
      - DefaultStorageClass
      - DefaultTolerationSeconds
      - MutatingAdmissionWebhook
      - ValidatingAdmissionWebhook
      - NodeRestriction
      - ResourceQuota
      - AlwaysPullImages
      - EventRateLimit
      - SecurityContextDeny
  fileAssets:
    - name: audit-policy-config
      path: /srv/kubernetes/audit/policy-config.yaml
      roles:
        - Master
      content: |
        apiVersion: audit.k8s.io/v1
        kind: Policy
        rules:
        - level: Metadata
  kubernetesVersion: 1.21.4
  masterPublicName: api.k8s.sample.com
  networkID: vpc-id
  sshKeyName: node_key
  networking:
    calico:
      crossSubnet: true
  nonMasqueradeCIDR: 100.64.0.0/10
  subnets:
    - id: subnet-id1
      name: sn_nodes_1
      type: Private
      zone: eu-west-3a
    - id: subnet-id2
      name: sn_nodes_2
      type: Private
      zone: eu-west-3a
    - id: subnet-id3
      name: sn_utility_1
      type: Utility
      zone: eu-west-3a
    - id: subnet-id4
      name: sn_utility_2
      type: Utility
      zone: eu-west-3a
  topology:
    dns:
      type: Private
    masters: private
    nodes: private
  additionalPolicies:
    node: |
      [
        {
          "Effect": "Allow",
          "Action": [
            "kms:CreateGrant",
            "kms:Decrypt",
            "kms:DescribeKey",
            "kms:Encrypt",
            "kms:GenerateDataKey*",
            "kms:ReEncrypt*"
          ],
          "Resource": [
            "arn:aws:kms:region:xxxx:key/s3access"
          ]
        }
      ]
    master: |
      [
        {
          "Effect": "Allow",
          "Action": [
            "kms:CreateGrant",
            "kms:Decrypt",
            "kms:DescribeKey",
            "kms:Encrypt",
            "kms:GenerateDataKey*",
            "kms:ReEncrypt*"
          ],
          "Resource": [
            "arn:aws:kms:region:xxxx:key/s3access"
          ]
        }
      ]

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: k8s.sample.com
  name: master-eu-west-3a
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210720
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-eu-west-3a
  role: Master
  subnets:
    - sn_nodes_1
    - sn_nodes_2
  detailedInstanceMonitoring: false
  additionalSecurityGroups:
    - sg-id

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: k8s.sample.com
  name: nodes-eu-west-3a
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210720
  machineType: t3.large
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-eu-west-3a
  role: Node
  subnets:
    - sn_nodes_1
    - sn_nodes_2
  detailedInstanceMonitoring: false
  additionalSecurityGroups:
    - sg-id

** 注意:我对上面的值进行了一些更改以删除一些特定的细节 **

我已经分别尝试了这些protectKernelDefaults & EventRateLimit 设置并尝试启动集群。在这些情况下它也不起作用。

当我尝试将protectKernelDefaults 和ssh 连接到主节点并检查/var/log 目录kube-scheduler.log, kube-proxy.log, kube-controller-manager.log and kube-apiserver.log是否为空时。

当它尝试 EventRateLimit 和 ssh 到主节点并检查 /var/log 目录时,api 服务器无法启动,并且所有其他日志文件都出现故障,表明无法连接到 api 服务器。 kube-apiserver.log包含以下内容

Log file created at: 2021/08/23 05:35:51
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:35:54
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:36:11
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:36:32
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0823 05:36:32.654990       1 flags.go:59] FLAG: --add-dir-header="false"
Log file created at: 2021/08/23 05:37:15
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:38:44
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:41:35
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:46:47
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:51:57
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:56:59
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg

任何指向正在发生的事情都会有所帮助。提前致谢。

标签: amazon-web-serviceskuberneteskops

解决方案


默认内核设置的问题是 kOps 中的一个错误。安装的没有设置 kubelet 期望的 sysctl 设置。

准入控制器的问题只是缺少准入控制器配置文件。


推荐阅读