使用AWS的EKS来托管Kubernetes是比较复杂,按照如下的方法可以创建出一个满足大部分使用环境的EKS。
1 创建一个IAM用户(Root用户操作)
在AWS中创建一个IAM用户,权限够用就行。
在AWS管理控制台,点击”Add users”:
其它页面默认就好。最后保存好下载的CSV文件,里面包含的Access Key和Secret Access Key在AWS CLI里面会用到。
2 创建策略和角色(Root用户操作)
2.1 创建EKS集群角色
给EKS集群创建一个角色:”testEKSClusterRole”,它包含一个策略: AmazonEKSClusterPolicy。
2.2 创建集群节点组角色
创建角色”testEKSNodeRole”,包含如下策略:
AmazonEKSWorkerNodePolicy
AmazonEC2ContainerRegistryReadOnly
AmazonEKS_CNI_Policy
2.3 给IAM用户添加权限
用户需要如下4个权限。你也可以创建一个用户组,并给其赋予权限,然后加入用户。
赋予受管策略”AmazonEC2FullAccess”, “AmazonVPCReadOnlyAccess”, “AmazonEC2FullAccess”。
添加一个包含如下内容的自定义策略:”TestEKSPolicy”
(请修改账号ID675892200046)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "eks:*", "Resource": "*" }, { "Action": [ "ssm:GetParameter", "ssm:GetParameters" ], "Resource": [ "arn:aws:ssm:*:675892200046:parameter/aws/*", "arn:aws:ssm:*::parameter/aws/*" ], "Effect": "Allow" }, { "Action": [ "kms:CreateGrant", "kms:DescribeKey" ], "Resource": "*", "Effect": "Allow" }, { "Action": [ "logs:PutRetentionPolicy" ], "Resource": "*", "Effect": "Allow" } ] } |
添加一个包含如下内容的自定义策略:”IamLimitedAccess”
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "iam:CreateServiceLinkedRole", "Resource": "*", "Condition": { "StringEquals": { "iam:AWSServiceName": [ "eks.amazonaws.com", "eks-nodegroup.amazonaws.com", "eks-fargate.amazonaws.com" ] } } }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": [ "iam:CreateInstanceProfile", "iam:TagRole", "iam:RemoveRoleFromInstanceProfile", "iam:DeletePolicy", "iam:CreateRole", "iam:AttachRolePolicy", "iam:PutRolePolicy", "iam:AddRoleToInstanceProfile", "iam:ListInstanceProfilesForRole", "iam:PassRole", "iam:DetachRolePolicy", "iam:DeleteRolePolicy", "iam:ListAttachedRolePolicies", "iam:DeleteOpenIDConnectProvider", "iam:DeleteInstanceProfile", "iam:GetRole", "iam:GetInstanceProfile", "iam:GetPolicy", "iam:DeleteRole", "iam:ListInstanceProfiles", "iam:CreateOpenIDConnectProvider", "iam:CreatePolicy", "iam:ListPolicyVersions", "iam:GetOpenIDConnectProvider", "iam:TagOpenIDConnectProvider", "iam:GetRolePolicy" ], "Resource": [ "arn:aws:iam::675892200046:role/testEKSNodeRole", "arn:aws:iam::675892200046:role/testEKSClusterRole", "arn:aws:iam::675892200046:role/aws-service-role/eks-nodegroup.amazonaws.com/AWSServiceRoleForAmazonEKSNodegroup", "arn:aws:iam::675892200046:instance-profile/*", "arn:aws:iam::675892200046:policy/*", "arn:aws:iam::675892200046:oidc-provider/*" ] }, { "Sid": "VisualEditor2", "Effect": "Allow", "Action": "iam:GetRole", "Resource": "arn:aws:iam::675892200046:role/*" }, { "Sid": "VisualEditor3", "Effect": "Allow", "Action": "iam:ListRoles", "Resource": "*" } ] } |
3 创建EKS集群(IAM用户)
3.1 创建EKS集群控制平面
在EKS产品页面,点击”Create Cluster”。
如果你没有在”Custer service role”下拉列表中看见角色,请检查第2步。
在子网”Subnets”中, 3个子网就好了。
在集群端点访问”Cluster endpoint access”中,选 “Public”就好,生产环境,请选择”Private”。
在网络插件”Networking add-ons”中,默认就好。
3.2 添加工作节点到集群
当集群创建成功了”Active”, 点击Compute标签中的”Add node group”来创建工作节点。
4 设置AWS CLI 工具和Kubectl 工具(IAM用户)
4.1 配置AWS CLI
安装AWS CLI后,运行”aws configure”来配置第一步中的IAM账号:
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" unzip awscliv2.zip sudo ./aws/install aws configure [awscli@bogon ~]$ aws sts get-caller-identity { "UserId": "AIDAZ2XSQQJXKNKFI4YDF", "Account": "675892200046", "Arn": "arn:aws:iam::675892200046:user/TestEKSUser" } |
4.2 配置Kubectl
[awscli@bogon ~]$ aws eks --region us-east-1 update-kubeconfig --name TestEKSCluster Updated context arn:aws:eks:us-east-1:675892200046:cluster/TestEKSCluster in /home/awscli/.kube/config |
5 设置EKS的存储EFS
5.1 创建接入EFS的策略(Root用户操作)
自定义一策略:”TestEKSAccessEFSPolicy”
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "elasticfilesystem:DescribeAccessPoints", "elasticfilesystem:DescribeFileSystems" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "elasticfilesystem:CreateAccessPoint" ], "Resource": "*", "Condition": { "StringLike": { "aws:RequestTag/efs.csi.aws.com/cluster": "true" } } }, { "Effect": "Allow", "Action": "elasticfilesystem:DeleteAccessPoint", "Resource": "*", "Condition": { "StringEquals": { "aws:ResourceTag/efs.csi.aws.com/cluster": "true" } } } ] } |
5.2 创建访问EFS的角色(Root用户操作)
创建角色”TestEKSAccessEFSRole”并分配策略”TestEKSAccessEFSPolicy”。
在信任关系”Trust relationships”中,修改如下内容。
替换”oidc.eks.us-east-1.amazonaws.com/id/98F61019E9B399FA9B7A43A19B56DF14″为你EKS的”OpenID Connect provider URL”。
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::675892200046:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/98F61019E9B399FA9B7A43A19B56DF14" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { "oidc.eks.us-east-1.amazonaws.com/id/98F61019E9B399FA9B7A43A19B56DF14:sub": "system:serviceaccount:kube-system:efs-csi-controller-sa" } } } ] } |
5.3 为OpenID Connect创建Identity Provider(Root用户操作)
填入提供URL和审计URL “sts.amazonaws.com”, 点击”Get thumbprint”, 然后单击”Add provider”。
5.4 在EKS中创建服务账户(IAM用户)
创建文件”efs-service-account.yaml”,包含如下内容,然后”kubectl apply -f efs-service-account.yaml”创建账户,注意修改account id。
apiVersion: v1 kind: ServiceAccount metadata: name: efs-csi-controller-sa namespace: kube-system labels: app.kubernetes.io/name: aws-efs-csi-driver annotations: eks.amazonaws.com/role-arn: arn:aws:iam::675892200046:role/TestEKSAccessEFSRole |
5.5 创建EFS CSI 插件(IAM用户)
执行如下命令获取EFS插件的安装yaml文件:driver.yaml
kubectl kustomize "github.com/kubernetes-sigs/aws-efs-csi-driver/deploy/kubernetes/overlays/stable/ecr?ref=release-1.3" > driver.yaml |
上面已经创建了服务账号,所以driver.yaml文件里面的”efs-csi-controller-sa”段可以去掉。
接着运行命令 “kubectl apply -f driver.yaml”创建CSI插件。
apiVersion: v1 kind: ServiceAccount metadata: name: efs-csi-controller-sa namespace: kube-system labels: app.kubernetes.io/name: aws-efs-csi-driver annotations: eks.amazonaws.com/role-arn: arn:aws:iam::675892200046:role/TestEKSAccessEFSRole --- apiVersion: v1 kind: ServiceAccount metadata: labels: app.kubernetes.io/name: aws-efs-csi-driver name: efs-csi-node-sa namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: app.kubernetes.io/name: aws-efs-csi-driver name: efs-csi-external-provisioner-role rules: - apiGroups: - "" resources: - persistentvolumes verbs: - get - list - watch - create - delete - apiGroups: - "" resources: - persistentvolumeclaims verbs: - get - list - watch - update - apiGroups: - storage.k8s.io resources: - storageclasses verbs: - get - list - watch - apiGroups: - "" resources: - events verbs: - list - watch - create - patch - apiGroups: - storage.k8s.io resources: - csinodes verbs: - get - list - watch - apiGroups: - "" resources: - nodes verbs: - get - list - watch - apiGroups: - coordination.k8s.io resources: - leases verbs: - get - watch - list - delete - update - create --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: app.kubernetes.io/name: aws-efs-csi-driver name: efs-csi-provisioner-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: efs-csi-external-provisioner-role subjects: - kind: ServiceAccount name: efs-csi-controller-sa namespace: kube-system --- apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/name: aws-efs-csi-driver name: efs-csi-controller namespace: kube-system spec: replicas: 2 selector: matchLabels: app: efs-csi-controller app.kubernetes.io/instance: kustomize app.kubernetes.io/name: aws-efs-csi-driver template: metadata: labels: app: efs-csi-controller app.kubernetes.io/instance: kustomize app.kubernetes.io/name: aws-efs-csi-driver spec: containers: - args: - --endpoint=$(CSI_ENDPOINT) - --logtostderr - --v=2 - --delete-access-point-root-dir=false env: - name: CSI_ENDPOINT value: unix:///var/lib/csi/sockets/pluginproxy/csi.sock image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/aws-efs-csi-driver:v1.3.8 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 5 httpGet: path: /healthz port: healthz initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 3 name: efs-plugin ports: - containerPort: 9909 name: healthz protocol: TCP securityContext: privileged: true volumeMounts: - mountPath: /var/lib/csi/sockets/pluginproxy/ name: socket-dir - args: - --csi-address=$(ADDRESS) - --v=2 - --feature-gates=Topology=true - --extra-create-metadata - --leader-election env: - name: ADDRESS value: /var/lib/csi/sockets/pluginproxy/csi.sock image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/csi-provisioner:v2.1.1 imagePullPolicy: IfNotPresent name: csi-provisioner volumeMounts: - mountPath: /var/lib/csi/sockets/pluginproxy/ name: socket-dir - args: - --csi-address=/csi/csi.sock - --health-port=9909 image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/livenessprobe:v2.2.0 imagePullPolicy: IfNotPresent name: liveness-probe volumeMounts: - mountPath: /csi name: socket-dir hostNetwork: true nodeSelector: kubernetes.io/os: linux priorityClassName: system-cluster-critical priorityClassName: system-cluster-critical serviceAccountName: efs-csi-controller-sa volumes: - emptyDir: {} name: socket-dir --- apiVersion: apps/v1 kind: DaemonSet metadata: labels: app.kubernetes.io/name: aws-efs-csi-driver name: efs-csi-node namespace: kube-system spec: selector: matchLabels: app: efs-csi-node app.kubernetes.io/instance: kustomize app.kubernetes.io/name: aws-efs-csi-driver template: metadata: labels: app: efs-csi-node app.kubernetes.io/instance: kustomize app.kubernetes.io/name: aws-efs-csi-driver spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: eks.amazonaws.com/compute-type operator: NotIn values: - fargate containers: - args: - --endpoint=$(CSI_ENDPOINT) - --logtostderr - --v=2 env: - name: CSI_ENDPOINT value: unix:/csi/csi.sock image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/aws-efs-csi-driver:v1.3.8 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 5 httpGet: path: /healthz port: healthz initialDelaySeconds: 10 periodSeconds: 2 timeoutSeconds: 3 name: efs-plugin ports: - containerPort: 9809 name: healthz protocol: TCP securityContext: privileged: true volumeMounts: - mountPath: /var/lib/kubelet mountPropagation: Bidirectional name: kubelet-dir - mountPath: /csi name: plugin-dir - mountPath: /var/run/efs name: efs-state-dir - mountPath: /var/amazon/efs name: efs-utils-config - mountPath: /etc/amazon/efs-legacy name: efs-utils-config-legacy - args: - --csi-address=$(ADDRESS) - --kubelet-registration-path=$(DRIVER_REG_SOCK_PATH) - --v=2 env: - name: ADDRESS value: /csi/csi.sock - name: DRIVER_REG_SOCK_PATH value: /var/lib/kubelet/plugins/efs.csi.aws.com/csi.sock - name: KUBE_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/csi-node-driver-registrar:v2.1.0 imagePullPolicy: IfNotPresent name: csi-driver-registrar volumeMounts: - mountPath: /csi name: plugin-dir - mountPath: /registration name: registration-dir - args: - --csi-address=/csi/csi.sock - --health-port=9809 - --v=2 image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/livenessprobe:v2.2.0 imagePullPolicy: IfNotPresent name: liveness-probe volumeMounts: - mountPath: /csi name: plugin-dir dnsPolicy: ClusterFirst hostNetwork: true nodeSelector: beta.kubernetes.io/os: linux priorityClassName: system-node-critical serviceAccountName: efs-csi-node-sa tolerations: - operator: Exists volumes: - hostPath: path: /var/lib/kubelet type: Directory name: kubelet-dir - hostPath: path: /var/lib/kubelet/plugins/efs.csi.aws.com/ type: DirectoryOrCreate name: plugin-dir - hostPath: path: /var/lib/kubelet/plugins_registry/ type: Directory name: registration-dir - hostPath: path: /var/run/efs type: DirectoryOrCreate name: efs-state-dir - hostPath: path: /var/amazon/efs type: DirectoryOrCreate name: efs-utils-config - hostPath: path: /etc/amazon/efs type: DirectoryOrCreate name: efs-utils-config-legacy --- apiVersion: storage.k8s.io/v1 kind: CSIDriver metadata: annotations: helm.sh/hook: pre-install, pre-upgrade helm.sh/hook-delete-policy: before-hook-creation helm.sh/resource-policy: keep name: efs.csi.aws.com spec: attachRequired: false |
等一会,”efs-csi-controller*”应该就绪了。
5.6 创建EFS文件系统(Root用户操作)
在Amazon EFS产品中,点击”Create file system”开始创建:
选择”Standard”作为存储类,这样可用区里面的所有节点都可以访问。
创建完成后,等待”Network”可用,然后点击”Manage”按钮添加集群安全组。
5.7 创建Kubernetes里面的存储类(IAM用户)
安装如下内容创建”storageclass.yaml”,并运行”kubectl apply -f storageclass.yaml”来创建。
注意修改”fileSystemId”成你自己的,通过如下图查询。
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: efs-sc provisioner: efs.csi.aws.com parameters: provisioningMode: efs-ap fileSystemId: fs-04470c1ed1eab275c directoryPerms: "700" gidRangeStart: "1000" # optional gidRangeEnd: "2000" # optional basePath: "/dynamic_provisioning" # optional |
6 部署Jenkins来测试(IAM用户)
6.1 部署Jenkins
注意设置存储类为efs-sc。
helm repo add jenkinsci https://charts.jenkins.io/
helm install my-jenkins jenkinsci/jenkins –version 4.1.17 –set persistence.storageClass=efs-sc
6.2 验证结果
等Jenkins启动后,可以采用端口转发来临时访问。
[awscli@bogon ~]$ kubectl port-forward svc/my-jenkins --address=0.0.0.0 8081:8080 Forwarding from 0.0.0.0:8081 -> 8080 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 Handling connection for 8081 |
7 集群自动伸缩
7.1 创建一个自动伸缩策略供EKS使用
首先创建一个策略:”TestEKSClusterAutoScalePolicy”
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "autoscaling:DescribeAutoScalingGroups", "autoscaling:DescribeAutoScalingInstances", "autoscaling:DescribeLaunchConfigurations", "autoscaling:DescribeTags", "autoscaling:SetDesiredCapacity", "autoscaling:TerminateInstanceInAutoScalingGroup", "ec2:DescribeLaunchTemplateVersions", "ec2:DescribeInstanceTypes" ], "Resource": "*" } ] } |
7.2 创建一个自动伸缩角色供EKS使用
接着创建一个角色:”TestEKSClusterAutoScaleRole”,使用上面创建的策略:”TestEKSClusterAutoScalePolicy”,并且按照如下设置”trusted entities”。
注意open id需要从你的EKS集群详情页面获取。
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::675892200046:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/BEDCA5446D2676BB0A51B7BECFB36773" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { "oidc.eks.us-east-1.amazonaws.com/id/BEDCA5446D2676BB0A51B7BECFB36773:sub": "system:serviceaccount:kube-system:cluster-autoscaler" } } } ] } |
7.3 部署cluster scaler
使用如下命令获取部署文件。
wget https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml |
然后修改”cluster-autoscaler-autodiscover.yaml” :
1 添加annotation “eks.amazonaws.com/role-arn” 到服务账号 ServiceAccount “cluster-autoscaler”上,见下面的代码。
2 在Deployment “cluster-autoscaler”中,修改
3 添加2个参数(- –balance-similar-node-groups – –skip-nodes-with-system-pods=false)到步骤2处代码的下一行。
4 添加注解annotation “cluster-autoscaler.kubernetes.io/safe-to-evict 在代码行”prometheus.io/port: ‘8085’”下面。
5 在https://github.com/kubernetes/autoscaler/releases找到对应的镜像版本,需要和你的EKS Kubernetes版本一致。
6 最后,运行”kubectl apply -f cluster-autoscaler-autodiscover.yaml”来创建。
--- apiVersion: v1 kind: ServiceAccount metadata: name: cluster-autoscaler namespace: kube-system labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler annotations: eks.amazonaws.com/role-arn: arn:aws:iam::675892200046:role/TestEKSClusterAutoScaleRole --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: cluster-autoscaler labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler rules: - apiGroups: [""] resources: ["events", "endpoints"] verbs: ["create", "patch"] - apiGroups: [""] resources: ["pods/eviction"] verbs: ["create"] - apiGroups: [""] resources: ["pods/status"] verbs: ["update"] - apiGroups: [""] resources: ["endpoints"] resourceNames: ["cluster-autoscaler"] verbs: ["get", "update"] - apiGroups: [""] resources: ["nodes"] verbs: ["watch", "list", "get", "update"] - apiGroups: [""] resources: - "namespaces" - "pods" - "services" - "replicationcontrollers" - "persistentvolumeclaims" - "persistentvolumes" verbs: ["watch", "list", "get"] - apiGroups: ["extensions"] resources: ["replicasets", "daemonsets"] verbs: ["watch", "list", "get"] - apiGroups: ["policy"] resources: ["poddisruptionbudgets"] verbs: ["watch", "list"] - apiGroups: ["apps"] resources: ["statefulsets", "replicasets", "daemonsets"] verbs: ["watch", "list", "get"] - apiGroups: ["storage.k8s.io"] resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"] verbs: ["watch", "list", "get"] - apiGroups: ["batch", "extensions"] resources: ["jobs"] verbs: ["get", "list", "watch", "patch"] - apiGroups: ["coordination.k8s.io"] resources: ["leases"] verbs: ["create"] - apiGroups: ["coordination.k8s.io"] resourceNames: ["cluster-autoscaler"] resources: ["leases"] verbs: ["get", "update"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: cluster-autoscaler namespace: kube-system labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler rules: - apiGroups: [""] resources: ["configmaps"] verbs: ["create","list","watch"] - apiGroups: [""] resources: ["configmaps"] resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"] verbs: ["delete", "get", "update", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: cluster-autoscaler labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-autoscaler subjects: - kind: ServiceAccount name: cluster-autoscaler namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: cluster-autoscaler namespace: kube-system labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: cluster-autoscaler subjects: - kind: ServiceAccount name: cluster-autoscaler namespace: kube-system --- apiVersion: apps/v1 kind: Deployment metadata: name: cluster-autoscaler namespace: kube-system labels: app: cluster-autoscaler spec: replicas: 1 selector: matchLabels: app: cluster-autoscaler template: metadata: labels: app: cluster-autoscaler annotations: prometheus.io/scrape: 'true' prometheus.io/port: '8085' cluster-autoscaler.kubernetes.io/safe-to-evict: "false" spec: priorityClassName: system-cluster-critical securityContext: runAsNonRoot: true runAsUser: 65534 fsGroup: 65534 serviceAccountName: cluster-autoscaler containers: - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.23.0 name: cluster-autoscaler resources: limits: cpu: 100m memory: 600Mi requests: cpu: 100m memory: 600Mi command: - ./cluster-autoscaler - --v=4 - --stderrthreshold=info - --cloud-provider=aws - --skip-nodes-with-local-storage=false - --expander=least-waste - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/TestEKSCluster - --balance-similar-node-groups - --skip-nodes-with-system-pods=false volumeMounts: - name: ssl-certs mountPath: /etc/ssl/certs/ca-certificates.crt #/etc/ssl/certs/ca-bundle.crt for Amazon Linux Worker Nodes readOnly: true imagePullPolicy: "Always" volumes: - name: ssl-certs hostPath: path: "/etc/ssl/certs/ca-bundle.crt" |
7.4 部署metrics server
使用metrics server我们可以获取pods的metrics,着色HPA的基础。
apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server rbac.authorization.k8s.io/aggregate-to-admin: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rbac.authorization.k8s.io/aggregate-to-view: "true" name: system:aggregated-metrics-reader rules: - apiGroups: - metrics.k8s.io resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server name: system:metrics-server rules: - apiGroups: - "" resources: - pods - nodes - nodes/stats - namespaces - configmaps verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server-auth-reader namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server:system:auth-delegator roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:auth-delegator subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: system:metrics-server roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:metrics-server subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: v1 kind: Service metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: ports: - name: https port: 443 protocol: TCP targetPort: https selector: k8s-app: metrics-server --- apiVersion: apps/v1 kind: Deployment metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 0 template: metadata: labels: k8s-app: metrics-server spec: containers: - args: - --cert-dir=/tmp - --secure-port=4443 - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s image: k8s.gcr.io/metrics-server/metrics-server:v0.5.2 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /livez port: https scheme: HTTPS periodSeconds: 10 name: metrics-server ports: - containerPort: 4443 name: https protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readyz port: https scheme: HTTPS initialDelaySeconds: 20 periodSeconds: 10 resources: requests: cpu: 100m memory: 200Mi securityContext: readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 volumeMounts: - mountPath: /tmp name: tmp-dir nodeSelector: kubernetes.io/os: linux priorityClassName: system-cluster-critical serviceAccountName: metrics-server volumes: - emptyDir: {} name: tmp-dir --- apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.io spec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system version: v1beta1 versionPriority: 100 |
7.5 测试集群伸缩cluster scaling
部署一个niginx来测试:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80 |
目前只有1个节点。
[awscli@bogon ~]$ kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% ip-172-31-17-148.ec2.internal 52m 2% 635Mi 19% [awscli@bogon ~]$ kubectl top pods --all-namespaces NAMESPACE NAME CPU(cores) MEMORY(bytes) default nginx-deployment-9456bbbf9-qlpcb 0m 2Mi kube-system aws-node-m6xjs 3m 34Mi kube-system cluster-autoscaler-5c4d9b6d4c-k2csm 2m 22Mi kube-system coredns-d5b9bfc4-4bvnn 1m 12Mi kube-system coredns-d5b9bfc4-z2ppq 1m 12Mi kube-system kube-proxy-x55c8 1m 10Mi kube-system metrics-server-84cd7b5645-prh6c 3m 16Mi 现在,我们把上面创建的测试POD副本设置到30,应为当前节点容量不够,一会儿后,一个新的节点(ip-172-31-91-231.ec2.internal)启动并加入到了集群。 [awscli@bogon ~]$ kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% ip-172-31-17-148.ec2.internal 66m 3% 726Mi 21% ip-172-31-91-231.ec2.internal 774m 40% 569Mi 17% [awscli@bogon ~]$ kubectl top pods --all-namespaces NAMESPACE NAME CPU(cores) MEMORY(bytes) default nginx-deployment-9456bbbf9-2tgpl 0m 2Mi default nginx-deployment-9456bbbf9-5jdsm 0m 2Mi default nginx-deployment-9456bbbf9-5vt9l 2m 2Mi default nginx-deployment-9456bbbf9-8ldm7 0m 2Mi default nginx-deployment-9456bbbf9-9m499 0m 2Mi default nginx-deployment-9456bbbf9-cpmqs 0m 2Mi default nginx-deployment-9456bbbf9-d6p4k 2m 2Mi default nginx-deployment-9456bbbf9-f2z87 2m 2Mi default nginx-deployment-9456bbbf9-f8w2f 0m 2Mi default nginx-deployment-9456bbbf9-fwjg4 0m 2Mi default nginx-deployment-9456bbbf9-kfmv8 0m 2Mi default nginx-deployment-9456bbbf9-knn2t 0m 2Mi default nginx-deployment-9456bbbf9-mq5sv 0m 2Mi default nginx-deployment-9456bbbf9-plh7h 0m 2Mi default nginx-deployment-9456bbbf9-qlpcb 0m 2Mi default nginx-deployment-9456bbbf9-tz22s 0m 2Mi default nginx-deployment-9456bbbf9-v6ccx 0m 2Mi default nginx-deployment-9456bbbf9-v9rc8 0m 2Mi default nginx-deployment-9456bbbf9-vwsfr 0m 2Mi default nginx-deployment-9456bbbf9-x2jnb 0m 2Mi default nginx-deployment-9456bbbf9-xhllv 0m 2Mi default nginx-deployment-9456bbbf9-z7hhr 0m 2Mi default nginx-deployment-9456bbbf9-zj7qc 0m 2Mi default nginx-deployment-9456bbbf9-zqptw 0m 2Mi kube-system aws-node-f4kf4 2m 35Mi kube-system aws-node-m6xjs 3m 35Mi kube-system cluster-autoscaler-5c4d9b6d4c-k2csm 3m 26Mi kube-system coredns-d5b9bfc4-4bvnn 1m 12Mi kube-system coredns-d5b9bfc4-z2ppq 1m 12Mi kube-system kube-proxy-qqrw9 1m 10Mi kube-system kube-proxy-x55c8 1m 10Mi kube-system metrics-server-84cd7b5645-prh6c 4m 16Mi |
8 在EKS中访问ECR
应为EKS托管的Node Group中的Node,我们不能修改上面的docker配置文件,所有不能用我们自己的Harbor除非你有正确的证书。所以采用AWS ECR就没有这些麻烦了。
8.1 创建repository
首先创建一个内联策略:”TestEKSonECRPolicy”,然后才能创建docker repository, 获取login token并上传镜像。
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "ecr:CreateRepository", "ecr:GetDownloadUrlForLayer", "ecr:DescribeRegistry", "ecr:GetAuthorizationToken", "ecr:UploadLayerPart", "ecr:ListImages", "ecr:DeleteRepository", "ecr:PutImage", "ecr:UntagResource", "ecr:BatchGetImage", "ecr:CompleteLayerUpload", "ecr:DescribeImages", "ecr:TagResource", "ecr:DescribeRepositories", "ecr:InitiateLayerUpload", "ecr:BatchCheckLayerAvailability" ], "Resource": "*" } ] } |
8.2 在EKS中拉取镜像
需要确认你的EKS node role托管角色有策略: AmazonEC2ContainerRegistryReadOnly
8.3 在EKS中推送镜像
运行如下命令获取login token.(注意修改ECR端点成你的).
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 675892200046.dkr.ecr.us-east-1.amazonaws.com |
为Jenkins命名空间创建一个secret token,然后Jenkins中的pipeline就可以使用docker推送镜像到ECR中。
kubectl create secret generic awsecr --from-file=.dockerconfigjson=config.json --type=kubernetes.io/dockerconfigjson -n jenkins |
本作品采用知识共享署名 4.0 国际许可协议进行许可。