慢慢的回味

使用VSCode 调试Apollo无人车代码

原创文章，转载请注明： 转载自慢慢的回味

本文链接地址: 使用VSCode 调试Apollo无人车代码

深入研究Apollo的代码是学习自动驾驶的很好途径。很多前沿科技，比如图像识别，激光雷达，多传感器融合，路径规划都可以直接完整的学习。能够直接调试代码是比读代码更能加深理解。本文就介绍怎么去调试Apollo的代码。

Content:

下载代码

系统使用Ubuntu 18.04版本。
https://gitee.com/ApolloAuto/apollo.git
比如目录为~/apollo。

安装Docker

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo   "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
 
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
systemctl daemon-reload
systemctl restart docker

安装无线网卡驱动（可选）

因为笔者的网卡Ubuntu没有自带，需要自行安装驱动。
下载驱动：https://codeload.github.com/gnab/rtl8812au/zip/refs/heads/master

make dkms_install
echo 8812au | sudo tee -a /etc/modules
insmod 8812au.ko

安装Nvidia驱动

本人的显卡为GTX1060。Apollo项目需要Nvidia显卡，否则大部分模块无法编译运行。
如下为安装显卡驱动的脚步程序。cuda安装后的路径可能有所不同。

sudo apt-get install linux-headers-$(uname -r)
distribution=$(. /etc/os-release;echo $ID$VERSION_ID | sed -e 's/\.//g')
wget https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/cuda-$distribution.pin
sudo mv cuda-$distribution.pin /etc/apt/preferences.d/cuda-repository-pin-600
#key的id可以会更新 https://developer.nvidia.com/blog/updating-the-cuda-linux-gpg-repository-key/
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/7fa2af80.pub
echo "deb http://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64 /" | sudo tee /etc/apt/sources.list.d/cuda.list
sudo apt-get update
sudo apt-get -y install cuda-drivers
export PATH=$PATH:/usr/local/cuda-11.2/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2/lib64:/usr/local/cuda/cuda/lib64

安装完成后，如下命令可以查看显卡信息。请确保cuda已正确安装。

nvidia-smi

禁止自动更新内核来避免重新安装驱动

1 查看自己使用的内核

derek@ubuntu:~$ uname -a
Linux ubuntu 5.4.0-150-generic #167~18.04.1-Ubuntu SMP Wed May 24 00:51:42 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

2 查看目前安装的内核，并锁定它们

derek@ubuntu:~$ dpkg --get-selections|grep linux|grep 150
linux-headers-5.4.0-150-generic			install
linux-hwe-5.4-headers-5.4.0-150			install
linux-image-5.4.0-150-generic			install
linux-modules-5.4.0-150-generic			install
linux-modules-extra-5.4.0-150-generic		install
 
derek@ubuntu:~$ sudo apt-mark hold linux-headers-5.4.0-150-generic
linux-headers-5.4.0-150-generic set on hold.
derek@ubuntu:~$ sudo apt-mark hold linux-hwe-5.4-headers-5.4.0-150
linux-hwe-5.4-headers-5.4.0-150 set on hold.
derek@ubuntu:~$ sudo apt-mark hold linux-image-5.4.0-150-generic
linux-image-5.4.0-150-generic set on hold.
derek@ubuntu:~$ sudo apt-mark hold linux-modules-5.4.0-150-generic
linux-modules-5.4.0-150-generic set on hold.
derek@ubuntu:~$ sudo apt-mark hold linux-modules-extra-5.4.0-150-generic
linux-modules-extra-5.4.0-150-generic set on hold
 
derek@ubuntu:~$ dpkg --get-selections|grep linux|grep 150
linux-headers-5.4.0-150-generic			hold
linux-hwe-5.4-headers-5.4.0-150			hold
linux-image-5.4.0-150-generic			hold
linux-modules-5.4.0-150-generic			hold
linux-modules-extra-5.4.0-150-generic		hold

3 修改软件自动更新
修改如下2项

安装Nvidia Docker

Apollo需要运行在Docker预安装环境，以加快开发运行环境的统一性。

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)    && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -    && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker

安装完成后，可启动示例Dockers程序查看。

sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

启动Apollo项目

在源码目录~/apollo中，执行如下命令，进入Apollo Docker：
脚本会把源文件里面的目录以volumn的形式挂载进Docker环境。

./docker/scripts/dev_start.sh 
./docker/scripts/dev_into.sh

以后进入，直接启动已有的Docker即可：

docker ps -a
docker start 671567b64765
./docker/scripts/dev_into.sh

调试一个车道线识别程序

创建一个Bazel编译配置：
~/apollo/modules/perception/camera/test/BUILD
内容为：

load("@rules_cc//cc:defs.bzl", "cc_library", "cc_test")
load("//tools:cpplint.bzl", "cpplint")
 
package(default_visibility = ["//visibility:public"])
 
cc_test(
    name = "camera_lib_lane_detector_darkscnn_lane_detector_test",
    size = "medium",
    srcs = ["camera_lib_lane_detector_darkscnn_lane_detector_test.cc"],
    deps = [
        "//cyber",
        "//modules/perception/base",
        "//modules/perception/camera/lib/lane/detector/darkSCNN:darkSCNN_lane_detector",
        "//modules/perception/common/io:io_util",
        "@com_google_googletest//:gtest_main",
        "@opencv//:core",
    ],
)
 
cpplint()

现在可以编译Apollo程序了：

./apollo.sh build_dbg

启动GDB Server，供Docker外面的VS Code进行远程调试：

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/libtorch_gpu/lib/
apt update
apt install gdbserver
gdbserver 127.0.0.1:2222 bazel-bin/modules/perception/camera/test/camera_lib_lane_detector_darkscnn_lane_detector_test

启动VS Code，安装C++插件，然后点击Debug，添加如下配置即可调试Docker里面刚才启动的程序：(注意修改program的路径)

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "gdb Remote camera_lib_lane_postprocessor_darkscnn_lane_postprocessor_test",
            "type": "cppdbg",
            "request": "launch",
            "program": "~/apollo/.cache/bazel/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/k8-dbg/bin/modules/perception/camera/test/camera_lib_lane_detector_darkscnn_lane_detector_test",
            "args": ["myarg1", "myarg2", "myarg3"],
            "stopAtEntry": true,
            "environment": [],
            "externalConsole": false,
            "MIMode": "gdb",
            "miDebuggerPath": "gdb",
            "miDebuggerArgs": "gdb",
            "linux": {
                "MIMode": "gdb",
                "miDebuggerPath": "/usr/bin/gdb",
                "miDebuggerServerAddress": "127.0.0.1:2222",
            },
            "logging": {
                "moduleLoad": false,
                "engineLogging": false,
                "trace": false
            },
            "setupCommands": [
                {
                    "description": "Enable pretty-printing for gdb",
                    "text": "-enable-pretty-printing",
                    "ignoreFailures": true
                }
            ],
            "cwd": "${workspaceFolder}",
        }
    ]
}

Windows下使用VS Code Dev Container

Windows下使用VS Code Dev Container来调试也是比较方便的，Dev Container可以让你在Windows下的操作如同在Container里面。

在项目根目录创建文件：.devcontainer/Dockerfile

FROM registry.baidubce.com/apolloauto/apollo:dev-x86_64-18.04-20210914_1336
 
ENV SHELL /bin/bash

在项目根目录创建文件：.devcontainer/devcontainer.json

{
    "name": "Apollo Dev Container",
    "build": {
      "dockerfile": "./Dockerfile"
    },
    "remoteUser": "root",
    //docker run -v apollo_map_volume-sunnyvale_big_loop_root:/apollo/modules/map/data/sunnyvale_big_loop --rm registry.baidubce.com/apolloauto/apollo:map_volume-sunnyvale_big_loop-latest true
    //docker run -v apollo_map_volume-sunnyvale_loop-latest:/apollo/modules/map/data/sunnyvale_loop --rm registry.baidubce.com/apolloauto/apollo:map_volume-sunnyvale_loop-latest true
    //docker run -v apollo_audio_volume_root:/apollo/modules/audio/data/ --rm registry.baidubce.com/apolloauto/apollo:data_volume-audio_model-x86_64-latest true
    "runArgs": [
        "--privileged",
        "--net=host",
        "-v", "apollo_map_volume-sunnyvale_big_loop_root:/apollo/modules/map/data/sunnyvale_big_loop",
        "-v", "apollo_map_volume-sunnyvale_loop-latest:/apollo/modules/map/data/sunnyvale_loop",
        "-v", "apollo_audio_volume_root:/apollo/modules/audio/data/",
        "--env", "CROSS_PLATFORM=0",
        "--env", "USER=root",
        "--env", "DOCKER_USER=root",
        "--env", "DOCKER_USER_ID=0",
        "--env", "DOCKER_GRP=0",
        "--env", "DOCKER_GRP_ID=0",
        "--env", "DOCKER_IMG=apolloauto/apollo:dev-x86_64-18.04-20210914_1336",
        "--env", "USE_GPU_HOST=false", // Set to 'true' or 'false' depending on the script
        "--env", "NVIDIA_VISIBLE_DEVICES=all",
        "--env", "NVIDIA_DRIVER_CAPABILITIES=compute,video,graphics,utility",
        "--workdir", "/apollo",
        "--add-host", "in-dev-docker:127.0.0.1",
        "--add-host", "localhost:127.0.0.1",
        "--hostname", "in-dev-docker"
    ],
    "workspaceMount": "type=bind,source=${localWorkspaceFolder},target=/apollo",
    "customizations": {
      "vscode": {
        "settings.json": {
          "terminal.integrated.profiles.linux": { "bash": { "path": "/bin/bash" } }
        }
      }
    }
}

最后，安装上VS Code 的 Dev Container 插件后就可以 Open In Container 了。

本作品采用知识共享署名 4.0 国际许可协议进行许可。

使用VSCode 调试tensorflow c lib的简单方法

原创文章，转载请注明： 转载自慢慢的回味

本文链接地址: 使用VSCode 调试tensorflow c lib的简单方法

Content:

回目录

编写测试程序

编写测试程序TensorflowTest.cpp，放到目录tensorflow/c下面。

//============================================================================
// Name        : TensorflowTest.cpp
// Author      : 
// Version     :
// Copyright   : Your copyright notice
// Description : Hello World in C++, Ansi-style
//============================================================================
 
#include <iostream>
#include "c_api.h"
#include "c_api_experimental.h"
#include "c_test_util.h"
 
#include <algorithm>
#include <cstddef>
#include <iterator>
#include <memory>
#include <vector>
#include <string.h>
 
using namespace std;
 
int main() {
	cout << "!!!Hello World!!!" << endl; // prints !!!Hello World!!!
	cout << "Hello from TensorFlow C library version" << TF_Version() << endl;
 
	TF_Status* s = TF_NewStatus();
	TF_Graph* graph = TF_NewGraph();
 
	// Construct the graph: A + 2 + B
	TF_Operation* a = Placeholder(graph, s, "A");
	cout << TF_Message(s);
 
	TF_Operation* b = Placeholder(graph, s, "B");
	cout << TF_Message(s);
 
	TF_Operation* one = ScalarConst(1, graph, s, "kone");
	cout << TF_Message(s);
 
	TF_Operation* two = ScalarConst(2, graph, s, "ktwo");
	cout << TF_Message(s);
 
	TF_Operation* three = Add(one, two, graph, s, "kthree");
	cout << TF_Message(s);
 
	TF_Operation* plus2 = Add(a, two, graph, s, "plus2");
	cout << TF_Message(s);
 
	TF_Operation* plusB = Add(plus2, b, graph, s, "plusB");
	cout << TF_Message(s);
 
	TF_Operation* plusC = Add(plusB, three, graph, s, "plusC");
	cout << TF_Message(s);
 
	// Setup a session and a partial run handle.  The partial run will allow
	// computation of A + 2 + B in two phases (calls to TF_SessionPRun):
	// 1. Feed A and get (A+2)
	// 2. Feed B and get (A+2)+B
	TF_SessionOptions* opts = TF_NewSessionOptions();
	TF_EnableXLACompilation(opts, true);
	TF_Session* sess = TF_NewSession(graph, opts, s);
	TF_DeleteSessionOptions(opts);
 
	TF_Output feeds[] = { TF_Output { a, 0 }, TF_Output { b, 0 } };
	TF_Output fetches[] = { TF_Output { plus2, 0 }, TF_Output { plusB, 0 }, TF_Output { plusC, 0 }  };
 
	const char* handle = nullptr;
	TF_SessionPRunSetup(sess, feeds, TF_ARRAYSIZE(feeds), fetches,
			TF_ARRAYSIZE(fetches), NULL, 0, &handle, s);
	cout << TF_Message(s);
 
	// Feed A and fetch A + 2.
	TF_Output feeds1[] = { TF_Output { a, 0 }, TF_Output { b, 0 } };
	TF_Output fetches1[] = { TF_Output { plus2, 0 }, TF_Output { plusB, 0 }, TF_Output { plusC, 0 } };
	TF_Tensor* feedValues1[] = { Int32Tensor(1), Int32Tensor(3) };
	TF_Tensor* fetchValues1[3];
	TF_SessionPRun(sess, handle, feeds1, feedValues1, 2, fetches1, fetchValues1,
			3, NULL, 0, s);
	cout << TF_Message(s);
	cout << *(static_cast<int*>(TF_TensorData(fetchValues1[0]))) << endl;
	cout << *(static_cast<int*>(TF_TensorData(fetchValues1[1]))) << endl;
	cout << *(static_cast<int*>(TF_TensorData(fetchValues1[2]))) << endl;
 
	// Clean up.
	TF_DeletePRunHandle(handle);
	TF_DeleteSession(sess, s);
	cout << TF_Message(s);
	TF_DeleteGraph(graph);
	TF_DeleteStatus(s);
	return 0;
}

//============================================================================ // Name : TensorflowTest.cpp // Author : // Version : // Copyright : Your copyright notice // Description : Hello World in C++, Ansi-style //============================================================================ #include <iostream> #include "c_api.h" #include "c_api_experimental.h" #include "c_test_util.h" #include <algorithm> #include <cstddef> #include <iterator> #include <memory> #include <vector> #include <string.h> using namespace std; int main() { cout << "!!!Hello World!!!" << endl; // prints !!!Hello World!!! cout << "Hello from TensorFlow C library version" << TF_Version() << endl; TF_Status* s = TF_NewStatus(); TF_Graph* graph = TF_NewGraph(); // Construct the graph: A + 2 + B TF_Operation* a = Placeholder(graph, s, "A"); cout << TF_Message(s); TF_Operation* b = Placeholder(graph, s, "B"); cout << TF_Message(s); TF_Operation* one = ScalarConst(1, graph, s, "kone"); cout << TF_Message(s); TF_Operation* two = ScalarConst(2, graph, s, "ktwo"); cout << TF_Message(s); TF_Operation* three = Add(one, two, graph, s, "kthree"); cout << TF_Message(s); TF_Operation* plus2 = Add(a, two, graph, s, "plus2"); cout << TF_Message(s); TF_Operation* plusB = Add(plus2, b, graph, s, "plusB"); cout << TF_Message(s); TF_Operation* plusC = Add(plusB, three, graph, s, "plusC"); cout << TF_Message(s); // Setup a session and a partial run handle. The partial run will allow // computation of A + 2 + B in two phases (calls to TF_SessionPRun): // 1. Feed A and get (A+2) // 2. Feed B and get (A+2)+B TF_SessionOptions* opts = TF_NewSessionOptions(); TF_EnableXLACompilation(opts, true); TF_Session* sess = TF_NewSession(graph, opts, s); TF_DeleteSessionOptions(opts); TF_Output feeds[] = { TF_Output { a, 0 }, TF_Output { b, 0 } }; TF_Output fetches[] = { TF_Output { plus2, 0 }, TF_Output { plusB, 0 }, TF_Output { plusC, 0 } }; const char* handle = nullptr; TF_SessionPRunSetup(sess, feeds, TF_ARRAYSIZE(feeds), fetches, TF_ARRAYSIZE(fetches), NULL, 0, &handle, s); cout << TF_Message(s); // Feed A and fetch A + 2. TF_Output feeds1[] = { TF_Output { a, 0 }, TF_Output { b, 0 } }; TF_Output fetches1[] = { TF_Output { plus2, 0 }, TF_Output { plusB, 0 }, TF_Output { plusC, 0 } }; TF_Tensor* feedValues1[] = { Int32Tensor(1), Int32Tensor(3) }; TF_Tensor* fetchValues1[3]; TF_SessionPRun(sess, handle, feeds1, feedValues1, 2, fetches1, fetchValues1, 3, NULL, 0, s); cout << TF_Message(s); cout << *(static_cast<int*>(TF_TensorData(fetchValues1[0]))) << endl; cout << *(static_cast<int*>(TF_TensorData(fetchValues1[1]))) << endl; cout << *(static_cast<int*>(TF_TensorData(fetchValues1[2]))) << endl; // Clean up. TF_DeletePRunHandle(handle); TF_DeleteSession(sess, s); cout << TF_Message(s); TF_DeleteGraph(graph); TF_DeleteStatus(s); return 0; }

添加Bazel build配置

修改tensorflow/c/BUILD文件：
替换

load("//tensorflow:tensorflow.bzl", "tf_cuda_cc_test")

为

load("//tensorflow:tensorflow.bzl", "tf_cuda_cc_test", "tf_cc_binary")

添加

tf_cc_binary(
    name = "TensorflowTest",
    testonly = 1,
    srcs = [
        "TensorflowTest.cpp",
    ],
    deps = [
        ":c_api_experimental",
        ":c_test_util",
        ":c_api",
    ],
)

编译测试程序

bazel build -s --config=dbg --config=noaws --config=nogcp --config=nohdfs --config=nonccl --config=xla //tensorflow/c:TensorflowTest --verbose_failures

VSCode Debug

启动VSCode，打开Tensorflow源代码根目录。
添加CPP插件

配置Debug，弹出菜单中选择gdb。
然后修改program参数，然后即可在代码上打断点调试
本作品采用知识共享署名 4.0 国际许可协议进行许可。

Kubernetes EFK(Elastic Search, Fluentd, Kibana)搭建

原创文章，转载请注明： 转载自慢慢的回味

本文链接地址: Kubernetes EFK(Elastic Search, Fluentd, Kibana)搭建

上一次，我们完成了搭建单节点Kubernetes环境。现在在其基础上继续搭建EFK(Elastic Search, Fluentd, Kibana)日志收集系统。

Elasticsearch 安装

Elasticsearch 安装时需要开启HTTPS。

kubectl create namespace efk
 
cat <<EOF > es_extracfg.yaml
  xpack:
    security:
      enabled: "true"
      authc:
        api_key:
          enabled: "true"
EOF
 
 
helm upgrade --install my-elasticsearch bitnami/elasticsearch -n efk --set security.enabled=true --set security.elasticPassword=YourPassword --set security.tls.autoGenerated=true --set-file extraConfig=es_extracfg.yaml

需要修改Stateful Set “my-elasticsearch-coordinating-only”和“my-elasticsearch-master”的内容如下，否则数据传输不成功：

          resources:
            requests:
              cpu: 25m
              memory: 512Mi

Kibana 安装

Kibana 安装的时候需要指定Elastic Search服务器的地址，密码为Elastic Search服务器的密码。这儿必须连接HTTPS接口。

helm upgrade --install my-kibana bitnami/kibana -n efk --set elasticsearch.hosts[0]=my-elasticsearch-coordinating-only --set elasticsearch.port=9200 --set elasticsearch.security.auth.enabled=true --set elasticsearch.security.auth.kibanaPassword=YourPassword --set elasticsearch.security.tls.enabled=true --set elasticsearch.security.tls.verificationMode=none

Fluentd 安装

手动添加如下配置，通过@type elasticsearch把日志转发到Elastic Search服务器上面。

kind: ConfigMap
apiVersion: v1
metadata:
  name: elasticsearch-output
  namespace: efk
data:
  fluentd.conf: |
    # Prometheus Exporter Plugin
    # input plugin that exports metrics
    <source>
      @type prometheus
      port 24231
    </source>
 
    # input plugin that collects metrics from MonitorAgent
    <source>
      @type prometheus_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>
 
    # input plugin that collects metrics for output plugin
    <source>
      @type prometheus_output_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>
 
    # Ignore fluentd own events
    <match fluent.**>
      @type null
    </match>
 
    # TCP input to receive logs from the forwarders
    <source>
      @type forward
      bind 0.0.0.0
      port 24224
    </source>
 
    # HTTP input for the liveness and readiness probes
    <source>
      @type http
      bind 0.0.0.0
      port 9880
    </source>
 
    # Throw the healthcheck to the standard output instead of forwarding it
    <match fluentd.healthcheck>
      @type stdout
    </match>
 
    # Send the logs to the standard output
    <match **>
      @type elasticsearch
      include_tag_key true
      scheme https
      host my-elasticsearch-coordinating-only
      port 9200
      user elastic
      password YourPassword
      ssl_verify false
      logstash_format true
      logstash_prefix k8s 
      request_timeout 30s
 
      <buffer>
        @type file
        path /opt/bitnami/fluentd/logs/buffers/logs.buffer
        flush_thread_count 2
        flush_interval 5s
      </buffer>
    </match>

helm upgrade --install my-fluentd bitnami/fluentd -n efk --set aggregator.configMap=elasticsearch-output

Kibana 管理界面

添加Index Pattern后可以查看log

本作品采用知识共享署名 4.0 国际许可协议进行许可。

Kubernetes Gitlab CICD和服务网格Istio搭建

原创文章，转载请注明： 转载自慢慢的回味

本文链接地址: Kubernetes Gitlab CICD和服务网格Istio搭建

上一次，我们完成了搭建单节点Kubernetes环境。现在在其基础上用Gitlab构建CICD持续集成环境，并用一个Demo在服务网格Istio上面进行演示。

服务网格Istio安装

参照https://istio.io/latest/docs/setup/getting-started/完成Istio的安装：

export https_proxy=http://192.168.0.105:8070
export http_proxy=http://192.168.0.105:8070
curl -L https://istio.io/downloadIstio | sh -
export https_proxy=
export http_proxy=
cd istio-1.12.1
export PATH=$PWD/bin:$PATH
istioctl install --set profile=demo -y
 
#✔ Istio core installed                                                                                                                                                                                         
#✔ Istiod installed                                                                                                                                                                                             
#✔ Egress gateways installed                                                                                                                                                                                    
#✔ Ingress gateways installed                                                                                                                                                                                   
#✔ Installation complete                                                                                                                                                                                        #Making this installation the default for injection and validation.
#
#Thank you for installing Istio 1.12.  Please take a few minutes to tell us about your install/upgrade experience!  https://forms.gle/FegQbc9UvePd4Z9z7

安装完成后如下图：

如下安装插件：

kubectl apply -f samples/addons
kubectl rollout status deployment/kiali -n istio-system

修改服务kiali为LoadBalancer类型：

kind: Service
apiVersion: v1
metadata:
  name: kiali
  namespace: istio-system
spec:
  type: LoadBalancer

安装NFS Server

Gitlab需要存储卷，所有给Kubernetes集群提供一个NFS Server作为存储提供。
服务器端安装NFS Server：

sudo yum install nfs-utils -y
sudo systemctl start nfs-server.service
sudo systemctl enable nfs-server.service
sudo systemctl status nfs-server.service
sudo cat /proc/fs/nfsd/versions
 
sudo mkdir /data
chmod +w /data
sudo mkdir -p /srv/nfs4/data
sudo mount --bind /data /srv/nfs4/data
 
sudo cp -p /etc/fstab /etc/fstab.bak$(date '+%Y%m%d%H%M%S')
sudo echo "/data    /srv/nfs4/data    none    bind    0    0" >> /etc/fstab
sudo mount -a
 
sudo echo "/srv/nfs4    192.168.0.0/24(rw,sync,no_subtree_check,crossmnt,fsid=0)" >> /etc/exports
sudo echo "/srv/nfs4/data    192.168.0.0/24(rw,sync,no_subtree_check,no_root_squash)" >> /etc/exports
sudo exportfs -ra
sudo exportfs -v
sudo systemctl restart nfs-server.service

在客户端测试NFS Service：

sudo yum install nfs-utils -y
sudo mkdir /root/data
sudo mount -t nfs -o vers=4 192.168.0.180:/data /root/data
cd /root/data
echo "test nfs write" >> test.txt

在客户端测试NFS Service：

export https_proxy=http://192.168.0.105:8070
export http_proxy=http://192.168.0.105:8070
wget https://get.helm.sh/helm-v3.7.2-linux-amd64.tar.gz
export https_proxy=
export http_proxy=
 
tar -xvf helm-v3.7.2-linux-amd64.tar.gz
cd /root/linux-amd64
export PATH=$PWD:$PATH
 
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
export https_proxy=http://192.168.0.105:8070
export http_proxy=http://192.168.0.105:8070
helm fetch nfs-subdir-external-provisioner/nfs-subdir-external-provisioner
export https_proxy=
export http_proxy=
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner-4.0.14.tgz \
    --set nfs.server=192.168.0.180 \
    --set nfs.path=/data

编辑StorageClass nfs-client，加上storageclass.kubernetes.io/is-default-class: ‘true’，使其成为默认的存储提供者：

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: nfs-client
  annotations:
    storageclass.kubernetes.io/is-default-class: 'true'

构建CICD持续集成环境

安装 Gitlab

安装gitlab到kubernetes环境中：
注意上一篇文章中给Docker配置代理相当重要。

helm repo add gitlab http://charts.gitlab.io/
helm repo update
kubectl create namespace mygitlab
helm upgrade --install my-gitlab gitlab/gitlab --version 5.6.0 --namespace mygitlab --set global.hosts.https=false --set global.ingress.tls.enabled=false --set global.ingress.configureCertmanager=false --set global.kas.enabled=true --set global.edition=ce

安装完成后如下图：

配置本地域名

查询ingress的外网地址如192.168.0.192：

[root@k8s-master data]# kubectl get ingress --all-namespaces
NAMESPACE   NAME                           CLASS             HOSTS                  ADDRESS         PORTS   AGE
mygitlab    my-gitlab-kas                  my-gitlab-nginx   kas.example.com        192.168.0.192   80      44m
mygitlab    my-gitlab-minio                my-gitlab-nginx   minio.example.com      192.168.0.192   80      44m
mygitlab    my-gitlab-registry             my-gitlab-nginx   registry.example.com   192.168.0.192   80      44m
mygitlab    my-gitlab-webservice-default   my-gitlab-nginx   gitlab.example.com     192.168.0.192   80      44m

添加自定义host到hosts中：

vi /etc/hosts
192.168.0.192       minio.example.com
192.168.0.192       registry.example.com
192.168.0.192       gitlab.example.com
192.168.0.192       kas.example.com

接下来需要修改coredns ConfigMap，使集群内部的DNS能够连上gitlab。注意IP地址192.168.0.192需要修改成你自己gitlab LoadBalancer地址。
否则，比如my-gitlab-gitlab-runner-*-*连不上gitlab.example.com。

kind: ConfigMap
apiVersion: v1
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        ......
        hosts {
          192.168.0.192  minio.example.com
          192.168.0.192  registry.example.com
          192.168.0.192  gitlab.example.com
          fallthrough
        }
        ......
    }

修复Docker的代理设置，注意包括master和worker节点

给Docker追加-H tcp://0.0.0.0:2375使my-gitlab-gitlab-runner-*-*可以调用Docker：

vi /usr/lib/systemd/system/docker.service
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock -H tcp://0.0.0.0:2375

确保http-proxy.conf含有NO_PROXY=example.com

vi /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://192.168.0.105:8070"
Environment="HTTPS_PROXY=http://192.168.0.105:8070"
Environment="NO_PROXY=localhost,127.0.0.1,example.com"

修改daemon.json，增加”insecure-registries”: [“registry.example.com”]：

vi /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2",
  "insecure-registries": ["registry.example.com"]
}
 
systemctl daemon-reload
systemctl restart docker

设置gitlab-runner可以访问非https的块存储器

接下来需要修改my-gitlab-gitlab-runner ConfigMap，把Insecure = false改成Insecure = true。

kind: ConfigMap
apiVersion: v1
metadata:
  name: my-gitlab-gitlab-runner
  namespace: mygitlab
data:
  config.template.toml: |
    [[runners]]
      [runners.cache]
        [runners.cache.s3]
          ServerAddress = "minio.example.com"
          BucketName = "runner-cache"
          BucketLocation = "us-east-1"
          Insecure = true

确保gitlab-runner工作正常

重启my-gitlab-gitlab-runner-*-*，然后查看my-gitlab-gitlab-runner-*-*的日志，确保Registering runner… succeeded就成功了：

ERROR: Registering runner... failed                 runner=CiOHA0SP status=couldn't execute POST against http://gitlab.example.com/api/v4/runners: Post http://gitlab.example.com/api/v4/runners: dial tcp: lookup gitlab.example.com on 10.96.0.10:53: no such host
PANIC: Failed to register the runner. You may be having network problems. 
Registration attempt 6 of 30
Runtime platform                                    arch=amd64 os=linux pid=82 revision=5316d4ac version=14.6.0
WARNING: Running in user-mode.                     
WARNING: The user-mode requires you to manually start builds processing: 
WARNING: $ gitlab-runner run                       
WARNING: Use sudo for system-mode:                 
WARNING: $ sudo gitlab-runner...                   
 
Registering runner... succeeded                     runner=CiOHA0SP

登录gitlab，修改密码，上传SSH Public Key

通过下列命令获取GITLAB root用户密码后登录http://gitlab.example.com/users/sign_in：

kubectl get secret my-gitlab-gitlab-initial-root-password -n mygitlab  -o jsonpath='{.data.password}' | base64 --decode

通过下列命令生成GITLAB root用户的SSH KEY，并把id_rsa.pub 内容更新到http://gitlab.example.com/-/profile/keys，这样就可以用git了：

[root@k8s-master k8s]# ssh-keygen -t rsa -b 2048 -C "mygitlab"
 
[root@k8s-master k8s]# cat /root/.ssh/id_rsa.pub 
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC+tr8cgRitUKHzoIReyPYYsoywtCvn8TLFMC2BjyI3kKWia4zajWkOFQpJwe9eaSlwO3GkqVdpfZ34O+y0caUWfwaw1+inZIlRvx7X6yGmMha27VSmfzj6dfd6TzH2B5KaBUg21nFBYaXaYwLAT0jX8BQ+/QXl8gi33NmH06ctIdVPl9dBkNBvr9rzRMYQnoFtJppKHnN8S/9XnhEJFN3lEvajka+j5VgeOuzLNUs7NvWd9+cbSWNakJulOSK/WSUdzT2oWpY6YP+amAByOIa5Nl2XSRpZ2/oVWG0KsXBHSgwhIlu6WK5GzTVSxRRdQNjSyqNTeuPmsh6WC1alWPGl mygitlab

修改允许上传Jar的最大限制

发布服务网格Demo程序

创建命名空间并注入Istio

在Kubernetes下创建namespace bookstore-servicemesh

kubectl create namespace bookstore-servicemesh
kubectl label namespace bookstore-servicemesh istio-injection=enabled

绑定ServiceAccount使其具有集群管理权限，后面helm会用：

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: mygitlab-admin-role-default
subjects:
  - kind: ServiceAccount
    name: default
    namespace: mygitlab
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin

CICD过程

下载源码bookstore（修改自https://github.com/fenixsoft/servicemesh_arch_istio.git），解压后直接Push到git@gitlab.example.com:root/bookstore.git。

等待gitlab CICD，然后在namespace bookstore-servicemesh下会自动Build and Deploy。

使book-admin具有pull 私有hub的权限

在namespace bookstore-servicemesh下创建一个私有Docker hub registry.example.com的 pull secret：

kubectl create secret docker-registry docker-gitlab --docker-server=registry.example.com --docker-username=root --docker-password=yougitlabpassword -n bookstore-servicemesh

关联这个secret到ServiceAccount book-admin。

kind: ServiceAccount
apiVersion: v1
metadata:
  name: book-admin
  namespace: bookstore-servicemesh
......
imagePullSecrets:
  - name: docker-gitlab

部署成功后如下：

测试

从如下图示可以得到Istio的Ingress地址：

访问如下：

从Kaili里面可以查看网络拓扑图:

本作品采用知识共享署名 4.0 国际许可协议进行许可。

搭建单节点Kubernetes环境

原创文章，转载请注明： 转载自慢慢的回味

本文链接地址: 搭建单节点Kubernetes环境

这次在Centos 8环境下搭建单节点Kubernetes环境用于日常的开发。区别于搭建高可用Kubernetes集群的是：系统升级为Centos 8；控制面为单节点；工作节点也只有一个。

系统规划

系统：CentOS-8.4.2105-x86_64
网络：Master节点 192.168.0.180；Worker节点：192.168.0.181
Kubernetes：1.23.1
kubeadm：1.23.1
Docker：20.10.9

准备工作

因为后面使用到的软件大部分需要科学上网。所以可以从阿里云香港区域购买一个Linux的主机，按量付费就可以。比如公网IP为47.52.220.100。
然后使用pproxy开启代理服务器：
pip3 install pproxy
pproxy -l http://0.0.0.0:8070 -r ssh://47.52.220.100/#root:password –v
这样代理服务器就在8070端口开放了。

安装基础服务器

安装所有需要的软件，后面的服务器只需要从它拷贝就可以了。

虚拟机安装Centos8

网卡需要选择桥接模式。
安装后手动设置IP，不要用DHCP

前置检查与配置

1 关闭防火墙，不然配置防火墙太麻烦。
2 关闭SELinux。
3 确保每个节点上 MAC 地址和 product_uuid 的唯一性。
4 禁用交换分区。为了保证 kubelet 正常工作，你必须禁用交换分区。
5 开启IP转发。

#!/bin/bash
 
echo "###############################################"
echo "Please ensure your OS is CentOS8 64 bits"
echo "Please ensure your machine has full network connection and internet access"
echo "Please ensure run this script with root user"
 
# Check hostname, Mac addr and product_uuid
echo "###############################################"
echo "Please check hostname as below:"
uname -a
# Set hostname if want
#hostnamectl set-hostname k8s-master
 
echo "###############################################"
echo "Please check Mac addr and product_uuid as below:"
ip link
cat /sys/class/dmi/id/product_uuid
 
echo "###############################################"
echo "Please check default route:"
ip route show
 
# Stop firewalld
echo "###############################################"
echo "Stop firewalld"
sudo systemctl stop firewalld
sudo systemctl disable firewalld
 
# Disable SELinux
echo "###############################################"
echo "Disable SELinux"
sudo getenforce
 
sudo setenforce 0
sudo cp -p /etc/selinux/config /etc/selinux/config.bak$(date '+%Y%m%d%H%M%S')
sudo sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
 
sudo getenforce
 
# Turn off Swap
echo "###############################################"
echo "Turn off Swap"
free -m
sudo cat /proc/swaps
 
sudo swapoff -a
 
sudo cp -p /etc/fstab /etc/fstab.bak$(date '+%Y%m%d%H%M%S')
sudo sed -i "s/\/dev\/mapper\/rhel-swap/\#\/dev\/mapper\/rhel-swap/g" /etc/fstab
sudo sed -i "s/\/dev\/mapper\/centos-swap/\#\/dev\/mapper\/centos-swap/g" /etc/fstab
sudo sed -i "s/\/dev\/mapper\/cl-swap/\#\/dev\/mapper\/cl-swap/g" /etc/fstab
sudo mount -a
 
free -m
sudo cat /proc/swaps
 
# Setup iptables (routing)
echo "###############################################"
echo "Setup iptables (routing)"
sudo cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-arptables = 1
net.ipv4.ip_forward = 1
EOF
 
sudo sysctl --system
iptables -P FORWARD ACCEPT
 
# Check ports
echo "###############################################"
echo "Check API server port(s)"
netstat -nlp | grep "8080\|6443"
 
echo "Check ETCD port(s)"
netstat -nlp | grep "2379\|2380"
 
echo "Check port(s): kublet, kube-scheduler, kube-controller-manager"
netstat -nlp | grep "10250\|10251\|10252"

#!/bin/bash echo "###############################################" echo "Please ensure your OS is CentOS8 64 bits" echo "Please ensure your machine has full network connection and internet access" echo "Please ensure run this script with root user" # Check hostname, Mac addr and product_uuid echo "###############################################" echo "Please check hostname as below:" uname -a # Set hostname if want #hostnamectl set-hostname k8s-master echo "###############################################" echo "Please check Mac addr and product_uuid as below:" ip link cat /sys/class/dmi/id/product_uuid echo "###############################################" echo "Please check default route:" ip route show # Stop firewalld echo "###############################################" echo "Stop firewalld" sudo systemctl stop firewalld sudo systemctl disable firewalld # Disable SELinux echo "###############################################" echo "Disable SELinux" sudo getenforce sudo setenforce 0 sudo cp -p /etc/selinux/config /etc/selinux/config.bak$(date '+%Y%m%d%H%M%S') sudo sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config sudo getenforce # Turn off Swap echo "###############################################" echo "Turn off Swap" free -m sudo cat /proc/swaps sudo swapoff -a sudo cp -p /etc/fstab /etc/fstab.bak$(date '+%Y%m%d%H%M%S') sudo sed -i "s/\/dev\/mapper\/rhel-swap/\#\/dev\/mapper\/rhel-swap/g" /etc/fstab sudo sed -i "s/\/dev\/mapper\/centos-swap/\#\/dev\/mapper\/centos-swap/g" /etc/fstab sudo sed -i "s/\/dev\/mapper\/cl-swap/\#\/dev\/mapper\/cl-swap/g" /etc/fstab sudo mount -a free -m sudo cat /proc/swaps # Setup iptables (routing) echo "###############################################" echo "Setup iptables (routing)" sudo cat <<EOF > /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-arptables = 1 net.ipv4.ip_forward = 1 EOF sudo sysctl --system iptables -P FORWARD ACCEPT # Check ports echo "###############################################" echo "Check API server port(s)" netstat -nlp | grep "8080\|6443" echo "Check ETCD port(s)" netstat -nlp | grep "2379\|2380" echo "Check port(s): kublet, kube-scheduler, kube-controller-manager" netstat -nlp | grep "10250\|10251\|10252"

安装Docker

卸载掉旧的docker，安装我们需要的版本。

#!/bin/bash
 
set -e
 
# Uninstall installed docker
sudo yum remove -y docker \
                  docker-client \
                  docker-client-latest \
                  docker-common \
                  docker-latest \
                  docker-latest-logrotate \
                  docker-logrotate \
                  docker-selinux \
                  docker-engine-selinux \
                  docker-engine \
                  runc
# If you need set proxy, append one line
#vi /etc/yum.conf
#proxy=http://192.168.0.105:8070
 
 
# Set up repository
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
 
# Use Aliyun Docker
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
 
# Install a validated docker version
sudo yum install -y docker-ce-20.10.9 docker-ce-cli-20.10.9 containerd.io-1.4.12
 
# Setup Docker daemon https://kubernetes.io/zh/docs/setup/production-environment/container-runtimes/#docker
mkdir -p /etc/docker
 
sudo cat <<EOF | sudo tee /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF
 
sudo mkdir -p /etc/systemd/system/docker.service.d
 
# Run Docker as systemd service
sudo systemctl daemon-reload
sudo systemctl enable docker
sudo systemctl start docker
 
# Check Docker version
docker version

安装Kubernetes

#!/bin/bash
 
set -e
 
sudo cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
 
yum clean all
yum makecache -y
yum repolist all
 
setenforce 0
 
sudo yum install -y kubelet-1.23.1 kubeadm-1.23.1 kubectl-1.23.1 --disableexcludes=kubernetes
 
# Check installed Kubernetes packages
sudo yum list installed | grep kube
 
sudo systemctl daemon-reload
sudo systemctl enable kubelet
sudo systemctl start kubelet

提前下载Docker镜像

包括Kubernetes和Calico网络插件镜像。

mkdir -p /etc/systemd/system/docker.service.d
vi /etc/systemd/system/docker.service.d/http-proxy.conf
#加入如下配置
[Service]
Environment="HTTP_PROXY=http://192.168.0.105:8070" "HTTPS_PROXY=http://192.168.0.105:8070" "NO_PROXY=localhost,127.0.0.1,registry.example.com"
 
#重载配置并重启dockers服务
systemctl daemon-reload
systemctl restart docker

#!/bin/bash
 
# Run `kubeadm config images list` to check required images
# Check version in https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/
# Search "Running kubeadm without an internet connection"
# For running kubeadm without an internet connection you have to pre-pull the required master images for the version of choice:
KUBE_VERSION=v1.23.1
KUBE_PAUSE_VERSION=3.6
ETCD_VERSION=3.5.1-0
CORE_DNS_VERSION=1.8.6
 
# In Kubernetes 1.12 and later, the k8s.gcr.io/kube-*, k8s.gcr.io/etcd and k8s.gcr.io/pause images don’t require an -${ARCH} suffix
images=(kube-proxy:${KUBE_VERSION}
kube-scheduler:${KUBE_VERSION}
kube-controller-manager:${KUBE_VERSION}
kube-apiserver:${KUBE_VERSION}
pause:${KUBE_PAUSE_VERSION}
etcd:${ETCD_VERSION})
 
for imageName in ${images[@]} ; do
  docker pull k8s.gcr.io/$imageName
done
docker pull coredns/coredns:${CORE_DNS_VERSION}
 
docker images
 
docker pull calico/cni:v3.21.2
docker pull calico/pod2daemon-flexvol:v3.21.2
docker pull calico/node:v3.21.2
docker pull calico/kube-controllers:v3.21.2
 
docker images | grep calico

配置Master节点

拷贝一份上面的基础镜像，命名为k8s-master，同时修改hostname。

echo "192.168.0.180    k8s-master" >> /etc/hosts
echo "192.168.0.181    k8s-worker" >> /etc/hosts

初始化集群服务器

#!/bin/bash
 
set -e
 
# Reset firstly if ran kubeadm init before
kubeadm reset -f
 
# kubeadm init with calico network
CONTROL_PLANE_ENDPOINT="192.168.0.180:6443"
 
kubeadm init \
  --kubernetes-version=v1.23.1 \
  --control-plane-endpoint=${CONTROL_PLANE_ENDPOINT} \
  --service-cidr=10.96.0.0/16 \
  --pod-network-cidr=10.244.0.0/16 \
  --upload-certs
 
# Make kubectl works
mkdir -p $HOME/.kube
sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
 
cp -p $HOME/.bash_profile $HOME/.bash_profile.bak$(date '+%Y%m%d%H%M%S')
echo "export KUBECONFIG=$HOME/.kube/config" >> $HOME/.bash_profile
source $HOME/.bash_profile
 
# Get cluster information
kubectl cluster-info

记录上面脚本输出中的kubeadm join 内容，后面用。
如果忘记了kubeadm join命令的内容，可运行kubeadm token create –print-join-command重新获取，并可运行kubeadm init phase upload-certs –upload-certs获取新的certificate-key。

当然，现在集群还没有工作，你会发现coredns还是Pending，那是因为我们没有安装CNI插件。

[root@k8s-master ~]# kubectl get pods --all-namespaces
NAMESPACE     NAME                                 READY   STATUS    RESTARTS   AGE
kube-system   coredns-64897985d-p46b8              0/1     Pending   0          4m12s
kube-system   coredns-64897985d-tbdxl              0/1     Pending   0          4m12s
kube-system   etcd-k8s-master                      1/1     Running   1          4m27s
kube-system   kube-apiserver-k8s-master            1/1     Running   1          4m26s
kube-system   kube-controller-manager-k8s-master   1/1     Running   1          4m27s
kube-system   kube-proxy-dwj6v                     1/1     Running   0          52s
kube-system   kube-proxy-nszmz                     1/1     Running   0          4m13s
kube-system   kube-scheduler-k8s-master            1/1     Running   1          4m26s

安装网络插件

#!/bin/bash
 
set -e
 
wget -O calico.yaml https://docs.projectcalico.org/v3.21/manifests/calico.yaml
 
kubectl apply -f calico.yaml
 
# Wait a while to let network takes effect
sleep 30
 
# Check daemonset
kubectl get ds -n kube-system -l k8s-app=calico-node
 
# Check pod status and ready
kubectl get pods -n kube-system -l k8s-app=calico-node
 
# Check apiservice status
kubectl get apiservice v1.crd.projectcalico.org -o yaml

现在，所有Pod状态都正常了。

NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE     IP               NODE         NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-647d84984b-gmlhv   1/1     Running   0          67s     10.244.254.131   k8s-worker              
kube-system   calico-node-bj8nn                          1/1     Running   0          67s     192.168.0.181    k8s-worker              
kube-system   calico-node-m77mk                          1/1     Running   0          67s     192.168.0.180    k8s-master              
kube-system   coredns-64897985d-p46b8                    1/1     Running   0          12m     10.244.254.130   k8s-worker              
kube-system   coredns-64897985d-tbdxl                    1/1     Running   0          12m     10.244.254.129   k8s-worker              
kube-system   etcd-k8s-master                            1/1     Running   1          12m     192.168.0.180    k8s-master              
kube-system   kube-apiserver-k8s-master                  1/1     Running   1          12m     192.168.0.180    k8s-master              
kube-system   kube-controller-manager-k8s-master         1/1     Running   1          12m     192.168.0.180    k8s-master              
kube-system   kube-proxy-dwj6v                           1/1     Running   0          8m49s   192.168.0.181    k8s-worker              
kube-system   kube-proxy-nszmz                           1/1     Running   0          12m     192.168.0.180    k8s-master              
kube-system   kube-scheduler-k8s-master                  1/1     Running   1          12m     192.168.0.180    k8s-master

安装MetalLB作为集群负载均衡提供者

https://metallb.universe.tf/installation/

修改strictARP的值：
kubectl edit configmap -n kube-system kube-proxy

apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
  strictARP: true

下载并安装MetalLB：

export https_proxy=http://192.168.0.105:8070
export http_proxy=http://192.168.0.105:8070
wget https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/namespace.yaml
wget https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/metallb.yaml
 
export https_proxy=
export http_proxy=
kubectl apply -f namespace.yaml
kubectl apply -f metallb.yaml

创建文件lb.yaml如下：

apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - 192.168.0.190-192.168.0.250

然后apply是MetalLB生效：
kubectl apply -f lb.yaml

配置Workder节点

拷贝一份上面的基础镜像，命名为k8s-worker，同时修改hostname，注意修改Mac地址和IP地址。

echo "192.168.0.180    k8s-master" >> /etc/hosts
echo "192.168.0.181    k8s-worker" >> /etc/hosts

Workder节点到集群

运行kubeadm init中打印的日志中关于加入“worker node”的命令。
如果忘记了kubeadm join命令的内容，可运行kubeadm token create –print-join-command重新获取，并可运行kubeadm init phase upload-certs –upload-certs获取新的certificate-key。

kubeadm join 192.168.0.180:6443 --token srmce8.eonpa2amiwek1x0n \
	--discovery-token-ca-cert-hash sha256:048c067f64ded80547d5c6acf2f9feda45d62c2fb02c7ab6da29d52b28eee1bb

安装Dashboard

export https_proxy=http://192.168.0.105:8070
export http_proxy=http://192.168.0.105:8070
wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml
 
export https_proxy=
export http_proxy=
kubectl apply -f recommended.yaml

创建文件dashboard-adminuser.yaml应用来添加管理员

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kubernetes-dashboard
 
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kubernetes-dashboard

然后执行下面命令后，拷贝输出的token来登录Dashboard：

kubectl apply -f dashboard-adminuser.yaml
kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}')

使用如下命令修改ClusterIP为LoadBalancer：

kubectl edit service -n kubernetes-dashboard kubernetes-dashboard

查询服务kubernetes-dashboard，发现有外网IP地址了：

[root@k8s-master ~]# kubectl get service --all-namespaces
NAMESPACE              NAME                        TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                  AGE
default                kubernetes                  ClusterIP      10.96.0.1       <none>          443/TCP                  45m
kube-system            kube-dns                    ClusterIP      10.96.0.10      <none>          53/UDP,53/TCP,9153/TCP   45m
kubernetes-dashboard   dashboard-metrics-scraper   ClusterIP      10.96.46.101    <none>          8000/TCP                 11m
kubernetes-dashboard   kubernetes-dashboard        LoadBalancer   10.96.155.101   192.168.0.190   443:31019/TCP            11m

现在可访问：https://192.168.0.190

安装Metrics Server

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  - nodes/stats
  - namespaces
  - configmaps
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        image: k8s.gcr.io/metrics-server/metrics-server:v0.5.2
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 4443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          initialDelaySeconds: 20
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 200Mi
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100

apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server rbac.authorization.k8s.io/aggregate-to-admin: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rbac.authorization.k8s.io/aggregate-to-view: "true" name: system:aggregated-metrics-reader rules: - apiGroups: - metrics.k8s.io resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server name: system:metrics-server rules: - apiGroups: - "" resources: - pods - nodes - nodes/stats - namespaces - configmaps verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server-auth-reader namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server:system:auth-delegator roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:auth-delegator subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: system:metrics-server roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:metrics-server subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: v1 kind: Service metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: ports: - name: https port: 443 protocol: TCP targetPort: https selector: k8s-app: metrics-server --- apiVersion: apps/v1 kind: Deployment metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 0 template: metadata: labels: k8s-app: metrics-server spec: containers: - args: - --cert-dir=/tmp - --secure-port=4443 - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s image: k8s.gcr.io/metrics-server/metrics-server:v0.5.2 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /livez port: https scheme: HTTPS periodSeconds: 10 name: metrics-server ports: - containerPort: 4443 name: https protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readyz port: https scheme: HTTPS initialDelaySeconds: 20 periodSeconds: 10 resources: requests: cpu: 100m memory: 200Mi securityContext: readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 volumeMounts: - mountPath: /tmp name: tmp-dir nodeSelector: kubernetes.io/os: linux priorityClassName: system-cluster-critical serviceAccountName: metrics-server volumes: - emptyDir: {} name: tmp-dir --- apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.io spec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system version: v1beta1 versionPriority: 100

本作品采用知识共享署名 4.0 国际许可协议进行许可。