kubernetes安装-二进制

2022-12-09,

主要参考https://github.com/opsnull/follow-me-install-kubernetes-cluster,采用Flanel和docker

系统信息

角色 系统 CPU Core 内存 主机名称 ip 安装组件
master 18.04.1-Ubuntu 4 8G master 192.168.0.107 kubectl,kube-apiserver,kube-controller-manager,kube-scheduler,etcd,flannald
slave 18.04.1-Ubuntu 4 4G slave 192.168.0.114 docker,flannald,kubelet,kube-proxy,coredns

k8s&docker版本

软件 版本
k8s 1.17.2
etcd v3.3.18
coredns 1.6.6(docker镜像)
Flanel v0.11.0
docker 18.09

安装前准备(主节点和从节点都需要执行)

    关闭swap

    sudo swapoff -a
    sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

    配置常用软件安装源

    在/etc/apt/sources.list.d/ 追加system.list文件,内容如下

    deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted
    deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted
    deb http://mirrors.aliyun.com/ubuntu/ bionic universe
    deb http://mirrors.aliyun.com/ubuntu/ bionic-updates universe
    deb http://mirrors.aliyun.com/ubuntu/ bionic multiverse
    deb http://mirrors.aliyun.com/ubuntu/ bionic-updates multiverse
    deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse

    执行

    sudo apt-get update
    
    

    创建工作目录

    mkdir -p /opt/k8s/{bin,work} /etc/{kubernetes,etcd}/cert
    
    

    将 /opt/k8s/bin追加到$PATH中

    echo 'PATH=/opt/k8s/bin:$PATH' >>/root/.bashrc
    source /root/.bashrc

    安装ssh服务,并设置root可以执行

    apt install openssh-server
    
    #编辑/etc/ssh/sshd_config文件,在#PermitRootLogin prohibit-password下追加PermitRootLogin yes ,重启ssh服务
    
    systemctl restart ssh.service
    
    

    安装依赖工具包

    apt install -y ipvsadm ipset curl jq
    
    

    设置主机名

    cat >> /etc/hosts <<EOF
    192.168.0.107 master
    192.168.0.114 slave
    EOF

    添加节点信任关系,只用在master节点上执行

    ssh-keygen -t rsa
    ssh-copy-id root@192.168.0.114

创建CA根证书和秘钥(在master节点上执行)

    安装cfssl工具集

    cd /opt/k8s/work
    
    wget https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssl_1.4.1_linux_amd64
    cp cfssl_1.4.1_linux_amd64 /opt/k8s/bin/cfssl wget https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssljson_1.4.1_linux_amd64
    cp cfssljson_1.4.1_linux_amd64 /opt/k8s/bin/cfssljson wget https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssl-certinfo_1.4.1_linux_amd64
    cp cfssl-certinfo_1.4.1_linux_amd64 /opt/k8s/bin/cfssl-certinfo chmod +x /opt/k8s/bin/*

    创建CA配置文件

    cd /opt/k8s/work
    cat > ca-config.json <<EOF
    {
    "signing": {
    "default": {
    "expiry": "87600h"
    },
    "profiles": {
    "kubernetes": {
    "usages": [
    "signing",
    "key encipherment",
    "server auth",
    "client auth"
    ],
    "expiry": "87600h"
    }
    }
    }
    }
    EOF

    signing:表示该证书可用于签名其它证书(生成的 ca.pem 证书中 CA=TRUE);
    server auth:表示 client 可以用该该证书对 server 提供的证书进行验证;
    client auth:表示 server 可以用该该证书对 client 提供的证书进行验证;
    expiry : "87600h":证书有效期设置为 10 年;

    创建证书签名请求文件

    cd /opt/k8s/work
    cat > ca-csr.json <<EOF
    {
    "CN": "kubernetes",
    "key": {
    "algo": "rsa",
    "size": 2048
    },
    "names": [
    {
    "C": "CN",
    "ST": "NanJing",
    "L": "NanJing",
    "O": "k8s",
    "OU": "system"
    }
    ],
    "ca": {
    "expiry": "87600h"
    }
    }
    EOF

    生成证书

    cd /opt/k8s/work
    cfssl gencert -initca ca-csr.json | cfssljson -bare ca
    ls ca*

    安装证书

    cd /opt/k8s/work
    
    cp ca*.pem ca-config.json /etc/kubernetes/cert
    
    # 分发到从节点
    export node_ip=192.168.0.114
    scp ca*.pem ca-config.json root@${node_ip}:/etc/kubernetes/cert/

部署 etcd(在master节点上执行)

    下载安装etcd

    cd /opt/k8s/work
    wget https://github.com/etcd-io/etcd/releases/download/v3.3.18/etcd-v3.3.18-linux-amd64.tar.gz
    tar -xvf etcd-v3.3.18-linux-amd64.tar.gz

    安装etcd

    cd /opt/k8s/work
    
    cp etcd-v3.3.18-linux-amd64/etcd* /opt/k8s/bin/
    chmod +x /opt/k8s/bin/*

    创建 etcd 证书和私钥

      创建证书签名请求文件


      cd /opt/k8s/work
      cat > etcd-csr.json <<EOF
      {
      "CN": "etcd",
      "hosts": [
      "127.0.0.1",
      "192.168.0.107"
      ],
      "key": {
      "algo": "rsa",
      "size": 2048
      },
      "names": [
      {
      "C": "CN",
      "ST": "NanJing",
      "L": "NanJing",
      "O": "k8s",
      "OU": "system"
      }
      ]
      }
      EOF

      指定授权使用该证书的 etcd 节点 IP 列表

      生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
      -ca-key=/opt/k8s/work/ca-key.pem \
      -config=/opt/k8s/work/ca-config.json \
      -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
      ls etcd*pem

      安装证书

      cd /opt/k8s/work
      cp etcd*.pem /etc/etcd/cert/

    创建etcd启动文件

    cat> /etc/systemd/system/etcd.service<< EOF
    [Unit]
    Description=Etcd Server
    After=network.target
    After=network-online.target
    Wants=network-online.target
    Documentation=https://github.com/coreos [Service]
    Type=notify
    WorkingDirectory=/data/k8s/etcd/data
    ExecStart=/opt/k8s/bin/etcd \\
    --data-dir=/etc/etcd/cfg/etcd \\
    --name=etcd-chengf \\
    --cert-file=/etc/etcd/cert/etcd.pem \\
    --key-file=/etc/etcd/cert/etcd-key.pem \\
    --trusted-ca-file=/etc/kubernetes/cert/ca.pem \\
    --peer-cert-file=/etc/etcd/cert/etcd.pem \\
    --peer-key-file=/etc/etcd/cert/etcd-key.pem \\
    --peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem \\
    --peer-client-cert-auth \\
    --client-cert-auth \\
    --listen-peer-urls=https://192.168.0.107:2380 \\
    --initial-advertise-peer-urls=https://192.168.0.107:2380 \\
    --listen-client-urls=https://192.168.0.107:2379,http://127.0.0.1:2379 \\
    --advertise-client-urls=https://192.168.0.107:2379 \\
    --initial-cluster-token=etcd-cluster-0\\
    --initial-cluster=etcd-chengf=https://192.168.0.107:2380 \\
    --initial-cluster-state=new \\
    --auto-compaction-mode=periodic \\
    --auto-compaction-retention=1 \\
    --max-request-bytes=33554432 \\
    --quota-backend-bytes=6442450944 \\
    --heartbeat-interval=250 \\
    --election-timeout=2000
    Restart=on-failure
    RestartSec=5
    LimitNOFILE=65536 [Install]
    WantedBy=multi-user.target
    EOF

    WorkingDirectory、--data-dir:指定工作目录和数据目录,需在启动服务前创建这个目录;
    --name:指定节点名称,当 --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中;
    --cert-file、--key-file:etcd server 与 client 通信时使用的证书和私钥;
    --trusted-ca-file:签名 client 证书的 CA 证书,用于验证 client 证书;
    --peer-cert-file、--peer-key-file:etcd 与 peer 通信使用的证书和私钥;
    --peer-trusted-ca-file:签名 peer 证书的 CA 证书,用于验证 peer 证书;

    创建etcd数据目录

    mkdir -p /data/k8s/etcd/data

    启动 etcd 服务

    systemctl enable etcd && systemctl start etcd
    
    

    检查启动结果

    systemctl status etcd|grep Active

    确保状态为 active (running),否则查看日志,确认原因

    如果出现异常,通过如下命令查看

    journalctl -u etcd

    验证服务状态

    export ETCD_ENDPOINTS=https://192.168.0.107:2379
    
    etcdctl \
    --endpoints=${ETCD_ENDPOINTS} \
    --ca-file=/etc/kubernetes/cert/ca.pem \
    --cert-file=/etc/etcd/cert/etcd.pem \
    --key-file=/etc/etcd/cert/etcd-key.pem cluster-health
    etcdctl \
    --endpoints=${ETCD_ENDPOINTS} \
    --ca-file=/etc/kubernetes/cert/ca.pem \
    --cert-file=/etc/etcd/cert/etcd.pem \
    --key-file=/etc/etcd/cert/etcd-key.pem member list

    输出结果

    root@master:/opt/k8s/work# etcdctl     --endpoints=${ETCD_ENDPOINTS}     --ca-file=/etc/kubernetes/cert/ca.pem     --cert-file=/etc/etcd/cert/etcd.pem     --key-file=/etc/etcd/cert/etcd-key.pem cluster-health

member c0d3b56a9878e38f is healthy: got healthy result from https://192.168.0.107:2379

cluster is healthy

root@master:/opt/k8s/work# etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pemmember list

c0d3b56a9878e38f: name=etcd-chengf peerURLs=https://192.168.0.107:2380 clientURLs=https://192.168.0.107:2379 isLeader=true

```

部署 flannel 网络(在master节点上执行)

kubernetes组件kubelet服务依赖docker服务,docker网络需要用flannel来配置docker0网桥的ip地址,所以需要先安装flannel网络组建

flannel 使用 vxlan 技术为各节点创建一个可以互通的 Pod 网络,使用的端口为 UDP 8472(需要开放该端口,如公有云 AWS 等)。

flanneld 第一次启动时,从 etcd 获取配置的 Pod 网段信息,为本节点分配一个未使用的地址段,然后创建 flannedl.1 网络接口(也可能是其它名称,如 flannel1 等)。

flannel 将分配给自己的 Pod 网段信息写入 /run/flannel/docker 文件,docker 后续使用这个文件中的环境变量设置 docker0 网桥,从而从这个地址段为本节点的所有 Pod 容器分配 IP

    下载和安装flanneld 二进制文件


    cd /opt/k8s/work
    mkdir flannel
    wget https://github.com/coreos/flannel/releases/download/v0.11.0/flannel-v0.11.0-linux-amd64.tar.gz
    tar -xzvf flannel-v0.11.0-linux-amd64.tar.gz -C flannel cp flannel/{flanneld,mk-docker-opts.sh} /opt/k8s/bin/ export node_ip=192.168.0.114
    scp flannel/{flanneld,mk-docker-opts.sh} root@${192.168.0.114}:/opt/k8s/bin/

    创建 flanneld 证书和私钥

    flanneld 从 etcd 集群存取网段分配信息,而 etcd 集群启用了双向 x509 证书认证,所以需要为 flanneld 生成证书和私钥。

      创建证书签名请求

      cd /opt/k8s/work
      cat > flanneld-csr.json <<EOF
      {
      "CN": "flanneld",
      "hosts": [],
      "key": {
      "algo": "rsa",
      "size": 2048
      },
      "names": [
      {
      "C": "CN",
      "ST": "NanJing",
      "L": "NanJing",
      "O": "k8s",
      "OU": "system"
      }
      ]
      }
      EOF

      生成证书和私钥

      cfssl gencert -ca=/opt/k8s/work/ca.pem \
      -ca-key=/opt/k8s/work/ca-key.pem \
      -config=/opt/k8s/work/ca-config.json \
      -profile=kubernetes flanneld-csr.json | cfssljson -bare flanneld
      ls flanneld*pem

      将生成的证书和私钥分发到所有节点

      cd /opt/k8s/work
      mkdir -p /etc/flanneld/cert
      cp flanneld*.pem /etc/flanneld/cert export node_ip=192.168.0.114
      ssh root@${node_ip} "mkdir -p /etc/flanneld/cert"
      scp flanneld*.pem root@${node_ip}:/etc/flanneld/cert

    向 etcd 写入集群 Pod 网段信息

    cd /opt/k8s/work
    
    export FLANNEL_ETCD_PREFIX="/kubernetes/network"
    export ETCD_ENDPOINTS="https://192.168.0.107:2379" etcdctl \
    --endpoints=${ETCD_ENDPOINTS} \
    --ca-file=/opt/k8s/work/ca.pem \
    --cert-file=/opt/k8s/work/flanneld.pem \
    --key-file=/opt/k8s/work/flanneld-key.pem \
    mk ${FLANNEL_ETCD_PREFIX}/config '{"Network":"172.30.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}'

    写入的 Pod 网段 Network 网络段对应的数值(如 /16)必须小于 SubnetLen对应的值(如24)

    创建 flanneld 服务的启动文件


    cd /opt/k8s/work
    export FLANNEL_ETCD_PREFIX="/kubernetes/network"
    export ETCD_ENDPOINTS="https://192.168.0.107:2379" cat > flanneld.service << EOF
    [Unit]
    Description=Flanneld overlay address etcd agent
    After=network.target
    After=network-online.target
    Wants=network-online.target
    After=etcd.service
    Before=docker.service [Service]
    Type=notify
    ExecStart=/opt/k8s/bin/flanneld \\
    -etcd-cafile=/etc/kubernetes/cert/ca.pem \\
    -etcd-certfile=/etc/flanneld/cert/flanneld.pem \\
    -etcd-keyfile=/etc/flanneld/cert/flanneld-key.pem \\
    -etcd-endpoints=${ETCD_ENDPOINTS} \\
    -etcd-prefix=${FLANNEL_ETCD_PREFIX} \\
    -ip-masq
    ExecStartPost=/opt/k8s/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
    Restart=always
    RestartSec=5
    StartLimitInterval=0 [Install]
    WantedBy=multi-user.target
    RequiredBy=docker.service
    EOF

    mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网段信息,通过-d参数写入 /run/flannel/docker 文件,后续 docker 启动时使用这个文件中的环境变量配置 docker0 网桥, -k 参数控制生成文件中变量的名称,下面docker启动时会用到这个变量;
    flanneld 使用系统缺省路由所在的接口与其它节点通信,对于有多个网络接口(如内网和公网)的节点,可以用 -iface 参数指定通信接口;
    -ip-masq: flanneld 为访问 Pod 网络外的流量设置 SNAT 规则,同时将传递给 Docker 的变量 --ip-masq(/run/flannel/docker 文件中)设置为 false,这样 Docker 将不再创建 SNAT 规则; Docker 的 --ip-masq 为 true 时,创建的 SNAT 规则比较“暴力”:将所有本节点 Pod 发起的、访问非 docker0 接口的请求做 SNAT,这样访问其他节点 Pod 的请求来源 IP 会被设置为 flannel.1 接口的 IP,导致目的 Pod 看不到真实的来源 Pod IP。 flanneld 创建的 SNAT 规则比较温和,只对访问非 Pod 网段的请求做 SNAT

    分发flanneld服务

    cd /opt/k8s/work
    
    cp flanneld.service /etc/systemd/system/
    
    export node_ip=192.168.0.114
    scp flanneld.service root@${node_ip}:/etc/systemd/system/

    启动flanneld服务

    systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld
    
    ssh root@${node_ip) "systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld"
    
    

    检查启动结果

    systemctl status flanneld|grep Active
    
    export node_ip=192.168.0.114
    ssh root@${node_ip} "systemctl status flanneld|grep Active"

    确保状态为 active (running),否则查看日志,确认原因

    如果出现异常,通过如下命令查看

    journalctl -u flanneld

    检查分配给各 flanneld 的 Pod 网段信息

    export FLANNEL_ETCD_PREFIX="/kubernetes/network"
    export ETCD_ENDPOINTS="https://192.168.0.107:2379" etcdctl \
    --endpoints=${ETCD_ENDPOINTS} \
    --ca-file=/etc/kubernetes/cert/ca.pem \
    --cert-file=/etc/flanneld/cert/flanneld.pem \
    --key-file=/etc/flanneld/cert/flanneld-key.pem \
    get ${FLANNEL_ETCD_PREFIX}/config

    输出结果

    {"Network":"172.30.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}

    查看已分配的 Pod 子网段列表

    export FLANNEL_ETCD_PREFIX="/kubernetes/network"
    export ETCD_ENDPOINTS="https://192.168.0.107:2379" etcdctl \
    --endpoints=${ETCD_ENDPOINTS} \
    --ca-file=/etc/kubernetes/cert/ca.pem \
    --cert-file=/etc/flanneld/cert/flanneld.pem \
    --key-file=/etc/flanneld/cert/flanneld-key.pem \
    ls ${FLANNEL_ETCD_PREFIX}/subnets

    输出结果

    /kubernetes/network/subnets/172.30.22.0-24
    /kubernetes/network/subnets/172.30.78.0-24

    检查节点 flannel 网络信息

    root@master:/opt/k8s/work# ip addr show
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
    valid_lft forever preferred_lft forever
    2: enp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether 04:92:26:13:92:2b brd ff:ff:ff:ff:ff:ff
    3: wlp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether d0:c5:d3:57:73:01 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.107/24 brd 192.168.0.255 scope global dynamic noprefixroute wlp3s0
    valid_lft 6385sec preferred_lft 6385sec
    inet6 fe80::1fda:e90a:207a:67e4/64 scope link noprefixroute
    valid_lft forever preferred_lft forever
    4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 12:cb:66:43:de:36 brd ff:ff:ff:ff:ff:ff
    inet 172.30.22.0/32 scope global flannel.1
    valid_lft forever preferred_lft forever
    inet6 fe80::10cb:66ff:fe43:de36/64 scope link
    valid_lft forever preferred_lft forever root@master:/opt/k8s/work# ip route show |grep flannel.1
    172.30.78.0/24 via 172.30.78.0 dev flannel.1 onlink

    验证各节点能通过 Pod 网段互通

    root@master:/opt/k8s/work# ip addr show flannel.1 |grep -w inet
    inet 172.30.22.0/32 scope global flannel.1
    root@master:/opt/k8s/work# ssh 192.168.0.114 "/sbin/ip addr show flannel.1|grep -w inet"
    inet 172.30.78.0/32 scope global flannel.1
    root@master:/opt/k8s/work# ping -c 1 172.30.78.0
    PING 172.30.78.0 (172.30.78.0) 56(84) bytes of data.
    64 bytes from 172.30.78.0: icmp_seq=1 ttl=64 time=80.7 ms --- 172.30.78.0 ping statistics ---
    1 packets transmitted, 1 received, 0% packet loss, time 0ms
    rtt min/avg/max/mdev = 80.707/80.707/80.707/0.000 ms
    root@master:/opt/k8s/work# ssh 192.168.0.114 "ping -c 1 172.30.22.0"
    PING 172.30.22.0 (172.30.22.0) 56(84) bytes of data.
    64 bytes from 172.30.22.0: icmp_seq=1 ttl=64 time=4.09 ms --- 172.30.22.0 ping statistics ---
    1 packets transmitted, 1 received, 0% packet loss, time 0ms
    rtt min/avg/max/mdev = 4.094/4.094/4.094/0.000 ms

    生成文件

    root@master:/opt/k8s/work# cat /run/flannel/subnet.env
    FLANNEL_NETWORK=172.30.0.0/16
    FLANNEL_SUBNET=172.30.22.1/24
    FLANNEL_MTU=1450
    FLANNEL_IPMASQ=true
    root@master:/opt/k8s/work# cat /run/flannel/docker
    DOCKER_OPT_BIP="--bip=172.30.22.1/24"
    DOCKER_OPT_IPMASQ="--ip-masq=false"
    DOCKER_OPT_MTU="--mtu=1450"
    DOCKER_NETWORK_OPTIONS=" --bip=172.30.22.1/24 --ip-masq=false --mtu=1450"

部署docker服务(在master节点上执行)

    下载和分发 docker 二进制文件

    cd /opt/k8s/work
    wget https://download.docker.com/linux/static/stable/x86_64/docker-18.09.6.tgz
    tar -xvf docker-18.09.6.tgz

    分发二进制文件到所有 worker 节点

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp docker/* root@${node_ip}:/opt/k8s/bin/
    ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"

    创建docker服务启动文件

    cd /opt/k8s/work
    cat > docker.service <<"EOF"
    [Unit]
    Description=Docker Application Container Engine
    Documentation=http://docs.docker.io [Service]
    WorkingDirectory=/data/k8s/docker
    Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin"
    EnvironmentFile=-/run/flannel/docker
    ExecStart=/opt/k8s/bin/dockerd $DOCKER_NETWORK_OPTIONS
    ExecReload=/bin/kill -s HUP $MAINPID
    Restart=on-failure
    RestartSec=5
    LimitNOFILE=infinity
    LimitNPROC=infinity
    LimitCORE=infinity
    Delegate=yes
    KillMode=process [Install]
    WantedBy=multi-user.target
    EOF

    EOF 前后有双引号,这样 bash 不会替换文档中的变量,如 $DOCKER_NETWORK_OPTIONS (这些环境变量是 systemd 负责替换的。);

    dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中;

    flanneld 启动时将网络配置写入 /run/flannel/docker 文件中,dockerd 启动前读取该文件中的环境变量 DOCKER_NETWORK_OPTIONS ,然后设置 docker0 网桥网段;

    docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT:

    export node_ip=192.168.0.114
    ssh root@${node_ip} "/sbin/iptables -P FORWARD ACCEPT"

    分发 docker.service 文件到所有 worker 机器:

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp docker.service root@${node_ip}:/etc/systemd/system/

    配置和分发 docker 配置文件

    使用国内的仓库镜像服务器以加快 pull image 的速度,同时增加下载的并发数 (需要重启 dockerd 生效):

    cd /opt/k8s/work
    cat > docker-daemon.json <<EOF
    {
    "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn","https://hub-mirror.c.163.com"],
    "max-concurrent-downloads": 20,
    "live-restore": true,
    "max-concurrent-uploads": 10,
    "data-root": "/data/k8s/docker/data",
    "log-opts": {
    "max-size": "100m",
    "max-file": "5"
    }
    }
    EOF

    分发 docker 配置文件到所有 worker 节点:

    cd /opt/k8s/work
    
    export node_ip=192.168.0.114
    ssh root@${node_ip} "mkdir -p /etc/docker/ /data/k8s/docker/data"
    scp docker-daemon.json root@${node_ip}:/etc/docker/daemon.json

    启动 docker 服务

    export node_ip=192.168.0.114
    ssh root@${node_ip} "systemctl daemon-reload && systemctl enable docker && systemctl restart docker"

    检查服务运行状态

    export node_ip=192.168.0.114
    ssh root@${node_ip} "systemctl status docker|grep Active"

    确保状态为 active (running),否则查看日志,确认原因

    如果出现异常,通过如下命令查看

    journalctl -u docker

    检查 docker0 网桥

    export node_ip=192.168.0.114
    ssh root@${node_ip} "/sbin/ip addr show flannel.1 && /sbin/ip addr show docker0"

    确认各 worker 节点的 docker0 网桥和 flannel.1 接口的 IP 处于同一个网段中

    输出内容

    export node_ip=192.168.0.114
    root@master:/opt/k8s/work# ssh root@${node_ip} "/sbin/ip addr show flannel.1 && /sbin/ip addr show docker0"
    4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether f2:fc:0f:7e:98:e4 brd ff:ff:ff:ff:ff:ff
    inet 172.30.78.0/32 scope global flannel.1
    valid_lft forever preferred_lft forever
    inet6 fe80::f0fc:fff:fe7e:98e4/64 scope link
    valid_lft forever preferred_lft forever
    5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:fd:1f:8f:d8 brd ff:ff:ff:ff:ff:ff
    inet 172.30.78.1/24 brd 172.30.78.255 scope global docker0
    valid_lft forever preferred_lft forever

    注意: 如果您的服务安装顺序不对或者机器环境比较复杂, docker服务早于flanneld服务安装,此时 worker 节点的 docker0 网桥和 flannel.1 接口的 IP可能不会同处同一个网段下,这个时候请先停止docker服务, 手工删除docker0网卡,重新启动docker服务后即可修复

    systemctl stop docker
    ip link delete docker0
    systemctl start docker

    查看 docker 的状态信息

    root@slave:/opt/k8s/work# docker info
    Containers: 0
    Running: 0
    Paused: 0
    Stopped: 0
    Images: 0
    Server Version: 18.09.6
    Storage Driver: overlay2
    Backing Filesystem: extfs
    Supports d_type: true
    Native Overlay Diff: true
    Logging Driver: json-file
    Cgroup Driver: cgroupfs
    Plugins:
    Volume: local
    Network: bridge host macvlan null overlay
    Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
    Swarm: inactive
    Runtimes: runc
    Default Runtime: runc
    Init Binary: docker-init
    containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
    runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
    init version: fec3683
    Security Options:
    apparmor
    seccomp
    Profile: default
    Kernel Version: 5.0.0-23-generic
    Operating System: Ubuntu 18.04.3 LTS
    OSType: linux
    Architecture: x86_64
    CPUs: 4
    Total Memory: 3.741GiB
    Name: slave
    ID: IDMG:7A6F:UNTP:IWVM:ZBK5:VHJ4:STC5:UXZX:HQT6:UUNE:YDOC:I27L
    Docker Root Dir: /data/k8s/docker/data
    Debug Mode (client): false
    Debug Mode (server): false
    Registry: https://index.docker.io/v1/
    Labels:
    Experimental: false
    Insecure Registries:
    127.0.0.0/8
    Registry Mirrors:
    https://docker.mirrors.ustc.edu.cn/
    https://hub-mirror.c.163.com/
    Live Restore Enabled: true
    Product License: Community Engine WARNING: No swap limit support

部署 master 节点(在master节点上执行)

    下载最新版本二进制文件

    cd /opt/k8s/work
    
    wget https://dl.k8s.io/v1.17.2/kubernetes-server-linux-amd64.tar.gz # 目前国内不能直接下载,需翻墙
    tar -xzvf kubernetes-server-linux-amd64.tar

    安装对应的k8s命令

    cd /opt/k8s/work
    cp kubernetes/server/bin/{apiextensions-apiserver,kubeadm,kube-apiserver,kube-controller-manager,kubectl,kubelet,kube-proxy,kube-scheduler,mounter} /opt/k8s/bin/ #将kubelet、kube-proxy分发到worker节点
    export node_ip=192.168.0.114
    scp kubernetes/server/bin/{kubelet,kube-proxy} root@${node_ip}:/opt/k8s/bin/

配置kubectl

kubectl 使用 https 协议与 kube-apiserver 进行安全通信,kube-apiserver 对 kubectl 请求包含的证书进行认证和授权。

kubectl 后续用于集群管理,所以这里创建具有最高权限的 admin 证书。

    创建 admin 证书和私钥

      创建证书签名请求文件


      cd /opt/k8s/work
      cat > admin-csr.json <<EOF
      {
      "CN": "admin",
      "hosts": [],
      "key": {
      "algo": "rsa",
      "size": 2048
      },
      "names": [
      {
      "C": "CN",
      "ST": "NanJing",
      "L": "NanJing",
      "O": "system:masters",
      "OU": "system"
      }
      ]
      }
      EOF

      O: system:masters:kube-apiserver 收到使用该证书的客户端请求后,为请求添加组(Group)认证标识 system:masters;
      预定义的 ClusterRoleBinding cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予操作集群所需的最高权限;
      该证书只会被 kubectl 当做 client 证书使用,所以 hosts 字段为空;

      生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
      -ca-key=/opt/k8s/work/ca-key.pem \
      -config=/opt/k8s/work/ca-config.json \
      -profile=kubernetes admin-csr.json | cfssljson -bare admin
      ls admin*

      安装证书

      cd /opt/k8s/work
      cp admin*.pem /etc/kubernetes/cert

    创建 kubeconfig 文件

    cd /opt/k8s/work
    
    export KUBE_APISERVER=https://192.168.0.107:6443
    
    # 设置集群参数
    kubectl config set-cluster kubernetes \
    --certificate-authority=/etc/kubernetes/cert/ca.pem \
    --embed-certs=true \
    --server=${KUBE_APISERVER} \
    --kubeconfig=kubectl.kubeconfig # 设置客户端认证参数
    kubectl config set-credentials admin \
    --client-certificate=/etc/kubernetes/cert/admin.pem \
    --client-key=/etc/kubernetes/cert/admin-key.pem \
    --embed-certs=true \
    --kubeconfig=kubectl.kubeconfig # 设置上下文参数
    kubectl config set-context kubernetes \
    --cluster=kubernetes \
    --user=admin \
    --kubeconfig=kubectl.kubeconfig # 设置默认上下文
    kubectl config use-context kubernetes --kubeconfig=kubectl.kubeconfig

    --certificate-authority:验证 kube-apiserver 证书的根证书;
    --client-certificate、--client-key:刚生成的 admin 证书和私钥,与 kube-apiserver https 通信时使用;
    --embed-certs=true:将 ca.pem 和 admin.pem 证书内容嵌入到生成的 kubectl.kubeconfig 文件中;
    --server:指定 kube-apiserver 的地址;

    分发 kubeconfig 文件(其他用户想要访问kubernetes时,也需要把此文件copy到对应的用户目录)

    cd /opt/k8s/work
    mkdir -p ~/.kube
    cp kubectl.kubeconfig ~/.kube/config

    配置kubectl自动补全功能

    root@master:/opt/k8s/work# apt install -y bash-completion
    root@master:/opt/k8s/work# locate bash_completion /usr/share/bash-completion/bash_completion
    root@master:/opt/k8s/work# source /usr/share/bash-completion/bash_completion
    root@master:/opt/k8s/work# source <(kubectl completion bash)
    root@master:/opt/k8s/work# echo 'source <(kubectl completion bash)' >>~/.bashrc

配置kube-apiserver

    创建 kubernetes-api 证书和私钥

      创建证书签名请求文件


      cd /opt/k8s/work
      cat > kubernetes-csr.json <<EOF
      {
      "CN": "kubernetes-api",
      "hosts": [
      "127.0.0.1",
      "192.168.0.107",
      "10.254.0.1",
      "kubernetes",
      "kubernetes.default",
      "kubernetes.default.svc",
      "kubernetes.default.svc.cluster",
      "kubernetes.default.svc.cluster.local."
      ],
      "key": {
      "algo": "rsa",
      "size": 2048
      },
      "names": [
      {
      "C": "CN",
      "ST": "NanJing",
      "L": "NanJing",
      "O": "k8s",
      "OU": "system"
      }
      ]
      }
      EOF

      生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
      -ca-key=/opt/k8s/work/ca-key.pem \
      -config=/opt/k8s/work/ca-config.json \
      -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes
      ls kubernetes*

      安装证书

      cd /opt/k8s/work
      cp kubernetes*.pem /etc/kubernetes/cert/

    创建kube-api服务启动文件

    export ETCD_ENDPOINTS="https://192.168.0.107:2379"
    export SERVICE_CIDR="10.254.0.0/16"
    export NODE_PORT_RANGE=80-60000 cat > /etc/systemd/system/kube-apiserver.service <<EOF
    [Unit]
    Description=Kubernetes API Server
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    After=network.target [Service]
    WorkingDirectory=/data/k8s/k8s/kube-apiserver
    ExecStart=/opt/k8s/bin/kube-apiserver \\
    --advertise-address=192.168.0.107 \\
    --etcd-cafile=/etc/kubernetes/cert/ca.pem \\
    --etcd-certfile=/etc/kubernetes/cert/kubernetes.pem \\
    --etcd-keyfile=/etc/kubernetes/cert/kubernetes-key.pem \\
    --etcd-servers=${ETCD_ENDPOINTS} \\
    --bind-address=192.168.0.107 \\
    --secure-port=6443 \\
    --tls-cert-file=/etc/kubernetes/cert/kubernetes.pem \\
    --tls-private-key-file=/etc/kubernetes/cert/kubernetes-key.pem \\
    --audit-log-maxage=15 \\
    --audit-log-maxbackup=3 \\
    --audit-log-maxsize=100 \\
    --audit-log-truncate-enabled \\
    --audit-log-path=/data/k8s/k8s/kube-apiserver/audit.log \\
    --profiling \\
    --anonymous-auth=false \\
    --client-ca-file=/etc/kubernetes/cert/ca.pem \\
    --enable-bootstrap-token-auth \\
    --service-account-key-file=/etc/kubernetes/cert/ca-key.pem \\
    --authorization-mode=Node,RBAC \\
    --runtime-config=api/all=true \\
    --allow-privileged=true \\
    --event-ttl=168h \\
    --kubelet-certificate-authority=/etc/kubernetes/cert/ca.pem \\
    --kubelet-client-certificate=/etc/kubernetes/cert/kubernetes.pem \\
    --kubelet-client-key=/etc/kubernetes/cert/kubernetes-key.pem \\
    --kubelet-https=true \\
    --kubelet-timeout=10s \\
    --service-cluster-ip-range=${SERVICE_CIDR} \\
    --service-node-port-range=${NODE_PORT_RANGE} \\
    --logtostderr=true \\
    --v=2
    Restart=on-failure
    RestartSec=10
    Type=notify
    LimitNOFILE=65536 [Install]
    WantedBy=multi-user.target
    EOF

    创建kube-api工作目录

    mkdir -p /data/k8s/k8s/kube-apiserver

    启动 kube-apiserver 服务

    systemctl daemon-reload && systemctl enable kube-apiserver && systemctl restart kube-apiserver

    检查启动结果

    systemctl status kube-apiserver |grep Active

    确保状态为 active (running),否则查看日志,确认原因

    如果出现异常,通过如下命令查看

    journalctl -u kube-apiserver

    检查 kube-apiserver 运行状态

    root@master:/opt/k8s/work# kubectl cluster-info
    Kubernetes master is running at https://192.168.0.107:6443 To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. root@master:/opt/k8s/work# kubectl get all --all-namespaces
    NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    default service/kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 2m30s root@master:/opt/k8s/work# kubectl get componentstatuses
    NAME STATUS MESSAGE ERROR
    scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
    controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
    etcd-0 Healthy {"health":"true"}

配置kube-controller-manager

    创建 kube-controller-manager 证书和私钥

      创建证书签名请求文件

      cd /opt/k8s/work
      cat > kube-controller-manager-csr.json <<EOF
      {
      "CN": "system:kube-controller-manager",
      "key": {
      "algo": "rsa",
      "size": 2048
      },
      "hosts": [
      "127.0.0.1",
      "192.168.0.107"
      ],
      "names": [
      {
      "C": "CN",
      "ST": "NanJing",
      "L": "NanJing",
      "O": "system:kube-controller-manager",
      "OU": "system"
      }
      ]
      }
      EOF

      CN 和 O 均为 system:kube-controller-manager,kubernetes 内置的 ClusterRoleBindings system:kube-controller-manager 赋予 kube-controller-manager 工作所需的权限。

      生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
      -ca-key=/opt/k8s/work/ca-key.pem \
      -config=/opt/k8s/work/ca-config.json \
      -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
      ls kube-controller-manager*pem

      安装证书

      cd /opt/k8s/work
      cp kube-controller-manager*.pem /etc/kubernetes/cert/

    创建 kubeconfig 文件

    kube-controller-manager 使用此文件访问apiserver,该文件提供了 apiserver 地址、嵌入的 CA 证书和 kube-controller-manager 证书等信息

    cd /opt/k8s/work
    export KUBE_APISERVER=https://192.168.0.107:6443 kubectl config set-cluster kubernetes \
    --certificate-authority=/opt/k8s/work/ca.pem \
    --embed-certs=true \
    --server="${KUBE_APISERVER}" \
    --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-credentials system:kube-controller-manager \
    --client-certificate=kube-controller-manager.pem \
    --client-key=kube-controller-manager-key.pem \
    --embed-certs=true \
    --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-context system:kube-controller-manager \
    --cluster=kubernetes \
    --user=system:kube-controller-manager \
    --kubeconfig=kube-controller-manager.kubeconfig kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig

    分发 kubeconfig

    cd /opt/k8s/work
    cp kube-controller-manager.kubeconfig /etc/kubernetes/kube-controller-manager.kubeconfig

    创建kube-controller-manager服务启动文件

    export SERVICE_CIDR="10.254.0.0/16"
    
    cat > /etc/systemd/system/kube-controller-manager.service <<EOF
    [Unit]
    Description=Kubernetes Controller Manager
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service]
    WorkingDirectory=/data/k8s/k8s/kube-controller-manager
    ExecStart=/opt/k8s/bin/kube-controller-manager \\
    --profiling \\
    --cluster-name=kubernetes \\
    --kube-api-qps=1000 \\
    --kube-api-burst=2000 \\
    --leader-elect \\
    --use-service-account-credentials\\
    --concurrent-service-syncs=2 \\
    --bind-address=192.168.0.107 \\
    --secure-port=10252 \\
    --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \\
    --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \\
    --port=0 \\
    --authentication-kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\
    --client-ca-file=/etc/kubernetes/cert/ca.pem \\
    --authorization-kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\
    --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \\
    --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \\
    --experimental-cluster-signing-duration=87600h \\
    --horizontal-pod-autoscaler-sync-period=10s \\
    --concurrent-deployment-syncs=10 \\
    --concurrent-gc-syncs=30 \\
    --node-cidr-mask-size=24 \\
    --service-cluster-ip-range=${SERVICE_CIDR} \\
    --pod-eviction-timeout=6m \\
    --terminated-pod-gc-threshold=10000 \\
    --root-ca-file=/etc/kubernetes/cert/ca.pem \\
    --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem \\
    --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\
    --logtostderr=true \\
    --v=2
    Restart=on-failure
    RestartSec=5 [Install]
    WantedBy=multi-user.target
    EOF

    创建kube-controller-manager工作目录

    mkdir -p /data/k8s/k8s/kube-controller-manager

    启动 kube-controller-manager服务

    systemctl daemon-reload && systemctl enable kube-controller-manager && systemctl restart kube-controller-manager

    检查启动结果

    systemctl status kube-controller-manager  |grep Active

    确保状态为 active (running),否则查看日志,确认原因

    如果出现异常,通过如下命令查看

    journalctl -u kube-controller-manager
    
    

    检查 kube-controller-manager 运行状态

    root@master:/opt/k8s/work# kubectl get endpoints kube-controller-manager --namespace=kube-system  -o yaml
    apiVersion: v1
    kind: Endpoints
    metadata:
    annotations:
    control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"master_6e2dfb91-8eaa-42d0-ba83-be669b99801f","leaseDurationSeconds":15,"acquireTime":"2020-02-09T13:37:08Z","renewTime":"2020-02-09T13:38:02Z","leaderTransitions":0}'
    creationTimestamp: "2020-02-09T13:37:08Z"
    name: kube-controller-manager
    namespace: kube-system
    resourceVersion: "888"
    selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
    uid: 5aa2c4a1-5ded-4870-900e-63dfd212c912 root@master:/opt/k8s/work# curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://192.168.0.107:10252/healthz
    ok

配置kube-scheduler

    创建 kube-scheduler 证书和私钥

      创建证书签名请求文件

      cd /opt/k8s/work
      cat > kube-scheduler-csr.json <<EOF
      {
      "CN": "system:kube-scheduler",
      "key": {
      "algo": "rsa",
      "size": 2048
      },
      "hosts": [
      "127.0.0.1",
      "192.168.0.107"
      ],
      "names": [
      {
      "C": "CN",
      "ST": "NanJing",
      "L": "NanJing",
      "O": "system:kube-scheduler",
      "OU": "system"
      }
      ]
      }
      EOF

      CN 和 O 均为 system:kube-scheduler,kubernetes 内置的 ClusterRoleBindings system:kube-scheduler 赋予 kube-scheduler 工作所需的权限。

      生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
      -ca-key=/opt/k8s/work/ca-key.pem \
      -config=/opt/k8s/work/ca-config.json \
      -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
      ls kube-scheduler*pem

      安装证书

      cd /opt/k8s/work
      cp kube-scheduler*.pem /etc/kubernetes/cert/

    创建 kubeconfig 文件

    kube-scheduler 使用此文件访问apiserver,该文件提供了 apiserver 地址、嵌入的 CA 证书和 kube-scheduler证书等信息

    cd /opt/k8s/work
    export KUBE_APISERVER=https://192.168.0.107:6443 kubectl config set-cluster kubernetes \
    --certificate-authority=/opt/k8s/work/ca.pem \
    --embed-certs=true \
    --server="${KUBE_APISERVER}" \
    --kubeconfig=kube-scheduler.kubeconfig kubectl config set-credentials system:kube-scheduler \
    --client-certificate=kube-scheduler.pem \
    --client-key=kube-scheduler-key.pem \
    --embed-certs=true \
    --kubeconfig=kube-scheduler.kubeconfig kubectl config set-context system:kube-scheduler \
    --cluster=kubernetes \
    --user=system:kube-scheduler \
    --kubeconfig=kube-scheduler.kubeconfig kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig

    分发 kubeconfig

    cd /opt/k8s/work
    cp kube-scheduler.kubeconfig /etc/kubernetes/kube-scheduler.kubeconfig

    创建 kube-scheduler 配置文件

    cd /opt/k8s/work
    cat >kube-scheduler.yaml <<EOF
    apiVersion: kubescheduler.config.k8s.io/v1alpha1
    kind: KubeSchedulerConfiguration
    bindTimeoutSeconds: 600
    clientConnection:
    burst: 200
    kubeconfig: "/etc/kubernetes/kube-scheduler.kubeconfig"
    qps: 100
    enableContentionProfiling: false
    enableProfiling: true
    hardPodAffinitySymmetricWeight: 1
    healthzBindAddress: 192.168.0.107:10251
    leaderElection:
    leaderElect: true
    metricsBindAddress: 192.168.0.107:10251
    EOF cp kube-scheduler.yaml /etc/kubernetes/kube-scheduler.yaml

    创建kube-scheduler服务启动文件

    cat > /etc/systemd/system/kube-scheduler.service <<EOF
    [Unit]
    Description=Kubernetes Scheduler
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service]
    WorkingDirectory=/data/k8s/k8s/kube-scheduler
    ExecStart=/opt/k8s/bin/kube-scheduler \\
    --config=/etc/kubernetes/kube-scheduler.yaml \\
    --bind-address=192.168.0.107 \\
    --secure-port=10259 \\
    --port=0 \\
    --tls-cert-file=/etc/kubernetes/cert/kube-scheduler.pem \\
    --tls-private-key-file=/etc/kubernetes/cert/kube-scheduler-key.pem \\
    --authentication-kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\
    --client-ca-file=/etc/kubernetes/cert/ca.pem \\
    --authorization-kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\
    --logtostderr=true \\
    --v=2
    Restart=always
    RestartSec=5
    StartLimitInterval=0 [Install]
    WantedBy=multi-user.target
    EOF

    创建kube-scheduler工作目录

    mkdir -p /data/k8s/k8s/kube-scheduler

    启动 kube-scheduler服务

    systemctl daemon-reload && systemctl enable kube-scheduler && systemctl restart kube-scheduler

    检查启动结果

    systemctl status kube-scheduler  |grep Active

    确保状态为 active (running),否则查看日志,确认原因

    如果出现异常,通过如下命令查看

    journalctl -u kube-scheduler
    
    

    检查 kube-scheduler 运行状态

    root@master:/opt/k8s/work# kubectl get endpoints kube-scheduler --namespace=kube-system  -o yaml
    apiVersion: v1
    kind: Endpoints
    metadata:
    annotations:
    control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"master_383054c4-58d8-4c24-a766-551a92492219","leaseDurationSeconds":15,"acquireTime":"2020-02-10T02:17:40Z","renewTime":"2020-02-10T02:18:09Z","leaderTransitions":0}'
    creationTimestamp: "2020-02-10T02:17:41Z"
    name: kube-scheduler
    namespace: kube-system
    resourceVersion: "50203"
    selfLink: /api/v1/namespaces/kube-system/endpoints/kube-scheduler
    uid: 39821272-40a1-4b3a-95bd-a4f09af09231 root@master:/opt/k8s/work# curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://192.168.0.107:10259/healthz
    ok root@master:/opt/k8s/work# curl http://192.168.0.107:10251/healthz
    ok

部署worker节点(在master节点上执行)

配置kubelet

kubelet 运行在每个 worker 节点上,接收 kube-apiserver 发送的请求,管理 Pod 容器,执行交互式命令,如 exec、run、logs 等。

kubelet 启动时自动向 kube-apiserver 注册节点信息,内置的 cadvisor 统计和监控节点的资源使用情况。

为确保安全,部署时关闭了 kubelet 的非安全 http 端口,对请求进行认证和授权,拒绝未授权的访问(如 apiserver、heapster 的请求)。

    创建 kubelet bootstrap kubeconfig 文件


    cd /opt/k8s/work export KUBE_APISERVER=https://192.168.0.107:6443
    export node_name=slave export BOOTSTRAP_TOKEN=$(kubeadm token create \
    --description kubelet-bootstrap-token \
    --groups system:bootstrappers:${node_name} \
    --kubeconfig ~/.kube/config) # 设置集群参数
    kubectl config set-cluster kubernetes \
    --certificate-authority=/etc/kubernetes/cert/ca.pem \
    --embed-certs=true \
    --server=${KUBE_APISERVER} \
    --kubeconfig=kubelet-bootstrap.kubeconfig # 设置客户端认证参数
    kubectl config set-credentials kubelet-bootstrap \
    --token=${BOOTSTRAP_TOKEN} \
    --kubeconfig=kubelet-bootstrap.kubeconfig # 设置上下文参数
    kubectl config set-context default \
    --cluster=kubernetes \
    --user=kubelet-bootstrap \
    --kubeconfig=kubelet-bootstrap.kubeconfig # 设置默认上下文
    kubectl config use-context default --kubeconfig=kubelet-bootstrap.kubeconfig

    向 kubeconfig 写入的是 token,bootstrap 结束后 kube-controller-manager 为 kubelet 创建 client 和 server 证书
    kube-apiserver 接收 kubelet 的 bootstrap token 后,将请求的 user 设置为 system:bootstrap:,group 设置为 system:bootstrappers,后续将为这个 group 设置 ClusterRoleBinding

    分发 bootstrap kubeconfig 文件到所有 worker 节点

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp kubelet-bootstrap.kubeconfig root@${node_ip}:/etc/kubernetes/kubelet-bootstrap.kubeconfig

    创建和分发 kubelet 参数配置文件

    从 v1.10 开始,部分 kubelet 参数需在配置文件中配置,kubelet --help 会提示

    cd /opt/k8s/work
    
    export CLUSTER_CIDR="172.30.0.0/16"
    export NODE_IP=192.168.0.114
    export CLUSTER_DNS_SVC_IP="10.254.0.2" cat > kubelet-config.yaml <<EOF
    kind: KubeletConfiguration
    apiVersion: kubelet.config.k8s.io/v1beta1
    address: ${NODE_IP}
    staticPodPath: "/etc/kubernetes/manifests"
    syncFrequency: 1m
    fileCheckFrequency: 20s
    httpCheckFrequency: 20s
    staticPodURL: ""
    port: 10250
    readOnlyPort: 0
    rotateCertificates: true
    serverTLSBootstrap: true
    authentication:
    anonymous:
    enabled: false
    webhook:
    enabled: true
    x509:
    clientCAFile: "/etc/kubernetes/cert/ca.pem"
    authorization:
    mode: Webhook
    registryPullQPS: 0
    registryBurst: 20
    eventRecordQPS: 0
    eventBurst: 20
    enableDebuggingHandlers: true
    enableContentionProfiling: true
    healthzPort: 10248
    healthzBindAddress: ${NODE_IP}
    clusterDomain: "cluster.local"
    clusterDNS:
    - "${CLUSTER_DNS_SVC_IP}"
    nodeStatusUpdateFrequency: 10s
    nodeStatusReportFrequency: 1m
    imageMinimumGCAge: 2m
    imageGCHighThresholdPercent: 85
    imageGCLowThresholdPercent: 80
    volumeStatsAggPeriod: 1m
    kubeletCgroups: ""
    systemCgroups: ""
    cgroupRoot: ""
    cgroupsPerQOS: true
    cgroupDriver: cgroupfs
    runtimeRequestTimeout: 10m
    hairpinMode: promiscuous-bridge
    maxPods: 220
    podCIDR: "${CLUSTER_CIDR}"
    podPidsLimit: -1
    resolvConf: /run/systemd/resolve/resolv.conf
    maxOpenFiles: 1000000
    kubeAPIQPS: 1000
    kubeAPIBurst: 2000
    serializeImagePulls: false
    evictionHard:
    memory.available: "100Mi"
    nodefs.available: "10%"
    nodefs.inodesFree: "5%"
    imagefs.available: "15%"
    evictionSoft: {}
    enableControllerAttachDetach: true
    failSwapOn: true
    containerLogMaxSize: 20Mi
    containerLogMaxFiles: 10
    systemReserved: {}
    kubeReserved: {}
    systemReservedCgroup: ""
    kubeReservedCgroup: ""
    enforceNodeAllocatable: ["pods"]
    EOF

    address:kubelet 安全端口(https,10250)监听的地址,不能为 127.0.0.1,否则 kube-apiserver、heapster 等不能调用 kubelet 的 API;
    readOnlyPort=0:关闭只读端口(默认 10255),等效为未指定;
    authentication.anonymous.enabled:设置为 false,不允许匿名访问 10250 端口;
    authentication.x509.clientCAFile:指定签名客户端证书的 CA 证书,开启 HTTP 证书认证;
    authentication.webhook.enabled=true:开启 HTTPs bearer token 认证;

    对于未通过 x509 证书和 webhook 认证的请求(kube-apiserver 或其他客户端),将被拒绝,提示 Unauthorized;
    authroization.mode=Webhook:kubelet 使用 SubjectAccessReview API 查询 kube-apiserver 某 user、group 是否具有操作资源的权限(RBAC);
    featureGates.RotateKubeletClientCertificate、featureGates.RotateKubeletServerCertificate:自动 rotate 证书,证书的有效期取决于 kube-controller-manager 的 --experimental-cluster-signing-duration 参数

    为各节点创建和分发 kubelet 配置文件

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp kubelet-config.yaml root@${node_ip}:/etc/kubernetes/kubelet-config.yaml

    创建和分发 kubelet 服务启动文件

    cd /opt/k8s/work
    export K8S_DIR=/data/k8s/k8s
    export NODE_NAME=slave
    cat > kubelet.service <<EOF
    [Unit]
    Description=Kubernetes Kubelet
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    After=docker.service
    Requires=docker.service [Service]
    WorkingDirectory=${K8S_DIR}/kubelet
    ExecStart=/opt/k8s/bin/kubelet \\
    --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \\
    --cert-dir=/etc/kubernetes/cert \\
    --root-dir=${K8S_DIR}/kubelet \\
    --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\
    --config=/etc/kubernetes/kubelet-config.yaml \\
    --hostname-override=${NODE_NAME} \\
    --image-pull-progress-deadline=15m \\
    --volume-plugin-dir=${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/ \\
    --logtostderr=true \\
    --v=2
    Restart=always
    RestartSec=5
    StartLimitInterval=0 [Install]
    WantedBy=multi-user.target
    EOF

    如果设置了 --hostname-override 选项,则 kube-proxy 也需要设置该选项,否则会出现找不到 Node 的情况;
    --bootstrap-kubeconfig:指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
    K8S approve kubelet 的 csr 请求后,在 --cert-dir 目录创建证书和私钥文件,然后写入 --kubeconfig 文件

    安装分发kubelet服务文件

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp kubelet.service root@${node_ip}:/etc/systemd/system/kubelet.service

    授予 kube-apiserver 访问 kubelet API 的权限

    在执行 kubectl exec、run、logs 等命令时,apiserver 会将请求转发到 kubelet 的 https 端口。这里定义 RBAC 规则,授权 apiserver 使用的证书(kubernetes.pem)用户名(CN:kubernetes-api)访问 kubelet API 的权限:

    kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes-api
    
    

    Bootstrap Token Auth 和授予权限

    kubelet 启动时查找 --kubeletconfig 参数对应的文件是否存在,如果不存在则使用 --bootstrap-kubeconfig 指定的 kubeconfig 文件向 kube-apiserver 发送证书签名请求 (CSR)。

    kube-apiserver 收到 CSR 请求后,对其中的 Token 进行认证,认证通过后将请求的 user 设置为 system:bootstrap:,group 设置为 system:bootstrappers,这一过程称为 Bootstrap Token Auth。

    默认情况下,这个 user 和 group 没有创建 CSR 的权限, 需要创建一个 clusterrolebinding,将 group system:bootstrappers 和 clusterrole system:node-bootstrapper 绑定:

    kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --group=system:bootstrappers
    
    

    启动 kubelet 服务

    export K8S_DIR=/data/k8s/k8s
    
    export node_ip=192.168.0.114
    ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kubelet && systemctl restart kubelet"

    kubelet 启动后使用 --bootstrap-kubeconfig 向 kube-apiserver 发送 CSR 请求,当这个 CSR 被 approve 后,kube-controller-manager 为 kubelet 创建 TLS 客户端证书、私钥和 --kubeletconfig 文件。

    注意:kube-controller-manager 需要配置 --cluster-signing-cert-file 和 --cluster-signing-key-file 参数,才会为 TLS Bootstrap 创建证书和私钥。

    遇到问题

      启动kubelet后,使用 kubectl get csr 没有结果,查看kubelet出现错误

      journalctl -u kubelet -a |grep -A 2 'certificate_manager.go' 
      
      Failed while requesting a signed certificate from the master: cannot create certificate signing request: Unauthorized 
      
      

      查看kube-api服务日志

      root@master:/opt/k8s/work# journalctl -eu kube-apiserver
      
      Unable to authenticate the request due to an error: invalid bearer token

      原因,在kube-apiserver服务的启动文件中丢掉了下面的配置

      --enable-bootstrap-token-auth \\

      追加上,重新启动kube-apiserver后解决

      kubelet 启动后持续不断的产生csr,手动approve后还继续产生

      原因是kube-controller-manager服务停止掉了,重新启动后解决

      kubelet服务出问题后 要删除对应节点的/etc/kubernetes/kubelet.kubeconfig和/etc/kubernetes/cert/kubelet-client-current*.pem、/etc/kubernetes/cert/kubelet-client-current*.pem,之后再重新启动kubelet

    查看 kubelet 情况

    root@master:/opt/k8s/work# kubectl get csr
    NAME AGE REQUESTOR CONDITION
    csr-kl5mg 49s system:bootstrap:5t989l Pending
    csr-mrmkf 2m1s system:bootstrap:5t989l Pending
    csr-ql68g 13s system:bootstrap:5t989l Pending
    csr-rvl2v 84s system:bootstrap:5t989l Pending

    执行时,在手动approve之前会一直追加csr

    手动 approve csr

    root@master:/opt/k8s/work# kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve
    certificatesigningrequest.certificates.k8s.io/csr-kl5mg approved
    certificatesigningrequest.certificates.k8s.io/csr-mrmkf approved
    certificatesigningrequest.certificates.k8s.io/csr-ql68g approved
    certificatesigningrequest.certificates.k8s.io/csr-rvl2v approved root@master:/opt/k8s/work# kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve
    certificatesigningrequest.certificates.k8s.io/csr-f4smx approved

    查看node信息

    root@master:/opt/k8s/work# kubectl get nodes
    NAME STATUS ROLES AGE VERSION
    slave Ready <none> 10m v1.17.2

    查看kubelet服务状态

    export node_ip=192.168.0.114
    root@master:/opt/k8s/work# ssh root@${node_ip} "systemctl status kubelet.service"
    ● kubelet.service - Kubernetes Kubelet
    Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Active: active (running) since Mon 2020-02-10 22:48:41 CST; 12min ago
    Docs: https://github.com/GoogleCloudPlatform/kubernetes
    Main PID: 15529 (kubelet)
    Tasks: 19 (limit: 4541)
    CGroup: /system.slice/kubelet.service
    └─15529 /opt/k8s/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig --cert-dir=/etc/kubernetes/cert --root-dir=/data/k8s/k8s/kubelet --kubeconfig=/etc/kubernetes/kubelet.kubeconfig --config=/etc/kubernetes/kubelet-config.yaml --hostname-override=slave --image-pull-progress-deadline=15m --volume-plugin-dir=/data/k8s/k8s/kubelet/kubelet-plugins/volume/exec/ --logtostderr=true --v=2 2月 10 22:49:04 slave kubelet[15529]: I0210 22:49:04.846285 15529 kubelet_node_status.go:73] Successfully registered node slave
    2月 10 22:49:04 slave kubelet[15529]: I0210 22:49:04.930745 15529 certificate_manager.go:402] Rotating certificates
    2月 10 22:49:14 slave kubelet[15529]: I0210 22:49:14.966351 15529 kubelet_node_status.go:486] Recording NodeReady event message for node slave
    2月 10 22:49:29 slave kubelet[15529]: I0210 22:49:29.580410 15529 certificate_manager.go:531] Certificate expiration is 2030-02-06 04:19:00 +0000 UTC, rotation deadline is 2029-01-21 13:08:18.850930128 +0000 UTC
    2月 10 22:49:29 slave kubelet[15529]: I0210 22:49:29.580484 15529 certificate_manager.go:281] Waiting 78430h18m49.270459727s for next certificate rotation
    2月 10 22:49:30 slave kubelet[15529]: I0210 22:49:30.580981 15529 certificate_manager.go:531] Certificate expiration is 2030-02-06 04:19:00 +0000 UTC, rotation deadline is 2027-07-14 16:09:26.990162158 +0000 UTC
    2月 10 22:49:30 slave kubelet[15529]: I0210 22:49:30.581096 15529 certificate_manager.go:281] Waiting 65065h19m56.409078053s for next certificate rotation
    2月 10 22:53:44 slave kubelet[15529]: I0210 22:53:44.911705 15529 kubelet.go:1312] Image garbage collection succeeded
    2月 10 22:53:45 slave kubelet[15529]: I0210 22:53:45.053792 15529 container_manager_linux.go:469] [ContainerManager]: Discovered runtime cgroups name: /system.slice/docker.service
    2月 10 22:58:45 slave kubelet[15529]: I0210 22:58:45.054225 15529 container_manager_linux.go:469] [ContainerManager]: Discovered runtime cgroups name: /system.slice/docker.servic

配置kube-proxy 组件

    创建 kube-proxy 证书和私钥

      创建证书签名请求文件

      cd /opt/k8s/work
      cat > kube-proxy-csr.json <<EOF
      {
      "CN": "system:kube-proxy",
      "key": {
      "algo": "rsa",
      "size": 2048
      },
      "names": [
      {
      "C": "CN",
      "ST": "NanJing",
      "L": "NanJing",
      "O": "system:kube-proxy",
      "OU": "system"
      }
      ]
      }
      EOF

      CN:指定该证书的 User 为 system:kube-proxy;
      预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限。

      生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
      -ca-key=/opt/k8s/work/ca-key.pem \
      -config=/opt/k8s/work/ca-config.json \
      -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy ls kube-proxy*pem

      安装证书

      cd /opt/k8s/work
      export node_ip=192.168.0.114
      scp kube-proxy*.pem root@${node_ip}:/etc/kubernetes/cert/

    创建 kubeconfig 文件

    kube-proxy 使用此文件访问apiserver,该文件提供了 apiserver 地址、嵌入的 CA 证书和 kube-proxy证书等信息

    cd /opt/k8s/work
    
    export KUBE_APISERVER=https://192.168.0.107:6443
    
    kubectl config set-cluster kubernetes \
    --certificate-authority=/opt/k8s/work/ca.pem \
    --embed-certs=true \
    --server=${KUBE_APISERVER} \
    --kubeconfig=kube-proxy.kubeconfig kubectl config set-credentials kube-proxy \
    --client-certificate=kube-proxy.pem \
    --client-key=kube-proxy-key.pem \
    --embed-certs=true \
    --kubeconfig=kube-proxy.kubeconfig kubectl config set-context default \
    --cluster=kubernetes \
    --user=kube-proxy \
    --kubeconfig=kube-proxy.kubeconfig kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig

    分发 kubeconfig

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp kube-proxy.kubeconfig root@${node_ip}:/etc/kubernetes/kube-proxy.kubeconfig

    创建 kube-proxy 配置文件

    cd /opt/k8s/work
    
    export CLUSTER_CIDR="172.30.0.0/16"
    
    export NODE_IP=192.168.0.114
    
    export NODE_NAME=slave
    
    cat > kube-proxy-config.yaml <<EOF
    kind: KubeProxyConfiguration
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    clientConnection:
    burst: 200
    kubeconfig: "/etc/kubernetes/kube-proxy.kubeconfig"
    qps: 100
    bindAddress: ${NODE_IP}
    healthzBindAddress: ${NODE_IP}:10256
    metricsBindAddress: ${NODE_IP}:10249
    enableProfiling: true
    clusterCIDR: ${CLUSTER_CIDR}
    hostnameOverride: ${NODE_NAME}
    mode: "ipvs"
    portRange: ""
    iptables:
    masqueradeAll: false
    ipvs:
    scheduler: rr
    excludeCIDRs: []
    EOF

    bindAddress: 监听地址;
    clientConnection.kubeconfig: 连接 apiserver 的 kubeconfig 文件;
    clusterCIDR: kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr 或 --masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
    hostnameOverride: 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 ipvs 规则;
    mode: 使用 ipvs 模式;

    分发kube-proxy 配置文件

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp kube-proxy-config.yaml root@${node_ip}:/etc/kubernetes/kube-proxy-config.yaml

    创建kube-proxy服务启动文件

    cd /opt/k8s/work
    export K8S_DIR=/data/k8s/k8s cat > kube-proxy.service <<EOF
    [Unit]
    Description=Kubernetes Kube-Proxy Server
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    After=network.target [Service]
    WorkingDirectory=${K8S_DIR}/kube-proxy
    ExecStart=/opt/k8s/bin/kube-proxy \\
    --config=/etc/kubernetes/kube-proxy-config.yaml \\
    --logtostderr=true \\
    --v=2
    Restart=on-failure
    RestartSec=5
    LimitNOFILE=65536 [Install]
    WantedBy=multi-user.target
    EOF

    分发 kube-proxy服务启动文件:

    export node_ip=192.168.0.114
    scp kube-proxy.service root@${node_ip}:/etc/systemd/system/

    启动 kube-proxy服务

    export node_ip=192.168.0.114
    export K8S_DIR=/data/k8s/k8s ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-proxy"
    ssh root@${node_ip} "modprobe ip_vs_rr"
    ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-proxy && systemctl restart kube-proxy"

    检查启动结果

    export node_ip=192.168.0.114
    ssh root@${node_ip} "systemctl status kube-proxy |grep Active"

    确保状态为 active (running),否则查看日志,确认原因

    如果出现异常,通过如下命令查看

    journalctl -u kube-proxy
    
    

    查看状态


    root@slave:~# netstat -lnpt|grep kube-prox
    tcp 0 0 192.168.0.114:10256 0.0.0.0:* LISTEN 23078/kube-proxy
    tcp 0 0 192.168.0.114:10249 0.0.0.0:* LISTEN 23078/kube-proxy
    root@slave:~# ipvsadm -ln
    IP Virtual Server version 1.2.1 (size=4096)
    Prot LocalAddress:Port Scheduler Flags
    -> RemoteAddress:Port Forward Weight ActiveConn InActConn
    TCP 10.254.0.1:443 rr
    -> 192.168.0.107:6443 Masq 1 0 0

验证集群功能(在master节点上执行)

以一个nginx的service和deployment来验证集群功能

    创建启动文件

    mkdir /opt/k8s/yml
    
    cd /opt/k8s/yml
    
    cat > nginx.yml << EOF
    apiVersion: v1
    kind: Service
    metadata:
    name: nginx
    labels:
    app: nginx
    spec:
    type: NodePort
    selector:
    app: nginx
    ports:
    - name: http
    port: 80
    targetPort: 80
    nodePort: 8080
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: nginx-deployment
    spec:
    selector:
    matchLabels:
    app: nginx
    replicas: 1
    template:
    metadata:
    labels:
    app: nginx
    spec:
    containers:
    - name: nginx
    image: nginx:1.9.1
    ports:
    - containerPort: 80
    EOF

    启动服务

    kubectl create -f nginx.yml
    
    

    第一次启动时需要下载k8s.gcr.io/pause:3.1镜像,国内无法直接下载,造成服务无法启动,通过下面操作来解决

    docker pull kubeimage/pause:3.1
    docker tag kubeimage/pause:3.1 k8s.gcr.io/pause:3.1

    观察服务启动情况


    root@master:/opt/k8s/yml# kubectl get service -o wide
    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
    kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 41h <none>
    nginx NodePort 10.254.8.25 <none> 80:8080/TCP 30m app=nginx
    root@master:/opt/k8s/yml# kubectl get pod -o wide
    NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    nginx-deployment-56f8998dbc-955gf 1/1 Running 0 30m 172.30.78.2 slave <none> <none>
    root@master:/opt/k8s/yml# curl http://192.168.0.114:8080
    <!DOCTYPE html>
    <html>
    <head>
    <title>Welcome to nginx!</title>
    <style>
    body {
    width: 35em;
    margin: 0 auto;
    font-family: Tahoma, Verdana, Arial, sans-serif;
    }
    </style>
    </head>
    <body>
    <h1>Welcome to nginx!</h1>
    <p>If you see this page, the nginx web server is successfully installed and
    working. Further configuration is required.</p> <p>For online documentation and support please refer to
    <a href="http://nginx.org/">nginx.org</a>.<br/>
    Commercial support is available at
    <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p>
    </body>
    </html>

部署 coredns 插件(在master节点上执行)

    下载和配置 coredns

    cd /opt/k8s/work
    git clone https://github.com/coredns/deployment.git
    mv deployment coredns

    启动 coredns

    cd /opt/k8s/work/coredns/kubernetes
    
    export CLUSTER_DNS_SVC_IP="10.254.0.2"
    export CLUSTER_DNS_DOMAIN="cluster.local" ./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -

    遇到问题

    启动coredns后,状态是CrashLoopBackOff

    root@master:/opt/k8s/work/coredns/kubernetes# kubectl get pod -n kube-system -l k8s-app=kube-dns

NAME READY STATUS RESTARTS AGE

coredns-76b74f549-99bxd 0/1 CrashLoopBackOff 5 4m45s

```

查看coredns对应的pod日志有如下错误

```
root@master:/opt/k8s/work/coredns/kubernetes# kubectl -n kube-system logs coredns-76b74f549-99bxd
.:53
[INFO] plugin/reload: Running configuration MD5 = 8b19e11d5b2a72fb8e63383b064116a1
CoreDNS-1.6.6
linux/amd64, go1.13.5, 6a7a75e
[FATAL] plugin/loop: Loop (127.0.0.1:60429 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 6292641803451309721.7599235642583168995." ``` 按照提示进入https://coredns.io/plugins/loop#troubleshooting页面,有如下表述 > When a CoreDNS Pod deployed in Kubernetes detects a loop, the CoreDNS Pod will start to “CrashLoopBackOff”. This is because Kubernetes will try to restart the Pod every time CoreDNS detects the loop and exits.
> A common cause of forwarding loops in Kubernetes clusters is an interaction with a local DNS cache on the host node (e.g. systemd-resolved). For example, in certain configurations systemd-resolved will put the loopback address 127.0.0.53 as a nameserver into /etc/resolv.conf. Kubernetes (via kubelet) by default will pass this /etc/resolv.conf file to all Pods using the default dnsPolicy rendering them unable to make DNS lookups (this includes CoreDNS Pods). CoreDNS uses this /etc/resolv.conf as a list of upstreams to forward requests to. Since it contains a loopback address, CoreDNS ends up forwarding requests to itself.
> There are many ways to work around this issue, some are listed here:
> * Add the following to your kubelet config yaml: resolvConf: <path-to-your-real-resolv-conf-file> (or via command line flag --resolv-conf deprecated in 1.10). Your “real” resolv.conf is the one that contains the actual IPs of your upstream servers, and no local/loopback address. This flag tells kubelet to pass an alternate resolv.conf to Pods. For systems using systemd-resolved, /run/systemd/resolve/resolv.conf is typically the location of the “real” resolv.conf, although this can be different depending on your distribution.
> * Disable the local DNS cache on host nodes, and restore /etc/resolv.conf to the original.
> * A quick and dirty fix is to edit your Corefile, replacing forward . /etc/resolv.conf with the IP address of your upstream DNS, for example forward . 8.8.8.8. But this only fixes the issue for CoreDNS, kubelet will continue to forward the invalid resolv.conf to all default dnsPolicy Pods, leaving them unable to resolve DNS. 按照提示的第一种解决方法,修改kubelet对应的配置文件kubelet-config.yaml中resolv-conf的值为/run/systemd/resolve/resolv.conf,配置片段如下 ```
... podPidsLimit: -1
resolvConf: /run/systemd/resolve/resolv.conf
maxOpenFiles: 1000000 ... ``` 重启kubelet服务 ```
systemctl daemon-reload
systemctl restart kubelet
``` 之后重新部署coredns ``` root@master:/opt/k8s/work/coredns/kubernetes# ./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created root@master:/opt/k8s/work/coredns/kubernetes# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-76b74f549-j5t9c 1/1 Running 0 12s root@master:/opt/k8s/work/coredns/kubernetes# kubectl get all -n kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
pod/coredns-76b74f549-j5t9c 1/1 Running 0 2m8s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP,9153/TCP 2m8s NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/coredns 1/1 1 1 2m8s NAME DESIRED CURRENT READY AGE
replicaset.apps/coredns-76b74f549 1 1 1 2m8s ```

    启动一个busybox pod,并启动上一章节中验证集群功能的nginx服务,在busybox通过服务名,访问nginx服务

    cd /opt/k8s/yml
    cat > busybox.yml << EOF
    apiVersion: v1
    kind: Pod
    metadata:
    name: busybox
    spec:
    containers:
    - name: busybox
    image: busybox
    command:
    - sleep
    - "3600"
    EOF kubectl create -f busybox.yml kubectl create -f nginx.yml

    进入busybox pod中访问nginx

    root@master:/opt/k8s/yml# kubectl exec -it busybox  sh
    / # cat /etc/resolv.conf
    nameserver 10.254.0.2
    search default.svc.cluster.local svc.cluster.local cluster.local
    options ndots:5 / # nslookup www.baidu.com
    Server: 10.254.0.2
    Address: 10.254.0.2:53 Non-authoritative answer:
    www.baidu.com canonical name = www.a.shifen.com
    Name: www.a.shifen.com
    Address: 183.232.231.174
    Name: www.a.shifen.com
    Address: 183.232.231.172 / # nslookup kubernetes
    Server: 10.254.0.2
    Address: 10.254.0.2:53 Name: kubernetes.default.svc.cluster.local
    Address: 10.254.0.1 / # nslookup nginx
    Server: 10.254.0.2
    Address: 10.254.0.2:53 Name: nginx.default.svc.cluster.local
    Address: 10.254.19.32 / # ping -c 1 nginx
    PING nginx (10.254.19.32): 56 data bytes
    64 bytes from 10.254.19.32: seq=0 ttl=64 time=0.155 ms --- nginx ping statistics ---
    1 packets transmitted, 1 packets received, 0% packet loss
    round-trip min/avg/max = 0.155/0.155/0.155 ms

追加节点(在master上执行)

追加节点

资源有限,我们这边尝试把master节点追加到集群中,如果是新机器,需要执行本文档的 安装前准备,把ca相关的证书分发到这个机器上,部署 flannel 网络步骤

    安装前准备(master节点已做过)

    把ca相关的证书分发到这个机器上(master节点已做过)

    部署 flannel 网络(master节点已做过)

    安装docker服务

    安装kubelet服务

    参照之前追加salve节点的操作,如果直接使用之前的kubelet-bootstrap.yml,发现节点无法加入,因为kubelet-bootstrap.yml中的token值有效期只有一天,如果token已经过期,在kube-apiserver中会出现错误

    2月 12 11:01:01 master kube-apiserver[5018]: E0212 11:01:01.640497    5018 authentication.go:104] Unable to authenticate the request due to an error: invalid bearer token 
    
    

    查看token

    root@master:/opt/k8s/work# kubeadm token list --kubeconfig ~/.kube/config
    TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
    5t989l.rweut7kedj7ifl1a <invalid> 2020-02-11T18:19:41+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:slave

    此时需要按照slave节点上安装kubelet的步骤,重新生成kubelet-bootstrap.yml

    将csr approve后,查看节点情况

    root@master:/opt/k8s/work# kubectl get nodes
    NAME STATUS ROLES AGE VERSION
    master Ready <none> 21s v1.17.2
    slave Ready <none> 36h v1.17.2

    安装kubeproxy服务

重新验证集群

root@master:/opt/k8s/yml# kubectl create -f nginx.yml
service/nginx created
deployment.apps/nginx-deployment created root@master:/opt/k8s/yml# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-56f8998dbc-6b6qm 1/1 Running 0 87s 172.30.22.2 master <none> <none>
root@master:/opt/k8s/yml# kubectl create -f busybox.yml
pod/busybox created root@master:/opt/k8s/yml# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 102s 172.30.22.3 master <none> <none>
nginx-deployment-56f8998dbc-6b6qm 1/1 Running 0 3m20s 172.30.22.2 master <none> <none> root@master:/opt/k8s/yml# curl http://192.168.0.107:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p> <p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p>
</body>
</html> root@master:/opt/k8s/yml# curl http://192.168.0.114:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p> <p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p>
</body>
</html>

可以看到访问集群中任意一个节点的8080端口,都可以正确访问到后端对应的nginx服务

kubernetes安装-二进制的相关教程结束。

《kubernetes安装-二进制.doc》

下载本文的Word格式文档,以方便收藏与打印。