崩溃啦!

崩溃啦!

介绍

什么是kubernetes

kubernetes is an open-source system for automating deployment,scaling, and management of containerized applications.

kubernetes是一个开源的自动化 部署、扩缩容、管理容器化应用 的系统。

Kubernetes起源于希腊语, 是“舵手”或者“领航员”的意思,是“管理者”和“控制论”的根源。 k8s是把用8代替8个字符“ubernete”而成的缩写。

有关kubernetes的起源故事

中文版:http://www.sohu.com/a/108369576_465914 (这个应该是转载的,但是我没有找到其他的链接了)
原文:https://cloudplatform.googleblog.com/2016/07/from-Google-to-the-world-the-Kubernetes-origin-story.html

国外三大云平台对kubenetes的支持(托管)情况:

  • GCE 全地区支持
  • AWS 只支持两个地区 (US East (N. Virginia)和US West (Oregon))
  • Azure 支持地区未知(没用过,逃~

我开始在GCE创建了三个node的k8s集群,但是觉得使用现成的托管不利于学习,于是从头搭建一套好了。

环境

在GCE上开了两台g1-small

g1-small:
1 个 vCPU,1.7 GB 内存
OS: Debian 9 (stretch)
Disk: 20G

内网ip
10.140.0.3 master
10.140.0.4 node

部署工具 kubespray

kubespray 是一个 自动化部署 kubernetes 集群的工具,基于 Ansible,因为Ansible 安装剧本(类似脚本) playbook 的透明化,相比 kubeadm 安装工具, 使用kubespray来安装k8s集群 是一种比较接近原生的安装方案。

github项目地址:https://github.com/kubernetes-incubator/kubespray

支持的发行版

  • Container Linux by CoreOS
  • Debian Jessie, Stretch, Wheezy
  • Ubuntu 16.04
  • CentOS/RHEL 7
  • Fedora/CentOS Atomic
  • openSUSE Leap 42.3/Tumbleweed

必要依赖

安装 git

安装 pip

使用pip安装核心工具ansible以及其他依赖

1
2
pip install ansible netaddr Jinja2
pip install --upgrade Jinja2

克隆代码安装上述没有考虑到的依赖

1
2
3
4
cd
git clone https://github.com/kubernetes-incubator/kubespray
cd kubespray
pip install -r requirements.txt

ssh免密访问

GCE的机器默认/etc/ssh/sshd_config中不能root登录,这个大家注意在cloudshell中设置下。

在master机器

回车三下
确保master自己可以ssh到自己

1
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

在node机器

把master机器的id_rsa.pub内容拷贝到~/.ssh/authorized_keys中,记得修改 /etc/ssh/sshd_config

或者在GCE的面板上,加上ssh公钥也是可以的,google会自动加入到root账户的~/.ssh/authorized_keys

配置ansible的inventory (资产)

cd kubespray
vim inventory/inventory
去掉注释后的,内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
node1 ansible_ssh_host=10.140.0.3 ansible_user=root
node2 ansible_ssh_host=10.140.0.4 ansible_user=root

[kube-master]
node1

[etcd]
node1

[kube-node]
node2

[k8s-cluster:children]
kube-master
kube-node

开始安装

启动集群

1
2
cd kubespray
ansible-playbook -i inventory/inventory cluster.yml -b -v --private-key=~/.ssh/id_rsa

如果需要增加node,使用如下命令添加节点

1
ansible-playbook -i inventory/inventory scale.yml  -b -v --private-key=~/.ssh/id_rsa

排错

may be elsewhere in the file depending on the exact syntax problem.

完整报错如下

1
2
3
4
5
6
7
8
9
10
11
12
root@kube-master:~/kubespray# ansible-playbook -i inventory/inventory cluster.yml -b -v --private-key=~/.ssh/id_rsa
Using /root/kubespray/ansible.cfg as config file
ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.

The error appears to have been in '/root/kubespray/roles/vault/handlers/main.yml': line 44, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:


- name: unseal vault
^ here

可能是依赖没安装完,按照官方教程,pip install -r requirements.txt即可

Failed to create bus connection: No such file or directory

完整报错如下

1
2
3
4
5
6
7
8
9
10
11
12
TASK [bootstrap-os : Assign inventory name to unconfigured hostnames (non-CoreOS and Tumbleweed)] *************************************
Tuesday 31 July 2018 08:41:53 +0000 (0:00:01.989) 0:00:06.384 **********
fatal: [node1]: FAILED! => {"changed": false, "msg": "Command failed rc=1, out=, err=Failed to create bus connection: No such file or directoryn"}
fatal: [node2]: FAILED! => {"changed": false, "msg": "Command failed rc=1, out=, err=Failed to create bus connection: No such file or directoryn"}

NO MORE HOSTS LEFT ********************************************************************************************************************
to retry, use: --limit @/root/kubespray/cluster.retry

PLAY RECAP ****************************************************************************************************************************
localhost : ok=2 changed=0 unreachable=0 failed=0
node1 : ok=5 changed=0 unreachable=0 failed=1
node2 : ok=5 changed=0 unreachable=0 failed=1

参考issues
https://github.com/ansible/ansible/issues/25543
https://github.com/plone/ansible-playbook/issues/108

debian系可能才会有这个问题
apt安装即可

ansible_swaptotal_mb == 0 when preinstall

1
2
3
4
5
fatal: [master]: FAILED! => {
"assertion": "ansible_swaptotal_mb == 0",
"changed": false,
"evaluated_to": false
}

因为GCE的vps没有swap,但是我在本地内网安装遇到了

参考:https://github.com/kubernetes-incubator/kubespray/issues/2031#issuecomment-349894969

在master服务器上执行

在每台服务器上关闭swap,执行如下

永久关闭 swap
修改 /etc/fstab 里面 swap 的相关 mount

安装过程中

RAM:260M/1.66g
CPU:3%
时而CPU能飙升到100%

安装完成

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
PLAY RECAP ****************************************************************************************************************************
localhost : ok=2 changed=0 unreachable=0 failed=0
node1 : ok=324 changed=102 unreachable=0 failed=0
node2 : ok=275 changed=84 unreachable=0 failed=0

Tuesday 31 July 2018 09:20:28 +0000 (0:00:00.217) 0:12:48.471 **********
===============================================================================
kubernetes/preinstall : Install packages requirements ------------------------------------------------------------------------- 72.36s
download : container_download | Download containers if pull is required or told to always pull (all nodes) -------------------- 52.10s
download : container_download | Download containers if pull is required or told to always pull (all nodes) -------------------- 38.50s
docker : ensure docker packages are installed --------------------------------------------------------------------------------- 32.12s
kubernetes/master : Master | wait for the apiserver to be running ------------------------------------------------------------- 27.15s
download : container_download | Download containers if pull is required or told to always pull (all nodes) -------------------- 18.61s
kubernetes-apps/ansible : Kubernetes Apps | Start Resources ------------------------------------------------------------------- 11.08s
download : container_download | Download containers if pull is required or told to always pull (all nodes) -------------------- 10.68s
docker : Docker | pause while Docker restarts --------------------------------------------------------------------------------- 10.13s
etcd : wait for etcd up -------------------------------------------------------------------------------------------------------- 9.57s
kubernetes/preinstall : Update package management cache (APT) ------------------------------------------------------------------ 9.35s
kubernetes-apps/network_plugin/calico : Start Calico resources ----------------------------------------------------------------- 8.48s
kubernetes/node : install | Copy kubelet from hyperkube container -------------------------------------------------------------- 7.92s
kubernetes-apps/ansible : Kubernetes Apps | Lay Down KubeDNS Template ---------------------------------------------------------- 7.52s
kubernetes/master : Copy kubectl from hyperkube container ---------------------------------------------------------------------- 7.15s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------- 6.61s
download : Download items ------------------------------------------------------------------------------------------------------ 5.73s
etcd : Configure | Check if etcd cluster is healthy ---------------------------------------------------------------------------- 5.64s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------- 5.57s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------- 5.44s

共计花费时间:0:12:48.471

master接收 400M流量
node接收 500M 流量

master ram 占用 668M cpu:10-30%
node ram占用 310M cpu:3-4%

master 磁盘
/dev/sda1 20G 3.7G 16G 20% /
node 磁盘
/dev/sda1 20G 3.5G 16G 19% /

查看 pod

1
2
3
4
5
6
7
8
9
10
11
12
13
root@kube-master:~/kubespray# kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-d6bl9 1/1 Running 0 3d
calico-node-mw85c 1/1 Running 0 3d
kube-apiserver-node1 1/1 Running 0 3d
kube-controller-manager-node1 1/1 Running 0 3d
kube-dns-7bd4d5fbb6-dq2r6 3/3 Running 0 3d
kube-dns-7bd4d5fbb6-pggh9 3/3 Running 0 3d
kube-proxy-node1 1/1 Running 0 3d
kube-proxy-node2 1/1 Running 0 3d
kube-scheduler-node1 1/1 Running 0 3d
kubedns-autoscaler-679b8b455-f24b5 1/1 Running 0 3d
kubernetes-dashboard-55fdfd74b4-9qplr 1/1 Running 0 3d

这样k8s就装好了

查看master节点和node节点的版本
kubectl version

1
2
3
root@kube-master:~
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.4", GitCommit:"5ca598b4ba5abb89bb773071ce452e33fb66339d", GitTreeState:"clean", BuildDate:"2018-06-06T08:00:59Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.4", GitCommit:"5ca598b4ba5abb89bb773071ce452e33fb66339d", GitTreeState:"clean", BuildDate:"2018-06-06T08:00:59Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

kubenetes version:1.10.4