官方网站:http://www.kubernetes.io
官方文档:https://kubernetes.io/zh/docs/home/
kubernetes是一种容器边编排机制,用于容器化应用程序的部署,扩展以及管理,目标是让部署容器化应用变得简单高效。
发展经历
Infrastucture as a Service ,简称IAAS,基础设施即服务,代表阿里云、亚马逊云
Platform as a Serivce ,简称PAAS,平台即服务,代表有:
Apache Mesos 开源分布式资源管理框架,用于构建资源池
DockerSwarm,轻量级容器资源管理器,功能太少
Kubernetes,领航者,功能全面、稳定,由Google支持,是由brog系统使用go重新演化出来的
Software as a Service ,简称SAAS,软件即服务,代表:Office 365、腾讯文档无需安装,直接在浏览器使用
集群架构和组件
Brog(博格)架构
BrogMaster (集群部署)主要负责请求分发。客户可以通过brogcfg配置文件、command-line tools命令行、web browser浏览器三种方式对集群进行调度管理
Scheduler ,调度器,由客户端发起的调度将会交给该组件解析,并将任务存储在Google Paxos键值对数据库中
Broglet 则会从Paxos中循环读取数据,找到自己需要做的任务进行处理,负责提供计算能力与服务
Kubernetes架构
Master
Scheduler 调度器:介绍任务给节点,也就是负责将任务分散到不同的node中,并将结果交给ApiServer,ApiServer再将数据写入到etcd(kv数据库)
Controller Manager :处理集群中常规后台任务,一个资源对应一个控制器,而ControllerManager就是负责管理这些控制器的。例如Deployment、Service
api server :Kubernetes API,集群的统一入口(kubectl, web UI, scheduler, etcd, replication controller),各组件协调者,以RESTfulAPI提供接口服务,所有对象资源的增删改查和监听操作都交给APIServer处理后再提交给Etcd存储。
etcd :分布式键值存储系统,用于保存集群状态数据,比如Pod、Service等对象信息
go语言编写的可信赖的分布式的键值存储服务,用于存储关键数据,协助分布式集群正常运转
v2版本只能存储在内存中,v3则会持久化。注意,在k8s v1.11版本中,v2版本已被弃用
Node
真正提供计算能力与服务的组件,负责运行Pod和容器
kubelet 组件,Master在Node节点上的Agent,管理本机运行容器的生命周期,比如创建容器、Pod挂载数据卷、下载secret、获取容器和节点状态等工作。kubelet将每个Pod转换成一组容器。容器之间的差异通过CRI(容器运行时接口)屏蔽。
kubeproxy ,负责写入规则到iptables或者IPVS实现映射访问,以实现在Node节点上实现Pod网络代理,维护网络规则和四层负载均衡工作。
集群调用者
Dashboard :给K8s集群提供一个B/S结构的访问体系
CoreDNS :可以为集群中的SVC创建一个A记录(域名IP对应关系解析)
FEDERATION :提供一个可以跨集群多k8s统一管理的功能
PROMETHEUS :普罗米修斯,提供K8s监控能力
部署Kubernetes
生产环境部署k8s主要的两种方式:
kubeadm,Kubeadm是一个工具,提供kubeadm init和kubeadm join,用于快速部署Kubernetes集群。
部署地址:https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm/
优点:快速、方便
二进制,从官方下载发行版的二进制包,手动部署每个组件,组成Kubernetes集群。
下载地址:https://github.com/kubernetes/kubernetes/releases
优点:可以对部署过程更好的理解
服务器硬件配置推荐:
网络组件的作用
部署网络组件的目的是打通Pod到Pod之间网络、Node与Pod之间网络,从集群中数据包可以任意传输,形成了一个扁平化网络。
目前,主流的网络组件有:
CNI(Container Network Interface,容器网络接口),就是对k8s对接这些三方网络组建的接口。
配置命令补全
连接Kubernetes集群
kubeconfig配置文件
文件位于:~/.kube/config
,kubectl 使用kubeconfig认证文件连接到k8s集群,我们可以使用kubectl config
指令生成kubeconfig文件;kubeconfig文件主要记录了下面的几个部分的信息:
集群信息:
复制 clusters:
- cluster:
certificate-authority-data: xxxxx 集群证书
server: https://192.168.41.10:6443 集群地址,master节点
name: kubernetes 集群名称
上下文信息(已经连接的所有集群信息,比如用户,生产环境有可能有多个集群):
复制 contexts:
- context:
cluster: kubernetes
user: kubernetes-admin
name: kubernetes-admin@kubernetes
当前上下文信息(当前选择的哪个集群)
复制 current-context: kubernetes-admin@kubernetes
客户端证书信息
复制 users:
- name: kubernetes-admin
user:
client-certificate-data: xxxx 证书
client-key-data: xxx key
指定kubeconfig执行命令:
复制 kubectl --kubeconfig=config get nodes
不加会默认从家目录读取,可以移动家目录缩短命令:
复制 mv config ~/.kube
kubectl get nodes
将配置文件分发给其他机器,就可以连接k8s集群。
连接多个k8s集群
合并配置,context方式,切换context
我的mac上docker-desktop的kubeconfig:
复制 apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeU1ERXhOREF4TkRjek1Wb1hEVE15TURFeE1qQXhORGN6TVZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBT2VRCmYzeVRYWGlybTIrMDR0NDhZcm5NYUd1ZVNZaWpPaXBkNkIxdE5HdldyQXdqS0xMK2loWG5UcEsvSUF0d0VDVTUKRHE2aFJZMGV0WlNUYXFWbXMxUnREb1BUMU5uR3djR2VXRDFyWUh5akpPeDZQTSt5K3RzcVE2NDFhTVd2c3FZTQpqdDZucmMwUjB2ZU1IY2ZuUWtBYnFLZWNlNFNEbEg2UHlndmx0d0RkOUtyb0tVbDRmRmNGY0Z1dmxzK3hWQ2dFCityMU9wMkp3WE5udXB6U2cwYTZlNm9abmNTL3UvVjhFWkg0RHJxMUN3ZnNUaFlsSlB3Z0NGTnR4SHJpV0xrQWcKb3BCanRmaVZYRDI1c2h6UzQ1N0x3NThUTEFzeTc4bEJaVFZ3TmxwQ1VkSjdxYVJWc3ZwWTQ3VThScEFwQllZYQpmNHQvdEhsSXkxQ0JqRWRad3AwQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZIMFYwZ2JtNmt6YkJrRWFNM2s4U1JEd09jQWxNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBRVFTUzJGU0NJZ0V1akZ1amlQWQpWWVU1QTNMRHcvbVBmSUZOWlZtNUhOYlUzZHdxRmVVRUxTaHAzdXpiM0JHbFprVDlRWDBSOEh5VVByWGtrSlJrCk9DYXlBdEg4WWxkTnZVMFFrYlFKNWVRWW1WYU9GK2dxLzRBVWdkRU1FWWdVd2c1TldxQkExSXdHV2VaZk9DM3MKTWRraU1FSG8zYjRBYjd1TDJibHo4aUc3U2NkTi9KRTJTM2FXQ2JjQWVEYitWRzdpS1pGc1JJM0dzMEhNelFiZQpyK3VVdnNHdG5SMnlSZnZrbWpCRTR4VzA0THpVb0tHY1RmcDcrSy9NZ2cyYXQyckVFUnUwK2tCaVFMd2pqWDRjCjRBbi80VUh2aHh6TXhqTkxNQml2bHlDcmg3OW9QUitZMEg5SHpxNmsvVEtuaGs3Ymh4bkpYYVYzbjVCRVNvWDUKMFNrPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
server: https://kubernetes.docker.internal:6443
name: docker-desktop
contexts:
- context:
cluster: docker-desktop
user: docker-desktop
name: docker-desktop
current-context: docker-desktop
kind: Config
preferences: {}
users:
- name: docker-desktop
user:
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1xxx
client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNxxx
我的windows虚拟机上的kubeconfig:
复制 apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM1ekNDQWMrZ0F3SUJBZxxxx
server: https://192.168.41.10:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubernetes-admin
name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
user:
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURJVENDQWdtZxxxx
client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUxxx
我想让我的mac上的kubectl可以同时访问这两个集群,将windows虚拟机上的kubeconfig拷贝到mac,并修改名称:
复制 $ ll
-rw-r--r-- 1 yangsx staff 5.6K 1 14 10:06 config-mac
-rw-r--r-- 1 yangsx staff 5.5K 1 14 09:59 config-win
配置 KUBECONFIG 环境变量,是 kubectl 工具支持的变量,变量内容是冒号分隔的 kubernetes config 认证文件路径,以此来合并多个kubeconfig文件:
复制 KUBECONFIG=config-mac:config-win kubectl config view --flatten > $HOME/.kube/config
查看合并后的文件cat ~/.
:
复制 apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZxxxx
server: https://kubernetes.docker.internal:6443
name: docker-desktop
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM1ekNDQWMrZ0F3SUJBZ0lCQURBTkJxxx
server: https://192.168.41.10:6443
name: kubernetes
contexts:
- context:
cluster: docker-desktop
user: docker-desktop
name: docker-desktop
- context:
cluster: kubernetes
user: kubernetes-admin
name: kubernetes-admin@kubernetes
current-context: docker-desktop
kind: Config
preferences: {}
users:
- name: docker-desktop
user:
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURRakNDQWlxZ0Fxxxx
client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBxxxx
- name: kubernetes-admin
user:
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURJVENDQWdtZ0F3SUJBxxxx
client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBMXJBUxxx
注意,不要重名
切换context :
使用命令更改context:
复制 # 查看当前的context
kubectl config current-context
docker-desktop
# 查看所有context
kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* docker-desktop docker-desktop docker-desktop
kubernetes-admin@kubernetes kubernetes kubernetes-admin
# 切换context
kubectl config use-context kubernetes-admin@kubernetes
Switched to context "kubernetes-admin@kubernetes".
切换config文件(不推荐)
复制 export KUBECONFIG=$HOME/.kube/rancher-config
直接指定配置文件(不推荐)
复制 #切换到生产集群
kubectl get pod --kubeconfig=/root/.kube/aliyun_prod-config
#切换到生产idc集群
kubectl get pod --kubeconfig=/root/.kube/vnet_prod-config
#切换到测试环境
kubectl get pod --kubeconfig=/root/.kube/bjcs_test-config
集群监控
查看资源集群状态
查看master组件状态:
复制 kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
这里可以看到组件的状态都是健康的
查看node节点状态:
复制 kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane,master 5h37m v1.21.0
k8s-node1 Ready <none> 5h26m v1.21.0
k8s-node2 Ready <none> 5h26m v1.21.0
AGE:所有的资源都有该属性,代表存活时间
STATUS:节点的状态没,如果显示NotReady说明kubelet组件有异常
查看集群信息:
复制 kubectl cluster-info
Kubernetes control plane is running at https://192.168.41.10:6443 # ApiServer的地址,也就是集群代理,连接k8s要连接apiserver这个地址
CoreDNS is running at https://192.168.41.10:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
# 如果要查看更详细的集群详细信息,可以使用命令:
kubectl cluster-info dump
# 将会以json格式的方式返回集群的详细信息
查看api资源信息:
复制 kubectl api-resources
名称 短名称 对应的API版本 是否属于命名空间 资源的类型
NAME SHORTNAMES APIVERSION NAMESPACED KIND
bindings v1 true Binding
componentstatuses cs v1 false ComponentStatus
....
pods po v1 true Pod
...
services svc v1 true Service
查看资源列表:
复制 kubectl get 资源类型(上面的API resource)
kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-6b9fbfff44-pstmt 1/1 Running 0 5h35m
calico-node-clq4g 1/1 Running 0 5h35m
更详细的展示
kubectl get pod -n kube-system -o wide
以yaml格式展示
kubectl get pod -n kube-system -o yaml
查看资源的详细信息:
复制 kubectl describe 资源类型(上面的API resource) 资源名称
查看集群资源利用率
查看Node资源消耗:
复制 kubectl top node <node_name>
查看Pod资源消耗:
复制 kubectl top pod <pod_name>
这个过程是:
复制 kubectl命令 => apiserver => metrics-server(pod) => 所有节点kubelet(cadvisor指标接口) => 所有资源利用率
如果提示: error: Metrics API not available,则需要安装metrics,官方地址:kubernetes-sigs/metrics-server: Scalable and efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines. (github.com) 。
安装完毕后再执行命令如下:
复制 kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master 589m 29% 1209Mi 63%
k8s-node1 106m 10% 504Mi 57%
k8s-node2 160m 16% 503Mi 57%
管理组件日志
k8s系统组件日志
systemd守护进程管理的组件,比如kubelet
Pod部署的组件:
复制 kubectl logs kube-proxy-btz4p -n kube-system
kubectl logs -f kube-proxy-btz4p -n kube-system # 实时查看
k8s集群内部部署的应用程序日志
标准输出,容器的标准输出将会输出在宿主机的/var/lib/docker/containers/<container-id>-json.log
下
日志文件,比如java log框架打印到某个指定的文件,可以对应用程序内部路径挂载,在宿主机查看;或者使用kubectl exec -it <Pod名称> -- bash
进入容器内部直接查看文件。
收集k8s日志的思路
针对标准输出:以DaemonSet方式在每个Node上部署一个日志收集程序,采集/var/lib/docker/containers/
目录下的所有日志
针对容器中的日志文件:在Pod中增加一个容器运行日志采集器,使用emptyDir共享日志目录让日志采集器读取到日志文件
部署应用
命令方式
使用Deployment控制器部署镜像
复制 kubectl create deployment web --image=nginx
kubectl get deployment,pods
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 1/1 1 1 18s
NAME READY STATUS RESTARTS AGE
pod/web-96d5df5c8-4dlx9 1/1 Running 0 18s
使用service发布pod
复制 # --target-port 容器中的应用程序端口,比如nginx就是默认为80
# --port kubernetes中的服务之间访问的端口,可以通过集群ip加上这个端口进行服务访问
# --type=NodePort 使用NodePort方式暴露给集群外部,将会对集群服务8080端口映射在每一台的一个随机固定端口上,可以通过任意一台Node节点的IP+31052访问
kubectl expose deployment web --type=NodePort --target-port=80 --port=8080 --name=web
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
web NodePort 10.102.83.203 <none> 8080:31052/TCP 7m2s
访问任意NodeIP:31052
即可访问Nginx服务
卸载: kubectl delete delployment web
和 kubectl delete svc web
yaml方式
编写yaml资源文件
复制 # api版本
apiVersion: apps/v1
# 资源类型
kind: Deployment
# 控制器的元数据,一般使用 项目名(project) + 应用程序名(app)
metadata:
name: web
labels:
app: web
# 指定控制器对象的详细信息
spec:
# 副本数3
replicas: 3
# Deployment控制器对象控制的Pod模板
template:
metadata:
name: web
labels:
app: web
# Pod的详细信息
spec:
# Pod的容器信息
containers:
- name: web
image: nginx
imagePullPolicy: IfNotPresent
# Pod的重启策略
restartPolicy: Always
# 通过选择器指定控制器对象控制的Pod,通过labels用来指定控制器要控制的资源,这里是控制名称为web的pod
selector:
matchLabels:
app: web
---
# 在创建一个svc
apiVersion: v1
kind: Service
metadata:
name: web
spec:
selector: # 标签选择器,选择的一定是标签,这里是Pod的标签
app: web
ports:
- name: http
port: 8080
targetPort: 80
protocol: TCP
type: NodePort
kubectl apply -f 上面的yaml文件
复制 kubectl apply -f web.yaml
deployment.apps/web created
kubectl get pods,deployments
NAME READY STATUS RESTARTS AGE
pod/web-849df489b4-97tpv 1/1 Running 0 3m3s
pod/web-849df489b4-crg4q 1/1 Running 0 3m3s
pod/web-849df489b4-ms5ds 1/1 Running 0 3m3s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 3/3 3 3 3m3s
yangsx@mac ~/temp kubectl get pods,deployments,services
NAME READY STATUS RESTARTS AGE
pod/web-849df489b4-97tpv 1/1 Running 0 3m14s
pod/web-849df489b4-crg4q 1/1 Running 0 3m14s
pod/web-849df489b4-ms5ds 1/1 Running 0 3m14s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 3/3 3 3 3m14s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 11h
service/web NodePort 10.103.77.17 <none> 8080:32654/TCP 24s
快速生成yaml
复制 kubectl create deployment web2 --image=nginx --replicas=3 --dry-run=client -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null # 可删除
labels:
app: web2
name: web2
spec:
replicas: 3
selector:
matchLabels:
app: web2
strategy: {}
template:
metadata:
creationTimestamp: null # 可删除
labels:
app: web2
spec:
containers:
- image: nginx
name: nginx
resources: {}
status: {} # 可删除
根据已有资源生成yaml
复制 kubectl get deploy web -o yaml > deployment3.yaml
查看资源的API
查看所有的API资源
复制 kubectl api-resources
资源名称 缩写 版本 命名空间级别 资源类型
NAME SHORTNAMES APIVERSION NAMESPACED KIND
bindings v1 true Binding
componentstatuses cs v1 false ComponentStatus
configmaps cm v1 true ConfigMap
endpoints ep v1 true Endpoints
events ev v1 true Event
limitranges limits v1 true LimitRange
namespaces ns v1 false Namespace
nodes no v1 false Node
persistentvolumeclaims pvc v1 true PersistentVolumeClaim
persistentvolumes pv v1 false PersistentVolume
pods po v1 true Pod
podtemplates v1 true PodTemplate
replicationcontrollers rc v1 true ReplicationController
resourcequotas quota v1 true ResourceQuota
secrets v1 true Secret
serviceaccounts sa v1 true ServiceAccount
services svc v1 true Service
mutatingwebhookconfigurations admissionregistration.k8s.io/v1 false MutatingWebhookConfiguration
validatingwebhookconfigurations admissionregistration.k8s.io/v1 false ValidatingWebhookConfiguration
customresourcedefinitions crd,crds apiextensions.k8s.io/v1 false CustomResourceDefinition
apiservices apiregistration.k8s.io/v1 false APIService
controllerrevisions apps/v1 true ControllerRevision
daemonsets ds apps/v1 true DaemonSet
deployments deploy apps/v1 true Deployment
replicasets rs apps/v1 true ReplicaSet
statefulsets sts apps/v1 true StatefulSet
tokenreviews authentication.k8s.io/v1 false TokenReview
localsubjectaccessreviews authorization.k8s.io/v1 true LocalSubjectAccessReview
selfsubjectaccessreviews authorization.k8s.io/v1 false SelfSubjectAccessReview
selfsubjectrulesreviews authorization.k8s.io/v1 false SelfSubjectRulesReview
subjectaccessreviews authorization.k8s.io/v1 false SubjectAccessReview
horizontalpodautoscalers hpa autoscaling/v1 true HorizontalPodAutoscaler
cronjobs cj batch/v1 true CronJob
jobs batch/v1 true Job
certificatesigningrequests csr certificates.k8s.io/v1 false CertificateSigningRequest
leases coordination.k8s.io/v1 true Lease
bgpconfigurations crd.projectcalico.org/v1 false BGPConfiguration
bgppeers crd.projectcalico.org/v1 false BGPPeer
blockaffinities crd.projectcalico.org/v1 false BlockAffinity
caliconodestatuses crd.projectcalico.org/v1 false CalicoNodeStatus
clusterinformations crd.projectcalico.org/v1 false ClusterInformation
felixconfigurations crd.projectcalico.org/v1 false FelixConfiguration
globalnetworkpolicies crd.projectcalico.org/v1 false GlobalNetworkPolicy
globalnetworksets crd.projectcalico.org/v1 false GlobalNetworkSet
hostendpoints crd.projectcalico.org/v1 false HostEndpoint
ipamblocks crd.projectcalico.org/v1 false IPAMBlock
ipamconfigs crd.projectcalico.org/v1 false IPAMConfig
ipamhandles crd.projectcalico.org/v1 false IPAMHandle
ippools crd.projectcalico.org/v1 false IPPool
ipreservations crd.projectcalico.org/v1 false IPReservation
kubecontrollersconfigurations crd.projectcalico.org/v1 false KubeControllersConfiguration
networkpolicies crd.projectcalico.org/v1 true NetworkPolicy
networksets crd.projectcalico.org/v1 true NetworkSet
endpointslices discovery.k8s.io/v1 true EndpointSlice
events ev events.k8s.io/v1 true Event
ingresses ing extensions/v1beta1 true Ingress
flowschemas flowcontrol.apiserver.k8s.io/v1beta1 false FlowSchema
prioritylevelconfigurations flowcontrol.apiserver.k8s.io/v1beta1 false PriorityLevelConfiguration
nodes metrics.k8s.io/v1beta1 false NodeMetrics
pods metrics.k8s.io/v1beta1 true PodMetrics
ingressclasses networking.k8s.io/v1 false IngressClass
ingresses ing networking.k8s.io/v1 true Ingress
networkpolicies netpol networking.k8s.io/v1 true NetworkPolicy
runtimeclasses node.k8s.io/v1 false RuntimeClass
poddisruptionbudgets pdb policy/v1 true PodDisruptionBudget
podsecuritypolicies psp policy/v1beta1 false PodSecurityPolicy
clusterrolebindings rbac.authorization.k8s.io/v1 false ClusterRoleBinding
clusterroles rbac.authorization.k8s.io/v1 false ClusterRole
rolebindings rbac.authorization.k8s.io/v1 true RoleBinding
roles rbac.authorization.k8s.io/v1 true Role
priorityclasses pc scheduling.k8s.io/v1 false PriorityClass
csidrivers storage.k8s.io/v1 false CSIDriver
csinodes storage.k8s.io/v1 false CSINode
csistoragecapacities storage.k8s.io/v1beta1 true CSIStorageCapacity
storageclasses sc storage.k8s.io/v1 false StorageClass
volumeattachments storage.k8s.io/v1 false VolumeAttachment
查看所有的API的版本
复制 kubectl api-versions
admissionregistration.k8s.io/v1
admissionregistration.k8s.io/v1beta1
apiextensions.k8s.io/v1
apiextensions.k8s.io/v1beta1
apiregistration.k8s.io/v1
apiregistration.k8s.io/v1beta1
apps/v1
authentication.k8s.io/v1
authentication.k8s.io/v1beta1
authorization.k8s.io/v1
authorization.k8s.io/v1beta1
autoscaling/v1
autoscaling/v2beta1
autoscaling/v2beta2
batch/v1
batch/v1beta1
certificates.k8s.io/v1
certificates.k8s.io/v1beta1
coordination.k8s.io/v1
coordination.k8s.io/v1beta1
crd.projectcalico.org/v1
discovery.k8s.io/v1
discovery.k8s.io/v1beta1
events.k8s.io/v1
events.k8s.io/v1beta1
extensions/v1beta1
flowcontrol.apiserver.k8s.io/v1beta1
metrics.k8s.io/v1beta1
networking.k8s.io/v1
networking.k8s.io/v1beta1
node.k8s.io/v1
node.k8s.io/v1beta1
policy/v1
policy/v1beta1
rbac.authorization.k8s.io/v1
rbac.authorization.k8s.io/v1beta1
scheduling.k8s.io/v1
scheduling.k8s.io/v1beta1
storage.k8s.io/v1
storage.k8s.io/v1beta1
v1
查看某个API的二级字段
复制 kubectl explain <资源名>
kubectl explain pod
KIND: Pod
VERSION: v1
DESCRIPTION:
Pod is a collection of containers that can run on a host. This resource is
created by clients and scheduled onto hosts.
FIELDS:
apiVersion <string>
APIVersion defines the versioned schema of this representation of an
object. Servers should convert recognized schemas to the latest internal
value, and may reject unrecognized values. More info:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
kind <string>
Kind is a string value representing the REST resource this object
represents. Servers may infer this from the endpoint the client submits
requests to. Cannot be updated. In CamelCase. More info:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
metadata <Object>
Standard object's metadata. More info:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
spec <Object>
Specification of the desired behavior of the pod. More info:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
status <Object>
Most recently observed status of the pod. This data may not be up to date.
Populated by the system. Read-only. More info:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-statu
查看某个API的所有级别的字段
复制 kubectl explain svc --recursive
KIND: Service
VERSION: v1
DESCRIPTION:
Service is a named abstraction of software service (for example, mysql)
consisting of local port (for example 3306) that the proxy listens on, and
the selector that determines which pods will answer requests sent through
the proxy.
FIELDS:
apiVersion <string>
kind <string>
metadata <Object>
annotations <map[string]string>
clusterName <string>
creationTimestamp <string>
deletionGracePeriodSeconds <integer>
deletionTimestamp <string>
finalizers <[]string>
generateName <string>
generation <integer>
labels <map[string]string>
managedFields <[]Object>
apiVersion <string>
fieldsType <string>
fieldsV1 <map[string]>
manager <string>
operation <string>
time <string>
name <string>
namespace <string>
ownerReferences <[]Object>
apiVersion <string>
blockOwnerDeletion <boolean>
controller <boolean>
kind <string>
name <string>
uid <string>
resourceVersion <string>
selfLink <string>
uid <string>
spec <Object>
allocateLoadBalancerNodePorts <boolean>
clusterIP <string>
clusterIPs <[]string>
externalIPs <[]string>
externalName <string>
externalTrafficPolicy <string>
healthCheckNodePort <integer>
internalTrafficPolicy <string>
ipFamilies <[]string>
ipFamilyPolicy <string>
loadBalancerClass <string>
loadBalancerIP <string>
loadBalancerSourceRanges <[]string>
ports <[]Object>
appProtocol <string>
name <string>
nodePort <integer>
port <integer>
protocol <string>
targetPort <string>
publishNotReadyAddresses <boolean>
selector <map[string]string>
sessionAffinity <string>
sessionAffinityConfig <Object>
clientIP <Object>
timeoutSeconds <integer>
topologyKeys <[]string>
type <string>
status <Object>
conditions <[]Object>
lastTransitionTime <string>
message <string>
observedGeneration <integer>
reason <string>
status <string>
type <string>
loadBalancer <Object>
ingress <[]Object>
hostname <string>
ip <string>
ports <[]Object>
error <string>
port <integer>
protocol <string>
查看某个API具体的某个字段的下一级字段
复制 kubectl explain svc.spec.ports
KIND: Service
VERSION: v1
RESOURCE: ports <[]Object>
DESCRIPTION:
The list of ports that are exposed by this service. More info:
https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies
ServicePort contains information on service's port.
FIELDS:
name <string>
The name of this port within the service. This must be a DNS_LABEL. All
ports within a ServiceSpec must have unique names. This maps to the 'Name'
field in EndpointPort objects. Optional if only one ServicePort is defined
on this service.
nodePort <integer>
The port on each node on which this service is exposed when type=NodePort
or LoadBalancer. Usually assigned by the system. If specified, it will be
allocated to the service if unused or else creation of the service will
fail. Default is to auto-allocate a port if the ServiceType of this Service
requires one. More info:
https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport
port <integer> -required-
The port that will be exposed by this service.
protocol <string>
The IP protocol for this port. Supports "TCP", "UDP", and "SCTP". Default
is TCP.
targetPort <string>
Number or name of the port to access on the pods targeted by the service.
Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. If
this is a string, it will be looked up as a named port in the target Pod's
container ports. If this is not specified, the value of the 'port' field is
used (an identity map). This field is ignored for services with
clusterIP=None, and should be omitted or set equal to the 'port' field.
More info:
https://kubernetes.io/docs/concepts/services-networking/service/#defining-a-service
资源
资源 :在k8s中,所有的内容都被抽象为资源,当资源被实例化后,成为对象。
集群资源的分类
名称空间(namespace)级别的资源
工作负载资源(workload)
ReplicationController,v1.11被废弃
服务发现/负载均衡资源(ServiceDiscovery LoadBalance)
特殊类型存储卷
DownwardAPI(将外部环境中的信息输出给容器)
Deployment
Deployment控制器与其他控制器的最主要目的就是为了方便管理k8s中的容器,而Deployment是最常见的工作负载控制器,是k8s的一个抽象概念,用于更高层次的对象,负责部署、管理Pod。与之类似的控制器还有DaemonSet、StatefulSet等。
主要功能:
应用场景:
管理应用生命周期
Deployment可以针对应用生命周期进行管理:
部署
复制 kubectl apply -f xxx.yaml
kubectl create deployment web --image=nginx:1.15 --replicas=3
升级
复制 kubectl apply -f xxx.yaml
kubectl set image deployment/web nginx=nginx=1.17
kubectl set image deployment/web nginx=nginx=1.17 --record=true # 升级并记录命令
kubectl edit deployment/web # 使用系统编辑器打开
滚动升级 :k8s对Pod升级的默认策略,通过使用新版本Pod逐步更新旧版本Pod,实现零停机发布,用户无感知
发布策略 主要有:蓝绿、灰度(金丝雀、A/B测试、冒烟测试)、滚动。(停机瀑布式升级已经过时)
部署升级的过程 :
复制 部署:ReplicaSet控制副本数为3
升级:
新创建一个ReplicaSet,设置副本数为1,并将旧的RS的副本数缩容为2
将新的RS副本数设置为2,并将旧的RS的副本数缩容为1
将新的RS副本数设置为3,并将旧的RS的副本数缩容为0
kubectl describe deploy web 命令可以查看时间记录,从而观察这个过程
kubectl get rs 可以查看已经创建的rs,一个rs对应一个升级版本
水平扩容缩容
复制 修改yaml文件的replicas值,再apply
kubectl sacale deployment web --replicas=10
扩容/缩容操作实际就是控制ReplicaSet的副本数
回滚(不常用)
复制 kubectl rollout history deployment web # 查看web的更新历史
版本编号 版本记录(上面--record=true,会记录升级时的命令,鸡肋没什么用)
REVISION CHANGE-CAUSE
1 <none>
2 <none>
3 <none>
4 kubectl set image deployment/web nginx=nginx=1.17 --record=true
# 回滚到上个版本
kubectl rollout undo deployment web
# 回滚到指定的版本
kubectl rollout undo deployment web --to-revision=2
回滚是调用ReplicaSet重新部署某个版本(每个版本都有对应的一个ReplicaSet,可以查看rs的信息确认对应的版本号已经所做的改动):
项目下线
复制 kubectl delete deploy/web
kubectl delete svc/web
ReplicaSet
用途:
Pod副本数量管理,不断对比当前Pod数量和期望Pod数量
Deployment每次发布都会创建一个ReplicaSet作为记录,用于实现回滚
复制 kubectl get rs # 查看RS记录
kubectl rollout history deployment web # 版本对应RS记录
Pod
最小封装集合(豌豆荚,容器荚),一个Pod中会封装多个容器,也是k8s管理的最小单位。有些服务之间关联性较强,需要共享网络环境、存储环境等。如果使用标准容器则很难完成操作,所以k8s在容器外部增加了一个Pod的概念。
特点:
边车模式设计(侧面可以带人的三轮摩托):
通过在Pod中定义专门容器来执行业务容器需要的辅助工作
可以将辅助功能同主业务容器解耦,实现独立发布和能力重用
Pod对象管理命令
注意:一半没有人直接创建Pod
创建Pod:
kubectl apply -f pod.yaml
,kind资源类型为Pod
kubectl run nginx --image=nginx
查看Pod:
kubectl describe pod <pod_name>
资源的详细信息
查看日志:
kubectl logs <pod_name> [-c 容器名称]
kubectl logs <pod_name> [-c 容器名称] -f
实时查看
进入容器终端:
kubectl exec <pod_name> [-c 容器名称] -- bash
删除Pod:
kubectl delete pod <pod_name>
Pod的状态
Pending 挂起:Pod已经被k8s系统接受,但是有一个或者多个容器尚未创建。这段时间包括:调度Pod、镜像下载等
Running 运行中:Pod已经被绑定在某个Node上,Pod中的所有容器都已经被创建。至少有一个容器处于运行状态,或者正在处于启动或者重启状态
Succeeded 成功:Pod中的所有容器都被成功终止,并且不会再重启
Failed 失败:Pod中至少有一个容器是失败终止的(容器以非0状态退出或者被系统终止)
Unknown 未知:因为某些原因无法取得Pod状态,通常是因为与Pod所在主机通讯失败
创建Pod的流程
Kubernetes基于list-watch机制的控制器架构,实现组件间交互的解耦。其他组件监控自己负责的资源,当这些组件方发生变化时,kube-apiserver会通知这些组件,这个过程类似发布与订阅。
执行命令创建Pod(或是由ControllerManager发送,比如Deployment控制器创建的),命令行将会通过api发送到APIServer,并将创建pod的配置信息提交给ETCD键值对存储系统
Schedule检测到未绑定节点的Pod,当根据自身算法选择一个合适的节点,并给这个pod打一个标记,比如:nodename=node1;然后响应给apiserver,并写入到etcd中
kubelet通过apiserver发现有分配到自己节点的新pod,于是调用CRI创建容器,随后将容器状态上报给apiserver,然后写入etcd
kubectl get 请求apiserver 获取当前命令空间pod列表状态,apiserver从etcd直接读取
Pod的生命周期
创建pod成功后:
初始化基础容器(pause容器)以完成Pod内网络存储的共享
由pause容器启动一个或者多个init容器,多个init容器链式执行
如果前一个执行完毕,切没有错误,就会执行下一个init容器
如果init容器启动失败,且Pod的重启策略为Always,那么Pod将会不断地重启
作用:在容器创建前,使用初始化容器完成一些工具以及数据的初始化,以防止MainC不安全或者冗余
作用:Init容器使用LinuxNamespace,所以相对于应用程序具有不同的文件视图,他们具有访问Secret的权限,MainC不具备
并发启动所有MainC(业务容器)
对每个主容器进行readiness/liveness 就绪/存活检测
就绪检测之后,容器会变为Ready就绪状态,就绪之后才会开放端口,暴露服务
存活检测跟随整个容器生命,会不间断的对容器的生存情况进行检测,如果不存活,则会根据Pod的重启策略,对Pod进行重启
验证:Pod内多容器网络、资源共享机制
复制 kubectl create deployment test-pod --dry-run=client --image=nginx -o yaml > test-pod.yaml
vi test-pod.yaml
修改yaml资源清单如下:
复制 apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: test-pod
name: test-pod
spec:
replicas: 1
selector:
matchLabels:
app: test-pod
strategy: {}
template:
metadata:
labels:
app: test-pod
spec:
containers:
- image: nginx
name: nginx
resources: {}
volumeMounts: # 挂载数据卷到容器
- name: html # 挂载html数据卷到容器的/usr/share/nginx/html目录下
mountPath: /usr/share/nginx/html
# 增加一个busybox容器
- image: busybox
name: bs
# 容器启动后执行sleep命令保证其不会退出
command: ["/bin/sh", "-c", "sleep 12h"]
volumeMounts:
- name: html # 挂载html数据卷到容器的/data目录下
mountPath: /data
# 定义一个数据卷,用于存储nginx的html资源
volumes:
- name: html # 数据卷名称为html
emptyDir: {} # 数据卷类型为空目录
执行:
复制 $ kubectl apply -f test-pod.yaml
# 查看创建的资源
$ kubectl get deploy,pod -o wide | grep test-pod
deployment.apps/test-pod 1/1 1 1 4m39s nginx,bs nginx,busybox app=test-pod
pod/test-pod-5dcdc754cd-2m8kw 2/2 Running 0 4m39s 10.244.36.95 k8s-node1 <none> <none>
验证网络以及资源共享:
复制 # 进入busybox容器,在共享目录下添加文件index.html
$ kubectl exec -it test-pod-5dcdc754cd-2m8kw -c bs -- sh
/ # cd /data/
/ # echo "<h1>HelloPod</h1>" > /data/index.html
# 在busybox内访问localhost:80端口
/ # wget localhost:80
Connecting to localhost:80 (127.0.0.1:80)
saving to 'index.html'
index.html 100% |***********************************| 18 0:00:00 ETA
'index.html' saved
/ # cat index.html
<h1>HelloPod</h1>
验证:每个Pod都会初始化一个pause容器
复制 [root@k8s-node1 ~]# docker ps | grep nginx
9fd7b718d6f7 nginx "/docker-entrypoint.…" 3 minutes ago Up 3 minutes k8s_nginx_nginx_default_9cd01c38-71ee-4743-b9ab-1dde7ab05bc3_0
6ca9803b29bc registry.aliyuncs.com/google_containers/pause:3.4.1 "/pause" 3 minutes ago Up 3 minutes k8s_POD_nginx_default_9cd01c38-71ee-4743-b9ab-1dde7ab05bc3_0
名称为nginx的容器共有两个,其中一个为pause容器。
环境变量
创建Pod时,可以为其下的容器设置环境变量。
应用场景:
容器内应用程序通过用户自定义变量改变应用程序默认行为
环境变量定义方式:
测试:
复制 apiVersion: v1
kind: Pod
metadata:
labels:
run: test-pod
name: test-pod
spec:
containers:
- image: busybox
name: test-pod
command: ["sh", "-c", "sleep 12h"]
resources: {}
env:
# 变量值从pod属性中获取
- name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: MY_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: ABC
value: "123456"
dnsPolicy: ClusterFirst
restartPolicy: Always
创建上述pod:
复制 kubectl apply -f testpod.yaml
查看环境变量配置是否生效:
复制 kubectl describe pod test-pod
Name: test-pod
Namespace: default
Priority: 0
Node: k8s-node1/192.168.41.11
Start Time: Fri, 14 Jan 2022 15:37:28 +0800
Labels: run=test-pod
Annotations: cni.projectcalico.org/containerID: 6ec8706992387019b175f4aa38489789379dd5470d2a9e4f9e752cc0c804d32b
cni.projectcalico.org/podIP: 10.244.36.103/32
cni.projectcalico.org/podIPs: 10.244.36.103/32
Status: Running
IP: 10.244.36.103
IPs:
IP: 10.244.36.103
Containers:
test-pod:
Container ID: docker://30fb7d4bebe0a4fe50cfd752e6f79599f79b59b36240ebced54effeeab634a6e
Image: busybox
Image ID: docker-pullable://busybox@sha256:5acba83a746c7608ed544dc1533b87c737a0b0fb730301639a0179f9344b1678
Port: <none>
Host Port: <none>
Command:
sh
-c
sleep 12h
State: Running
Started: Fri, 14 Jan 2022 15:37:35 +0800
Ready: True
Restart Count: 0
####################### 环境变量 ######################
Environment:
MY_NODE_NAME: (v1:spec.nodeName)
MY_POD_NAME: test-pod (v1:metadata.name)
MY_POD_NAMESPACE: default (v1:metadata.namespace)
MY_POD_IP: (v1:status.podIP)
ABC: 123456
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nl9t2 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-nl9t2:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2d17h default-scheduler Successfully assigned default/test-pod to k8s-node1
Normal Pulling 2d17h kubelet Pulling image "busybox"
Normal Pulled 2d17h kubelet Successfully pulled image "busybox" in 3.323362543s
Normal Created 2d17h kubelet Created container test-pod
Normal Started 2d17h kubelet Started container test-pod
进入容器输出环境变量:
复制 kubectl exec -it test-pod -- sh
/ # echo $MY_NODE_NAME,$MY_POD_NAME,$MY_POD_NAMESPACE,$MY_POD_IP,$ABC
k8s-node1,test-pod,default,10.244.36.103,123456
init容器/初始化容器
初始化容器用于初始化工作,执行完毕就结束,可以理解为一次性任务
应用场景:
环境检查:确保应用容器以来的服务启动后再启动应用容器
示例:下载并初始化配置
复制 # Pod 初始化容器
apiVersion: v1
kind: Pod
metadata:
name: tomcat-initc
labels:
app: tomcat-initc
spec:
# 定义卷
volumes:
- name: tomcat-initc-volume
emptyDir: {}
initContainers:
# 为tomcat初始化数据的初始化容器
- name: init-html
image: busybox:latest
imagePullPolicy: IfNotPresent
# 命令可以去下载war包
command: ['sh', '-c','mkdir -p /usr/local/tomcat/webapps && echo ''<h1>你好</h1>'' >> /usr/local/tomcat/webapps/index.html']
volumeMounts: # 将tomcat-initc-volume 挂载到init容器下,并初始化此卷的内容,准备配置以及数据
- mountPath: /usr/local/tomcat/webapps/
name: tomcat-initc-volume
containers:
- name: tomcat-initc
image: tomcat
imagePullPolicy: IfNotPresent
volumeMounts: # 将tomcat-initc-volume 挂载到webapp目录下
- mountPath: /usr/local/tomcat/webapps/
name: tomcat-initc-volume
restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
name: tomcat-initc-svc
spec:
selector:
app: tomcat-initc
ports:
- port: 8080
type: NodePort
重启策略 restartPolicy
共有三中那个重启策略,分别用在不同的场景:
Always
:当容器终止退出后,总是重新启动容器,是Pod的默认策略。适合需要持续运行提供服务的程序,比如nginx、redis、javaapp
OnFailure
:当容器异常退出(退出状态码非0)时,才重启容器。适合需要周期性运行的程序,比如数据库备份、巡检
Never
:当容器退出后,从不重启容器。适合一次性运行的程序,比如计算程序、数据离线处理程序
设置Pod的重启策略:
复制 apiVersion: apps/v1
kind: Deployment
metadata:
name: web
labels:
app: web
spec:
replicas: 1
template:
metadata:
name: web
labels:
app: web
spec:
containers:
- name: web
image: nginx
imagePullPolicy: IfNotPresent
restartPolicy: Always # 指定重启策略
selector:
matchLabels:
app: web
健康检查
健康检查分为三个阶段,位于Init容器运行成功之后:
startupProbe
:启动检查,检查成功才由存活检查接手,用于保护慢启动容器(某些容器启动过程时间长,通过启动检查可以排除环境问题,防止长时间启动最后因环境失败的情况)
livenessPobe
:存活检查,将杀死容器,根据Pod的restartPolicy来操作
redinessProbe
:就绪检查,检查服务是否正常运行,比如项目是否启动成功。如果检查失败,Kubernetes会把Pod从service endpoints中剔除
其中每种检查都支持一下三种检查方法:
httGet
:发送HTTP请求,返回200-400范围状态码为成功
tcpSocket
:发起TCP Socket建立成功
示例:健康检查-端口探测(http)
部署一个deployment,他的资源清单nginx.yaml
如下:
复制 apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
ports:
- containerPort: 80
resources: {}
# 容器存活检查
livenessProbe:
httpGet:
port: 80
path: /
# 启动容器多少秒开始进行存活检查
initialDelaySeconds: 20
# 以后每间隔多少秒检查依次
periodSeconds: 10
# 容器就绪检查
readinessProbe:
httpGet:
port: 80
path: /
initialDelaySeconds: 30
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
type: NodePort
ports:
- port: 80
name: web
targetPort: 80
selector: # 关联pod
app: nginx
查看部署情况:
复制 kubectl get deploy,pod,svc
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx 0/1 1 0 8s
NAME READY STATUS RESTARTS AGE
pod/nginx-5fbd849686-bgvmq 0/1 ContainerCreating 0 8s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 20h
service/nginx NodePort 10.100.240.34 <none> 80:30063/TCP 8s
查看pod内容器的详细信息,确认livenessProbe和redinessProbe配置是否成功
复制 kubectl describe pod nginx-5fbd849686-bgvmq
Name: nginx-5fbd849686-bgvmq
Namespace: default
Priority: 0
Node: k8s-node1/192.168.41.11
Start Time: Fri, 14 Jan 2022 10:30:15 +0800
Labels: app=nginx
pod-template-hash=5fbd849686
Annotations: cni.projectcalico.org/containerID: 6a97b48184cd15527d99cc965f3e7f3445619a084df0e0856005646f5d046ef7
cni.projectcalico.org/podIP: 10.244.36.102/32
cni.projectcalico.org/podIPs: 10.244.36.102/32
Status: Running
IP: 10.244.36.102
IPs:
IP: 10.244.36.102
Controlled By: ReplicaSet/nginx-5fbd849686
Containers:
nginx:
Container ID: docker://1b801e6715c6cb6b3714a383d10071fd84292bb6980dc9278a3eb77819ffb8b4
Image: nginx
Image ID: docker-pullable://nginx@sha256:0d17b565c37bcbd895e9d92315a05c1c3c9a29f762b011a10c54a66cd53c9b31
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Fri, 14 Jan 2022 10:30:25 +0800
Ready: False
Restart Count: 0
######### 在这里
Liveness: http-get http://:80/ delay=20s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:80/ delay=30s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cspnd (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-cspnd:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2d12h default-scheduler Successfully assigned default/nginx-5fbd849686-bgvmq to k8s-node1
Warning FailedMount 2d12h kubelet MountVolume.SetUp failed for volume "kube-api-access-cspnd" : failed to sync configmap cache: timed out waiting for the condition
Normal Pulling 2d12h kubelet Pulling image "nginx"
Normal Pulled 2d12h kubelet Successfully pulled image "nginx" in 5.460878686s
Normal Created 2d12h kubelet Created container nginx
Normal Started 2d12h kubelet Started container nginx
验证存活检查与就绪检查是否每段时间发送一次请求
复制 # 查看nginx日志
kubectl logs nginx-5fbd849686-bgvmq
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2022/01/14 02:30:25 [notice] 1#1: using the "epoll" event method
2022/01/14 02:30:25 [notice] 1#1: nginx/1.21.5
2022/01/14 02:30:25 [notice] 1#1: built by gcc 10.2.1 20210110 (Debian 10.2.1-6)
2022/01/14 02:30:25 [notice] 1#1: OS: Linux 3.10.0-957.el7.x86_64
2022/01/14 02:30:25 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2022/01/14 02:30:25 [notice] 1#1: start worker processes
2022/01/14 02:30:25 [notice] 1#1: start worker process 31 # 启动时间
192.168.41.11 - - [14/Jan/2022:02:30:45 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-" # 20s后触发第一次存活检测
192.168.41.11 - - [14/Jan/2022:02:30:55 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-" # 30s后触发第一次就绪检测
192.168.41.11 - - [14/Jan/2022:02:30:55 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-" # 第二次存活检测
# 下面每十秒发生一次存活检测、一次就绪检测
192.168.41.11 - - [14/Jan/2022:02:31:05 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:05 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:15 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:15 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:25 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:25 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:35 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:35 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:45 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:45 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:55 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
192.168.41.11 - - [14/Jan/2022:02:31:55 +0000] "GET / HTTP/1.1" 200 615 "-" "kube-probe/1.21" "-"
验证存活检查失败,是否将杀死容器,并根据Pod的restartPolicy来操作
复制 # 进入容器删除index.html 让httpGet访问结果为404 模拟存活检查失败:
kubectl exec -it nginx-5fbd849686-bgvmq -- /bin/sh
$ rm -rf /usr/share/nginx/html/index.html
$ exit
再次查看pod信息:
复制 kubectl describe pod nginx-5fbd849686-bgvmq
Name: nginx-5fbd849686-bgvmq
Namespace: default
Priority: 0
Node: k8s-node1/192.168.41.11
Start Time: Fri, 14 Jan 2022 10:30:15 +0800
Labels: app=nginx
pod-template-hash=5fbd849686
Annotations: cni.projectcalico.org/containerID: 6a97b48184cd15527d99cc965f3e7f3445619a084df0e0856005646f5d046ef7
cni.projectcalico.org/podIP: 10.244.36.102/32
cni.projectcalico.org/podIPs: 10.244.36.102/32
Status: Running
IP: 10.244.36.102
IPs:
IP: 10.244.36.102
Controlled By: ReplicaSet/nginx-5fbd849686
Containers:
nginx:
Container ID: docker://5d3c4ca052837601b5ff1bb9f43a6dc1df04f6b4a4c05726edd7a0a928a55577
Image: nginx
Image ID: docker-pullable://nginx@sha256:0d17b565c37bcbd895e9d92315a05c1c3c9a29f762b011a10c54a66cd53c9b31
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Fri, 14 Jan 2022 10:41:03 +0800
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 14 Jan 2022 10:30:25 +0800
Finished: Fri, 14 Jan 2022 10:40:55 +0800
Ready: False
Restart Count: 1
Liveness: http-get http://:80/ delay=20s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:80/ delay=30s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cspnd (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-cspnd:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2d12h default-scheduler Successfully assigned default/nginx-5fbd849686-bgvmq to k8s-node1
Warning FailedMount 2d12h kubelet MountVolume.SetUp failed for volume "kube-api-access-cspnd" : failed to sync configmap cache: timed out waiting for the condition
Normal Pulled 2d12h kubelet Successfully pulled image "nginx" in 5.460878686s
############### 就绪检测失败、存货检测失败
Warning Unhealthy 2d12h (x3 over 2d12h) kubelet Readiness probe failed: HTTP probe failed with statuscode: 403
Warning Unhealthy 2d12h (x3 over 2d12h) kubelet Liveness probe failed: HTTP probe failed with statuscode: 403
############### 容器即将重启
Normal Killing 2d12h kubelet Container nginx failed liveness probe, will be restarted
Normal Pulling 2d12h (x2 over 2d12h) kubelet Pulling image "nginx"
Normal Created 2d12h (x2 over 2d12h) kubelet Created container nginx
Normal Started 2d12h (x2 over 2d12h) kubelet Started container nginx
Normal Pulled 2d12h kubelet Successfully pulled image "nginx" in 6.968038775s
重启次数:
复制 kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-5fbd849686-bgvmq 1/1 Running 1 12m # 变为1
验证Kubernetes是否会把Pod从service endpoints中剔除
复制 # 进入容器删除index.html 让httpGet访问结果为404 模拟存活检查失败:
kubectl exec -it nginx-5fbd849686-bgvmq -- /bin/sh
$ rm -rf /usr/share/nginx/html/index.html
$ exit
在执行上述命令之前,使用如下命令持续监测endpoint的状态,可以看到如下的结果:
复制 kubectl get ep -w
NAME ENDPOINTS AGE
kubernetes 192.168.41.10:6443 20h
nginx 10.244.36.102:80 16m
nginx 16m # 就绪检测失败,移除svc关联的endpoint
nginx 10.244.36.102:80 17m # nginx服务恢复,重新添加
静态Pod
特点:
应用场景:
在kubelet配置文件中启用静态Pod参数:
复制 vi /var/lib/kubelet/config.yaml
...
staticPodPath: /etc/kubernets/manifests
...
[root@k8s-master ~]# cd /etc/kubernetes/manifests
[root@k8s-master manifests]# ll
总用量 16
-rw------- 1 root root 2220 1月 13 14:06 etcd.yaml
-rw------- 1 root root 3330 1月 13 14:06 kube-apiserver.yaml
-rw------- 1 root root 2828 1月 13 19:27 kube-controller-manager.yaml
-rw------- 1 root root 1414 1月 13 19:26 kube-scheduler.yaml
将部署的pod yaml放在该目录由kubelet自动创建,从这个目录移除就会自动移除静态pod。
DaemonSet
功能:
应用场景:
复制 apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nginx-daemonset
labels:
app: daemonset-pod
spec:
selector:
matchLabels:
app: daemonset-pod
template:
metadata:
name: daemonset-pod
labels:
app: daemonset-pod
spec:
containers:
- name: daemonset-container
image: nginx:1.21.1
# kubectl get DaemonSet 查询所有的DaemonSet
# 或者使用 kubectl get ds
查看调度失败原因,kubectl describe pod <NAME>
Service
Service的引入主要解决Pod的动态变化(IP每次部署都不同),并提供统一的访问入口:
**服务发现:**防止Pod失联,找到提供同一个服务的Pod
负载均衡 :定义一组Pod的访问策略,并可以避免将流量发送到不可达的Pod上
是集群内服务的代理节点。
Pod和Service的关系
Service通过iptables或者ipvs为一组Pod提供负载均衡的能力
定义与创建Service
复制 apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
spec:
selector:
matchLabels:
app: my-nginx
template:
metadata:
labels:
app: my-nginx
spec:
containers:
- name: my-nginx
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: my-nginx
labels:
app: my-nginx
spec:
ports:
- port: 80
protocol: TCP
name: http
selector:
app: my-nginx
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-nginx
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: nginx.test.com # 将域名映射到 my-nginx 服务
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-nginx # 将所有请求发送到 my-nginx 服务的 80 端口
port:
number: 80
创建svc:
复制 kubectl apply -f svc.yaml
查看已经创建的svc:
复制 kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-nginx ClusterIP 10.97.1.73 <none> 80/TCP 61s
type字段
常见的Service类型有三种:
ClusterIP ,默认值,分配一个IP地址,即VIP,只能在集群内部访问
复制 spec:
ports:
# service以80端口暴露服务
- port: 80
name: web
targetPort: 80 # 将pod的80端口服务提供给service
selector: # 关联pod
app: nginx
复制 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx ClusterIP 10.97.1.73 <none> 80/TCP 61s
类型是ClusterIP, 集群IP是 10.97.1.73
NodePort ,在每个节点上启用一个端口来暴露服务,可以让其通过任意node+端口来进行外部访问;同时也像ClusterIP一样分配一个集群内部IP供集群内部访问
复制 spec:
type: NodePort
ports:
# service以80端口暴露服务
- port: 80
name: web
targetPort: 80 # 将pod的80端口服务提供给service
nodePort: 30001 # 端口范围在 30000 - 32767 之间,如果不写,默认会随机分配一个端口
selector: # 关联pod
app: nginx
复制 kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx NodePort 10.97.1.73 <none> 80:30001/TCP 21m
LoadBalance ,与NodePort完全一致。除此以外,k8s会请求底层云平台(比如阿里云、腾讯云、AWS等)上的负载均衡,将每个Node([NodeIp]:[NodePort]) 作为后端添加进去
service负载均衡实现机制
Service 底层实现主要有iptables和 ipvs两种网络模式,决定你如何转发流量。
service DNS名称解析
CoreDNS是一个DNS服务器,k8s默认采用pod的方式部署在集群中。CoreDNS服务监视KubernetesAPI,为每一个Service创建DNS A记录用于域名解析。其格式为 <service-name>.<namespace-name>.svc.cluster.local
CoreDNS Yaml文件可以参考: https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/coredns
验证:当创建service时会自动添加一个DNS记录:
复制 apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: web
name: web
spec:
replicas: 1
selector:
matchLabels:
app: web
strategy: {}
template:
metadata:
labels:
app: web
spec:
containers:
- image: nginx
name: nginx
resources: {}
- image: busybox:1.28.3
name: busybox
resources: {}
command:
- "sh"
- "-c"
- "sleep 12h"
---
apiVersion: v1
kind: Service
metadata:
name: web
spec:
selector:
app: web
ports:
- protocol: TCP
port: 80
targetPort: 80
创建Deployment以及service,进入容器,测试DNS:
复制 $ kubectl get deployment,pod,svc -owide
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.apps/web 1/1 1 1 20m nginx,busybox nginx,busybox:1.28.3 app=web
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/web-7dfb85867c-nf5p5 2/2 Running 0 2m5s 10.1.0.26 docker-desktop <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
#---- 是Kubernetes默认的service,用于让k8s中的pod访问到k8s集群
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 7d4h <none>
service/web ClusterIP 10.102.2.103 <none> 80/TCP 20m app=web
kubectl exec -it pod/web-7dfb85867c-nf5p5 -c busybox -- sh
/ # nslookup web.default.svc.cluster.local
Server: 10.96.0.10 # DNS服务地址,也就是 service/kubernetes 的ip地址
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: web.default.svc.cluster.local
Address 1: 10.102.2.103 web.default.svc.cluster.local # 域名解析记录,正好是service的IP
Ingress
既然后又NodePort,为什么还需要Ingress?
NodePort 是基于iptables/ipvs实现的负载均衡器,他是四层转发
四层转发是指在传输层基于IP和Port的转发方式,这种方式的转发不能满足类似域名分流、重定向之类的需求
所以引入的Ingress七层转发(应用层),他可以针对HTTP等应用层协议的内容转发,可以满足的场景更多
Ingress :k8s中的一个抽象资源,用于给管理员提供一个暴露应用的入口定义方法
Ingress Controller :负责流量路由,根据Ingress生成具体的路由规则,并对Pod进行负载均衡
外部用户通过Ingress Controller访问服务,由Ingress规则决定访问哪个Service
IngressController内包含一个Service,也可以通过NodePort暴露端口,让用户访问
然后将流量直接转发到对应的Pod上(注意 :只通过Service找到对应的Pod,实际发送并不经过Service,这样更高效)
IngressController是社区提供的一种接口,其下面有很多具体的实现,比如 Nginx、Kong等
部署ingress-nginx
从github中下载yaml配置
复制 curl -o deploy.yaml https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.1.1/deploy/static/provider/cloud/deploy.yaml
ingress-nginx相关镜像位于google镜像仓库中,国内网络无法访问;可以从docker hub上寻找相关镜像,修改yaml中的相关镜像地址
修改用于暴露Ingress-Nginx-Controller的Service的端口暴露方式(ingress controller是pod,负责动态生成nginx配置):
复制 apiVersion: v1
kind: Service
metadata:
annotations:
labels:
helm.sh/chart: ingress-nginx-4.0.10
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/version: 1.1.0
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/component: controller
name: ingress-nginx-controller
namespace: ingress-nginx
spec:
type: NodePort
ports:
- name: http
port: 80
nodePort: 30080
protocol: TCP
targetPort: http
appProtocol: http
- name: https
port: 443
protocol: TCP
nodePort: 30443
targetPort: https
appProtocol: https
selector:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/component: controller
执行部署
复制 kubectl apply -f deploy.yaml
kubernetes里命名空间删不掉的问题
如果某个命名空间(此例里是ingress-nginx)迟迟删除不掉,状态一直是Terminating,然后在此命名空间里重新创建资源时报如下错误:
复制 Error from server (Forbidden): error when creating "nginx-controller.yaml": roles.rbac.authorization.k8s.io "ingress-nginx-admission" is forbidden: unable to create new content in namespace ingress-nginx because it is being terminated
解决方案:
在第二个终端里:kubectl get namespace ingress-nginx -o json > xx.json
最后执行:
复制 curl -k -H "Content-Type: application/json" -X PUT --data-binary @xx.json \
http://127.0.0.1:8001/api/v1/namespaces/ingress-nginx/finalize
创建ingress规则(HTTP)
复制 apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
spec:
selector:
matchLabels:
app: my-nginx
template:
metadata:
labels:
app: my-nginx
spec:
containers:
- name: my-nginx
image: nginx
ports:
- containerPort: 80 # 注意要指定端口否则ingress无法正常通过pod提供服务
---
apiVersion: v1
kind: Service
metadata:
name: my-nginx
labels:
app: my-nginx
spec:
ports:
- port: 80
protocol: TCP
name: http
selector:
app: my-nginx
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-nginx
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: nginx.test.com # 将域名映射到 my-nginx 服务
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-nginx # 将所有请求发送到 my-nginx 服务的 80 端口
port:
number: 80
查看ingress规则:
复制 $ kubectl get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
ingress-host-bar nginx hello.chenby.cn,demo.chenby.cn 80 6s
配置hosts,访问这些地址即可。
创建ingress规则(HTTPS)
准备域名证书文件(使用阿里云免费证书,或者使用openssl/cfssl创建自签证书)
将证书文件保存到k8s Secret中
复制 kubectl create secret tls ingress-yangsx95-com --cert=7182207_ingress.yangsx95.com.pem --key=7182207_ingress.yangsx95.com.key
复制 kubectl get secret
NAME TYPE DATA AGE
default-token-2wnpx kubernetes.io/service-account-token 3 75m
ingress-yangsx95-com kubernetes.io/tls 2 28s
使用Ingress规则配置tls
复制 apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-yangsx95-com
spec:
ingressClassName: nginx # 指定ingress的实现是ingress-nginx
tls:
- hosts:
- ingress.yangsx95.com
secretName: ingress-yangsx95-com
rules:
- host: "ingress.yangsx95.com"
http:
paths:
- pathType: Prefix
path: "/"
backend:
service:
name: my-nginx
port:
number: 80
配置hosts文件,访问: https://ingress.yangsx95.com:30443/
工作原理
IngressController通过与k8s API交互,动态感知集群中Ingress规则变化,然后读取他,按照自定义的规则,规则就是写明了哪个域名对应哪个service,生成一段nginx配置,应用到管理的nginx服务,然后加载生效。以此来达到负载均衡配置即热更新的效果。
工作流程:
复制 域名+端口 -> Ingress Controller -> Pod
StatefulSet(部署有状态应用)
有状态和无状态
Deployment控制器的设计原则:管理的所有Pod一模一样,提供同一个服务,也不考虑在哪台Node运行,可随意扩容缩容。这种应用成为无状态应用。比如web应用程序就是无状态应用。
在实际场景中,并不能满足所有的应用,尤其是分布式应用程序,一般会部署多个实例,不同于无状态应用比如web服务,这些实例之间往往有依赖关系,例如:主从关系、主备关系,这种应用成为有状态应用,比如Mysql集群,ETCD集群。
StatefulSet就是为了解决部署有状态应用而出现的控制器:
给Pod分配一个唯一稳定的网络标志符(主机名、唯一域名):使用Headless Service来维护网络的身份
复制 # 这是一个Headless Service
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None # 指定Service不分配ClusterIP,因为无状态应用是通过Service的统一IP来暴露服务的,使用Service的ClusterIP无法区分集群内的多个Pod的不同的角色,故这里指定Service的ClusterIP为None
selector:
app: nginx
---
# 这是一个StatefulSet的一部分
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx" # 使用serviceName关联无头服务,主要是根据无头服务找到Service关联的那一组Pod
....
稳定唯一的持久的存储(唯一的PV和PVC):StatefulSet的存储卷使用VolumeClaimTemplate(卷申请模板),当StatefulSet使用VolumeClaimTemplate创建一个PersisentVolume时,同样也会为每个Pod分配并创建一个对应的PVC
复制 # 创建StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx" # 指定HeadlessService
replicas: 2 # 副本数 2
selector: # 与Pod进行绑定
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21.1
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates: # 指定卷申请模板
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ] # 访问模式,只可以被一个容器访问
resources:
requests:
storage: 1Gi
StatefulSet三要素:
部署StatefulSet
复制 # 创建HeadlessService用于发布StatefulSet中Pod的IP和Port
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None # 标志此Service为HeadlessService,
selector:
app: nginx
---
# 创建StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx" # 指定HeadlessService
replicas: 2 # 副本数 2
selector: # 与Pod进行绑定
matchLabels:
app: nginx
template: # 定义Pod模板
metadata:
labels:
app: nginx
spec:
containers: # 定义容器
- name: nginx
image: nginx:1.21.1
ports:
- containerPort: 80
name: web
volumeMounts: # 挂载卷,name指定卷名称,mountPath指定要挂载的容器路径
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates: # 定义申领卷,动态方式
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ] # 访问模式,只可以被一个容器访问
resources:
requests:
storage: 1Gi # 申领1g空间
# kubectl get pods -w -l app=nginx 查看StatefulSet的Pod的创建情况
# 参数-w表示watch实时监控 -l表示labels表示根据标签过滤资源
# 顺序创建:StatefulSet拥有多个副本时,会按照顺序创建,web-0处于Running或者Ready状态,web-1才会启动
# 稳定网络标志:使用kubectl exec 循环获取hostname for i in 0 1; do kubectl exec "web-$i" -- sh -c 'hostname'; done
# 稳定的存储:获取web0以及web1的pvc kubectl get pvc -l app=nginx
# 扩容副本为5:kubectl scale sts web --replicas=5
# 缩容副本为3:kubectl patch sts web -p '{"spec":{"replicas":3}}'
调度
配置Pod的资源限制
复制 apiVersion: v1
kind: Pod
metadata:
labels:
run: testpod
name: testpod
spec:
containers:
- image: nginx
name: testpod
resources:
# 容器最大资源限制 limit
limits:
cpu: 200m
memory: 100Mi
# 容器使用的最小资源请求 request
request:
cpu: 200m
memory: 100Mi
dnsPolicy: ClusterFirst
restartPolicy: Always
cpu的单位为毫核(m)或者为浮点数字,比如500m = 0.5
,1000m = 1
request
代表应用程序启动时需要的资源数量,调度器会寻找符合request要求的节点,如果没有Pod就会一直处于pending状态
limit
代表应用程序运行时最多占用的资源数量,这个值对调度机制并不起决定性的作用,如果request
的值满足,那么就会部署容器
limit
可以防止应用程序假死或者超负荷运行导致主机崩溃的情况,可以更合理控制资源
request
的值设置的过大会造成资源浪费,被request分配的资源,不管应用程序有没有使用,其他容器都无法再分配使用他们了
如何配置这几个值得大小:
request的值根据应用程序启动并正常提供服务时,大约占用的资源量决定
limit的值不建议超过宿主机实际物理配置的20%,剩余空间用来保证物理机的正常运行
limit的值可以根据request配置:不能大于request、request要小于limit 20%~30%
查看pod的资源限制:
复制 kubectl describe pod testpod
Name: testpod
Namespace: default
Priority: 0
Node: docker-desktop/192.168.65.4
Start Time: Tue, 18 Jan 2022 15:11:23 +0800
Labels: run=testpod
Annotations: <none>
Status: Running
IP: 10.1.0.10
IPs:
IP: 10.1.0.10
Containers:
testpod:
Container ID: docker://421614a5c6a4d1de9c472dfccdee9480f28de61e1b2e343162df92dc3097cb87
Image: nginx
Image ID: docker-pullable://nginx@sha256:0d17b565c37bcbd895e9d92315a05c1c3c9a29f762b011a10c54a66cd53c9b31
Port: <none>
Host Port: <none>
State: Running
Started: Tue, 18 Jan 2022 15:12:04 +0800
Ready: True
Restart Count: 0
######################## 容器限制
Limits:
cpu: 200m
memory: 100Mi
Requests:
cpu: 200m
memory: 100Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-t2fl5 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-t2fl5:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m43s default-scheduler Successfully assigned default/testpod to docker-desktop
Normal Pulling 2m42s kubelet Pulling image "nginx"
Normal Pulled 2m3s kubelet Successfully pulled image "nginx" in 38.832099142s
Normal Created 2m2s kubelet Created container testpod
Normal Started 2m2s kubelet Started container testpod
查看Node信息,看Node上运行的容器的资源限制情况与Node本身的资源情况 :
复制 kubectl describe node docker-desktop
Name: docker-desktop
Roles: control-plane,master
Labels: beta.kubernetes.io/arch=arm64
beta.kubernetes.io/os=linux
kubernetes.io/arch=arm64
kubernetes.io/hostname=docker-desktop
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=
node-role.kubernetes.io/master=
node.kubernetes.io/exclude-from-external-load-balancers=
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Fri, 14 Jan 2022 09:47:41 +0800
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: docker-desktop
AcquireTime: <unset>
RenewTime: Tue, 18 Jan 2022 15:18:07 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Tue, 18 Jan 2022 15:17:46 +0800 Fri, 14 Jan 2022 09:47:41 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 18 Jan 2022 15:17:46 +0800 Fri, 14 Jan 2022 09:47:41 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Tue, 18 Jan 2022 15:17:46 +0800 Fri, 14 Jan 2022 09:47:41 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Tue, 18 Jan 2022 15:17:46 +0800 Fri, 14 Jan 2022 09:48:12 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.65.4
Hostname: docker-desktop
Capacity:
cpu: 4
ephemeral-storage: 61255492Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
hugepages-32Mi: 0
hugepages-64Ki: 0
memory: 8126780Ki
pods: 110
###### 该节点总共可分配的资源信息
Allocatable:
cpu: 4
ephemeral-storage: 56453061334
hugepages-1Gi: 0
hugepages-2Mi: 0
hugepages-32Mi: 0
hugepages-64Ki: 0
memory: 8024380Ki
pods: 110
System Info:
Machine ID: 72066838-b7d0-4811-9f4d-a82203068bec
System UUID: 72066838-b7d0-4811-9f4d-a82203068bec
Boot ID: 71c72b3c-3da4-41bc-ae54-8db53c078f15
Kernel Version: 5.10.76-linuxkit
OS Image: Docker Desktop
Operating System: linux
Architecture: arm64
Container Runtime Version: docker://20.10.11
Kubelet Version: v1.22.4
Kube-Proxy Version: v1.22.4
Non-terminated Pods: (10 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
###################################### cpu和内存资源限制的配置
default testpod 200m (5%) 200m (5%) 100Mi (1%) 100Mi (1%) 6m46s
kube-system coredns-78fcd69978-2tb8q 100m (2%) 0 (0%) 70Mi (0%) 170Mi (2%) 4d5h
kube-system coredns-78fcd69978-8bqxk 100m (2%) 0 (0%) 70Mi (0%) 170Mi (2%) 4d5h
kube-system etcd-docker-desktop 100m (2%) 0 (0%) 100Mi (1%) 0 (0%) 4d5h
kube-system kube-apiserver-docker-desktop 250m (6%) 0 (0%) 0 (0%) 0 (0%) 4d5h
kube-system kube-controller-manager-docker-desktop 200m (5%) 0 (0%) 0 (0%) 0 (0%) 4d5h
kube-system kube-proxy-vblmk 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d5h
kube-system kube-scheduler-docker-desktop 100m (2%) 0 (0%) 0 (0%) 0 (0%) 4d5h
kube-system storage-provisioner 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d5h
kube-system vpnkit-controller 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d5h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1050m (26%) 200m (5%)
memory 340Mi (4%) 440Mi (5%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
hugepages-32Mi 0 (0%) 0 (0%)
hugepages-64Ki 0 (0%) 0 (0%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 15m kube-proxy
Normal Starting 16m kubelet Starting kubelet.
Normal NodeHasSufficientMemory 16m (x8 over 16m) kubelet Node docker-desktop status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 16m (x8 over 16m) kubelet Node docker-desktop status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 16m (x7 over 16m) kubelet Node docker-desktop status is now: NodeHasSufficientPID
Normal NodeAllocatableEnforced 16m kubelet Updated Node Allocatable limit across pods
将Pod分配给指定节点
nodeName
指定节点名称,用于将Pod调度到指定的Node上,不经过调度器 。所有污点、节点亲和都将会失效。
复制 apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
nodeName: kube-01
nodeSelector
用于将Pod调度到匹配Label的Node上,如果没有匹配的标签,调度会失败
作用:
应用场景:
示例,去报pod被分配具有ssd硬盘的节点上:
给含有ssd的node,设置一个标签:
复制 kubectl label node k8s-node1 disktype=ssd
查看node的标签信息
复制 kubectl get node k8s-node1 --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s-node1 Ready <none> 5d3h v1.21.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node1,kubernetes.io/os=linux
创建含有nodeSelect的Pod
复制 apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: pod1
name: pod1
spec:
containers:
- image: nginx
name: pod1
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
nodeSelector:
disktype: "ssd"
验证,确实在node1上
复制 kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod1 0/1 ContainerCreating 0 7s <none> k8s-node1 <none> <none>
如果不需要标签栏,可以移除标签:
复制 kubectl label node k8s-node1 disktype-
node/k8s-node1 labeled
kubectl get node k8s-node1 --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s-node1 Ready <none> 5d3h v1.21.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node1,kubernetes.io/os=linux
nodeAffinity
节点亲和类似于nodeSelector,可以根据节点上的标签来约束Pod可以调度在哪些节点上。相比于nodeSelector:
匹配有更多的逻辑组合,不只是字符串的完全相等,支持的操作有: In、NotIn、Exist、DoesNotExist、Gt、Lt
调度分为软策略与应策略:
硬(required):必须满足,如果不满足则调度失败
软(preferred):尽量满足,如果不满足也继续调度,满足则调度到目标
参考官方文档:将 Pod 分配给节点 | Kubernetes
示例:
复制 apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
# 节点亲和
nodeAffinity:
# 硬亲和
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
# node必须包含key kubernetes.io/e2e-az-name,且值在 e2e-az1,e2e-az2数组中
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
# 软亲和
preferredDuringSchedulingIgnoredDuringExecution:
# 权重为1
- weight: 1
preference:
# node最好包含key another-node-label-key,且值为 another-node-label-value
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
containers:
- name: with-node-affinity
image: k8s.gcr.io/pause:2.0
其中,权重值weight的范围为 1~100
,权重值越大,这条亲和规则优先级就越高,调度器就会优先选择
污点和污点容忍
Taints :污点,避免Pod调度到特定的Node
Tolerations :污点容忍,允许Pod调度到持有Taints的Node上
应用场景:
保证master节点安全,在master节点含有污点,防止pod在master节点运行
专用节点:根据业务将Node分组管理,希望在默认情况下不调度该节点,只有配置了污点容忍才允许分配
配备特殊硬件:部分Node配有SSD硬盘、CPU,希望在默认情况下不调度该节点,只有配置了污点容忍才允许分配
查看master节点的污点:
复制 kubectl describe node k8s-master
Name: k8s-master
Roles: control-plane,master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8s-master
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=
node-role.kubernetes.io/master=
node.kubernetes.io/exclude-from-external-load-balancers=
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
projectcalico.org/IPv4Address: 192.168.41.10/24
projectcalico.org/IPv4IPIPTunnelAddr: 10.244.235.192
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Thu, 13 Jan 2022 13:56:51 +0800
#################### 不允许调度的污点
Taints: node-role.kubernetes.io/master:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: k8s-master
AcquireTime: <unset>
RenewTime: Tue, 18 Jan 2022 18:55:44 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Tue, 18 Jan 2022 17:26:05 +0800 Tue, 18 Jan 2022 17:26:05 +0800 CalicoIsUp Calico is running on this node
MemoryPressure False Tue, 18 Jan 2022 18:51:21 +0800 Thu, 13 Jan 2022 13:56:47 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 18 Jan 2022 18:51:21 +0800 Thu, 13 Jan 2022 13:56:47 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Tue, 18 Jan 2022 18:51:21 +0800 Thu, 13 Jan 2022 13:56:47 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Tue, 18 Jan 2022 18:51:21 +0800 Thu, 13 Jan 2022 19:27:20 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.41.10
Hostname: k8s-master
Capacity:
cpu: 2
ephemeral-storage: 17394Mi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 1863252Ki
pods: 110
Allocatable:
cpu: 2
ephemeral-storage: 16415037823
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 1760852Ki
pods: 110
System Info:
Machine ID: 752d054974304aa8a04e23779cc60c55
System UUID: 8CF74D56-3C99-7C12-13A9-B2530762D312
Boot ID: 47c1f93c-a422-4c0e-ae7b-959a42d92cbb
Kernel Version: 3.10.0-957.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://20.10.12
Kubelet Version: v1.21.0
Kube-Proxy Version: v1.21.0
PodCIDR: 10.244.0.0/24
PodCIDRs: 10.244.0.0/24
Non-terminated Pods: (8 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system calico-kube-controllers-6b9fbfff44-pstmt 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5d6h
kube-system calico-node-slrpz 250m (12%) 0 (0%) 0 (0%) 0 (0%) 5d6h
kube-system etcd-k8s-master 100m (5%) 0 (0%) 100Mi (5%) 0 (0%) 5d6h
kube-system kube-apiserver-k8s-master 250m (12%) 0 (0%) 0 (0%) 0 (0%) 5d6h
kube-system kube-controller-manager-k8s-master 200m (10%) 0 (0%) 0 (0%) 0 (0%) 5d
kube-system kube-proxy-65mlq 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5d6h
kube-system kube-scheduler-k8s-master 100m (5%) 0 (0%) 0 (0%) 0 (0%) 5d
kubernetes-dashboard dashboard-metrics-scraper-5594697f48-zrzc5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5d4h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 900m (45%) 0 (0%)
memory 100Mi (5%) 0 (0%)
ephemeral-storage 100Mi (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
使用污点和污点容忍 :
给节点添加污点:
复制 kubectl taint node [node] key=value:<effect>
kubectl taint node k8s-node1 gpu=yes:NoSchedule
effect的取值:
- NoSchedule: 一定不能被调度
- PreferNoSchedule: 不配置污点容忍也有可能被调度,只是尽量保证不调度
- NoExecute: 不仅不会调度,还会驱逐Node上已有的Pod
验证是否正常添加:
复制 kubectl describe node k8s-node1 | grep Taint
Taints: gpu=yes:NoSchedule
配置污点容忍(pod可以容忍有gpu的节点):
复制 apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: pod1
name: pod1
spec:
containers:
- image: nginx
name: pod1
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
# 容忍gpu=yes:NoSchedule的污点
tolerations:
- key: "gpu"
value: "yes"
effect: NoSchedule
operator: Equal
删除污点(在后面增加一个减号):
复制 kubectl taint node k8s-node1 gpu:NoSchedule-
存储
容器中的文件是在磁盘中临时存放的,这给容器中运行比较重要的应用程序带来如下问题:
当容器升级或者崩溃,kubelet会重建容器,容器内的文件会丢失
所以Kubernetes需要数据卷(Volume),常用的数据卷有:
所有支持的卷类型,可以参考:卷 | Kubernetes
emptyDir 临时数据卷
是一个临时的存储卷,与Pod的生命周期绑定在一起,如果Pod删除了卷也会被删除。主要用于Pod中的多个容器之间数据共享。
复制 apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
- mountPath: /cache
name: cache-volume # 挂载卷
# 定义卷
volumes:
- name: cache-volume
emptyDir: {}
empty实际上是位于宿主机上的一个文件夹,容器都是共享的这个宿主机文件夹。他的位置在:/var/lib/kubelet/pod/podid/volumes/kubernetes.io~empty-dir/data
中。
hostPath 节点数据卷
挂载node的文件系统,也就是pod所在的节点上的文件或者目录到pod中的容器。主要应用在Pod中的容器需要访问宿主机的文件的情况,比如DaemonSet。
复制 apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
- mountPath: /test-pd
name: test-volume
volumes:
- name: test-volume
hostPath:
# 宿主上目录位置
path: /data
# 此字段为可选
type: Directory # 如果是文件则是File
注意:当因为某些情况pod被调度到其他节点上时,节点数据卷是不会被迁移过去的。
不安全,不建议使用,建议使用共享存储
NFS 网络数据卷
使用nfs网络数据卷共享存储:
NFS服务端一半是集群外的一台主机,而NFS客户端一般是需要使用共享存储的节点。
部署NFS
centos下准备NFS环境:
复制 # 安装NFS(每个需要共享数据的节点都要安装)
yum install nfs-utils
# 选择需要作为nfs服务端的服务器,编辑nfs exports配置文件
vi /etc/exports
/ifs/kubernetes *(rw,no_root_squash)
# 共享目录为 /ifs/kubernetes
# * 代表可以连接的nfs客户端的网段,这里是任意网段
# rw 代表可读写
# no_root_squash: 登入 NFS 主机使用分享目录的使用者,如果是 root 的话,那么对于这个分享的目录来说,他就具有 root 的权限!
# root_squash:在登入 NFS 主机使用分享之目录的使用者如果是 root 时,那么这个使用者的权限将被压缩成为匿名使用者,通常他的 UID 与 GID 都会变成 nobody 那个身份
# 创建共享目录
mkdir -p /ifs/kubernetes
# 启动nfs server
systemctl start nfs
systemctl enable nfs
ubuntu下准备NFS环境:
复制 # 安装NFS(每个需要共享数据的节点都要安装)
sudo apt update
sudo apt install nfs-kernel-server
# 查看nfs版本启用状态,-2代表2版本禁用
sudo cat /proc/fs/nfsd/versions
# -2 +3 +4 +4.1 +4.2
# 选择需要作为nfs服务端的服务器,编辑nfs exports配置文件
vi /etc/exports
/ifs/kubernetes *(rw,no_root_squash)
# 创建共享目录
mkdir -p /ifs/kubernetes
# 启动nfs server
sudo /etc/init.d/nfs-kernel-server restart
测试NFS
在任意一个节点上执行:
复制 # 将本地/mnt/目录挂载到nfs服务121.10下的远程/ifs/kubernetes目录下
mount -t nfs 192.168.121.10:/ifs/kubernetes /mnt/
# 创建文件查看是否同步
使用NFS网络数据卷
复制 apiVersion: v1
kind: Pod
metadata:
name: test-busybox
spec:
containers:
- image: busybox
name: busybox
volumeMounts:
- mountPath: /root
name: bsroot
command: ["/bin/sh", "-c", "sleep 12h"]
volumes:
- name: bsroot
nfs:
# nfs的远程地址
server: 192.168.121.10
# 共享的nfs的路径
path: /ifs/kubernetes
进入容器查看nfs情况:
复制 kubectl exec -it test-busybox -- sh
PV和PVC
PersistentVolume(PV) :对存储资源创建和使用的抽象,使得存储作为集群中的资源管理
PersistentVolumeClaim(PVC) :让用户不需要关心具体的Volume实现细节
Pod申请PVC作为卷来使用,Kubernetes通过PVC查找绑定的PV,并Mount给Pod。
pvc与pv是一对一的关系,一块存储只能给一个pvc使用
pvc会向上匹配第一个符合要求的pv,如果满足不了,pod处于pending
使用pv和pvc:静态供给
定义需求卷:
复制 apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce # 读写权限,但同时只能被一个节点写入
# 数据提供来源 nfs
nfs:
path: "/ifs/kubernetes"
server: 192.168.31.63
定义卷需求,Pod 使用 PersistentVolumeClaim 来请求物理存储
复制 apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources: # 需求资源大小:3gb
requests:
storage: 3G
容器应用使用卷需求
复制 apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts: # 挂载www卷
- mountPath: "/usr/share/nginx/html"
name: www
volumes: # 定义www卷,卷使用my-pvc卷需求对象完成
- name: www
persistentVolumeClaim:
claimName: my-pod # 使用指定的存储要求
pv的访问模式
AccessMode是用来对PV进行访问模式的设置, 用于描述用户应用对存储资源的访问权限,包含以下几种:
ReadWriteOnce:拥有读写权限,但是只能被单个节点挂载
ReadOnlyMany:只读权限,可以被多个节点挂载
ReadWriteMany:读写权限,可以被多个节点挂载
pv的回收策略
Retain:当将pvc删除时,pv进入Released状态,这个状态下保留数据,需要管理员手动清理数据,默认策略,推荐使用
Recycle:清除pv中的数据,效果相同于执行命令rm -rf /共享目录/*
pv的状态
一个pv的生命周期中,可能会处于四种不同的状态:
Avaliable:可用状态,还未被任何PVC绑定
Released:已释放,表示PVC被删除,但是资源还未被集群重新声明
Storage Class
StorageClass是存储类,对一类存储资源的分类,不同的StorageClass可能代表值不同的存储服务的质量等级或者备份策略,比如固态硬盘与机械硬盘,定时备份与不做备份。
官方文档:https://kubernetes.io/zh/docs/concepts/storage/storage-classes/
创建一个StorageClass:
复制 apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
# 指定存储制备器,参考官方文档支持的存储制备器
# 这里使用aws云存储
provisioner: kubernetes.io/aws-ebs
# aws云存储所需要的参数,需要参考aws的文档
parameters:
type: gp2
# StorageClass对应的PV的回收策略:Delete(默认)、Retain
reclaimPolicy: Retain
# 允许卷扩展:允许用户通过编辑相应的 PVC 对象来调整卷大小
allowVolumeExpansion: true
mountOptions:
- debug
# 指定卷绑定和动态制备 应该发生在什么时候
# Immediate 模式表示一旦创建了 PersistentVolumeClaim 也就完成了卷绑定和动态制备
# WaitForFirstConsumer 模式,直到使用该 PersistentVolumeClaim 的 Pod 被创建才完成了卷绑定和动态制备
volumeBindingMode: Immediate
pv动态供给
允许按需创建PV,不需要运维人员每次手动添加,大大降低了维护成本。pv的动态供给主要由StorageClass对象实现。
动态卷供应 | Kubernetes
PVC存储请求被创建,将会由对应的StorageClass自动创建一个PV。StorageClass是存储类,对一类存储志愿的分类与抽象。
NFS
Kubernetes 不包含内部 NFS 驱动。你需要使用外部驱动为 NFS 创建 StorageClass。 这里有些例子:
以subdir为例:
默认的pv的删除策略为delete,可以在class.yaml文件中进行更改
修改deployment.yaml
,更改镜像地址,否则无法下载镜像
修改deployment.yaml
,更改NFS服务端信息,IP以及PATH(注意有两个地方)
查看已经创建的StorageClass:
复制 kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
managed-nfs-storage k8s-sigs.io/nfs-subdir-external-provisioner Delete Immediate false 5m14s
使用NFS动态供给:
复制 # 定义卷需求
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
storageClassName: "managed-nfs-storage"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3G
---
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: www
volumes:
- name: www
persistentVolumeClaim:
claimName: my-pvc
创建上述PVC,并查看PV,应该有自动创建的PV:
复制 kubectl get pvc,pv
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/my-pvc Bound pvc-36c44a7c-0518-4738-be2d-d2432194c7c1 3G RWO managed-nfs-storage 6h52m
ConfigMap
ConfigMap用于应用程序的配置存储,Secret则用于存储敏感数据。ConfigMap共有两种创建方式:
创建后,其数据将会存储在ETCD中。相应的,Pod也可以通过两种不同的方式获取ConfigMap中的数据到应用程序中:
创建ConfigMap
复制 # 定义ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: game-demo
data:
# 类属性键;每一个键都映射到一个简单的值
player_initial_lives: "3"
ui_properties_file_name: "user-interface.properties"
# 类文件键
game.properties: |
enemy.types=aliens,monsters
player.maximum-lives=5
user-interface.properties: |
color.good=purple
color.bad=yellow
allow.textmode=true
binaryData:
token: "5aWl5Yip57uZ" #echo -n '奥利给' | base64
使用ConfigMap
复制 apiVersion: v1
kind: Pod
metadata:
name: configmap-demo-pod
spec:
containers:
- name: demo
image: alpine
command: ["sleep", "3600"]
# 定义环境变量
env:
- name: PLAYER_INITIAL_LIVES
valueFrom: # 指定环境部变量值来源为configMap
configMapKeyRef:
name: game-demo # 指定ConfigMap的名称
key: player_initial_lives # 指定需要从ConfigMap中取出值得键
- name: UI_PROPERTIES_FILE_NAME
valueFrom:
configMapKeyRef:
name: game-demo
key: ui_properties_file_name
# 挂载卷,config卷是一个ConfigMap对象
volumeMounts:
- name: config
mountPath: "/config"
readOnly: true
# 定义配置卷
volumes:
- name: config
configMap:
name: game-demo # ConfigMap 卷命令
items: # ConfigMap的一组键,与容器文件名的映射
- key: "game.properties"
path: "game.properties"
- key: "user-interface.properties"
path: "user-interface.properties"
Secret
与ConfigMap类似,区别在于Secret主要存储敏感数据,所有数据都要经过base64编码(不加密)。
Secret的创建命令kubectl create secret
支持存储创建三种数据类型的Secret:
docker-registry:存储镜像仓库认证信息
generic
创建secret:
复制 apiVersion: v1
kind: Secret
metadata:
name: mysecret
type: Opaque
data: # data代表私密数据,需要以base64的方式填入
username: cm9vdA==
password: cm9vdDEyMw==
使用secret(环境变量方式):
复制 apiVersion: v1
# 以环境变量方式导入secret到Pod
kind: Pod
metadata:
name: use-secret-env
spec:
containers:
- name: use-secret-env
imagePullPolicy: IfNotPresent
image: nginx
env:
- name: SECRET_USERNAME
valueFrom:
secretKeyRef:
name: mysecret
key: username
- name: SECRET_PASSWORD
valueFrom:
secretKeyRef:
name: mysecret
key: password
使用secret(volume挂载):
复制 apiVersion: v1
# 以volume方式挂载secret到Pod
kind: Pod
metadata:
name: use-secret-volume
spec:
containers:
- name: use-secret-env
imagePullPolicy: IfNotPresent
image: nginx
volumeMounts:
- name: "foo"
mountPath: "/etc/foo" # 挂载完毕后,会在此目录下看到两个名称为username和password的文件,文件内容就是具体的secret的值
readOnly: true
volumes:
- name: foo
secret:
secretName: mysecret
安全
k8s安全框架主要由下面3个阶段进行控制,每个阶段都支持插件方式,通过API Server配置来启用插件:
kubectl 发送指令到API Server依次经过这三个步骤进行安全控制,通过后才后继续进行后续的操作。
Authentication 鉴权
k8s API Server提供三种客户端身份认证:
HTTPS证书认证:基于CA证书签名的数字证书认证(kubeconfig,kubectl就是使用这种方式)
HTTP Token认证:通过一个Token来识别用户(ServiceAccount,一般提供给程序使用,但也可以提供给kubectl)
HTTP Basic认证:用户名 + 密码认证(1.19版本废弃)
Authorization 授权
基于RABC完成授权工作。RABC根据API请求属性,决定允许还是拒绝。
角色
ClusterRole:授权所有命名空间的(也就是整个集群)访问权限
角色绑定
ClusterRoleBinding:将集群角色绑定到主体
上图描述了这几个概念之间的关系。
Admission Control 准入控制
Admission Control实际上是一个准入控制器插件列表,发送到 API Server的请求都要经过这个列表中每个准入控制插件的检查,检查不通过则拒绝请求。
启用一个准入控制器:
复制 kube-apiserver --enable-admission-plugins=NamespaceLifecycle,LimitRanger ...
关闭一个准入控制器:
复制 kube-apiserver --disable-admission-plugins=PodNodeSelector,AlwaysDeny ...
查看默认启用:
复制 # 在 kube-apiserver-k8s-master 这个pod中执行命令 kube-apiserver -h ,查看Admission启用的插件
kubectl exec kube-apiserver-k8s-master -n kube-system -- kube-apiserver -h | grep enable-admission-plugins
示例:配置一个新的kubectl集群客户端
大致步骤:
证书链的意思是有一个证书机构A,A生成证书B,B也可以生成证书C,那么A是根证书。操作系统预先安装的一些根证书,都是国际上很有权威的证书机构,比如 verisign 、 ENTRUST 这些公司。
这里k8s集群的根证书位于/etc/kubernetes/pki/ca.crt
,可以根据根证书下发子证书。
包括了 ca.cer 和 nginx.cn.cer 的全链证书
创建脚本cert.sh,用于生成证书:
复制 # 创建证书配置文件
cat > ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "87600h"
}
}
}
}
EOF
# 创建证书请求文件
cat > yangsx-csr.json <<EOF
{
"CN": "yangsx",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
# 使用cfsslt生成客户端证书
# -ca 指定根证书 -ca-ky指定根证书私钥 -config指定证书配置文件
cfssl gencert -ca=/etc/kubernetes/pki/ca.crt -ca-key=/etc/kubernetes/pki/ca.key -config=ca-config.json -profile=kubernetes aliang-csr.json | cfssljson -bare yangsx
执行脚本将会生成:
复制 yangsx-key.pem 证书私钥
yangsx.pem 证书
再创建脚本kubeconfig.sh
,使用此脚本创建kubeconfig:
复制 # 添加集群信息到配置文件
# 这里可以修改集群名称、apiserver地址、配置文件名称等
kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/pki/ca.crt \
--embed-certs=true \
--server=https://192.168.121.10:6443 \
--kubeconfig=yangsx.kubeconfig
# 添加用户以及客户端认证认证信息到配置文件
kubectl config set-credentials yangsx \
--client-key=yangsx-key.pem \
--client-certificate=yangsx.pem \
--embed-certs=true \
--kubeconfig=yangsx.kubeconfig
# 设置默认上下文
kubectl config set-context kubernetes \
--cluster=kubernetes \
--user=yangsx \
--kubeconfig=yangsx.kubeconfig
# 设置使用配置
kubectl config use-context kubernetes --kubeconfig=yangsx.kubeconfig
执行完毕后将会生成yangsx.kubeconfig
配置文件,然后将文件下发给某个用户,配置给kubectl即可使用。
在未给yangsx这个用户授权之前,做任何操作都无法通过API Server鉴权的:
复制 kubectl get pods --kubeconfig=yangsx.kubeconfig
Error from server (Forbidden): pods is forbidden: User "yangsx" cannot list resource "pods" in API group "" in the namespace "default"
我们需要通过创建rbac资源,给指定的用户赋予权限,创建rbac.yaml
:
复制 kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""] # 资源组,可以通过 kubectl api-resources 命令查看,其第三列就代表apiGroup,空字符串代表核心组
resources: ["pods"] # 核心组下的名称为pods的资源
verbs: ["get", "watch", "list"] # 对pod可进行的操作
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: read-pods
namespace: default
subjects:
- kind: User # 指定角色要绑定的主体的类型为用户
name: yangx # 指定用户名称
apiGroup: rbac.authorization.k8s.io
roleRef: # 指定将pod-reader这个角色绑定给用户
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
使用kube-admin用户创建上述资源清单中的资源:
复制 kubectl apply -f rbac.yaml
这样就给用户yangsx分配了权限:
复制 root@k8s-master:~# kubectl get pods --kubeconfig=./yangsx.kubeconfig
NAME READY STATUS RESTARTS AGE
my-pod 1/1 Running 0 25h
可以查看role、rolebinding的创建情况:
复制 root@k8s-master:~# kubectl get role,rolebinding
NAME CREATED AT
role.rbac.authorization.k8s.io/pod-reader 2022-02-03T14:33:44Z
NAME ROLE AGE
rolebinding.rbac.authorization.k8s.io/read-pods Role/pod-reader 2m29s
示例:为一个ServiceAccount分配一个只能创建deployment、daemonset、statefulset的权限
ServiceAccount一般提供给程序使用,但也可以给kubectl使用。
实现方式一,通过命令创建:
复制 # 创建集群角色
kubectl create clusterrole deployment-clusterrole --verb=create --resource=deployments,daemonsets,statefulsets
# 创建服务账号
kubectl create serviceaccount cicd-token -n app-team1
# 将服务账号绑定角色
kubectl create rolebinding cicd-token --serviceaccount=app-team1:cicd-token --clusterrole=deployment-clusterrole -n app-team1
# 测试服务账号权限
kubectl --as=system:serviceaccount:app-team1:cicd-token get pods -n app-team1
实现方式二,通过yaml创建:
复制 apiVersion: v1
kind: ServiceAccount
metadata:
name: cicd-token
namespace: app-team1
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: deployment-clusterrole
rules:
- apiGroups: ["apps"]
resources: ["deployments","daemonsets","statefulsets"]
verbs: ["create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: cicd-token
namespace: app-team1
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: deployment-clusterrole
subjects:
- kind: ServiceAccount
name: cicd-token
namespace: app-team1
网络策略
默认情况下,Kubernetes 集群网络没任何网络限制,Pod 可以与任何其他Pod 通信,在某些场景下就需要进行网络控制,减少网络攻击面,提高安全性,这就会用到网络策略。网络策略(Network Policy):是一个K8s资源,用于限制Pod出入流量,提供Pod级别和Namespace级别网络访问控制。
网络策略的应用场景(偏重多租户下):
应用程序间的访问控制,例如项目A不能访问项目B的Pod
网络策略的工作流程:
Policy Controller监控网路策略,同步并通知节点上的程序
节点上DaemonSet运行的程序从etcd获取Policy,调用本地Iptables规则
案例:拒绝其他命名空间Pod访问
需求:test命名空间下所有pod可以互相访问,也可以访问其他命名空间Pod,但其他命名空间不能访问test命名空间Pod 。
复制 apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all-namespaces
namespace: test # 指定命名空间
spec:
podSelector: {} # 网络策略应用的目标pod,未配置代表所有pod
policyTypes: # 策略类型,指定策略用于入站(Ingress)、出站(Egress)流量
- Ingress
ingress:
- from: # 指定入站的白名单
- podSelector: {} # 未配置,匹配所有的pod
测试:
复制 kubectl run busybox --image=busybox -n test -- sleep 12h
kubectl run web --image=nginx -n test
# 同命名空间pod可访问测试
kubectl exec busybox -n test -- ping <同命名空间pod IP>
# 非test命名空间pod不可访问test命名空间测试
kubectl exec busybox -- ping <test命名空间pod IP>
案例:同一个命名空间下应用之间限制访问
需求:将test命名空间携带run=web标签的Pod隔离,只允许携带run=client1标签的Pod访问80端口。
复制 apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: app-to-app
namespace: test
spec:
podSelector:
matchLabels:
run: web # test命名空间携带run=web标签的Pod
policyTypes:
- Ingress
ingress:
- from: # 指定白名单 只允许携带run=client1标签的Pod
- podSelector:
matchLabels:
run: client1
ports:
- protocol: TCP # 访问80端口
port: 80
测试:
复制 kubectl run web --image=nginx -n test
kubectl run client1 --image=busybox -n test -- sleep 12h
# 可以访问
kubectl exec client1 -n test -- wget <test命名空间pod IP>
# 不能访问
kubectl exec busybox -- wget <test命名空间pod IP>
案例:只允许指定命名空间中的应用访问
需求:只允许dev命名空间中的Pod访问test命名空间中的pod 80端口
复制 apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-port-from-namespace
namespace: test
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from: # 白名单,dev命名空间的pod
- namespaceSelector:
matchLabels:
name: dev
ports:
- protocol: TCP
port: 80
测试:
复制 # 命名空间打标签:
kubectl label namespace dev name=dev
kubectl run busybox --image=busybox -n dev -- sleep 12h
# 可以访问
kubectl exec busybox -n dev -- wget <test命名空间pod IP>
# 不可以访问
kubectl exec busybox -- wget <test命名空间pod IP>