k8s+harbor离线安装部署及测试使用全过程
环境:
| IP | 主机名 | 功能 |
| 172.16.131.83 | k8s-master | master管理节点 |
| 172.16.131.84 | k8s-node1 | 工作节点1 |
| 172.16.131.85 | k8s-node2 | 工作节点2 |
| 172.16.131.86 | k8s-node3 | 工作节点3 |
| 172.16.131.87 | registry-harbor | 仓库 |
| 172.16.131.88 | k8s-zhongzhuan | 外网中转 |
一、部署前准备工作:
在k8s集群上操作
1.在k8s节点修改主机名:
cp /etc/hosts /etc/hosts_`date +%y%m%d`echo "172.16.131.83 k8s-master172.16.131.84 k8s-node1172.16.131.85 k8s-node2172.16.131.86 k8s-node3172.16.131.87 registry-harbor" >> /etc/hosts
2.系统参数配置:
echo "fs.file-max = 6815744kernel.sem = 10000 10240000 10000 1024kernel.shmmni = 4096kernel.shmall = 1073741824kernel.shmmax = 751619276800net.ipv4.ip_local_port_range = 9000 65500net.core.rmem_default = 16777216net.core.rmem_max = 16777216net.core.wmem_max = 16777216net.core.wmem_default = 16777216fs.aio-max-nr = 6194304vm.dirty_ratio=20vm.dirty_background_ratio=3vm.dirty_writeback_centisecs=100vm.dirty_expire_centisecs=500vm.min_free_kbytes=524288net.core.netdev_max_backlog = 30000net.core.netdev_budget = 600#vm.nr_hugepages =net.ipv4.conf.all.rp_filter = 2net.ipv4.conf.default.rp_filter = 2net.ipv4.ipfrag_time = 60net.ipv4.ipfrag_low_thresh = 6291456net.ipv4.ipfrag_high_thresh = 8388608net.bridge.bridge-nf-call-ip6tables = 1net.bridge.bridge-nf-call-iptables = 1vm.swappiness=0">> /etc/sysctl.conf && sysctl -p
3.用户限制参数配置:
cp /etc/security/limits.conf /etc/security/limits_`date +"%Y%m%d_%H%M%S"`.confecho "* soft nproc 655350* hard nproc 655350* soft nofile 655360* hard nofile 655360* soft stack 102400* hard stack 327680* soft stack 102400* hard stack 327680* soft memlock -1* hard memlock -1" >>/etc/security/limits.conf
4.关闭防火墙:
systemctl stop firewalldsystemctl disable firewalld
5.关闭selinux:
setenforce 0sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
6.关闭透明大页:
[ -f /sys/kernel/mm/transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/transparent_hugepage/enabled[ -f /sys/kernel/mm/redhat_transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabledgrep transparent_hugepage /etc/rc.d/rc.local 1>/dev/null || echo '[ -f /sys/kernel/mm/transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.localgrep redhat_transparent_hugepage /etc/rc.d/rc.local 1>/dev/null || echo '[ -f /sys/kernel/mm/redhat_transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled' >> /etc/rc.local[ -x /etc/rc.d/rc.local ] || chmod +x /etc/rc.d/rc.local
7.关闭swap
swapoff -ased -i 's/.*swap.*/#&/' /etc/fstab
8.配置ssh(sshUserSetup.sh具体内容见附录)
sh sshUserSetup.sh -user root -hosts "k8s-master k8s-node1 k8s-node2 k8s-node3"
9.同步时钟(其他节点同步):
master中:vi /etc/ntp.conf#server 0.rhel.pool.ntp.org iburst#server 1.rhel.pool.ntp.org iburst#server 2.rhel.pool.ntp.org iburst#server 3.rhel.pool.ntp.org iburstserver 127.127.1.0fudge 127.127.1.0 stratum 10其他机器:crontab -e*/2 * * * * /usr/sbin/ntpdate 172.16.131.83date && ssh k8s-node1 date && ssh k8s-node2 date && ssh k8s-node3 date
二、安装容器基础docker
联网中转机器上操作:
1.安装需要软件(利用本地源即可)
yum install -y yum-utils device-mapper-persistent-data lvm2 wget
2.安装epel(需要centos7源)
获取阿里云的centos-7的repo文件:wget -O /etc/yum.repos.d/CentOS-Base.repo <http://mirrors.aliyun.com/repo/Centos-7.repo>
3.修改CentOS-Base.repo文件,把文件里面的$releasever全部替换为版本号7:
vi /etc/yum.repos.d/CentOS-Base.repo%s/$releasever/7/g
4.清理注册源:
yum clean all&& yum makecache fast
5.安装epel-release.noarch:
yum install -y epel-release.noarch
6.下载docker源
yum-config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repooryum-config-manager --add-repo <http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo>
7.生效yum仓库
yum-config-manager --enable docker-ce-nightly(检查可以安装的docker版本:yum list docker-ce --showduplicates | sort -r)
注:当检查可安装的docker版本时出现以下类似错误的时候:
<https://mirrors.aliyun.com/docker-ce/linux/centos/7Server/x86_64/stable/repodata/7cc100684a6630e5382cf07c92483acecdff60eb94243af9acb95654c2913d70-primary.sqlite.bz2:> [Errno 14] HTTPS Error 404 - Not FoundTrying other mirror.
主要原因是由于,仓库配置中的$releasever找不到导致,此时可以作如下操作:
vi /etc/yum.repos.d/docker-ce.repo%s/$releasever/7/g
8.清理注册源:
yum clean all&& yum makecache fast
9.下载指定版本的docker的相关部署包:
mkdir -p /app/soft/dockercd /app/soft/dockeryumdownloader --resolve docker-ce-23.0.1
10.打包:
cd /app/softtar -cvzf docker_v23.0.1_offline_pkg.tar.gz docker
11.将docker_v23.0.1_offline_pkg.tar.gz包发送至离线机器
scp -rp docker_offline_pkg.tar.gz 172.16.131.83:/app/soft/scp -rp docker_offline_pkg.tar.gz 172.16.131.84:/app/soft/scp -rp docker_offline_pkg.tar.gz 172.16.131.85:/app/soft/scp -rp docker_offline_pkg.tar.gz 172.16.131.86:/app/soft/scp -rp docker_offline_pkg.tar.gz 172.16.131.87:/app/soft/
在k8s集群上操作
1.解压离线安装包:
tar -xvzf docker_offline_pkg.tar.gz -C /app/soft/
2.安装docker:
cd /app/soft/yum install *.rpm
三、安装私有仓库harbor
联网中转机器操作:
1.下载配置epel源
wget -O /etc/yum.repos.d/epel.repo <http://mirrors.aliyun.com/repo/epel-7.repo>
2.下载docker-compose
检查版本:yum list docker-compose --showduplicates | sort -r创建目录:mkdir -p /app/soft/docker-composecd /app/soft/docker-compose安装指定版本:yumdownloader --resolve docker-compose-1.18.0
3.打包docker-compose安装包:
cd /app/softtar -cvzf docker-compase_offline_pkg_v1.18.0.tar.gz docker-compase
4.将docker_offline_pkg.tar.gz包发送至离线机器
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.83:/app/soft/scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.84:/app/soft/scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.85:/app/soft/scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.86:/app/soft/scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.87:/app/soft/
5.在离线机器上解压离线安装包:
tar -xvzf docker-compase_offline_pkg_v1.18.0.tar.gz -C /app/soft/
6.在离线机器上安装docker-compase:
cd /app/soft/docker-compaseyum install *.rpm
7.下载harbor的离线安装包(联网中转机)
curl -O <https://github.com/goharbor/harbor/releases/download/v2.7.1/harbor-offline-installer-v2.7.1.tgz或者直接到github上手动下载上传>
8.传输离线包至registry-harbor主机下并解压
scp -rp /app/soft/harbor-offline-installer-v2.7.1.tgz 172.16.131.87:/app/soft/tar -xvzf /app/soft/harbor-offline-installer-v2.7.1.tgz -C /app/
9.根据需求修改yaml文件
cp harbor.yml.tmpl harbor.yml
vi harbor.yml
主要修改内容包括:
1.hostname
2.port
3.注释htpps及其相关内容
4.harbor_admin_password
5.database部分的password
6.data_volume
7.log部分的location
其余则根据实际情况进行个性化修改,最终形成如下的harbor.yaml文件:
# Configuration file of Harbor# The IP address or hostname to access admin UI and registry service.# DO NOT use localhost or 127.0.0.1, because Harbor needs to be accessed by external clients.hostname: 172.16.131.87# http related confighttp:# port for http, default is 80. If https enabled, this port will redirect to https portport: 1088# https related config#https:# # https port for harbor, default is 443# port: 443# # The path of cert and key files for nginx# certificate: /your/certificate/path# private_key: /your/private/key/path# # Uncomment following will enable tls communication between all harbor components# internal_tls:# # set enabled to true means internal tls is enabled# enabled: true# # put your cert and key files on dir# dir: /etc/harbor/tls/internal# Uncomment external_url if you want to enable external proxy# And when it enabled the hostname will no longer used# external_url: <https://reg.mydomain.com:8433># The initial password of Harbor admin# It only works in first time to install harbor# Remember Change the admin password from UI after launching Harbor.harbor_admin_password: Harbor@1234# Harbor DB configurationdatabase:# The password for the root user of Harbor DB. Change this before any production use.password: Harbor@1234# The maximum number of connections in the idle connection pool. If it <=0, no idle connections are retained.max_idle_conns: 100# The maximum number of open connections to the database. If it <= 0, then there is no limit on the number of open connections.# Note: the default number of connections is 1024 for postgres of harbor.max_open_conns: 900# The maximum amount of time a connection may be reused. Expired connections may be closed lazily before reuse. If it <= 0, connections are not closed due to a connection's age.# The value is a duration string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".conn_max_lifetime: 5m# The maximum amount of time a connection may be idle. Expired connections may be closed lazily before reuse. If it <= 0, connections are not closed due to a connection's idle time.# The value is a duration string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".conn_max_idle_time: 0# The default data volumedata_volume: /app/data# Harbor Storage settings by default is using /data dir on local filesystem# Uncomment storage_service setting If you want to using external storage# storage_service:# # ca_bundle is the path to the custom root ca certificate, which will be injected into the truststore# # of registry's and chart repository's containers. This is usually needed when the user hosts a internal storage with self signed certificate.# ca_bundle:# # storage backend, default is filesystem, options include filesystem, azure, gcs, s3, swift and oss# # for more info about this configuration please refer <https://docs.docker.com/registry/configuration/#> filesystem:# maxthreads: 100# # set disable to true when you want to disable registry redirect# redirect:# disabled: false# Trivy configuration## Trivy DB contains vulnerability information from NVD, Red Hat, and many other upstream vulnerability databases.# It is downloaded by Trivy from the GitHub release page <https://github.com/aquasecurity/trivy-db/releases> and cached# in the local file system. In addition, the database contains the update timestamp so Trivy can detect whether it# should download a newer version from the Internet or use the cached one. Currently, the database is updated every# 12 hours and published as a new release to GitHub.trivy:# ignoreUnfixed The flag to display only fixed vulnerabilitiesignore_unfixed: false# skipUpdate The flag to enable or disable Trivy DB downloads from GitHub## You might want to enable this flag in test or CI/CD environments to avoid GitHub rate limiting issues.# If the flag is enabled you have to download the `trivy-offline.tar.gz` archive manually, extract `trivy.db` and# `metadata.json` files and mount them in the `/home/scanner/.cache/trivy/db` path.skip_update: false## The offline_scan option prevents Trivy from sending API requests to identify dependencies.# Scanning JAR files and pom.xml may require Internet access for better detection, but this option tries to avoid it.# For example, the offline mode will not try to resolve transitive dependencies in pom.xml when the dependency doesn't# exist in the local repositories. It means a number of detected vulnerabilities might be fewer in offline mode.# It would work if all the dependencies are in local.# This option doesn’t affect DB download. You need to specify "skip-update" as well as "offline-scan" in an air-gapped environment.offline_scan: false## Comma-separated list of what security issues to detect. Possible values are `vuln`, `config` and `secret`. Defaults to `vuln`.security_check: vuln## insecure The flag to skip verifying registry certificateinsecure: false# github_token The GitHub access token to download Trivy DB## Anonymous downloads from GitHub are subject to the limit of 60 requests per hour. Normally such rate limit is enough# for production operations. If, for any reason, it's not enough, you could increase the rate limit to 5000# requests per hour by specifying the GitHub access token. For more details on GitHub rate limiting please consult# <https://developer.github.com/v3/#rate-limiting##> You can create a GitHub token by following the instructions in# <https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line##> github_token: xxxjobservice:# Maximum number of job workers in job servicemax_job_workers: 10notification:# Maximum retry count for webhook jobwebhook_job_max_retry: 10chart:# Change the value of absolute_url to enabled can enable absolute url in chartabsolute_url: disabled# Log configurationslog:# options are debug, info, warning, error, fatallevel: info# configs for logs in local storagelocal:# Log files are rotated log_rotate_count times before being removed. If count is 0, old versions are removed rather than rotated.rotate_count: 50# Log files are rotated only if they grow bigger than log_rotate_size bytes. If size is followed by k, the size is assumed to be in kilobytes.# If the M is used, the size is in megabytes, and if G is used, the size is in gigabytes. So size 100, size 100k, size 100M and size 100G# are all valid.rotate_size: 200M# The directory on your host that store loglocation: /app/harbor/log# Uncomment following lines to enable external syslog endpoint.# external_endpoint:# # protocol used to transmit log to external endpoint, options is tcp or udp# protocol: tcp# # The host of external endpoint# host: localhost# # Port of external endpoint# port: 5140#This attribute is for migrator to detect the version of the .cfg file, DO NOT MODIFY!_version: 2.7.0# Uncomment external_database if using external database.# external_database:# harbor:# host: harbor_db_host# port: harbor_db_port# db_name: harbor_db_name# username: harbor_db_username# password: harbor_db_password# ssl_mode: disable# max_idle_conns: 2# max_open_conns: 0# notary_signer:# host: notary_signer_db_host# port: notary_signer_db_port# db_name: notary_signer_db_name# username: notary_signer_db_username# password: notary_signer_db_password# ssl_mode: disable# notary_server:# host: notary_server_db_host# port: notary_server_db_port# db_name: notary_server_db_name# username: notary_server_db_username# password: notary_server_db_password# ssl_mode: disable# Uncomment external_redis if using external Redis server# external_redis:# # support redis, redis+sentinel# # host for redis: :# # host for redis+sentinel:# # :,:,:# host: redis:6379# password:# # sentinel_master_set must be set to support redis+sentinel# #sentinel_master_set:# # db_index 0 is for core, it's unchangeable# registry_db_index: 1# jobservice_db_index: 2# chartmuseum_db_index: 3# trivy_db_index: 5# idle_timeout_seconds: 30# Uncomment uaa for trusting the certificate of uaa instance that is hosted via self-signed cert.# uaa:# ca_file: /path/to/ca# Global proxy# Config http proxy for components, e.g. <http://my.proxy.com:3128># Components doesn't need to connect to each others via http proxy.# Remove component from `components` array if want disable proxy# for it. If you want use proxy for replication, MUST enable proxy# for core and jobservice, and set `http_proxy` and `https_proxy`.# Add domain to the `no_proxy` field, when you want disable proxy# for some special registry.proxy:http_proxy:https_proxy:no_proxy:components:- core- jobservice- trivy# metric:# enabled: false# port: 9090# path: /metrics# Trace related config# only can enable one trace provider(jaeger or otel) at the same time,# and when using jaeger as provider, can only enable it with agent mode or collector mode.# if using jaeger collector mode, uncomment endpoint and uncomment username, password if needed# if using jaeger agetn mode uncomment agent_host and agent_port# trace:# enabled: true# # set sample_rate to 1 if you wanna sampling 100% of trace data; set 0.5 if you wanna sampling 50% of trace data, and so forth# sample_rate: 1# # # namespace used to differenciate different harbor services# # namespace:# # # attributes is a key value dict contains user defined attributes used to initialize trace provider# # attributes:# # application: harbor# # # jaeger should be 1.26 or newer.# # jaeger:# # endpoint: <http://hostname:14268/api/traces#> # username:# # password:# # agent_host: hostname# # # export trace data by jaeger.thrift in compact mode# # agent_port: 6831# # otel:# # endpoint: hostname:4318# # url_path: /v1/traces# # compression: false# # insecure: true# # timeout: 10s# enable purge _upload directoriesupload_purging:enabled: true# remove files in _upload directories which exist for a period of time, default is one week.age: 168h# the interval of the purge operationsinterval: 24hdryrun: false# cache layer configurations# If this feature enabled, harbor will cache the resource# `project/project_metadata/repository/artifact/manifest` in the redis# which can especially help to improve the performance of high concurrent# manifest pulling.# NOTICE# If you are deploying Harbor in HA mode, make sure that all the harbor# instances have the same behaviour, all with caching enabled or disabled,# otherwise it can lead to potential data inconsistency.cache:# not enabled by defaultenabled: false# keep cache for one day by defaultexpire_hours: 24
10.安装harbor:
cd /app/soft/harbor./install.sh
11.修改各服务器的容器仓库源为内网harbor,且将docker容器的cgroup的控制模式调整为systemd:
cat > /etc/docker/daemon.json< /etc/yum.repos.d/kubernetes.repo[kubernetes]name=Kubernetesbaseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/enabled=1gpgcheck=1repo_gpgcheck=1gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg <https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpgEOF>
2.重新加载yum源
yum clean all && yum makecache
3.查看版本kubelet,kubeadm,kubectl的版本
yum list kubelet --showduplicates | sort -ryum list kubeadm --showduplicates | sort -ryum list kubectl --showduplicates | sort -r
4.下载kubeadm相关包
yumdownloader kubelet-1.25.6 --resolve --destdir=/app/soft/kubernetes/kubeletyumdownloader kubeadm-1.25.6 --resolve --destdir=/app/soft/kubernetes/kubeadmyumdownloader kubectl-1.25.6 --resolve --destdir=/app/soft/kubernetes/kubectl
注:
这里我下载了稍旧的版本,主要是因为我下载了1.26.3最新版本后,安装部署发现kubectl无法启动,经过查询,发现这主要是因为cri-dockerd仍然只支持v1alpha2,Kubelet在1.26中移除了对v1alpha2的支持,因此如果容器运行时不支持CRI v1,则移除后kubelet将无法注册节点。也就是说,Kubernetes 1.26不再支持containerd的次要版本1.5及更早版本;要继续使用containerd,大家需要升级至containerd 1.6.0或更高版本,总结就是kubernetes1.26和我部署的docker运行时的cri不兼容,导致kubelet启动失败,根据资料,我们将containerd-io版本换成1.16版本以上即可,但是实际部署中我的containerd-io已经是1.16.20了,发现也还是无法启动,因此我就换了比较低的版本。
5.生成后,将kubeadm文件夹下载的kubectl-1.26.3和kubelet-1.26.3移走,并打包剩余的安装包
cd /app/soft/kubernetes/kubeadm/mv *kubectl-1.26.3*.rpm *kubelet-1.26*.rpm ../../tar -cvzf kubeadm_1.25.6_offline_install_pkg.tar.gz /app/soft/kubernetes
6.传输至离线的所有节点:
scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.83:/app/soft/scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.84:/app/soft/scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.85:/app/soft/scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.86:/app/soft/
kubernetes集群:
1.所有机器,解压并安装kubelet,kubectl,kubeadm
tar -xvzf /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz -C /app/cd /app/kubernets/kubelet/yum install -y *.rpmcd /app/kubernets/kubectl/yum install -y *.rpmcd /app/kubernets/kubeadm/yum install -y *.rpm
2.启动kubelet服务
systemctl start kubelet && systemctl enable kubelet && systemctl status kubelet
联网中转机:
1.下载k8s镜像:
mkdir -p /app/soft/k8s_imagesdocker pull mirrorgcrio/kube-proxy:v1.23.6docker pull mirrorgcrio/kube-scheduler:v1.23.6docker pull mirrorgcrio/kube-controller-manager:v1.23.6docker pull mirrorgcrio/kube-apiserver:v1.23.6docker pull mirrorgcrio/coredns:1.7.0docker pull mirrorgcrio/etcd:3.4.9docker pull mirrorgcrio/pause:latestdocker pull registry:latestdocker pull quay.io/coreos/flannel:v0.15.1-amd64docker pull nginx
2.打包镜像:
docker save mirrorgcrio/kube-proxy:v1.23.6 -o /app/soft/k8s_images/kube-proxy_v1.23.6.tardocker save mirrorgcrio/kube-scheduler:v1.23.6 -o /app/soft/k8s_images/kube-scheduler_v1.23.6.tardocker save mirrorgcrio/kube-controller-manager:v1.23.6 -o /app/soft/k8s_images/kube-controller-manager_v1.23.6.tardocker save mirrorgcrio/kube-apiserver:v1.23.6 -o /app/soft/k8s_images/kube-apiserver_v1.23.6.tardocker save registry:latest -o /app/soft/k8s_images/registry.tardocker save quay.io/coreos/flannel:v0.15.1-amd64 -o /app/soft/k8s_images/flannel_v0.15.1.tardocker save mirrorgcrio/coredns:1.7.0 -o /app/soft/k8s_images/coredns.tardocker save mirrorgcrio/etcd:3.4.9 -o /app/soft/k8s_images/etcd.tardocker save mirrorgcrio/pause:latest -o /app/soft/k8s_images/pause.tardocker save nginx:latest -o /app/soft/k8s_images/nginx.tar
3.将打包的镜像压缩,并传输至k8s的master节点
tar -cvzf /app/soft/k8s_images.tar.gz /app/soft/k8s_imagesscp -rp /app/soft/k8s_images.tar.gz 172.16.131.83:/app/soft
k8s集群中master节点:
1.解压镜像
tar -xvzf /app/soft/k8s_images.tar.gz -C /app/soft
2.加载镜像
cd k8s_imagesfor i in `ls`> do> docker load -i $i> done
3.重新给镜像打包:
docker images|awk '{print "docker tag " $1 ":" $2 " 172.16.131.87:1088/kubernetes-deploy/" $1 ":" $2}'|sed 1ddocker tag registry:latest 172.16.131.87:1088/kubernetes-deploy/registry:latestdocker tag nginx:latest 172.16.131.87:1088/kubernetes-deploy/nginx:latestdocker tag mirrorgcrio/kube-apiserver:v1.23.6 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/kube-apiserver:v1.23.6docker tag mirrorgcrio/kube-controller-manager:v1.23.6 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/kube-controller-manager:v1.23.6docker tag mirrorgcrio/kube-proxy:v1.23.6 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/kube-proxy:v1.23.6docker tag mirrorgcrio/kube-scheduler:v1.23.6 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/kube-scheduler:v1.23.6docker tag quay.io/coreos/flannel:v0.15.1-amd64 172.16.131.87:1088/kubernetes-deploy/quay.io/coreos/flannel:v0.15.1-amd64docker tag mirrorgcrio/etcd:3.4.9 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/etcd:3.4.9docker tag mirrorgcrio/coredns:1.7.0 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/coredns:1.7.0docker tag mirrorgcrio/pause:latest 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/pause:latest
4.在各个节点登陆私有并在master节点推入新tag的镜像到仓库中:
docker login 172.16.131.87:1088Username: adminPassword:WARNING! Your password will be stored unencrypted in /root/.docker/config.json.Configure a credential helper to remove this warning. Seehttps://docs.docker.com/engine/reference/commandline/login/#credentials-storeLogin Succeededdocker images|grep "172.16.131.87"|awk '{print "docker push " $1 ":" $2}'docker push 172.16.131.87:1088/kubernetes-deploy/registry:latestdocker push 172.16.131.87:1088/kubernetes-deploy/nginx:latestdocker push 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/kube-apiserver:v1.23.6docker push 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/kube-controller-manager:v1.23.6docker push 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/kube-proxy:v1.23.6docker push 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/kube-scheduler:v1.23.6docker push 172.16.131.87:1088/kubernetes-deploy/quay.io/coreos/flannel:v0.15.1-amd64docker push 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/etcd:3.4.9docker push 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/coredns:1.7.0docker push 172.16.131.87:1088/kubernetes-deploy/mirrorgcrio/pause:latest
5.在master节点初始化kubernetes集群:
kubeadm init --kubernetes-version=1.23.6 --apiserver-advertise-address=172.16.131.83 --image-repository 172.16.131.87:1088/kubernetes-deploy --service-cidr=10.1.0.0/16 --pod-network-cidr=10.244.0.0/16[init] Using Kubernetes version: v1.23.6[preflight] Running pre-flight checks[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 23.0.1. Latest validated version: 18.09[preflight] Pulling images required for setting up a Kubernetes cluster[preflight] This might take a minute or two, depending on the speed of your internet connection[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"[kubelet-start] Activating the kubelet service[certs] Using certificateDir folder "/etc/kubernetes/pki"[certs] Generating "front-proxy-ca" certificate and key[certs] Generating "front-proxy-client" certificate and key[certs] Generating "ca" certificate and key[certs] Generating "apiserver" certificate and key[certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.1.0.1 172.16.131.83][certs] Generating "apiserver-kubelet-client" certificate and key[certs] Generating "etcd/ca" certificate and key[certs] Generating "etcd/server" certificate and key[certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [172.16.131.83 127.0.0.1 ::1][certs] Generating "etcd/healthcheck-client" certificate and key[certs] Generating "apiserver-etcd-client" certificate and key[certs] Generating "etcd/peer" certificate and key[certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [172.16.131.83 127.0.0.1 ::1][certs] Generating "sa" key and public key[kubeconfig] Using kubeconfig folder "/etc/kubernetes"[kubeconfig] Writing "admin.conf" kubeconfig file[kubeconfig] Writing "kubelet.conf" kubeconfig file[kubeconfig] Writing "controller-manager.conf" kubeconfig file[kubeconfig] Writing "scheduler.conf" kubeconfig file[control-plane] Using manifest folder "/etc/kubernetes/manifests"[control-plane] Creating static Pod manifest for "kube-apiserver"[control-plane] Creating static Pod manifest for "kube-controller-manager"[control-plane] Creating static Pod manifest for "kube-scheduler"[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s[apiclient] All control plane components are healthy after 17.002750 seconds[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster[upload-certs] Skipping phase. Please see --experimental-upload-certs[mark-control-plane] Marking the node k8s-master as control-plane by adding the label "node-role.kubernetes.io/master=''"[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule][bootstrap-token] Using token: f93xna.7kr79tn4z6fmzf23[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster[bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace[addons] Applied essential addon: CoreDNS[addons] Applied essential addon: kube-proxyYour Kubernetes control-plane has initialized successfully!To start using your cluster, you need to run the following as a regular user:mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/configYou should now deploy a pod network to the cluster.Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:<https://kubernetes.io/docs/concepts/cluster-administration/addons/Then> you can join any number of worker nodes by running the following on each as root:kubeadm join 172.16.131.83:6443 --token f93xna.7kr79tn4z6fmzf23 \\\\--discovery-token-ca-cert-hash sha256:40dba9e45ffce1c08d415c44a962974f9081c1ae9e74e922a2410d4e0ebac590
6.根据提示启动kubernetes集群
mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/config
7.配置fannel(或calcio)网络,用于不同主机之间的容器网络交互:
联网中转机操作下载fannel的yml配置文件:
curl -O <https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml>
在master上配置FANNEL网络:
kubectl apply -f /apps/flannel/kube-flannel.ymlclusterrole.rbac.authorization.k8s.io/flannel createdclusterrolebinding.rbac.authorization.k8s.io/flannel createdserviceaccount/flannel createdconfigmap/kube-flannel-cfg createddaemonset.extensions/kube-flannel-ds-amd64 createddaemonset.extensions/kube-flannel-ds-arm64 createddaemonset.extensions/kube-flannel-ds-arm createddaemonset.extensions/kube-flannel-ds-ppc64le createddaemonset.extensions/kube-flannel-ds-s390x created
8.根据上述提示在其他节点上执行命令加入kubectl集群:
kubeadm join 172.16.131.83:6443 --token f93xna.7kr79tn4z6fmzf23 --discovery-token-ca-cert-hash sha256:40dba9e45ffce1c08d415c44a962974f9081c1ae9e74e922a2410d4e0ebac590
也可以通过以下方式在主节点生成集群加入命令,并拷贝到其他node上执行:
kubeadm token create --print-join-commandkubeadm join 172.16.131.83:6443 --token r7oaex.qgqvdqvlyuubt5aw --discovery-token-ca-cert-hash sha256:40dba9e45ffce1c08d415c44a962974f9081c1ae9e74e922a2410d4e0ebac590
9.node节点执行后,如下则说明成功将节点加入集群,以后有新的节点需要加入kubernets集群也一样:
e922a2410d4e0ebac590[preflight] Running pre-flight checks[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 23.0.1. Latest validated version: 18.09[preflight] Reading configuration from the cluster...[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"[kubelet-start] Activating the kubelet service[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...This node has joined the cluster:* Certificate signing request was sent to apiserver and a response was received.* The Kubelet was informed of the new secure connection details.Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
10.检查集群状态:
kubectl get nodesNAME STATUS ROLES AGE VERSIONk8s-master Ready master 1h v1.23.6k8s-node1 Ready 2h v1.23.6k8s-node2 Ready 1h v1.23.6k8s-node3 Ready 1h v1.23.6
注:
我在部署完成后,长时间检查发现node节点一直处于NotReady的状态
NAME STATUS ROLES AGE VERSIONk8s-master Ready master 1h v1.23.6k8s-node1 NotReady 2h v1.23.6k8s-node2 NotReady 1h v1.23.6k8s-node3 NotReady 1h v1.23.6
此时kubenetes的状态是不正确的,因此需要排查,我们可以在k8s节点上运行如下命令用于查看错误日志,方便我们排查问题:
journalctl -u kubelet -f
此时在日志中,我看到两个报错:
1)报错如下:
k8s-node1 kubelet[27242]: I1014 11:17:29.409068 27242 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"Oct 14 11:17:29 k8s-node1 kubelet[27242]: E1014 11:17:29.996079 27242 kubelet.go:2332] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
2)报错为无法通过172.16.131.87:1088/kuberenets-deploy仓库获取镜像,认证失败
问题1的处理方式,即其他节点缺失配置文件,传输主节点的网络配置文件到其他节点即可
scp -rp /etc/cni k8s-node1:/etc/scp -rp /etc/cni k8s-node2:/etc/scp -rp /etc/cni k8s-node3:/etc/
此时可以发现所有节点状态为ready,即kubernetes的状态已经正确。
问题2的出现则是由于在搭建仓库上传k8s镜像的时候,将项目kubernetes-deploy项目设置为了私有,因此无法下载,最简单的方式就是直接在harbor上将该项目设置为公开即可(私有方式如何获取镜像后续再讨论)。
至此,我们整个基于红帽7的k8s通过kubeadm的离线安装部署整个就完成了,接下来就是通过部署一个nginx来验证整个集群的可用性了。
#!/bin/sh# Nitin Jerath - Aug 2005#Usage sshUserSetup.sh -user [ -hosts \\\\"\\\\" | -hostfile ] [ -advanced ] [ -verify] [ -exverify ] [ -logfile ] [-confirm] [-shared] [-help] [-usePassphrase] [-noPromptPassphrase]#eg. sshUserSetup.sh -hosts "host1 host2" -user njerath -advanced#This script is used to setup SSH connectivity from the host on which it is# run to the specified remote hosts. After this script is run, the user can use # SSH to run commands on the remote hosts or copy files between the local host# and the remote hosts without being prompted for passwords or confirmations.# The list of remote hosts and the user name on the remote host is specified as# a command line parameter to the script. Note that in case the user on the# remote host has its home directory NFS mounted or shared across the remote# hosts, this script should be used with -shared option.#Specifying the -advanced option on the command line would result in SSH# connectivity being setup among the remote hosts which means that SSH can be# used to run commands on one remote host from the other remote host or copy# files between the remote hosts without being prompted for passwords or# confirmations.#Please note that the script would remove write permissions on the remote hosts#for the user home directory and ~/.ssh directory for "group" and "others". This# is an SSH requirement. The user would be explicitly informed about this by teh script and prompted to continue. In case the user presses no, the script would exit. In case the user does not want to be prompted, he can use -confirm option.# As a part of the setup, the script would use SSH to create files within ~/.ssh# directory of the remote node and to setup the requisite permissions. The#script also uses SCP to copy the local host public key to the remote hosts so# that the remote hosts trust the local host for SSH. At the time, the script#performs these steps, SSH connectivity has not been completely setup hence# the script would prompt the user for the remote host password.#For each remote host, for remote users with non-shared homes this would be# done once for SSH and once for SCP. If the number of remote hosts are x, the# user would be prompted 2x times for passwords. For remote users with shared# homes, the user would be prompted only twice, once each for SCP and SSH.#For security reasons, the script does not save passwords and reuse it. Also,# for security reasons, the script does not accept passwords redirected from a#file. The user has to key in the confirmations and passwords at the prompts.#The -verify option means that the user just wants to verify whether SSH has#been set up. In this case, the script would not setup SSH but would only check# whether SSH connectivity has been setup from the local host to the remote# hosts. The script would run the date command on each remote host using SSH. In# case the user is prompted for a password or sees a warning message for a#particular host, it means SSH connectivity has not been setup correctly for# that host.#In case the -verify option is not specified, the script would setup SSH and#then do the verification as well.#In case the user speciies the -exverify option, an exhaustive verification would be done. In that case, the following would be checked:# 1. SSH connectivity from local host to all remote hosts.# 2. SSH connectivity from each remote host to itself and other remote hosts.#echo Parsing command line argumentsnumargs=$#ADVANCED=falseHOSTNAME=`hostname`CONFIRM=noSHARED=falsei=1USR=$USERif test -z "$TEMP"thenTEMP=/tmpfiIDENTITY=id_rsaLOGFILE=$TEMP/sshUserSetup_`date +%F-%H-%M-%S`.logVERIFY=falseEXHAUSTIVE_VERIFY=falseHELP=falsePASSPHRASE=noRERUN_SSHKEYGEN=noNO_PROMPT_PASSPHRASE=nowhile [ $i -le $numargs ]doj=$1if [ $j = "-hosts" ]thenHOSTS=$2shift 1i=`expr $i + 1`fiif [ $j = "-user" ]thenUSR=$2shift 1i=`expr $i + 1`fiif [ $j = "-logfile" ]thenLOGFILE=$2shift 1i=`expr $i + 1`fiif [ $j = "-confirm" ]thenCONFIRM=yesfiif [ $j = "-hostfile" ]thenCLUSTER_CONFIGURATION_FILE=$2shift 1i=`expr $i + 1`fiif [ $j = "-usePassphrase" ]thenPASSPHRASE=yesfiif [ $j = "-noPromptPassphrase" ]thenNO_PROMPT_PASSPHRASE=yesfiif [ $j = "-shared" ]thenSHARED=truefiif [ $j = "-exverify" ]thenEXHAUSTIVE_VERIFY=truefiif [ $j = "-verify" ]thenVERIFY=truefiif [ $j = "-advanced" ]thenADVANCED=truefiif [ $j = "-help" ]thenHELP=truefii=`expr $i + 1`shift 1doneif [ $HELP = "true" ]thenecho "Usage $0 -user [ -hosts \\\\"\\\\" | -hostfile ] [ -advanced ] [ -verify] [ -exverify ] [ -logfile ] [-confirm] [-shared] [-help] [-usePassphrase] [-noPromptPassphrase]"echo "This script is used to setup SSH connectivity from the host on which it is run to the specified remote hosts. After this script is run, the user can use SSH to run commands on the remote hosts or copy files between the local host and the remote hosts without being prompted for passwords or confirmations. The list of remote hosts and the user name on the remote host is specified as a command line parameter to the script. "echo "-user : User on remote hosts. "echo "-hosts : Space separated remote hosts list. "echo "-hostfile : The user can specify the host names either through the -hosts option or by specifying the absolute path of a cluster configuration file. A sample host file contents are below: "echoecho " stacg30 stacg30int 10.1.0.0 stacg30v -"echo " stacg34 stacg34int 10.1.0.1 stacg34v -"echoecho " The first column in each row of the host file will be used as the host name."echoecho "-usePassphrase : The user wants to set up passphrase to encrypt the private key on the local host. "echo "-noPromptPassphrase : The user does not want to be prompted for passphrase related questions. This is for users who want the default behavior to be followed."echo "-shared : In case the user on the remote host has its home directory NFS mounted or shared across the remote hosts, this script should be used with -shared option. "echo " It is possible for the user to determine whether a user's home directory is shared or non-shared. Let us say we want to determine that user user1's home directory is shared across hosts A, B and C."echo " Follow the following steps:"echo " 1. On host A, touch ~user1/checkSharedHome.tmp"echo " 2. On hosts B and C, ls -al ~user1/checkSharedHome.tmp"echo " 3. If the file is present on hosts B and C in ~user1 directory and"echo " is identical on all hosts A, B, C, it means that the user's home "echo " directory is shared."echo " 4. On host A, rm -f ~user1/checkSharedHome.tmp"echo " In case the user accidentally passes -shared option for non-shared homes or viceversa,SSH connectivity would only be set up for a subset of the hosts. The user would have to re-run the setyp script with the correct option to rectify this problem."echo "-advanced : Specifying the -advanced option on the command line would result in SSH connectivity being setup among the remote hosts which means that SSH can be used to run commands on one remote host from the other remote host or copy files between the remote hosts without being prompted for passwords or confirmations."echo "-confirm: The script would remove write permissions on the remote hosts for the user home directory and ~/.ssh directory for "group" and "others". This is an SSH requirement. The user would be explicitly informed about this by the script and prompted to continue. In case the user presses no, the script would exit. In case the user does not want to be prompted, he can use -confirm option."echo "As a part of the setup, the script would use SSH to create files within ~/.ssh directory of the remote node and to setup the requisite permissions. The script also uses SCP to copy the local host public key to the remote hosts so that the remote hosts trust the local host for SSH. At the time, the script performs these steps, SSH connectivity has not been completely setup hence the script would prompt the user for the remote host password. "echo "For each remote host, for remote users with non-shared homes this would be done once for SSH and once for SCP. If the number of remote hosts are x, the user would be prompted 2x times for passwords. For remote users with shared homes, the user would be prompted only twice, once each for SCP and SSH. For security reasons, the script does not save passwords and reuse it. Also, for security reasons, the script does not accept passwords redirected from a file. The user has to key in the confirmations and passwords at the prompts. "echo "-verify : -verify option means that the user just wants to verify whether SSH has been set up. In this case, the script would not setup SSH but would only check whether SSH connectivity has been setup from the local host to the remote hosts. The script would run the date command on each remote host using SSH. In case the user is prompted for a password or sees a warning message for a particular host, it means SSH connectivity has not been setup correctly for that host. In case the -verify option is not specified, the script would setup SSH and then do the verification as well. "echo "-exverify : In case the user speciies the -exverify option, an exhaustive verification for all hosts would be done. In that case, the following would be checked: "echo " 1. SSH connectivity from local host to all remote hosts. "echo " 2. SSH connectivity from each remote host to itself and other remote hosts. "echo The -exverify option can be used in conjunction with the -verify option as well to do an exhaustive verification once the setup has been done.echo "Taking some examples: Let us say local host is Z, remote hosts are A,B and C. Local user is njerath. Remote users are racqa(non-shared), aime(shared)."echo "$0 -user racqa -hosts "A B C" -advanced -exverify -confirm"echo "Script would set up connectivity from Z -> A, Z -> B, Z -> C, A -> A, A -> B, A -> C, B -> A, B -> B, B -> C, C -> A, C -> B, C -> C."echo "Since user has given -exverify option, all these scenario would be verified too."echoecho "Now the user runs : $0 -user racqa -hosts "A B C" -verify"echo "Since -verify option is given, no SSH setup would be done, only verification of existing setup. Also, since -exverify or -advanced options are not given, script would only verify connectivity from Z -> A, Z -> B, Z -> C"echo "Now the user runs : $0 -user racqa -hosts "A B C" -verify -advanced"echo "Since -verify option is given, no SSH setup would be done, only verification of existing setup. Also, since -advanced options is given, script would verify connectivity from Z -> A, Z -> B, Z -> C, A-> A, A->B, A->C, A->D"echo "Now the user runs:"echo "$0 -user aime -hosts "A B C" -confirm -shared"echo "Script would set up connectivity between Z->A, Z->B, Z->C only since advanced option is not given."echo "All these scenarios would be verified too."exitfiif test -z "$HOSTS"thenif test -n "$CLUSTER_CONFIGURATION_FILE" && test -f "$CLUSTER_CONFIGURATION_FILE"thenHOSTS=`awk '$1 !~ /^#/ { str = str " " $1 } END { print str }' $CLUSTER_CONFIGURATION_FILE`elif ! test -f "$CLUSTER_CONFIGURATION_FILE"thenecho "Please specify a valid and existing cluster configuration file."fifiif test -z "$HOSTS" || test -z $USRthenecho "Either user name or host information is missing"echo "Usage $0 -user [ -hosts \\\\"\\\\" | -hostfile ] [ -advanced ] [ -verify] [ -exverify ] [ -logfile ] [-confirm] [-shared] [-help] [-usePassphrase] [-noPromptPassphrase]"exit 1fiif [ -d $LOGFILE ]; thenecho $LOGFILE is a directory, setting logfile to $LOGFILE/ssh.logLOGFILE=$LOGFILE/ssh.logfiecho The output of this script is also logged into $LOGFILE | tee -a $LOGFILEif [ `echo $?` != 0 ]; thenecho Error writing to the logfile $LOGFILE, Exitingexit 1fiecho Hosts are $HOSTS | tee -a $LOGFILEecho user is $USR | tee -a $LOGFILESSH="/usr/bin/ssh"SCP="/usr/bin/scp"SSH_KEYGEN="/usr/bin/ssh-keygen"calculateOS(){platform=`uname -s`case "$platform"in"SunOS") os=solaris;;"Linux") os=linux;;"HP-UX") os=hpunix;;"AIX") os=aix;;*) echo "Sorry, $platform is not currently supported." | tee -a $LOGFILEexit 1;;esacecho "Platform:- $platform " | tee -a $LOGFILE}calculateOSBITS=1024ENCR="rsa"deadhosts=""alivehosts=""if [ $platform = "Linux" ]thenPING="/bin/ping"elsePING="/usr/sbin/ping"fi#bug 9044791if [ -n "$SSH_PATH" ]; thenSSH=$SSH_PATHfiif [ -n "$SCP_PATH" ]; thenSCP=$SCP_PATHfiif [ -n "$SSH_KEYGEN_PATH" ]; thenSSH_KEYGEN=$SSH_KEYGEN_PATHfiif [ -n "$PING_PATH" ]; thenPING=$PING_PATHfiPATH_ERROR=0if test ! -x $SSH ; thenecho "ssh not found at $SSH. Please set the variable SSH_PATH to the correct location of ssh and retry."PATH_ERROR=1fiif test ! -x $SCP ; thenecho "scp not found at $SCP. Please set the variable SCP_PATH to the correct location of scp and retry."PATH_ERROR=1fiif test ! -x $SSH_KEYGEN ; thenecho "ssh-keygen not found at $SSH_KEYGEN. Please set the variable SSH_KEYGEN_PATH to the correct location of ssh-keygen and retry."PATH_ERROR=1fiif test ! -x $PING ; thenecho "ping not found at $PING. Please set the variable PING_PATH to the correct location of ping and retry."PATH_ERROR=1fiif [ $PATH_ERROR = 1 ]; thenecho "ERROR: one or more of the required binaries not found, exiting"exit 1fi#9044791 endecho Checking if the remote hosts are reachable | tee -a $LOGFILEfor host in $HOSTSdoif [ $platform = "SunOS" ]; then$PING -s $host 5 5elif [ $platform = "HP-UX" ]; then$PING $host -n 5 -m 5else$PING -c 5 -w 5 $hostfiexitcode=`echo $?`if [ $exitcode = 0 ]thenalivehosts="$alivehosts $host"elsedeadhosts="$deadhosts $host"fidoneif test -z "$deadhosts"thenecho Remote host reachability check succeeded. | tee -a $LOGFILEecho The following hosts are reachable: $alivehosts. | tee -a $LOGFILEecho The following hosts are not reachable: $deadhosts. | tee -a $LOGFILEecho All hosts are reachable. Proceeding further... | tee -a $LOGFILEelseecho Remote host reachability check failed. | tee -a $LOGFILEecho The following hosts are reachable: $alivehosts. | tee -a $LOGFILEecho The following hosts are not reachable: $deadhosts. | tee -a $LOGFILEecho Please ensure that all the hosts are up and re-run the script. | tee -a $LOGFILEecho Exiting now... | tee -a $LOGFILEexit 1fifirsthost=`echo $HOSTS | awk '{print $1}; END { }'`echo firsthost $firsthostnumhosts=`echo $HOSTS | awk '{ }; END {print NF}'`echo numhosts $numhostsif [ $VERIFY = "true" ]thenecho Since user has specified -verify option, SSH setup would not be done. Only, existing SSH setup would be verified. | tee -a $LOGFILEcontinueelseecho The script will setup SSH connectivity from the host ''`hostname`'' to all | tee -a $LOGFILEecho the remote hosts. After the script is executed, the user can use SSH to run | tee -a $LOGFILEecho commands on the remote hosts or copy files between this host ''`hostname`'' | tee -a $LOGFILEecho and the remote hosts without being prompted for passwords or confirmations. | tee -a $LOGFILEecho | tee -a $LOGFILEecho NOTE 1: | tee -a $LOGFILEecho As part of the setup procedure, this script will use 'ssh' and 'scp' to copy | tee -a $LOGFILEecho files between the local host and the remote hosts. Since the script does not | tee -a $LOGFILEecho store passwords, you may be prompted for the passwords during the execution of | tee -a $LOGFILEecho the script whenever 'ssh' or 'scp' is invoked. | tee -a $LOGFILEecho | tee -a $LOGFILEecho NOTE 2: | tee -a $LOGFILEecho "AS PER SSH REQUIREMENTS, THIS SCRIPT WILL SECURE THE USER HOME DIRECTORY" | tee -a $LOGFILEecho AND THE .ssh DIRECTORY BY REVOKING GROUP AND WORLD WRITE PRIVILEGES TO THESE | tee -a $LOGFILEecho "directories." | tee -a $LOGFILEecho | tee -a $LOGFILEecho "Do you want to continue and let the script make the above mentioned changes (yes/no)?" | tee -a $LOGFILEif [ "$CONFIRM" = "no" ]thenread CONFIRMelseecho "Confirmation provided on the command line" | tee -a $LOGFILEfiecho | tee -a $LOGFILEecho The user chose ''$CONFIRM'' | tee -a $LOGFILEif [ -z "$CONFIRM" -o "$CONFIRM" != "yes" -a "$CONFIRM" != "no" ]thenecho "You haven't specified proper input. Please enter 'yes' or 'no'. Exiting...."exit 0fiif [ "$CONFIRM" = "no" ]thenecho "SSH setup is not done." | tee -a $LOGFILEexit 1elseif [ $NO_PROMPT_PASSPHRASE = "yes" ]thenecho "User chose to skip passphrase related questions." | tee -a $LOGFILEelseif [ $SHARED = "true" ]then hostcount=`expr ${numhosts} + 1` PASSPHRASE_PROMPT=`expr 2 \\* $hostcount`else PASSPHRASE_PROMPT=`expr 2 \\* ${numhosts}`fiecho "Please specify if you want to specify a passphrase for the private key this script will create for the local host. Passphrase is used to encrypt the private key and makes SSH much more secure. Type 'yes' or 'no' and then press enter. In case you press 'yes', you would need to enter the passphrase whenever the script executes ssh or scp. $PASSPHRASE " | tee -a $LOGFILEecho "The estimated number of times the user would be prompted for a passphrase is $PASSPHRASE_PROMPT. In addition, if the private-public files are also newly created, the user would have to specify the passphrase on one additional occasion. " | tee -a $LOGFILEecho "Enter 'yes' or 'no'." | tee -a $LOGFILEif [ "$PASSPHRASE" = "no" ]thenread PASSPHRASEelseecho "Confirmation provided on the command line" | tee -a $LOGFILEfiecho | tee -a $LOGFILEecho The user chose ''$PASSPHRASE'' | tee -a $LOGFILEif [ -z "$PASSPHRASE" -o "$PASSPHRASE" != "yes" -a "$PASSPHRASE" != "no" ]thenecho "You haven't specified whether to use Passphrase or not. Please specify 'yes' or 'no'. Exiting..."exit 0fiif [ "$PASSPHRASE" = "yes" ]thenRERUN_SSHKEYGEN="yes"#Checking for existence of ${IDENTITY} fileif test -f $HOME/.ssh/${IDENTITY}.pub && test -f $HOME/.ssh/${IDENTITY}then echo "The files containing the client public and private keys already exist on the local host. The current private key may or may not have a passphrase associated with it. In case you remember the passphrase and do not want to re-run ssh-keygen, press 'no' and enter. If you press 'no', the script will not attempt to create any new public/private key pairs. If you press 'yes', the script will remove the old private/public key files existing and create new ones prompting the user to enter the passphrase. If you enter 'yes', any previous SSH user setups would be reset. If you press 'change', the script will associate a new passphrase with the old keys." | tee -a $LOGFILE echo "Press 'yes', 'no' or 'change'" | tee -a $LOGFILEread RERUN_SSHKEYGENecho The user chose ''$RERUN_SSHKEYGEN'' | tee -a $LOGFILE if [ -z "$RERUN_SSHKEYGEN" -o "$RERUN_SSHKEYGEN" != "yes" -a "$RERUN_SSHKEYGEN" != "no" -a "$RERUN_SSHKEYGEN" != "change" ] then echo "You haven't specified whether to re-run 'ssh-keygen' or not. Please enter 'yes' , 'no' or 'change'. Exiting..." exit 0; fifielseif test -f $HOME/.ssh/${IDENTITY}.pub && test -f $HOME/.ssh/${IDENTITY}thenecho "The files containing the client public and private keys already exist on the local host. The current private key may have a passphrase associated with it. In case you find using passphrase inconvenient(although it is more secure), you can change to it empty through this script. Press 'change' if you want the script to change the passphrase for you. Press 'no' if you want to use your old passphrase, if you had one."read RERUN_SSHKEYGENecho The user chose ''$RERUN_SSHKEYGEN'' | tee -a $LOGFILE if [ -z "$RERUN_SSHKEYGEN" -o "$RERUN_SSHKEYGEN" != "yes" -a "$RERUN_SSHKEYGEN" != "no" -a "$RERUN_SSHKEYGEN" != "change" ] then echo "You haven't specified whether to re-run 'ssh-keygen' or not. Please enter 'yes' , 'no' or 'change'. Exiting..." exit 0 fifififiecho Creating .ssh directory on local host, if not present already | tee -a $LOGFILEmkdir -p $HOME/.ssh | tee -a $LOGFILEecho Creating authorized_keys file on local host | tee -a $LOGFILEtouch $HOME/.ssh/authorized_keys | tee -a $LOGFILEecho Changing permissions on authorized_keys to 644 on local host | tee -a $LOGFILEchmod 644 $HOME/.ssh/authorized_keys | tee -a $LOGFILEmv -f $HOME/.ssh/authorized_keys $HOME/.ssh/authorized_keys.tmp | tee -a $LOGFILEecho Creating known_hosts file on local host | tee -a $LOGFILEtouch $HOME/.ssh/known_hosts | tee -a $LOGFILEecho Changing permissions on known_hosts to 644 on local host | tee -a $LOGFILEchmod 644 $HOME/.ssh/known_hosts | tee -a $LOGFILEmv -f $HOME/.ssh/known_hosts $HOME/.ssh/known_hosts.tmp | tee -a $LOGFILEecho Creating config file on local host | tee -a $LOGFILEecho If a config file exists already at $HOME/.ssh/config, it would be backed up to $HOME/.ssh/config.backup.echo "Host *" > $HOME/.ssh/config.tmp | tee -a $LOGFILEecho "ForwardX11 no" >> $HOME/.ssh/config.tmp | tee -a $LOGFILEif test -f $HOME/.ssh/configthencp -f $HOME/.ssh/config $HOME/.ssh/config.backupfimv -f $HOME/.ssh/config.tmp $HOME/.ssh/config | tee -a $LOGFILEchmod 644 $HOME/.ssh/configif [ "$RERUN_SSHKEYGEN" = "yes" ]thenecho Removing old private/public keys on local host | tee -a $LOGFILErm -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILErm -f $HOME/.ssh/${IDENTITY}.pub | tee -a $LOGFILEecho Running SSH keygen on local host | tee -a $LOGFILE$SSH_KEYGEN -t $ENCR -b $BITS -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILEelif [ "$RERUN_SSHKEYGEN" = "change" ]thenecho Running SSH Keygen on local host to change the passphrase associated with the existing private key | tee -a $LOGFILE$SSH_KEYGEN -p -t $ENCR -b $BITS -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILEelif test -f $HOME/.ssh/${IDENTITY}.pub && test -f $HOME/.ssh/${IDENTITY}thencontinueelseecho Removing old private/public keys on local host | tee -a $LOGFILErm -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILErm -f $HOME/.ssh/${IDENTITY}.pub | tee -a $LOGFILEecho Running SSH keygen on local host with empty passphrase | tee -a $LOGFILE$SSH_KEYGEN -t $ENCR -b $BITS -f $HOME/.ssh/${IDENTITY} -N '' | tee -a $LOGFILEfiif [ $SHARED = "true" ]thenif [ $USER = $USR ]then#No remote operations requiredecho Remote user is same as local user | tee -a $LOGFILEREMOTEHOSTS=""chmod og-w $HOME $HOME/.ssh | tee -a $LOGFILEelseREMOTEHOSTS="${firsthost}"fielseREMOTEHOSTS="$HOSTS"fifor host in $REMOTEHOSTSdoecho Creating .ssh directory and setting permissions on remote host $host | tee -a $LOGFILEecho "THE SCRIPT WOULD ALSO BE REVOKING WRITE PERMISSIONS FOR "group" AND "others" ON THE HOME DIRECTORY FOR $USR. THIS IS AN SSH REQUIREMENT." | tee -a $LOGFILEecho The script would create ~$USR/.ssh/config file on remote host $host. If a config file exists already at ~$USR/.ssh/config, it would be backed up to ~$USR/.ssh/config.backup. | tee -a $LOGFILEecho The user may be prompted for a password here since the script would be running SSH on host $host. | tee -a $LOGFILE$SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c \\\\" mkdir -p .ssh ; chmod og-w . .ssh; touch .ssh/authorized_keys .ssh/known_hosts; chmod 644 .ssh/authorized_keys .ssh/known_hosts; cp .ssh/authorized_keys .ssh/authorized_keys.tmp ; cp .ssh/known_hosts .ssh/known_hosts.tmp; echo \\\\\\\\"Host *\\\\\\\\" > .ssh/config.tmp; echo \\\\\\\\"ForwardX11 no\\\\\\\\" >> .ssh/config.tmp; if test -f .ssh/config ; then cp -f .ssh/config .ssh/config.backup; fi ; mv -f .ssh/config.tmp .ssh/config\\\\"" | tee -a $LOGFILEecho Done with creating .ssh directory and setting permissions on remote host $host. | tee -a $LOGFILEdonefor host in $REMOTEHOSTSdoecho Copying local host public key to the remote host $host | tee -a $LOGFILEecho The user may be prompted for a password or passphrase here since the script would be using SCP for host $host. | tee -a $LOGFILE$SCP $HOME/.ssh/${IDENTITY}.pub $USR@$host:.ssh/authorized_keys | tee -a $LOGFILEecho Done copying local host public key to the remote host $host | tee -a $LOGFILEdonecat $HOME/.ssh/${IDENTITY}.pub >> $HOME/.ssh/authorized_keys | tee -a $LOGFILEfor host in $HOSTSdoif [ "$ADVANCED" = "true" ]thenecho Creating keys on remote host $host if they do not exist already. This is required to setup SSH on host $host. | tee -a $LOGFILEif [ "$SHARED" = "true" ]thenIDENTITY_FILE_NAME=${IDENTITY}_$hostCOALESCE_IDENTITY_FILES_COMMAND="cat .ssh/${IDENTITY_FILE_NAME}.pub >> .ssh/authorized_keys"elseIDENTITY_FILE_NAME=${IDENTITY}fi$SSH -o StrictHostKeyChecking=no -x -l $USR $host " /bin/sh -c \\\\"if test -f .ssh/${IDENTITY_FILE_NAME}.pub && test -f .ssh/${IDENTITY_FILE_NAME}; then echo; else rm -f .ssh/${IDENTITY_FILE_NAME} ; rm -f .ssh/${IDENTITY_FILE_NAME}.pub ; $SSH_KEYGEN -t $ENCR -b $BITS -f .ssh/${IDENTITY_FILE_NAME} -N '' ; fi; ${COALESCE_IDENTITY_FILES_COMMAND} \\\\"" | tee -a $LOGFILEelse#At least get the host keys from all hosts for shared case - advanced option not setif test $SHARED = "true" && test $ADVANCED = "false"thenif [ "$PASSPHRASE" = "yes" ]then echo "The script will fetch the host keys from all hosts. The user may be prompted for a passphrase here in case the private key has been encrypted with a passphrase." | tee -a $LOGFILEfi$SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c true"fifidonefor host in $REMOTEHOSTSdoif test $ADVANCED = "true" && test $SHARED = "false"then$SCP $USR@$host:.ssh/${IDENTITY}.pub $HOME/.ssh/${IDENTITY}.pub.$host | tee -a $LOGFILEcat $HOME/.ssh/${IDENTITY}.pub.$host >> $HOME/.ssh/authorized_keys | tee -a $LOGFILErm -f $HOME/.ssh/${IDENTITY}.pub.$host | tee -a $LOGFILEfidonefor host in $REMOTEHOSTSdoif [ "$ADVANCED" = "true" ]thenif [ "$SHARED" != "true" ]thenecho Updating authorized_keys file on remote host $host | tee -a $LOGFILE$SCP $HOME/.ssh/authorized_keys $USR@$host:.ssh/authorized_keys | tee -a $LOGFILEfiecho Updating known_hosts file on remote host $host | tee -a $LOGFILE$SCP $HOME/.ssh/known_hosts $USR@$host:.ssh/known_hosts | tee -a $LOGFILEfiif [ "$PASSPHRASE" = "yes" ]then echo "The script will run SSH on the remote machine $host. The user may be prompted for a passphrase here in case the private key has been encrypted with a passphrase." | tee -a $LOGFILEfi$SSH -x -l $USR $host "/bin/sh -c \\\\"cat .ssh/authorized_keys.tmp >> .ssh/authorized_keys; cat .ssh/known_hosts.tmp >> .ssh/known_hosts; rm -f .ssh/known_hosts.tmp .ssh/authorized_keys.tmp\\\\"" | tee -a $LOGFILEdonecat $HOME/.ssh/known_hosts.tmp >> $HOME/.ssh/known_hosts | tee -a $LOGFILEcat $HOME/.ssh/authorized_keys.tmp >> $HOME/.ssh/authorized_keys | tee -a $LOGFILE#Added chmod to fix BUG NO 5238814chmod 644 $HOME/.ssh/authorized_keys#Fix for BUG NO 5157782chmod 644 $HOME/.ssh/configrm -f $HOME/.ssh/known_hosts.tmp $HOME/.ssh/authorized_keys.tmp | tee -a $LOGFILEecho SSH setup is complete. | tee -a $LOGFILEfifiecho | tee -a $LOGFILEecho ------------------------------------------------------------------------ | tee -a $LOGFILEecho Verifying SSH setup | tee -a $LOGFILEecho =================== | tee -a $LOGFILEecho The script will now run the 'date' command on the remote nodes using ssh | tee -a $LOGFILEecho to verify if ssh is setup correctly. IF THE SETUP IS CORRECTLY SETUP, | tee -a $LOGFILEecho THERE SHOULD BE NO OUTPUT OTHER THAN THE DATE AND SSH SHOULD NOT ASK FOR | tee -a $LOGFILEecho PASSWORDS. If you see any output other than date or are prompted for the | tee -a $LOGFILEecho password, ssh is not setup correctly and you will need to resolve the | tee -a $LOGFILEecho issue and set up ssh again. | tee -a $LOGFILEecho The possible causes for failure could be: | tee -a $LOGFILEecho 1. The server settings in /etc/ssh/sshd_config file do not allow ssh | tee -a $LOGFILEecho for user $USR. | tee -a $LOGFILEecho 2. The server may have disabled public key based authentication.echo 3. The client public key on the server may be outdated.echo 4. ~$USR or ~$USR/.ssh on the remote host may not be owned by $USR. | tee -a $LOGFILEecho 5. User may not have passed -shared option for shared remote users or | tee -a $LOGFILEecho may be passing the -shared option for non-shared remote users. | tee -a $LOGFILEecho 6. If there is output in addition to the date, but no password is asked, | tee -a $LOGFILEecho it may be a security alert shown as part of company policy. Append the | tee -a $LOGFILEecho "additional text to the /sysman/prov/resources/ignoreMessages.txt file." | tee -a $LOGFILEecho ------------------------------------------------------------------------ | tee -a $LOGFILE#read -t 30 dummyfor host in $HOSTSdoecho --$host:-- | tee -a $LOGFILEecho Running $SSH -x -l $USR $host date to verify SSH connectivity has been setup from local host to $host. | tee -a $LOGFILEecho "IF YOU SEE ANY OTHER OUTPUT BESIDES THE OUTPUT OF THE DATE COMMAND OR IF YOU ARE PROMPTED FOR A PASSWORD HERE, IT MEANS SSH SETUP HAS NOT BEEN SUCCESSFUL. Please note that being prompted for a passphrase may be OK but being prompted for a password is ERROR." | tee -a $LOGFILEif [ "$PASSPHRASE" = "yes" ]thenecho "The script will run SSH on the remote machine $host. The user may be prompted for a passphrase here in case the private key has been encrypted with a passphrase." | tee -a $LOGFILEfi$SSH -l $USR $host "/bin/sh -c date" | tee -a $LOGFILEecho ------------------------------------------------------------------------ | tee -a $LOGFILEdoneif [ "$EXHAUSTIVE_VERIFY" = "true" ]thenfor clienthost in $HOSTSdoif [ "$SHARED" = "true" ]thenREMOTESSH="$SSH -i .ssh/${IDENTITY}_${clienthost}"elseREMOTESSH=$SSHfifor serverhost in $HOSTSdoecho ------------------------------------------------------------------------ | tee -a $LOGFILEecho Verifying SSH connectivity has been setup from $clienthost to $serverhost | tee -a $LOGFILEecho ------------------------------------------------------------------------ | tee -a $LOGFILEecho "IF YOU SEE ANY OTHER OUTPUT BESIDES THE OUTPUT OF THE DATE COMMAND OR IF YOU ARE PROMPTED FOR A PASSWORD HERE, IT MEANS SSH SETUP HAS NOT BEEN SUCCESSFUL." | tee -a $LOGFILE$SSH -l $USR $clienthost "$REMOTESSH $serverhost \\\\"/bin/sh -c date\\\\"" | tee -a $LOGFILEecho ------------------------------------------------------------------------ | tee -a $LOGFILEdoneecho -Verification from $clienthost complete- | tee -a $LOGFILEdoneelseif [ "$ADVANCED" = "true" ]thenif [ "$SHARED" = "true" ]thenREMOTESSH="$SSH -i .ssh/${IDENTITY}_${firsthost}"elseREMOTESSH=$SSHfifor host in $HOSTSdoecho ------------------------------------------------------------------------ | tee -a $LOGFILEecho Verifying SSH connectivity has been setup from $firsthost to $host | tee -a $LOGFILEecho "IF YOU SEE ANY OTHER OUTPUT BESIDES THE OUTPUT OF THE DATE COMMAND OR IF YOU ARE PROMPTED FOR A PASSWORD HERE, IT MEANS SSH SETUP HAS NOT BEEN SUCCESSFUL." | tee -a $LOGFILE$SSH -l $USR $firsthost "$REMOTESSH $host \\\\"/bin/sh -c date\\\\"" | tee -a $LOGFILEecho ------------------------------------------------------------------------ | tee -a $LOGFILEdoneecho -Verification from $clienthost complete- | tee -a $LOGFILEfifiecho "SSH verification complete." | tee -a $LOGFILE