最新消息:以前的文章会整理补上,如果有时间就来坐坐吧。。。

CM(Cloudera Manager)部署客户端配置失败问题排查

Hadoop 麦童 263浏览 0评论

背景

目前接手了Hadoop集群的维护,由于服务器老旧经常出现宕机的问题,无法维修的需要从集群中摘除掉再加入新的机器。新机器加入集群后需要重新下发配置,此前一直没有关注下发配置成功的服务器数量。这次在集群新增机器下发配置的过程中发现有台机器更新配置居然失败了。

部署客户端配置失败问题排查

将上图中的日志从stdout切换到stderr:

Can't open /run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_-7727416719664341322/yarn-conf/hive-env.sh: No such file or directory.
++ dirname /etc/hadoop/conf.cloudera.yarn
+ ROOT_DIR_NAME=/etc/hadoop
+ '[' '!' -e /etc/hadoop ']'
+ for SPECIAL_FILE in '$DEST_PATH/{taskcontroller.cfg,container-executor.cfg}'
+ '[' -e /etc/hadoop/conf.cloudera.yarn/taskcontroller.cfg ']'
+ for SPECIAL_FILE in '$DEST_PATH/{taskcontroller.cfg,container-executor.cfg}'
+ '[' -e /etc/hadoop/conf.cloudera.yarn/container-executor.cfg ']'
++ basename /etc/hadoop/conf
+ LINK_BASENAME=conf
+ [[ -d conf ]]
+ '[' -n '' ']'
+ DEPLOYED_FILE_USER=root
+ rm -rf /etc/hadoop/conf.cloudera.yarn
+ cp -a /run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_-7727416719664341322/yarn-conf /etc/hadoop/conf.cloudera.yarn
+ chown root /etc/hadoop/conf.cloudera.yarn
+ chmod -R ugo+r /etc/hadoop/conf.cloudera.yarn
+ '[' -e /etc/hadoop/conf.cloudera.yarn/topology.py ']'
+ chmod +x /etc/hadoop/conf.cloudera.yarn/topology.py
+ /usr/sbin/update-alternatives --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.cloudera.yarn 92
/var/lib/alternatives/hadoop-conf empty!

对stderr日志进行分析,YARN的配置文件是从/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_-7727416719664341322/yarn-conf拷贝到/etc/hadoop/conf.cloudera.yarn中供角色使用。进入process目录下

[root@prefix.company-inc.com ~]$ ls -al /run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_-7727416719664341322/
总用量 4
drwxr-xr-x  4 root root 100 6月  24 15:53 .
drwxr-x--x 10 root root 200 6月  24 15:53 ..
-rw-r--r--  1 root root  20 6月  24 15:53 __cloudera_metadata__
drwxr-x--x  2 root root 120 6月  24 15:52 logs
drwxr-x--x  2 root root 260 6月  24 15:52 yarn-conf
[root@prefix.company-inc.com ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_-7727416719664341322]$ ls -alht logs/
总用量 56K
drwxr-xr-x 4 root root 100 6月  24 15:55 ..
-rw-r--r-- 1 root root 21K 6月  24 15:55 stderr.log
-rw-r--r-- 1 root root 550 6月  24 15:55 stdout.log
drwxr-x--x 2 root root 120 6月  24 15:55 .
-rw-r----- 1 root root 21K 6月  24 15:55 stderr.log.bak
-rw-r----- 1 root root 550 6月  24 15:55 stdout.log.bak

查看stderr.log,在最后面的位置可以发现部署信息:/var/lib/alternatives/hadoop-conf empty!

+ for SPECIAL_FILE in '$DEST_PATH/{taskcontroller.cfg,container-executor.cfg}'
+ '[' -e /etc/hadoop/conf.cloudera.yarn/container-executor.cfg ']'
++ basename /etc/hadoop/conf
+ LINK_BASENAME=conf
+ [[ -d conf ]]
+ '[' -n '' ']'
+ DEPLOYED_FILE_USER=root
+ rm -rf /etc/hadoop/conf.cloudera.yarn
+ cp -a /run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_-7727416719664341322/yarn-conf /etc/hadoop/conf.cloudera.yarn
+ chown root /etc/hadoop/conf.cloudera.yarn
+ chmod -R ugo+r /etc/hadoop/conf.cloudera.yarn
+ '[' -e /etc/hadoop/conf.cloudera.yarn/topology.py ']'
+ chmod +x /etc/hadoop/conf.cloudera.yarn/topology.py
+ /usr/sbin/update-alternatives --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.cloudera.yarn 92
/var/lib/alternatives/hadoop-conf empty!

通过和以下客户端配置部署成功服务器上的日志进行对比可以确部署客户端配置失败是由于/var/lib/alternatives/hadoop-conf文件为空导致的。

+ for SPECIAL_FILE in '$DEST_PATH/{taskcontroller.cfg,container-executor.cfg}'
+ '[' -e /etc/hadoop/conf.cloudera.yarn/taskcontroller.cfg ']'
+ for SPECIAL_FILE in '$DEST_PATH/{taskcontroller.cfg,container-executor.cfg}'
+ '[' -e /etc/hadoop/conf.cloudera.yarn/container-executor.cfg ']'
++ basename /etc/hadoop/conf
+ LINK_BASENAME=conf
+ [[ -d conf ]]
+ '[' -n '' ']'
+ DEPLOYED_FILE_USER=root
+ rm -rf /etc/hadoop/conf.cloudera.yarn
+ cp -a /run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_-7727416719664341322/yarn-conf /etc/hadoop/conf.cloudera.yarn
+ chown root /etc/hadoop/conf.cloudera.yarn
+ chmod -R ugo+r /etc/hadoop/conf.cloudera.yarn
+ '[' -e /etc/hadoop/conf.cloudera.yarn/topology.py ']'
+ chmod +x /etc/hadoop/conf.cloudera.yarn/topology.py
+ /usr/sbin/update-alternatives --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.cloudera.yarn 92
+ /usr/sbin/update-alternatives --auto hadoop-conf

部署客户端配置失败问题处理

查看客户端配置部署成功服务器上的文件/var/lib/alternatives/hadoop-conf,并将其中内容拷贝到客户端配置部署失败的服务器上:

[root@prefix.company-inc.com ~]$  more /var/lib/alternatives/hadoop-conf 
auto 
/etc/hadoop/conf 

/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/etc/hadoop/conf.empty 
10 
/etc/hadoop/conf.cloudera.hdfs 
90 
/etc/hadoop/conf.cloudera.yarn 
92

完成后再重新部署一次客户端配置即可。

 

转载请注明:麦童博客 » CM(Cloudera Manager)部署客户端配置失败问题排查

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址