运维文档
Centos7运维命令
Centos7在线搭建docker的elasticsearch环境(单节点)
Centos7使用阿里云yum源
Centos7 Yum相关软件在线安装
Windows运维
工具类运维
禅道系统运维
git使用培训
Docker搭建Hadoop环境
Docker搭建Hadoop环境(新)
Mysql运维
MySQL 索引
Mysql模拟故障恢复案例过程
常用Sql
Docker维护命令
Git常用操作命令
搭建ZSK服务
SVN常用操作命令及维护
Ubuntu相关运维
gitlab安装升级操作
openEuler运维命令
常用统计SQL-治未病
服务人数-活动档案统计
Oracle数据库管理
Windows安装VC2015\VC2017
Idea离线开发的Maven设置
慢病治未病部署步骤
Centos7升级openssh+openssl
OpenEuler22.03源码编译安装Nginx
Centos7 ISO文件做本地yum源
本文档使用 MrDoc 发布
-
+
首页
Docker搭建Hadoop环境
### 1. 安装centos7.9虚拟机 VMWare安装centos7.9 ### 2. centos7.9安装docker 更新软件到最新版本 并下载安装docker ```shell yum -y update yum install -y yum-utils device-mapper-persistent-data lvm2 yum-config-manager --add-repo http://download.docker.com/linux/centos/docker-ce.repo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo yum list docker-ce --showduplicates | sort -r yum -y install docker-ce-20.10.23 systemctl status docker systemctl start docker systemctl enable docker ``` #### 2.1 安装工具 下载工具离线安装文件 ### 3. docker构建镜像和容器 制作带ssh的centos镜像 通过build Dockfile生成带ssh功能的centos镜像 ```shell docker pull centos:7.9.2009 vi dockerfile docker build -t="centos7-ssh" . ``` `dockerfile`文件: ```shell ######################## FROM centos MAINTAINER hadoop RUN cd /etc/yum.repos.d/ RUN sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-* RUN sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-* RUN yum makecache RUN yum update -y RUN yum install -y openssh-server sudo RUN sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config RUN yum install -y openssh-clients RUN echo "root:a123456" | chpasswd RUN echo "root ALL=(ALL) ALL" >> /etc/sudoers RUN ssh-keygen -t dsa -f /etc/ssh/ssh_host_dsa_key RUN ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key RUN mkdir /var/run/sshd EXPOSE 22 CMD ["/usr/sbin/sshd", "-D"] ######################## ``` ### 4. 上传jdk、hadoop、hive等包 使用winscp等工具,将jdk、hadoop、hive等包上传到宿主机目录下(与dockerfile在同目录下) 使用命令解压包 ```shell tar -zxvf jdk-8u361-linux-x64.tar.gz tar -zxvf hadoop-2.10.1.tar.gz tar -zxvf apache-hive-1.2.2-bin.tar.gz ``` ### 5. 制作带jdk的centos镜像 基于带ssh的centos镜像,制作hadoop镜像 先复制dockerfile文件 ```shell mv dockerfile dockefile.ssh ``` 新的dockerfile文件 ```shell FROM centos7-ssh COPY jdk1.8.0_361 /usr/local/jdk ENV JAVA_HOME /usr/local/jdk ENV PATH $JAVA_HOME/bin:$PATH COPY hadoop-2.10.1 /usr/local/hadoop ENV HADOOP_HOME /usr/local/hadoop ENV PATH $HADOOP_HOME/bin:$PATH COPY apache-hive-1.2.2-bin /usr/local/hive ENV HIVE_HOME /usr/local/hive ENV PATH $HIVE_HOME/bin:$PATH ``` 执行生成hadoop镜像 docker build -t="hadoop" . ### 6. 设置docker网桥,并创建容器 ```shell #创建名为hadoop的docker网桥 docker network create hadoop #查看docker网桥列表 docker network ls ``` 以hadoop网桥连接创建容器 ```shell docker run -itd --network hadoop --name hadoop1 -p 50070:50070 -p 8088:8088 -p 9001:9001 hadoop docker run -itd --network hadoop --name hadoop2 hadoop docker run -itd --network hadoop --name hadoop3 hadoop -- 防火墙添加端口放行 sudo firewall-cmd --permanent --add-port=8088/tcp sudo firewall-cmd --permanent --add-port=9001/tcp sudo firewall-cmd --permanent --add-port=50070/tcp sudo firewall-cmd --reload ``` 可以通过命令查看网桥的使用 ```shell docker network inspect hadoop [root@hadoop hadoop]# docker network inspect hadoop [ { "Name": "hadoop", "Id": "efa5a8e88c3dd2b6bb6ff213909ba6bbb514b5c94a073edb57a0e398187806dd", "Created": "2023-03-21T16:18:00.54457425+08:00", "Scope": "local", "Driver": "bridge", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": {}, "Config": [ { "Subnet": "172.18.0.0/16", "Gateway": "172.18.0.1" } ] }, "Internal": false, "Attachable": false, "Ingress": false, "ConfigFrom": { "Network": "" }, "ConfigOnly": false, "Containers": { "1c7fb49bcf5af2bc879da5bf09851b990408a0e6b5bdae7285ca66a8014cee06": { "Name": "hadoop3", "EndpointID": "1769afacbcd07cc21f64f2234b668c0a543880be486607bc0a2b4b62e8050d33", "MacAddress": "02:42:ac:12:00:04", "IPv4Address": "172.18.0.4/16", "IPv6Address": "" }, "213f8c13ebe8d72c6da578b0c3889fb29baddf0a80cca75aa8b0318da661b7c9": { "Name": "hadoop1", "EndpointID": "65665d880bbf22f547829de932fa68a0f3c47ed1177080295b12d562b4ad38f3", "MacAddress": "02:42:ac:12:00:02", "IPv4Address": "172.18.0.2/16", "IPv6Address": "" }, "97f10e9e19861c3fd57e1e8bb1234a282d5774b4186658baca6b581ec131f2cf": { "Name": "hadoop2", "EndpointID": "bb6296d3b25c72f77a83a790ba6cb772a9083f2a289beaad39128e88ca8995f9", "MacAddress": "02:42:ac:12:00:03", "IPv4Address": "172.18.0.3/16", "IPv6Address": "" } }, "Options": {}, "Labels": {} } ] ``` 分别记录各容器的ip及名称,后续hadoop配置需要用到 ```shell 172.18.0.2 hadoop1 172.18.0.3 hadoop2 172.18.0.4 hadoop3 ``` ### 7. 登录容器,设置ssh免密登录 开3个终端分别进入3个容器 ```shell docker exec -it hadoop1 bash docker exec -it hadoop2 bash docker exec -it hadoop3 bash ``` 分别修改 /etc/hosts 文件,添加以下内容 ```shell 172.18.0.2 hadoop1 172.18.0.3 hadoop2 172.18.0.4 hadoop3 ``` 保存退出,可以在3个终端分别ping一下,看是否可以正常ping通 设置ssh免密登录 分别在3个终端执行一下命令,期间的输入直接回车,到直接生成密钥文件 ```shell ssh-keygen ``` 静默命令-待验证 ```shell ssh-keygen -t rsa -N '' -f /root/.ssh/id_rsa -q ``` | 参数 | 作用 | | --- | --- | | -t | 类型 | | -N | 密码 | | -f | 文件名 | | -q | 静默 | 分别在3个终端上执行以下命令,复制本机密钥到其他机器 ```shell ssh-copy-id -i /root/.ssh/id_rsa -p 22 root@hadoop1 ssh-copy-id -i /root/.ssh/id_rsa -p 22 root@hadoop2 ssh-copy-id -i /root/.ssh/id_rsa -p 22 root@hadoop3 ``` 期间会提示`Are you sure you want to continue connecting (yes/no/[fingerprint])`,输入`yes`,然后输入root密码(第3步构建ssh容器的dockerfile中的root密码) 执行完成后,可以分别在终端上执行ssh命令,确认是否可以免密登录 ```shell ssh hadoop1 ssh hadoop2 ssh hadoop3 ``` ### 8. 修改hadoop配置文件 hadoop1(主服务): 创建hdfs需要的目录: ```shell mkdir /home/hadoop mkdir /home/hadoop/tmp /home/hadoop/hdfs_name /home/hadoop/hdfs_data ``` 进入`/usr/local/hadoop/etc/hadoop`目录下,分别修改`hadoop.env`、`core-site.xml`、`mapred-site.xml`、`hdfs-site.xml`、`yarn-site.xml`文件。如果文件不存在,则查看是否存在`$file.xml.template`的文件,并复制重命名为对应的文件名即可。 修改`hadoop.env`文件,重点修改以下内容: ```shell export JAVA_HOME=${JAVA_HOME} export HADOOP_HOME=${HADOOP_HOME} export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop ``` core-site.xml ```xml <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop1:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/home/hadoop/tmp</value> </property> <property> <name>io.file.buffer.size</name> <value>131702</value> </property> </configuration> ``` mapred-site.xml ```xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop1:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop1:19888</value> </property> </configuration> ``` hdfs-site.xml ```xml <configuration> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hadoop/hdfs_name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hadoop/hdfs_data</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.http-address</name> <value>hadoop1:9001</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop2:9002</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration> ``` yarn-site.xml ```xml <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>hadoop1:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>hadoop1:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>hadoop1:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>hadoop1:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>hadoop1:8088</value> </property> </configuration> ``` 编辑slaves文件,部分版本会叫workes文件。 删除原来的localhost,并添加 ```shell hadoop2 hadoop3 ``` 添加环境变量/etc/profile内容,并执行”source /etc/profile“立即生效 ```shell export JAVA_HOME=/usr/local/jdk export PATH=$JAVA_HOME/bin:$PATH export HADOOP_HOME=/usr/local/hadoop export PATH=$HADOOP_HOME/bin:$PATH export PATH=$HADOOP_HOME/sbin:$PATH export HDFS_NAMENODE_USER=root export HDFS_DATANODE_USER=root export HDFS_SECONDARYNAMENODE_USER=root export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root ``` 设置目录权限 ```shell chmod -R 777 /usr/local/hadoop chmod -R 777 /usr/local/jdk ``` 将文件修改复制到其他2台机器上 ```shell scp -r $HADOOP_HOME/etc/hadoop hadoop2:$HADOOP_HOME/etc/ scp -r $HADOOP_HOME/etc/hadoop hadoop3:$HADOOP_HOME/etc/ scp -r /home/hadoop hadoop2:/home/ scp -r /home/hadoop hadoop3:/home/ scp -r /etc/profile hadoop2:/etc/ scp -r /etc/profile hadoop3:/etc/ ```
张文海
2023年4月14日 17:17
转发文档
收藏文档
上一篇
下一篇
手机扫码
复制链接
手机扫一扫转发分享
复制链接
Markdown文件
分享
链接
类型
密码
更新密码