文档材料
- 官方文档 01:https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/yarn/
- 官方文档 02:https://nightlies.apache.org/flink/flink-docs-release-1.12/deployment/resource-providers/yarn.html
- CSDN 文档:https://blog.csdn.net/qq_31454379/article/details/110440037
介质路径
- Flink Shaded 10.0 包:https://archive.apache.org/dist/flink/flink-shaded-10.0/flink-shaded-10.0-src.tgz
- Flink 1.10.2 源码包:https://archive.apache.org/dist/flink/flink-1.10.2/flink-1.10.2-src.tgz
- Flink 1.10.2 bin包:https://archive.apache.org/dist/flink/flink-1.10.2/flink-1.10.2-bin-scala_2.12.tgz
- Flink Parcel GitHub 项目:https://github.com/pkeropen/flink-parcel.git
调整 Maven 配置文件
# 备份原始文件
cp /data/maven/apache-maven-3.6.3/conf/settings.xml /data/maven/apache-maven-3.6.3/conf/settings.xml.orig
# 添加镜像路径
# 在 159 行的 "</mirrors>" 前,添加如下配置
<!-- flink 源码编译-->
<mirror>
<id>alimaven</id>
<mirrorOf>central</mirrorOf>
<name>aliyun maven</name>
<url>http://maven.aliyun.com/nexus/content/repositories/central/</url>
</mirror>
<mirror>
<id>alimaven</id>
<name>aliyun maven</name>
<url>http://maven.aliyun.com/nexus/content/groups/public/</url>
<mirrorOf>central</mirrorOf>
</mirror>
<mirror>
<id>central</id>
<name>Maven Repository Switchboard</name>
<url>http://repo1.maven.org/maven2/</url>
<mirrorOf>central</mirrorOf>
</mirror>
<mirror>
<id>repo2</id>
<mirrorOf>central</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://repo2.maven.org/maven2/</url>
</mirror>
<mirror>
<id>ibiblio</id>
<mirrorOf>central</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://mirrors.ibiblio.org/pub/mirrors/maven2/</url>
</mirror>
<mirror>
<id>jboss-public-repository-group</id>
<mirrorOf>central</mirrorOf>
<name>JBoss Public Repository Group</name>
<url>http://repository.jboss.org/nexus/content/groups/public</url>
</mirror>
<mirror>
<id>google-maven-central</id>
<name>Google Maven Central</name>
<url>https://maven-central.storage.googleapis.com
</url>
<mirrorOf>central</mirrorOf>
</mirror>
<!-- 中央仓库在中国的镜像 -->
<mirror>
<id>maven.net.cn</id>
<name>oneof the central mirrors in china</name>
<url>http://maven.net.cn/content/groups/public/</url>
<mirrorOf>central</mirrorOf>
</mirror>
编译 Flink
1. 创建服务目录
mkdir -p /data/flink
2. 下载介质
wget https://archive.apache.org/dist/flink/flink-shaded-10.0/flink-shaded-10.0-src.tgz -P /data/flink
wget https://archive.apache.org/dist/flink/flink-1.10.2/flink-1.10.2-bin-scala_2.12.tgz -P /data/flink
3. 编译 Flink Shaded
# 解压 Flink Shaded 压缩包
tar -xzf /data/flink/flink-shaded-10.0-src.tgz -C /data/flink
# 备份初始配置文件
cp /data/flink/flink-shaded-10.0/pom.xml /data/flink/flink-shaded-10.0/pom.xml.orig
# 修改配置文件
# 在 170 行的 "</profiles>" 前,添加如下配置
<profile>
<id>java11</id>
<activation>
<jdk>11</jdk>
</activation>
<id>vendor-repos</id>
<activation>
<property>
<name>vendor-repos</name>
</property>
</activation>
<!-- Add vendor maven repositories -->
<repositories>
<!-- Cloudera -->
<repository>
<id>cloudera-releases</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<!-- Hortonworks -->
<repository>
<id>HDPReleases</id>
<name>HDP Releases</name>
<url>https://repo.hortonworks.com/content/repositories/releases/</url>
<snapshots><enabled>false</enabled></snapshots>
<releases><enabled>true</enabled></releases>
</repository>
<repository>
<id>HortonworksJettyHadoop</id>
<name>HDP Jetty</name>
<url>https://repo.hortonworks.com/content/repositories/jetty-hadoop</url>
<snapshots><enabled>false</enabled></snapshots>
<releases><enabled>true</enabled></releases>
</repository>
<!-- MapR -->
<repository>
<id>mapr-releases</id>
<url>https://repository.mapr.com/maven/</url>
<snapshots><enabled>false</enabled></snapshots>
<releases><enabled>true</enabled></releases>
</repository>
</repositories>
</profile>
# 编译 Flink Shaded
# - clean: 在构建项目之前,清理先前生成的文件。它会删除 target 目录
# - install: 将构建的项目文件安装到本地 Maven 仓库中。其他项目可以从本地仓库中引用这个项目
# - DskipTests: 在构建期间跳过运行测试
# - Pvendor-repos: 激活 Maven profile 为 vendor-repos
# - Dhadoop.version=3.0.0-cdh6.3.2: 指定 Hadoop 版本为 3.0.0-cdh6.3.2
# - Dscala-2.12: 指定 Scala 版本为 2.12
# - Drat.skip=true: 跳过 "Release Audit Tool"(RAT)检查。RAT 用于检查项目是否符合 Apache 许可证要求
# - T10C: 启用并行构建,线程数为 10,C 表示以类的方式执行构建
cd /data/flink/flink-shaded-10.0/ && mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=3.0.0-cdh6.3.2 -Dscala-2.12 -Drat.skip=true -T10C
制作 Pacel 包
# 下载介质
git clone https://github.com/pkeropen/flink-parcel.git
# 将 flink-1.10.2-bin-scala_2.12.tgz 存入指定路径
cp /data/flink/flink-1.10.2-bin-scala_2.12.tgz /data/flink/flink-parcel/
# 备份原始配置文件
cp /data/flink/flink-parcel/flink-parcel.properties /data/flink/flink-parcel/flink-parcel.properties.orig
# 修改配置文件
cat > /data/flink/flink-parcel/flink-parcel.properties << EOF
# FLINK 下载地址
FLINK_URL=https://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-1.10.2/flink-1.10.2-bin-scala_2.12.tgz
# Flink 版本号
FLINK_VERSION=1.10.2
# 扩展版本号
EXTENS_VERSION=BIN-SCALA_2.12
# 操作系统版本,以centos为例
OS_VERSION=7
# CDH 小版本
CDH_MIN_FULL=5.2
CDH_MAX_FULL=6.3.3
# CDH大版本
CDH_MIN=5
CDH_MAX=6
EOF
# 增加 build.sh 权限
chmod +x build.sh
# 编译 Flink Parcel
sh build.sh parcel
# 生成 csd 文件
# On YARN
sh build.sh csd_on_yarn
# StandAlone
sh build.sh csd_standalone
# 查看是否已生成所需文件
ll /data/flink/flink-parcel
-rwxr-xr-x 1 root root 5863 Nov 27 14:50 build.sh
drwxr-xr-x 6 root root 142 Nov 27 15:03 cm_ext
drwxr-xr-x 4 root root 29 Nov 27 15:31 FLINK-1.10.2-BIN-SCALA_2.12
drwxr-xr-x 2 root root 123 Nov 27 15:31 FLINK-1.10.2-BIN-SCALA_2.12_build
-rw-r--r-- 1 root root 280626150 Nov 27 14:52 flink-1.10.2-bin-scala_2.12.tgz
-rw-r--r-- 1 root root 7737 Nov 27 15:40 FLINK-1.10.2.jar
drwxr-xr-x 5 root root 53 Nov 27 15:40 flink_csd_build
drwxr-xr-x 5 root root 53 Nov 27 14:50 flink-csd-on-yarn-src
drwxr-xr-x 5 root root 53 Nov 27 14:50 flink-csd-standalone-src
-rw-r--r-- 1 root root 8260 Nov 27 15:40 FLINK_ON_YARN-1.10.2.jar
-rw-r--r-- 1 root root 350 Nov 27 14:55 flink-parcel.properties
-rw-r--r-- 1 root root 346 Nov 27 14:53 flink-parcel.properties.orig
drwxr-xr-x 3 root root 85 Nov 27 14:50 flink-parcel-src
-rw-r--r-- 1 root root 11357 Nov 27 14:50 LICENSE
-rw-r--r-- 1 root root 4334 Nov 27 14:50 README.md
配置 Flink Parcel
1. 节点配置
# 将 csd 文件存入 cloudera-scm-server 节点的 /opt/cloudera/csd 目录下
scp FLINK-1.10.2.jar FLINK_ON_YARN-1.10.2.jar root@cloudera-scm-server:/opt/cloudera/csd
# 配置 Httpd 服务,外发 Flink Parcel 配置及介质
ln -s /data/flink/flink-parcel/FLINK-1.10.2-BIN-SCALA_2.12_build /var/www/html/flink1.10.2
# 查看外发 Flink Parcel 配置及介质
ll /var/www/html/flink1.10.2/
-rw-r--r-- 1 root root 280629521 Nov 27 15:47 FLINK-1.10.2-BIN-SCALA_2.12-el7.parcel
-rw-r--r-- 1 root root 41 Nov 27 15:47 FLINK-1.10.2-BIN-SCALA_2.12-el7.parcel.sha
-rw-r--r-- 1 root root 583 Nov 27 15:47 manifest.json
# 备份原文件
cp /etc/httpd/conf/httpd.conf /etc/httpd/conf/httpd.conf.orig
# 调整配置文件
# 将 284 行更改为如下配置
AddType application/x-gzip .gz .tgz .parcel
# 重启服务使其生效
systemctl restart httpd
# 查看外发状态
curl http://${httpd_server_ip}/flink1.10.
2. CM Web UI,选择 Parcel 配置,添加 http://${httpd_server_ip}/flink1.10.2
3. Parcel 中将会识别 Flink Parcel 包
4. 下载 => 分配 => 激活 Parcel 包
部署 Flink 服务
1. 重启 cloudera-scm-server 服务
systemctl restart cloudera-scm-server
2. 将 Flink Shaded 存入指定路径
# 所有 cloudera-scm-agent 都需进行如下操作
cp /data/flink/flink-shaded-10.0/flink-shaded-hadoop-2-parent/flink-shaded-hadoop-2-uber/target/flink-shaded-hadoop-2-uber-3.0.0-cdh6.3.2-10.0.jar /opt/cloudera/parcels/FLINK/lib/flink/lib/
3. 按流程完成 Flink 部署(若未配置 kerberos,需将两项 kerberos 配置清空)
验证 Flink 服务
1. 查看 YARN 应用程序,存在驻留任务 “Flink session cluster”
2. 通过此任务信息,跳转至 Flink Dashbord
文章来源地址https://www.toymoban.com/news/detail-836677.html
文章来源:https://www.toymoban.com/news/detail-836677.html
到了这里,关于CDH 6.3.2 Parcel 包安装 Apache Flink 1.10.2的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!