官方的预构建包对很多 Native Libraries 功能扩展支持不是很完善,需要重新进行构建,本文演示 Debian 11 基于 amd64 架构环境编译。
过程
基础环境 Debian 11 (amd64) 最小化安装实例。可以手动配置使用国内源,具体可参考清华源说明,建议编译前完整升级一遍系统并重启后继续操作。
deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bullseye main
deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bullseye main
#deb https://security.debian.org/debian-security bullseye-security main
#deb-src https://security.debian.org/debian-security bullseye-security main
deb https://mirrors.tuna.tsinghua.edu.cn/debian-security bullseye-security main
deb-src https://mirrors.tuna.tsinghua.edu.cn/debian-security bullseye-security main
# bullseye-updates, to get updates before a point release is made;
# see https://www.debian.org/doc/manuals/debian-reference/ch02.en.html#_updates_and_backports
deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bullseye-updates main
deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bullseye-updates main
# This system was installed using small removable media
# (e.g. netinst, live or single CD). The matching "deb cdrom"
# entries were disabled at the end of the installation process.
# For information about how to configure apt package sources,
# see the sources.list(5) manual.
升级完毕后检查系统版本(如果提示命令未找到,手动安装 lsb-release
)
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye
环境
Oracle JDK
因为 Hadoop 及相关套件都是基于 Java 编写的,先安装基础环境。理论上 OpenJDK 亦可使用,不过谨慎起见,在 Oracle JDK 官网下载 JDK 安装包。
## 若下载版本为 8u341
sudo tar xf jdk-8u341-linux-x64.tar.gz -C /opt/
echo -e 'export JAVA_HOME="/opt/jdk1.8.0_341"\nexport PATH=$JAVA_HOME/bin:$PATH' | sudo tee /etc/profile.d/maven.sh
安装后检查版本
$ java -version
java version "1.8.0_341"
Java(TM) SE Runtime Environment (build 1.8.0_341-b10)
Java HotSpot(TM) 64-Bit Server VM (build 25.341-b10, mixed mode)
Maven
然后部署 Maven ,提供 Java 构建环境。
wget https://archive.apache.org/dist/maven/maven-3/3.8.6/binaries/apache-maven-3.8.6-bin.tar.gz
sudo tar xf apache-maven-3.8.6-bin.tar.gz -C /opt/
添加全局环境变量
cat <<"EOF" | sudo tee -a /etc/profile.d/maven.sh
export M2_HOME="/opt/apache-maven-3.8.7"
export MAVEN_HOME="/opt/apache-maven-3.8.7"
export PATH='$M2_HOME/bin:$PATH'
EOF
System Depends
然后准备 Native Libraries 编译环境,官方演示编译环境为 Ubuntu,包名相同按要求进行安装即可。
sudo apt -y install build-essential autoconf automake libtool zlib1g-dev pkg-config libssl-dev libsasl2-dev
sudo apt -y install g++-9 gcc-9
因为 Debian 官方仓库中的 cmake 版本较低,不满足要求,因此手动编译安装:
wget https://cmake.org/files/v3.21/cmake-3.21.7.tar.gz
tar xf cmake-3.21.7.tar.gz
cd cmake-3.21.7/
./bootstrap
make -j$(nproc)
sudo make install
然后安装 Native Depends
sudo apt -y install libbz2-dev \
libfuse-dev \
libprotobuf-dev \
libsasl2-dev \
libssl-dev \
libzstd-dev \
libsnappy-dev \
zlib1g-dev
Build Tools
接下来安装编译工具
## Protocol Buffers 3.7.1 (required to build native code)
curl -L -s -S https://github.com/protocolbuffers/protobuf/releases/download/v3.7.1/protobuf-java-3.7.1.tar.gz -o protobuf-3.7.1.tar.gz
mkdir protobuf-3.7-src
tar xf protobuf-3.7.1.tar.gz --strip-components 1 -C protobuf-3.7-src && cd protobuf-3.7-src
./configure
make -j$(nproc)
sudo make install
## Boost 1.72.0
wget https://sourceforge.net/projects/boost/files/boost/1.72.0/boost_1_72_0.tar.bz2/download -O boost_1_72_0.tar.bz2
tar xf boost_1_72_0.tar.bz2
cd boost_1_72_0/
./bootstrap.sh --prefix=/usr/
./b2 --without-python
sudo ./b2 --without-python install
安装完成后检查系统组件版本
$ cmake --version
cmake version 3.21.7
CMake suite maintained and supported by Kitware (kitware.com/cmake).
$ protoc --version
libprotoc 3.7.1
小贴士:如果执行报错,需要先刷新一下库目录缓存 sudo ldconfig
。
构建
下载 Hadoop 3.3.4 版本源码包,解压并开始构建
wget https://dlcdn.apache.org/hadoop/common/hadoop-3.3.4/hadoop-3.3.4-src.tar.gz
tar xf hadoop-3.3.4-src.tar.gz
cd hadoop-3.3.4-src/
mvn clean package -Pdist,native, -DskipTests -Dtar -Dmaven.javadoc-skip=true -X
看到以下提示即为构建成功:
[INFO] No site descriptor found: nothing to attach.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache Hadoop Main 3.3.4:
[INFO]
[INFO] Apache Hadoop Main ................................. SUCCESS [02:01 min]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [01:01 min]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [ 37.537 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [ 16.711 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [ 0.047 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [ 50.656 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [01:33 min]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [ 35.818 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [02:48 min]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 5.573 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [02:42 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 7.504 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 8.095 s]
[INFO] Apache Hadoop Registry ............................. SUCCESS [ 5.720 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [ 0.019 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [01:41 min]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [ 57.229 s]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [01:12 min]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 7.278 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 1.293 s]
[INFO] Apache Hadoop HDFS-RBF ............................. SUCCESS [ 14.657 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [ 0.028 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [ 0.045 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [ 21.863 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [ 41.097 s]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [ 0.025 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [ 17.717 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 47.475 s]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [ 4.568 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [ 7.874 s]
[INFO] Apache Hadoop YARN Timeline Service ................ SUCCESS [ 3.912 s]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 9.536 s]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [ 0.583 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [ 5.542 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [ 1.138 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [ 1.066 s]
[INFO] Apache Hadoop YARN TimelineService HBase Backend ... SUCCESS [ 0.036 s]
[INFO] Apache Hadoop YARN TimelineService HBase Common .... SUCCESS [ 20.700 s]
[INFO] Apache Hadoop YARN TimelineService HBase Client .... SUCCESS [ 42.930 s]
[INFO] Apache Hadoop YARN TimelineService HBase Servers ... SUCCESS [ 0.018 s]
[INFO] Apache Hadoop YARN TimelineService HBase Server 1.2 SUCCESS [ 1.809 s]
[INFO] Apache Hadoop YARN TimelineService HBase tests ..... SUCCESS [ 41.403 s]
[INFO] Apache Hadoop YARN Router .......................... SUCCESS [ 1.524 s]
[INFO] Apache Hadoop YARN TimelineService DocumentStore ... SUCCESS [ 41.767 s]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [ 0.068 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [ 1.118 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [ 0.819 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [ 0.194 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [ 5.284 s]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 2.601 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [ 1.258 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [ 5.154 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [ 1.950 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 2.218 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [ 0.494 s]
[INFO] Apache Hadoop YARN Services ........................ SUCCESS [ 0.096 s]
[INFO] Apache Hadoop YARN Services Core ................... SUCCESS [ 6.723 s]
[INFO] Apache Hadoop YARN Services API .................... SUCCESS [ 0.769 s]
[INFO] Apache Hadoop YARN Application Catalog ............. SUCCESS [ 0.082 s]
[INFO] Apache Hadoop YARN Application Catalog Webapp ...... SUCCESS [05:20 min]
[INFO] Apache Hadoop YARN Application Catalog Docker Image SUCCESS [ 0.023 s]
[INFO] Apache Hadoop YARN Application MaWo ................ SUCCESS [ 0.082 s]
[INFO] Apache Hadoop YARN Application MaWo Core ........... SUCCESS [ 1.033 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [ 0.017 s]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [ 0.241 s]
[INFO] Apache Hadoop YARN UI .............................. SUCCESS [ 0.064 s]
[INFO] Apache Hadoop YARN CSI ............................. SUCCESS [ 37.246 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [ 7.038 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [ 0.807 s]
[INFO] Apache Hadoop MapReduce NativeTask ................. SUCCESS [ 15.014 s]
[INFO] Apache Hadoop MapReduce Uploader ................... SUCCESS [ 0.840 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 1.622 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [ 2.737 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 6.104 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 1.699 s]
[INFO] Apache Hadoop Client Aggregator .................... SUCCESS [ 0.994 s]
[INFO] Apache Hadoop Dynamometer Workload Simulator ....... SUCCESS [ 1.170 s]
[INFO] Apache Hadoop Dynamometer Cluster Simulator ........ SUCCESS [ 1.490 s]
[INFO] Apache Hadoop Dynamometer Block Listing Generator .. SUCCESS [ 0.930 s]
[INFO] Apache Hadoop Dynamometer Dist ..................... SUCCESS [ 3.411 s]
[INFO] Apache Hadoop Dynamometer .......................... SUCCESS [ 0.083 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [ 0.887 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [ 1.075 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 1.840 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 1.439 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [ 1.114 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [ 1.082 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [ 1.907 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [ 1.298 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [ 33.725 s]
[INFO] Apache Hadoop Kafka Library support ................ SUCCESS [ 7.707 s]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 8.595 s]
[INFO] Apache Hadoop Aliyun OSS support ................... SUCCESS [ 15.269 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 1.651 s]
[INFO] Apache Hadoop Resource Estimator Service ........... SUCCESS [ 9.306 s]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [ 13.255 s]
[INFO] Apache Hadoop Image Generation Tool ................ SUCCESS [ 1.232 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 9.532 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [ 0.017 s]
[INFO] Apache Hadoop Client API ........................... SUCCESS [01:16 min]
[INFO] Apache Hadoop Client Runtime ....................... SUCCESS [01:02 min]
[INFO] Apache Hadoop Client Packaging Invariants .......... SUCCESS [ 3.033 s]
[INFO] Apache Hadoop Client Test Minicluster .............. SUCCESS [01:46 min]
[INFO] Apache Hadoop Client Packaging Invariants for Test . SUCCESS [ 0.076 s]
[INFO] Apache Hadoop Client Packaging Integration Tests ... SUCCESS [ 5.577 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 26.695 s]
[INFO] Apache Hadoop Client Modules ....................... SUCCESS [ 0.040 s]
[INFO] Apache Hadoop Cloud Storage ........................ SUCCESS [ 0.171 s]
[INFO] Apache Hadoop Tencent COS Support .................. SUCCESS [ 5.319 s]
[INFO] Apache Hadoop Cloud Storage Project ................ SUCCESS [ 0.014 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 38:03 min
[INFO] Finished at: 2022-12-24T10:47:07+08:00
[INFO] ------------------------------------------------------------------------
生成的安装包在 hadoop-dist/target/
目录下 hadoop-3.3.4.tar.gz
为最终编译的产品包。
可选项
ISA-L Support
ISA-L (Intelligent Storage Acceleration Library) 是 Intel 开发的智能存储加速库,可以为 HDFS 提高性能,可以在 ARMv8(aarch64) 和 AMD64(x86_64) 架构上编译。
此组件在编译时会自动检测并添加支持,因此请在编译前按下述步骤安装 ISA-L 库,然后进行编译即可原生支持 ISA-L。
## 安装依赖
sudo apt -y install nasm help2man libtool
## 克隆源码
git clone https://github.com/intel/isa-l
cd isa-l/
./autogen.sh
./configure --prefix=/usr --libdir=/usr/lib
make
sudo make install
PMDK Support
PMDK(Persistent Memory Development Kit) 利用 PMDK 用户态编程库进行数据读写,减小用户态、内核态切换与文件系统开销,提高集群的读写性能。
PMDK 扩展与其他扩展不同,即便系统内检测到相关依赖库也不会默认编译支持,需要在编译时增加参数重新进行编译,增加支持,安装所需依赖。
amd64(x86_64) 可以使用发行版的包管理器进行安装:
## 只安装运行依赖 Runtime (部署机器上安装) sudo apt -y install libpmem1 librpmem1 libpmemblk1 libpmemlog1 libpmemobj1 libpmempool1 ## 只安装开发套件 Development (编译机器上安装) sudo apt -y install libpmem-dev librpmem-dev libpmemblk-dev libpmemlog-dev libpmemobj-dev libpmempool-dev libpmempool-dev
arm64(aarch64) 没有预构建包,需要手动编译:
## 安装编译依赖 sudo apt install -y autoconf automake pkg-config libglib2.0-dev libfabric-dev pandoc libncurses5-dev ## 可选依赖 sudo apt install -y libfabric-dev libndctl-dev libdaxctl-dev ## 克隆源码 git clone https://github.com/pmem/pmdk ## 编译 make sudo make install prefix=/usr
然后使用命令进行编译
mvn clean package -Pdist,native, -DskipTests -Dtar -Dmaven.javadoc-skip=true -Drequire.pmdk -X
编译后使用新安装包部署后重新执行检查
$ hadoop checknative
2022-12-25 02:10:45,702 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
2022-12-25 02:10:45,703 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2023-01-03 02:10:45,729 INFO nativeio.NativeIO: The native code was built with PMDK support, and PMDK libs were loaded successfully.
Native library checking:
hadoop: true /opt/hadoop-3.3.4/lib/native/libhadoop.so.1.0.0
zlib: true /lib/x86_64-linux-gnu/libz.so.1
zstd : true /lib/x86_64-linux-gnu/libzstd.so.1
bzip2: true /lib/x86_64-linux-gnu/libbz2.so.1
openssl: true /lib/x86_64-linux-gnu/libcrypto.so.1.1
ISA-L: true /lib/libisal.so.2
PMDK: true /usr/lib/x86_64-linux-gnu/libpmem.so.1.0.0
常见问题
a) 提示部分库下载失败
- 换用阿里云效的 Maven 镜像仓库
搭建自建代理,然后使用命令配置使用代理。
export MAVEN_OPTS="-DproxyHost=127.0.0.1 -DproxyPort=8080"
HTTP 代理也可以在配置
$M2_HOME/conf/settings.xml
中指定<proxies> <proxy> <id>optional</id> <active>true</active> <protocol>http</protocol> <host>127.0.0.1</host> <port>8118</port> <nonProxyHosts>local.net|some.host.com</nonProxyHosts> </proxy> </proxies>
b) help2man: command not found
编译 isa-l 安装时报错如下:
/bin/bash: line 1: help2man: command not found
make[2]: [Makefile:4791: programs/igzip.1] Error 127 (ignored)
这是因为没有按照要求安装依赖,导致缺失 help2man
,手动补充安装。
c) libisal.so.2: cannot open shared object file
编译安装 isa-l 后依然提示 libisal.so.2: cannot open shared object file: No such file or directory
这是因为在 Debian 和 RHEL 系列的发行版内 64 位库文件的默认存放位置不同,Debian 在 /lib/
内,RedHat 在 /lib64/
内,手动软链一下即可:
sudo ln -s /usr/lib64/libisal.so.2 /usr/lib/libisal.so.2
d) checknative 检查原生组件部分组件报 false
比如
$ hadoop checknative
Native library checking:
hadoop: true /opt/hadoop-3.3.4/lib/native/libhadoop.so.1.0.0
zlib: true /lib/x86_64-linux-gnu/libz.so.1
zstd : true /lib/x86_64-linux-gnu/libzstd.so.1
bzip2: true /lib/x86_64-linux-gnu/libbz2.so.1
openssl: false Cannot load libcrypto.so (libcrypto.so: cannot open shared object file: No such file or directory)!
ISA-L: false Loading ISA-L failed: Failed to load libisal.so.2 (libisal.so.2: cannot open shared object file: No such file or directory)
PMDK: false The native code was built without PMDK support.
其中的组件及其对应的包名如下表:
Object Name | Package Name | Source Name |
---|---|---|
zlib | zlib1g-dev | / |
zstd | libzstd-dev | / |
bzip2 | libbz2-dev | / |
openssl | libssl-dev | / |
ISA-L | / | https://github.com |
PMDK | / | https://pmem.io |
附录
参考链接
本文由 柒 创作,采用 知识共享署名4.0
国际许可协议进行许可。
转载本站文章前请注明出处,文章作者保留所有权限。
最后编辑时间: 2023-10-12 14:10 PM