最近在研究汇编的一些基本指令,在研究过程中通过二进制的反编译学习到了不少汇编的函数、堆栈和一些可以提高代码运行速度的机器指令等汇编语言语法。如字符串的复制可以使用MOV指令逐个字符赋值,也可以使用字符串操作指令减少指令数,提高运行速度。
1. 字符串操作指令
1.1 字符串装载LODS
转载指令用于将ESI寄存器所指向的字符串的字符装入到累加寄存器,同时调整ESI的值(加或者减所操作的字节数)。包括LODSB、LODSW、LODSD、LODSQ。
Opcode | Instruction | Op/ En |
64-Bit Mode |
Compat/ Leg Mode |
Description |
AC | LODS m8 | ZO | Valid | Valid | For legacy mode, Load byte at address DS:(E)S into AL. For 64-bit mode load byte at address (R)SI into AL. |
AD | LODS m16 | ZO | Valid | Valid | For legacy mode, Load word at address DS:(E)SI into AX. For 64-bit mode load word at address (R)SI into AX. |
AD | LODS m32 | ZO | Valid | Valid | For legacy mode, Load dword at address DS:(E)SI into EAX. For 64-bit mode load dword at address (R)SI into EAX. |
REX.W + AD | LODS m64 | ZO | Valid | N.E. | Load qword at address (R)SI into RAX. |
AC | LODSB | ZO | Valid | Valid | For legacy mode, Load byte at address DS:(E)S into AL. For 64-bit mode load byte at address (R)SI into AL. |
AD | LODSW | ZO | Valid | Valid | For legacy mode, Load word at address DS:(E)SI into AX. For 64-bit mode load word at address (R)SI into AX. |
AD | LODSD | ZO | Valid | Valid | For legacy mode, Load dword at address DS:(E)SI into EAX. For 64-bit mode load dword at address (R)SI into EAX. |
REX.W + AD | LODSQ | ZO | Valid | N.E. | Load qword at address (R)SI into RAX. |
1.2 字符串存储STOS
字符串存储指令用于将累加寄存器值存储到EDI所指向的地址,也就是替换字符,同时调整EDI的值(加或者减所操作的字节数)。包括STOSB、STOSW、STOSD、STOSQ。
Opcode | Instruction | Op/ En |
64-Bit Mode |
Compat/ Leg Mode |
Description |
AA | STOS m8 | NA | Valid | Valid | For legacy mode, store AL at address ES:(E)DI; For 64-bit mode store AL at address RDI or EDI. |
AB | STOS m16 | NA | Valid | Valid | For legacy mode, store AX at address ES:(E)DI; For 64-bit mode store AX at address RDI or EDI. |
AB | STOS m32 | NA | Valid | Valid | For legacy mode, store EAX at address ES:(E)DI; For 64-bit mode store EAX at address RDI or EDI. |
REX.W + AB | STOS m64 | NA | Valid | N.E. | Store RAX at address RDI or EDI. |
AA | STOSB | NA | Valid | Valid | For legacy mode, store AL at address ES:(E)DI; For 64-bit mode store AL at address RDI or EDI. |
AB | STOSW | NA | Valid | Valid | For legacy mode, store AX at address ES:(E)DI; For 64-bit mode store AX at address RDI or EDI. |
AB | STOSD | NA | Valid | Valid | For legacy mode, store EAX at address ES:(E)DI; For 64-bit mode store EAX at address RDI or EDI. |
REX.W + AB | STOSQ | NA | Valid | N.E. | Store RAX at address RDI or EDI. |
1.3 字符串传送MOVS
字符串传送指令用于将ESI所指向的字符串传送到EDI所指向的地址,同时调整ESI和EDI的值(加或者减所操作的字节数)。包括MOVSB、MOVSW、MOVSD、MOVSQ。
Opcode | Instruction | Op/ En |
64-Bit Mode |
Compat/ Leg Mode |
Description |
A4 | MOVS m8, m8 | ZO | Valid | Valid | For legacy mode, Move byte from address DS:(E)SI to ES:(E)DI. For 64-bit mode move byte from address (R|E)SI to (R|E)DI. |
A5 | MOVS m16, m16 | ZO | Valid | Valid | For legacy mode, move word from address DS:(E)SI to ES:(E)DI. For 64-bit mode move word at address (R|E)SI to (R|E)DI. |
A5 | MOVS m32, m32 | ZO | Valid | Valid | For legacy mode, move dword from address DS:(E)SI to ES:(E)DI. For 64-bit mode move dword from address (R|E)SI to (R|E)DI. |
REX.W + A5 | MOVS m64, m64 | ZO | Valid | N.E. | Move qword from address (R|E)SI to (R|E)DI. |
A4 | MOVSB | ZO | Valid | Valid | For legacy mode, Move byte from address DS:(E)SI to ES:(E)DI. For 64-bit mode move byte from address (R|E)SI to (R|E)DI. |
A5 | MOVSW | ZO | Valid | Valid | For legacy mode, move word from address DS:(E)SI to ES:(E)DI. For 64-bit mode move word at address (R|E)SI to (R|E)DI. |
A5 | MOVSD | ZO | Valid | Valid | For legacy mode, move dword from address DS:(E)SI to ES:(E)DI. For 64-bit mode move dword from address (R|E)SI to (R|E)DI. |
REX.W + A5 | MOVSQ | ZO | Valid | N.E. | Move qword from address (R|E)SI to (R|E)DI. |
1.4 字符串扫描SCAS
字符串扫描指令用于将累加寄存器的内容与EDI所指向的字节、双字或四字进行比较,并调整EDI的值(加或者减所操作的字节数)。包括SCASB、SCASW、SCASD、SCASQ。
Opcode | Instruction | Op/ En |
64-Bit Mode |
Compat/ Leg Mode |
Description |
AE | SCAS m8 | ZO | Valid | Valid | Compare AL with byte at ES:(E)DI or RDI, then set status flags.* |
AF | SCAS m16 | ZO | Valid | Valid | Compare AX with word at ES:(E)DI or RDI, then set status flags.* |
AF | SCAS m32 | ZO | Valid | Valid | Compare EAX with doubleword at ES(E)DI or RDI then set status flags.* |
REX.W + AF | SCAS m64 | ZO | Valid | N.E. | Compare RAX with quadword at RDI or EDI then set status flags. |
AE | SCASB | ZO | Valid | Valid | Compare AL with byte at ES:(E)DI or RDI then set status flags.* |
AF | SCASW | ZO | Valid | Valid | Compare AX with word at ES:(E)DI or RDI then set status flags.* |
AF | SCASD | ZO | Valid | Valid | Compare EAX with doubleword at ES:(E)DI or RDI then set status flags.* |
REX.W + AF | SCASQ | ZO | Valid | N.E. | Compare RAX with quadword at RDI or EDI then set status flags. |
1.5 字符串比较CMPS
字符串比较指令用于将ESI所指向的数据与EDI所指向的数据进行比较,同时调整ESI和EDI的值(加或者减所操作的字节数)。包括CMPSB、CMPSW、CMPSD、CMPSQ。
Opcode | Instruction | Op/ En |
64-Bit Mode |
Compat/ Leg Mode |
Description |
A6 | CMPS m8, m8 | ZO | Valid | Valid | For legacy mode, compare byte at address DS:(E)SI with byte at address ES:(E)DI; For 64-bit mode compare byte at address (R|E)SI to byte at address (R|E)DI. The status flags are set accordingly. |
A7 | CMPS m16, m16 | ZO | Valid | Valid | For legacy mode, compare word at address DS:(E)SI with word at address ES:(E)DI; For 64-bit mode compare word at address (R|E)SI with word at address (R|E)DI. The status flags are set accordingly. |
A7 | CMPS m32, m32 | ZO | Valid | Valid | For legacy mode, compare dword at address DS:(E)SI at dword at address ES:(E)DI; For 64-bit mode compare dword at address (R|E)SI at dword at address (R|E)DI. The status flags are set accordingly. |
REX.W + A7 | CMPS m64, m64 | ZO | Valid | N.E. | Compares quadword at address (R|E)SI with quadword at address (R|E)DI and sets the status flags accordingly. |
A6 | CMPSB | ZO | Valid | Valid | For legacy mode, compare byte at address DS:(E)SI with byte at address ES:(E)DI; For 64-bit mode compare byte at address (R|E)SI with byte at address (R|E)DI. The status flags are set accordingly. |
A7 | CMPSW | ZO | Valid | Valid | For legacy mode, compare word at address DS:(E)SI with word at address ES:(E)DI; For 64-bit mode compare word at address (R|E)SI with word at address (R|E)DI. The status flags are set accordingly. |
A7 | CMPSD | ZO | Valid | Valid | For legacy mode, compare dword at address DS:(E)SI with dword at address ES:(E)DI; For 64-bit mode compare dword at address (R|E)SI with dword at address (R|E)DI. The status flags are set accordingly. |
REX.W + A7 | CMPSQ | ZO | Valid | N.E. | Compares quadword at address (R|E)SI with quadword at address (R|E)DI and sets the status flags accordingly. |
2. 示例
如下的字符串复制函数C代码在经过编译器编译后,不同的编译器和编译参数将产生不同的机器码,代码中的dst[n] = src[n]语句经过编译后可能会使用到字符串操作指令。以下是casm.c源文件中的copy_str函数:
void copy_str(void)
{
char src[32] = "this is a string";
char dst[32] = "";
for(int n = 0; n < 32; n++)
{
dst[n] = src[n];
}
printf("copy_str called: %s\n", dst);
}
本文编译示例代码使用的CL编译器的版本是用于 x86 的 Microsoft (R) C/C++ 优化编译器 19.29.30133 版,GCC版本是4.7.2。
2.1 微软CL编译
2.1.1 不使用优化
在命令行运行如下命令编译并链接生成可执行程序:
CL /c casm.c
LINK /SUBSYSTEM:CONSOLE /RELEASE casm.obj
不使用优化生成的机器码如下图所示,CL编译器使用多条MOV指令来复制字符串:
2.1.2 使用O1优化(优选空间)
在命令行运行如下命令编译并链接生成可执行程序:
CL /c /O1 /GA /w /TC /nologo /Fo casm.c
LINK /SUBSYSTEM:CONSOLE /RELEASE casm.obj
示例copy_str函数编译后对应的机器码如下图所示,可以看到地址为0x00AC1048等地址使用5次MOVS指令复制了17字符“this is a string",0x00AC1050、0x00AC1054等地址使用了5次STOS指令来将后面的15个字节置为0。如下图所示:
2.1.3 使用O2优化(优选速度)
在命令行运行如下命令编译并链接生成可执行程序:
CL /c /O2 /GA /w /TC /nologo /Fo casm.c
LINK /SUBSYSTEM:CONSOLE /RELEASE casm.obj
CL编译器采用O2选项编译后的机器码如下图所示,代码非常精简,使用MOVAPS(Move Aligned Packed Single-Precision Floating-Point Values)指令复制字符串,如下图所示:
2.2 GCC编译
2.2.1 不使用优化
在命令行运行如下命令编译并链接生成可执行程序:
gcc -std=c99 .\casm.c -o .\casmgcc.exe
使用gcc编译后的机器码如下图所示,使用了REP指令和MOVS指令逐个复制字符串,MOVS共计执行17次,如下图所示:
2.2.2 使用O1优化
在命令行运行如下命令编译并链接生成可执行程序:
gcc -std=c99 -O1 .\casm.c -o .\casmgcc.exe
使用gcc编译后的机器码如下图所示,使用了REP指令和MOVS指令逐个复制字符串,MOVS共计执行17次,为了调用printf函数,gcc还重新分配了stack空间并调用了43次STOS指令,如下图所示:
2.2.3 使用O2优化
在命令行运行如下命令编译并链接生成可执行程序:
gcc -std=c99 -O2 .\casm.c -o .\casmgcc.exe
生成的机器码如下图所示,可以看到和使用O1优化所生成的机器码基本一样,也是使用了17次MOVS指令复制字符串,如下图所示:
文章来源:https://www.toymoban.com/news/detail-462350.html
3. 总结
字符串操作指令可以简化字符串的拷贝等操作,CL和GCC编译器各具特色,CL编译器的O2优化可以生成非常精简的机器码。另外,GCC编译器在处理字符串复制操作时,其使用的指令数多于CL编译器。文章来源地址https://www.toymoban.com/news/detail-462350.html
到了这里,关于汇编字符串操作指令【微软CL vs. GCC】的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!