現在のブログ
ゲーム開発ブログ (2025年~) Gamedev Blog (2025~)
レガシーブログ
テクノロジーブログ (2018~2024年) リリースノート (2023~2025年) MeatBSD (2024年)

【Programming】How to Learn Assembly Language Using C
There are many ways to learn assembly language, but I recommend learning at least one assembly dialect.
Mastering one dialect makes it easier to learn others and allows you to deeply understand how computers work.
Without reading assembly, it's hard to truly understand a computer's mechanics.
Normally, I'd recommend starting with MIPS or RISC-V assembly because they're very simple.
However, this time, I'll assume you're using an Intel or AMD processor and focus on x64 assembly.
Workflow for Learning Assembly from C
cc -o program program.c
.objdump -d -Mintel program | less
.main
function and analyze the assembly instructions.Compile with no optimization (default -O0
) using cc main.c -o ftoc
.
Using optimization flags like -O2
can produce different assembly, so use -O0
for learning.
Converting C to Assembly
The easiest way to learn assembly is to first create a simple C program, compile it, and disassemble the binary with objdump
.This article uses GhostBSD, but Linux, Illumos, FreeBSD, OpenBSD, NetBSD, or other Unix-like OSes will work fine.All the tools used in this article are likely already installed on your computer.We'll use Intel syntax for assembly (specified with -Mintel
in objdump
).This makes instructions read like mov eax, ebx
instead of AT&T's movl %ebx, %eax
.
Below is C code directly quoted from The C Programming Language, 2nd Edition by B.W. Kernighan and D.M. Ritchie, a monumental book for all C programmers.
#include <stdio.h>
int main() {
int fahr, celsius;
int lower, upper, step;
lower = 0; /* 温度表の下限 */
upper = 300; /* 上限 */
step = 20; /* きざみ */
fahr = lower;
while (fahr <= upper) {
celsius = 5 * (fahr-32) / 9;
printf("%d\t%d\n", fahr, celsius);
fahr = fahr + step;
}
}
This C program converts Fahrenheit to Celsius, outputting a table from 0°F to 300°F in 20°F increments.
The formula used is celsius = 5 * (fahr - 32) / 9
.
Next, compile it with cc main.c -o ftoc
.
Then, run objdump -d -Mintel ./ftoc | less
.
Press /
and type main
to find the main
function.
What you see may vary slightly depending on your CPU architecture or OS.
In my case, the output looks like this:
00000000002016a0 <main>:
2016a0: 55 push rbp
2016a1: 48 89 e5 mov rbp, rsp
2016a4: 48 83 ec 20 sub rsp, 0x20
2016a8: c7 45 fc 00 00 00 00 mov dword ptr [rbp - 0x4], 0x0
2016af: c7 45 f0 00 00 00 00 mov dword ptr [rbp - 0x10], 0x0
2016b6: c7 45 ec 2c 01 00 00 mov dword ptr [rbp - 0x14], 0x12c
2016bd: c7 45 e8 14 00 00 00 mov dword ptr [rbp - 0x18], 0x14
2016c4: 8b 45 f0 mov eax, dword ptr [rbp - 0x10]
2016c7: 89 45 f8 mov dword ptr [rbp - 0x8], eax
2016ca: 8b 45 f8 mov eax, dword ptr [rbp - 0x8]
2016cd: 3b 45 ec cmp eax, dword ptr [rbp - 0x14]
2016d0: 7f 36 jg 0x201708 <main+0x68>
2016d2: 8b 45 f8 mov eax, dword ptr [rbp - 0x8]
2016d5: 83 e8 20 sub eax, 0x20
2016d8: 6b c0 05 imul eax, eax, 0x5
2016db: b9 09 00 00 00 mov ecx, 0x9
2016e0: 99 cdq
2016e1: f7 f9 idiv ecx
2016e3: 89 45 f4 mov dword ptr [rbp - 0xc], eax
2016e6: 8b 75 f8 mov esi, dword ptr [rbp - 0x8]
2016e9: 8b 55 f4 mov edx, dword ptr [rbp - 0xc]
2016ec: 48 bf d8 04 20 00 00 00 00 00 movabs rdi, 0x2004d8
2016f6: b0 00 mov al, 0x0
2016f8: e8 c3 00 00 00 call 0x2017c0 <printf@plt>
2016fd: 8b 45 f8 mov eax, dword ptr [rbp - 0x8]
201700: 03 45 e8 add eax, dword ptr [rbp - 0x18]
201703: 89 45 f8 mov dword ptr [rbp - 0x8], eax
201706: eb c2 jmp 0x2016ca <main+0x2a>
201708: 8b 45 fc mov eax, dword ptr [rbp - 0x4]
20170b: 48 83 c4 20 add rsp, 0x20
20170f: 5d pop rbp
201710: c3 ret
201711: cc int3
201712: cc int3
201713: cc int3
201714: cc int3
201715: cc int3
201716: cc int3
201717: cc int3
201718: cc int3
201719: cc int3
20171a: cc int3
20171b: cc int3
20171c: cc int3
20171d: cc int3
20171e: cc int3
20171f: cc int3
The int3
instruction (opcode cc
) at the end is padding added by the compiler for alignment, debugging traps, or invalid control flow.
It doesn't affect the program's logic, so we can ignore it for now.
You'll notice all numbers are in hexadecimal.
For those who missed it, I wrote a detailed article on bitwise operations here.
Assembly Basics
Assembly language is a low-level language that directly corresponds to machine instructions.
Key concepts include:
rax
, rbp
, rsp
).rsp
) and base pointer (rbp
).mov
(move data), cmp
(compare), jmp
(jump to an address).Simple Parts
Defining Integers
You might first notice this part:
2016af: c7 45 f0 00 00 00 00 mov dword ptr [rbp - 0x10], 0x0
2016b6: c7 45 ec 2c 01 00 00 mov dword ptr [rbp - 0x14], 0x12c
2016bd: c7 45 e8 14 00 00 00 mov dword ptr [rbp - 0x18], 0x14
2016c4: 8b 45 f0 mov eax, dword ptr [rbp - 0x10]
2016c7: 89 45 f8 mov dword ptr [rbp - 0x8], eax
This corresponds to the following C code:
lower = 0;
upper = 300;
step = 20;
fahr = lower;
rbp - 0x10
holds lower
, rbp - 0x14
holds upper
, rbp - 0x18
holds step
, and rbp - 0x8
holds fahr
.
The instruction mov eax, dword ptr [rbp - 0x10]
loads lower
(0) into eax
(bytes 0-3 of the rax
register).rax
is used to store a function's return value.mov dword ptr [rbp - 0x8], eax
assigns lower
to fahr
.
rbp
is the base pointer for the current stack frame, and rsp
is the current stack pointer.
At the start of the program, you'll see something like this:
2016a0: 55 push rbp
2016a1: 48 89 e5 mov rbp, rsp
2016a4: 48 83 ec 20 sub rsp, 0x20
These instructions set up the stack frame for the main
function.
push rbp
, mov rbp, rsp
, and sub rsp, 0x20
form the standard prologue for an x64 function.push rbp
saves the caller's base pointer, mov rbp, rsp
sets the current stack pointer as the new base pointer for main
's stack frame, and sub rsp, 0x20
allocates 32 bytes for local variables and alignment.
This appears in all code and doesn't directly correspond to C code.
It's what the machine does in the background.
You'll also see notations like [rbp - 0x10]
.
This means rbp
is the base pointer for the current stack frame, and subtracting the hexadecimal offset 0x10
(16) loads the local variable stored at that address (in this case, lower
).
printf Function
Next, you can easily spot this part:
2016e6: 8b 75 f8 mov esi, dword ptr [rbp - 0x8]
2016e9: 8b 55 f4 mov edx, dword ptr [rbp - 0xc]
2016ec: 48 bf d8 04 20 00 00 00 00 00 movabs rdi, 0x2004d8
2016f6: b0 00 mov al, 0x0
2016f8: e8 c3 00 00 00 call 0x2017c0 <printf@plt>
This corresponds to the printf()
function inside the loop.mov esi, dword ptr [rbp - 0x8]
loads fahr
into esi
(printf
's second argument).mov edx, dword ptr [rbp - 0xc]
loads celsius
into edx
(printf
's third argument).movabs rdi, 0x2004d8
loads the address of the format string "%d\t%d\n"
into rdi
(printf
's first argument).mov al, 0x0
sets al
to 0, indicating no floating-point arguments are passed to printf
.
This is a requirement for x64 processors.
Next, you'll see call 0x2017c0 <printf@plt>
.
This calls printf
via the Procedure Linkage Table (PLT) used in dynamically linked binaries, resolving the address of printf
in libc.so
at runtime.
In statically linked binaries, as we'll discuss later, the full printf
implementation is included.
Let's check the address 0x2004d8
.
First, press q
to exit the current objdump
instance.
Then, run objdump -d -Mintel -s -j .rodata ./ftoc
.
In my case, the output looks like this:
./ftoc: file format elf64-x86-64
Contents of section .rodata:
2004d8 25640925 640a0000 %d.%d...
Disassembly of section .rodata:
00000000002004d8 <.rodata>:
2004d8: 25 64 09 25 64 and eax, 0x64250964
2004dd: 0a 00 or al, byte ptr [rax]
2004df: 00 <unknown>
In particular, look at 2004d8: 25 64 09 25 64 and eax, 0x64250964
.%
= 0x25d
= 0x64
Since x64 is a little-endian architecture, the byte sequence 25 64
represents %d
.\t
= 0x09
Then, 25 64
repeats, which makes sense since %d
is used twice.
The next line shows 2004dd: 0a 00 or al, byte ptr [rax]
.\n
= 0x0a
The final 0x00
is the null terminator inserted by the compiler.
This is where printf
is defined.
I won't go into detail, but if you search for address 2017c0
, in my case, it's at the end of the binary.
You'll see something like this:
00000000002017c0 <printf@plt>:
2017c0: ff 25 aa 21 00 00 jmp qword ptr [rip + 0x21aa] # 0x203970 <printf+0x203970>
2017c6: 68 02 00 00 00 push 0x2
2017cb: e9 c0 ff ff ff jmp 0x201790 <.plt>
As homework, figure out what this means.
However, this is printf
for a dynamically linked binary.
In a statically linked binary, the output would look like this:
0000000000227320 <printf>:
227320: 55 push rbp
227321: 48 89 e5 mov rbp, rsp
227324: 48 81 ec d0 00 00 00 sub rsp, 0xd0
22732b: 49 89 fa mov r10, rdi
22732e: 84 c0 test al, al
227330: 74 26 je 0x227358 <printf+0x38>
227332: 0f 29 85 60 ff ff ff movaps xmmword ptr [rbp - 0xa0], xmm0
227339: 0f 29 8d 70 ff ff ff movaps xmmword ptr [rbp - 0x90], xmm1
227340: 0f 29 55 80 movaps xmmword ptr [rbp - 0x80], xmm2
227344: 0f 29 5d 90 movaps xmmword ptr [rbp - 0x70], xmm3
227348: 0f 29 65 a0 movaps xmmword ptr [rbp - 0x60], xmm4
22734c: 0f 29 6d b0 movaps xmmword ptr [rbp - 0x50], xmm5
227350: 0f 29 75 c0 movaps xmmword ptr [rbp - 0x40], xmm6
227354: 0f 29 7d d0 movaps xmmword ptr [rbp - 0x30], xmm7
227358: 48 89 b5 38 ff ff ff mov qword ptr [rbp - 0xc8], rsi
22735f: 48 89 95 40 ff ff ff mov qword ptr [rbp - 0xc0], rdx
227366: 48 89 8d 48 ff ff ff mov qword ptr [rbp - 0xb8], rcx
22736d: 4c 89 85 50 ff ff ff mov qword ptr [rbp - 0xb0], r8
227374: 4c 89 8d 58 ff ff ff mov qword ptr [rbp - 0xa8], r9
22737b: 48 8b 05 ae 3c 07 00 mov rax, qword ptr [rip + 0x73cae] # 0x29b030 <__stack_chk_guard>
227382: 48 89 45 f8 mov qword ptr [rbp - 0x8], rax
227386: 48 8d 85 30 ff ff ff lea rax, [rbp - 0xd0]
22738d: 48 89 45 f0 mov qword ptr [rbp - 0x10], rax
227391: 48 8d 45 10 lea rax, [rbp + 0x10]
227395: 48 89 45 e8 mov qword ptr [rbp - 0x18], rax
227399: 48 b8 08 00 00 00 30 00 00 00 movabs rax, 0x3000000008
2273a3: 48 89 45 e0 mov qword ptr [rbp - 0x20], rax
2273a7: 48 8b 3d 22 ea 06 00 mov rdi, qword ptr [rip + 0x6ea22] # 0x295dd0 <__stdoutp>
2273ae: 48 8d 55 e0 lea rdx, [rbp - 0x20]
2273b2: 4c 89 d6 mov rsi, r10
2273b5: e8 46 40 00 00 call 0x22b400 <vfprintf>
2273ba: 48 8b 0d 6f 3c 07 00 mov rcx, qword ptr [rip + 0x73c6f] # 0x29b030 <__stack_chk_guard>
2273c1: 48 3b 4d f8 cmp rcx, qword ptr [rbp - 0x8]
2273c5: 75 09 jne 0x2273d0 <printf+0xb0>
2273c7: 48 81 c4 d0 00 00 00 add rsp, 0xd0
2273ce: 5d pop rbp
2273cf: c3 ret
2273d0: e8 1b cc 00 00 call 0x233ff0 <__stack_chk_fail_local>
2273d5: 66 66 2e 0f 1f 84 00 00 00 00 00 nop word ptr cs:[rax + rax]
As additional homework, investigate the printf@plt
(dynamic) and printf
(static) functions.
In static linking, the printf
function and all its dependencies are included, but in dynamic linking, it only points to an address in the libc.so
file somewhere on the system.
In that case, you can check with objdump -d -Mintel /lib/libc.so.7 | less
(the filename or path may vary by OS).You'll see something like this:
00000000001cd370 <printf@plt>:
1cd370: ff 25 82 1d 01 00 jmp qword ptr [rip + 0x11d82] # 0x1df0f8
1cd376: 68 93 01 00 00 push 0x193
1cd37b: e9 b0 e6 ff ff jmp 0x1cba30 <.plt>
Unfortunately, you won't find the full definition in this object dump.
However, you can dump the entire library with objdump -d /lib/libc.so.7 > libc_disasm.txt
.
Then, run less libc_disasm.txt
, press /
, and search for <printf>
.
You'll see the full definition, which looks like this (in AT&T syntax):
000000000011b220 <printf>:
11b220: 55 pushq %rbp
11b221: 48 89 e5 movq %rsp, %rbp
11b224: 53 pushq %rbx
11b225: 48 81 ec d8 00 00 00 subq $0xd8, %rsp
11b22c: 49 89 fa movq %rdi, %r10
11b22f: 84 c0 testb %al, %al
11b231: 74 29 je 0x11b25c <printf+0x3c>
11b233: 0f 29 85 50 ff ff ff movaps %xmm0, -0xb0(%rbp)
11b23a: 0f 29 8d 60 ff ff ff movaps %xmm1, -0xa0(%rbp)
11b241: 0f 29 95 70 ff ff ff movaps %xmm2, -0x90(%rbp)
11b248: 0f 29 5d 80 movaps %xmm3, -0x80(%rbp)
11b24c: 0f 29 65 90 movaps %xmm4, -0x70(%rbp)
11b250: 0f 29 6d a0 movaps %xmm5, -0x60(%rbp)
11b254: 0f 29 75 b0 movaps %xmm6, -0x50(%rbp)
11b258: 0f 29 7d c0 movaps %xmm7, -0x40(%rbp)
11b25c: 48 89 b5 28 ff ff ff movq %rsi, -0xd8(%rbp)
11b263: 48 89 95 30 ff ff ff movq %rdx, -0xd0(%rbp)
11b26a: 48 89 8d 38 ff ff ff movq %rcx, -0xc8(%rbp)
11b271: 4c 89 85 40 ff ff ff movq %r8, -0xc0(%rbp)
11b278: 4c 89 8d 48 ff ff ff movq %r9, -0xb8(%rbp)
11b27f: 48 8b 1d 3a dc 0b 00 movq 0xbdc3a(%rip), %rbx # 0x1d8ec0
11b286: 48 8b 03 movq (%rbx), %rax
11b289: 48 89 45 f0 movq %rax, -0x10(%rbp)
11b28d: 48 8d 85 20 ff ff ff leaq -0xe0(%rbp), %rax
11b294: 48 89 45 e0 movq %rax, -0x20(%rbp)
11b298: 48 8d 45 10 leaq 0x10(%rbp), %rax
11b29c: 48 89 45 d8 movq %rax, -0x28(%rbp)
11b2a0: 48 b8 08 00 00 00 30 00 00 00 movabsq $0x3000000008, %rax # imm = 0x3000000008
11b2aa: 48 89 45 d0 movq %rax, -0x30(%rbp)
11b2ae: 48 8b 05 1b dd 0b 00 movq 0xbdd1b(%rip), %rax # 0x1d8fd0
11b2b5: 48 8b 38 movq (%rax), %rdi
11b2b8: 48 8d 55 d0 leaq -0x30(%rbp), %rdx
11b2bc: 4c 89 d6 movq %r10, %rsi
11b2bf: e8 cc 11 0b 00 callq 0x1cc490 <vfprintf@plt>
11b2c4: 48 8b 0b movq (%rbx), %rcx
11b2c7: 48 3b 4d f0 cmpq -0x10(%rbp), %rcx
11b2cb: 75 0a jne 0x11b2d7 <printf+0xb7>
11b2cd: 48 81 c4 d8 00 00 00 addq $0xd8, %rsp
11b2d4: 5b popq %rbx
11b2d5: 5d popq %rbp
11b2d6: c3 retq
11b2d7: e8 a4 07 0b 00 callq 0x1cba80 <__stack_chk_fail@plt>
11b2dc: 0f 1f 40 00 nopl (%rax)
Simple, right?
Now, let's move on to the trickier parts!
Trickier Parts
The while loop condition is constructed like this:
2016ca: 8b 45 f8 mov eax, dword ptr [rbp - 0x8]
2016cd: 3b 45 ec cmp eax, dword ptr [rbp - 0x14]
2016d0: 7f 36 jg 0x201708 <main+0x68>
This checks the while loop condition, i.e., whether fahr
is less than or equal to upper
.mov eax, dword ptr [rbp - 0x8]
loads fahr
into eax
.cmp eax, dword ptr [rbp - 0x14]
compares fahr
with upper
(stored at rbp - 0x14
).
Together, these represent fahr <= upper
.jg 0x201708
jumps to address 201708
(the end of the loop) if fahr
is greater than upper
.
This is the negation of fahr <= upper
, so the loop continues if fahr
is less than or equal to upper
.
Inside the loop, you'll see code like this:
2016d2: 8b 45 f8 mov eax, dword ptr [rbp - 0x8]
2016d5: 83 e8 20 sub eax, 0x20
2016d8: 6b c0 05 imul eax, eax, 0x5
2016db: b9 09 00 00 00 mov ecx, 0x9
2016e0: 99 cdq
2016e1: f7 f9 idiv ecx
2016e3: 89 45 f4 mov dword ptr [rbp - 0xc], eax
This calculates celsius = 5 * (fahr - 32) / 9
.mov eax, dword ptr [rbp - 0x8]
loads fahr
into eax
.sub eax, 0x20
subtracts 32 (0x20) from eax
, corresponding to fahr - 32
in C.imul eax, eax, 0x5
multiplies eax
by 5, corresponding to 5 * (fahr - 32)
in C.mov ecx, 0x9
loads 9 into ecx
(the divisor).cdq
sign-extends eax
into edx:eax
to prepare for the idiv
instruction.
This is used to perform signed division on the 64-bit value in edx:eax
, correctly handling negative numbers for 5 * (fahr - 32) / 9
.idiv ecx
divides edx:eax
by ecx
(9), storing the quotient in eax
.mov dword ptr [rbp - 0xc], eax
stores the result in celsius
(rbp - 0xc
).
In other words, one line of C code requires seven assembly instructions.
Next, the following calculation is performed:
2016fd: 8b 45 f8 mov eax, dword ptr [rbp - 0x8]
201700: 03 45 e8 add eax, dword ptr [rbp - 0x18]
201703: 89 45 f8 mov dword ptr [rbp - 0x8], eax
With all the explanations so far, you should understand that this means fahr = fahr + step;
.
Then, you'll see this:
201706: eb c2 jmp 0x2016ca <main+0x2a>
This jumps back to the start of the while loop (address 2016ca
) to check if fahr
is still less than or equal to upper
.
If the answer is false
, the loop ends; if true
, it continues.
Finally, there's this:
201708: 8b 45 fc mov eax, dword ptr [rbp - 0x4]
20170b: 48 83 c4 20 add rsp, 0x20
20170f: 5d pop rbp
201710: c3 ret
This cleans up the stack frame and returns from main
.mov eax, dword ptr [rbp - 0x4]
loads the value at rbp - 0x4
into eax
.
This is the return value for main
, implicitly initialized at address 2016a8
(mov dword ptr [rbp - 0x4], 0x0
).
In C, if main
doesn't explicitly specify a return value, the C standard returns 0.
In assembly, mov dword ptr [rbp - 0x4], 0x0
(at address 2016a8
) initializes fahr
, not the return value.
The actual return value is set at the end with mov eax, dword ptr [rbp - 0x4]
.add rsp, 0x20
frees the 32 bytes of stack space.pop rbp
restores the caller's base pointer.ret
returns to the caller.
Still pretty simple, right?
Explicit vs. Implicit Return Value
In this example, we omitted return 0;
, which is allowed in C and C++ only for int main()
.
However, adding return 0;
to the C code slightly changes the assembly output.
00000000002016a0 <main>:
2016a0: 55 push rbp
2016a1: 48 89 e5 mov rbp, rsp
2016a4: 48 83 ec 20 sub rsp, 0x20
2016a8: c7 45 fc 00 00 00 00 mov dword ptr [rbp - 0x4], 0x0
2016af: c7 45 f0 00 00 00 00 mov dword ptr [rbp - 0x10], 0x0
2016b6: c7 45 ec 2c 01 00 00 mov dword ptr [rbp - 0x14], 0x12c
2016bd: c7 45 e8 14 00 00 00 mov dword ptr [rbp - 0x18], 0x14
2016c4: 8b 45 f0 mov eax, dword ptr [rbp - 0x10]
2016c7: 89 45 f8 mov dword ptr [rbp - 0x8], eax
2016ca: 8b 45 f8 mov eax, dword ptr [rbp - 0x8]
2016cd: 3b 45 ec cmp eax, dword ptr [rbp - 0x14]
2016d0: 7f 36 jg 0x201708 <main+0x68>
2016d2: 8b 45 f8 mov eax, dword ptr [rbp - 0x8]
2016d5: 83 e8 20 sub eax, 0x20
2016d8: 6b c0 05 imul eax, eax, 0x5
2016db: b9 09 00 00 00 mov ecx, 0x9
2016e0: 99 cdq
2016e1: f7 f9 idiv ecx
2016e3: 89 45 f4 mov dword ptr [rbp - 0xc], eax
2016e6: 8b 75 f8 mov esi, dword ptr [rbp - 0x8]
2016e9: 8b 55 f4 mov edx, dword ptr [rbp - 0xc]
2016ec: 48 bf d8 04 20 00 00 00 00 00 movabs rdi, 0x2004d8
2016f6: b0 00 mov al, 0x0
2016f8: e8 b3 00 00 00 call 0x2017b0 <printf@plt>
2016fd: 8b 45 f8 mov eax, dword ptr [rbp - 0x8]
201700: 03 45 e8 add eax, dword ptr [rbp - 0x18]
201703: 89 45 f8 mov dword ptr [rbp - 0x8], eax
201706: eb c2 jmp 0x2016ca <main+0x2a>
201708: 31 c0 xor eax, eax
20170a: 48 83 c4 20 add rsp, 0x20
20170e: 5d pop rbp
20170f: c3 ret
First, the debug padding is gone.
The relevant changed lines are:
201708: 8b 45 fc mov eax, dword ptr [rbp - 0x4]
Instead, you'll see this:
201708: 31 c0 xor eax, eax
In other words, instead of looking up the return value, the binary directly accesses 0 as the return value.
This speeds up processing and saves a few bytes.
Conclusion
Overall, it's not that difficult.
It might seem intimidating at first, but once you understand what's happening, learning assembly becomes easy quickly.
Once you understand assembly, all software becomes open-source.
That's all