=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-===== ===== ===== ==== ASM TUTORIAL FOR LINUX n' ELF FILE FORMAT by LiTlLe VxW ==== ===== ===== =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-===== ,...... .-, ,-. .' '. \ SUMMARY / . . ' ' : .-. .-. : '.__.---.__.' : : _ : : _ : : : :(_); :(_); : * Introduction : , '. : * asm syntax : , :; : * HELLO WORLD in asm : .-___- . : * HELLO WORLD in pure asm : '.___.' '. * So what is a syscall? ;. ..: * What is a file descriptor? :... ....: * What is an inode? ;:::: .:::::: * list of syscall :::::; ::::::: * Important remark ;::::; ::::::: * syscalls descriptions .::::; A S M '::::::. * Important STRUCTURES .;::::: ::::;. - dirent ;::::; TUTORIAL ':::; - utsname .:;; '::, * Important FLAGS DESCRIPTIONS ': FOR LINUX ': - file access bits : : - file permissions flags : by LiTlLe VxW : * HOW to use some syscall '. . - print a text on screen '. ; - get info about kernel '. .--..___..--'. ; - open a file , . , , - open a directory ._._._' ._._._. - close a file descriptor - create a new file - move the file pointer - write bytes in a (opened) file WRITTEN FOR 29A - read bytes in a (opened) file - find files with getdents - find files with readdir - read the keyboard (read) * ERROR CODE * Last DWORDS .-, ,-. \ Introduction / ' ' '.____.----.____.' Programming in asm under linux is not the best choice for a 'normal' programmer because there is few good documentation and tutorials, and because LINUX native language is C. I founded only few asm tutorial using asm syntax instead of stupid C syntax to describe structure, data type... For people, linux means 'written in C!!!', but how are code C library? in C? ;-) All operating system has for first language assembly! and UNIX-like platform too !!! This article is a good introduction to asm programming in linux and UNIX-like platform. with it you have all informations to find, open, read, write files ;-) You should find other documentation of syscall, structure, advanced asm programming,... I will use here in this tutorial the nasm syntax because I use it too in win32. If you have correctly installed LINUX so you will find all you need in your hard drive for asm programming. This tutorial has been tested on RED HAT 7.3, kernel 2.4.18-3 what you need: - your eyes to read this tutorial - bad english knowledge (english used in vx tutorial) - doc about syscall - a stupid text editor - nasm (in your intallation CDrom of your linux distribution) - ld (in your intallation CDrom of your linux distribution) - basic asm knowledge (intel syntax) (if you don't know what mov eax,0000029ah means then go away!!!) - doc, code source, .h file from your linux hard drive) .-, ,-. \ asm syntax / ' ' '.____.--.____.' Intel Syntax and AT&T Syntax: expl: Intel Syntax | AT&T Syntax -----------------------+-------------------- mov eax,1 | movl $1,%eax mov ebx,0ffh | movl $0xff,%ebx int 80h | int $0x80 mov eax,[ebx] | movl (%ebx),%eax mov eax,[ebx+3] | movl 3(%ebx),%eax lea eax,[ebx+ecx] | leal (%ebx,%ecx),%eax I have choose intel syntax to wrote this tutorial (I don't know AT&T Syntax ;-) and NASM as compiler of course! There is some prog to convert asm files from AT&T syntax to intel one Let's start with the famous HELLO WORLD programm: .-, ,-. \ HELLO WORLD in asm / ' ' '.______.-----._______.' Here it is not 'pure asm'. It is something between C and ASM, I had to show you this because you can find some code like this: ;------------------------------------ ; To compile this file: ; ; ; ; ; ;nasm -f elf hello.asm ; ; ;gcc hello.o -o hello ; ; ;- - - - - - - - - - - - - ; ; BITS 32 ; ; EXTERN puts ; ; SECTION .text ; ; GLOBAL main ; ; main: ; push dword hello ; call puts ; add esp, 4 ; ret ; ; SECTION .data ; ; hello db "Hello world !", 0 ; ;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -; ; REMARK: * gcc use ld to create the executable file ; ; * this file is 13470 bytes long when compiled!!! ; ;-----------------------------------------------------------+ .-, ,-. \ HELLO WORLD in pure asm / ' ' '.________.-------.________.' here is an HELLO WORLD code in pure asm, using int 80h: ;-------------------------------------------------------------------- ; To compile this file: ; ; ; ; ; ; nasm -f elf hello.asm ; ; ; ld -o hello hello.o ; ; ;- - - - - - - - - - - - - ; ; BITS32 ; <--- tell to nasm that it is in 32bits mode ; ; section .text ; <---the code section start here ; ; global _start ; <---. ; _start: ; <----\__for nasm ; ; mov eax,4 ; syscall #4, 'write' ; mov ebx,1 ; file descritor (fd) of the screen ; mov ecx,message ; offset of what to write ; mov edx,23 ; number of bytes to write ; int 80h ; interrupt 80h ; ; mov eax,1 ; syscall #1, 'exit' ; mov ebx,0 ; exit code ; int 80h ; interrupt 80h ; ; section .data ; <--- the data section start here ; ; message db "hello LINUX world !!!",13,10,0 ; ; section .bss ; ; ;- - - - - - - - - - - - - - - - - - - - - - - - - ; ; REMARK: * .text is the code (program) section ; ; ; * .data is the initialized data ; ; ; * .bss is the unitialized data ; ; ; * with NASM, when you write mov eax,10 ; ; ; it means 10d in decimal, not in hexa ; ; ; * The screen is consider as a file, ; ; ; the fd of the screen is 1 ; ; ; * this file is 889 bytes long when ; ; ; compiled ; ; ;-------------------------------------------------------------------; create a file hello.asm, write the code, save it. Next you should type in the console: nasm -f elf hello.asm <---. ld -o hello hello.o <----'--- to compile (read the doc of NASM!) ./hello <-------- to run hello programm (don't forget './') If you are a beginner in linux world: cd Desktop <--- go in the directoy called 'Desktop' (case sensitive) cd .. <--- go to previous directory (parent) ls <--- print on screen the files of the current directory ./RUNit <--- ./ before the name of the prog to launch .-, ,-. \ So what is a syscall? / ' ' '._________.----._________.' in DOS asm you had to put the number of the function in ah, put parameters in the other registers, and call the good interrupt (int 21h, int 22h, int 23h,...,int 10h for BIOS call) expl: mov ah,4ch ; fonction 4ch = 'terminate current programm' mov al,00h ; exit code int 21h ; int 21h (dos interrupt) in win32 you had to push parameters in the stack before to call APIs expl: push dword 0 ; exit code call ExitProcess ; fontion to terminate current programm' with LINUX you had to put in eax the number of the syscall, put parameters in ebx,ecx,edx,...and call int 80h (only int 80h is used). expl: mov eax,1 ; syscall number 1 = 'terminate current programm' mov ebx,0 ; exit code int 80h ; interrupt 80h a syscall is a call to the kernel to do what we ask him for. All version of linux can not have the same syscall, but the most used syscall are the same in all linux versions. The list aviable of syscall in the linux system installed can be found at: /usr/man/man2/unistd.h (or try to find a sub-directory called 'man') /usr/include/sys/syscall.h. /usr/include/asm/unistd.h to make a syscall you should: - put in eax the number off the syscall - put the parameters in ebx,ecx,edx,esi,edi(,ebp) (this order is used in syscall description) - int 80h most of the syscall return a value in eax, it can be an error code or something else. WARNING: in kernel 1.0 it seems to be: eax=0 ---> success eax=0ffffffffh ---> error in kernel 2.4 it seems to be: 0fffff000h < eax <= 0ffffffffh ---> error code .-, ,-. \ What is a file descriptor? / ' ' '.___________.----.___________.' a file descriptor (fd) is an dword value assigned to a file when it is opened (open a file ---> give an fd). There are always 3 default "files" open: fd=0 ---> stdin (the keyboard) fd=1 ---> stdout (the screen) fd=2 ---> and stderr (error messages output to the screen) .-, ,-. \ What is an inode? / ' ' '._______.---._______.' Linux operating system (and unix like)don't work with the ascii name of the file (to find them, write them, read them,...) In fact, each file has his own inode number, and the system deals with inode number (and not with the name of the file like in DOS/win platform) A directory is just a file which contains informations to others file (filename, inode number). .-, ,-. \ List of syscall / ' ' '.______.---.______.' here are some usefull syscall: +===========+==========+================================================+ | sys call | name | action | +===========+==========+================================================+ | 1 | exit | terminate current process | +-----------+----------+------------------------------------------------+ | 3 | read | read bytes from file descriptor into buffer | +-----------+----------+------------------------------------------------+ | 4 | write | write byte from buffer to fd (file descriptor) | +-----------+----------+------------------------------------------------+ | 5 | open | open,create or truncate a file/device | +-----------+----------+------------------------------------------------+ | 6 | close | close a file | +-----------+----------+------------------------------------------------+ | 11 | execve | execute a program | +-----------+----------+------------------------------------------------+ | 12 | chdir | change current working directory | +-----------+----------+------------------------------------------------+ | 14 | mknod | create a filesystem node (file, device, special| | | | file, named pipe) | +-----------+----------+------------------------------------------------+ | 15 | chmod | change file permissions (attributes) | +-----------+----------+------------------------------------------------+ | 19 | lseek | change file pointer of file | +-----------+----------+------------------------------------------------+ | 38 | rename | move/rename a file | +-----------+----------+------------------------------------------------+ | 39 | mkdir | create a directory | +-----------+----------+------------------------------------------------+ | 40 | rmdir | delete an empty directory | +-----------+----------+------------------------------------------------+ | 89 | readdir | read a directory | +-----------+----------+------------------------------------------------+ | 122 | uname | get name of & information about current kernel | +-----------+----------+------------------------------------------------+ | 141 | getdents | get directory entries | +-----------+----------+------------------------------------------------+ DOC and files you should find in your hard drive: syscalls list : unistd.h syscall.h data structures : linux/types.h asm/posix_types.h linux/kernel.h error numbers : include/asm/errno.h According to asmutils-0.14 there are 46 system calls common to LINUX (2.2/4), FREEBSD, NETBSD, OPENBSD, BEOS, ATHEOS .-, ,-. \ Important remark / ' ' '.______.-----._____.' Remark: * All in linux world is considered as a file: screen, directory, keyboard, devices, text files,... * In some version of linux, the readdir syscall may have been remplaced by the getdents syscall * FreeBSD don't use int 80h like as LINUX: (that's what I've read but I've never tested it !, perhaps in one other tutorial): expl: ; to open a file (syscall 5): ; open: push dword mode push dword flags push dword path mov eax,5 push eax int 80h add esp, byte 16 ret .-, ,-. \ syscalls descriptions / ' ' '.________.-----.________.' ____________________________________________ 1 | ___| exit input : eax = 1 ebx = exit code output: none ____________________________________________ 3 | ___| read input : eax = 3 ebx = file descritor ecx = pointer to buffer edx = number of byte to receive output: eax = number of byte received (file pointer has been set according to edx) error : EAGAIN, EBADF, EFAULT, EINTR, EINVAL, EIO, EISDIR ____________________________________________ 4 | ___| write input : eax = 4 ebx = file descritor (fd) ecx = pointer to buffer to write edx = number of byte to write output: eax = number of byte written error : EAGAIN, EBADF, EFAULT, EINTR, EINVAL, EIO, ENOSPC, EPIPE ____________________________________________ 5 | ___| open input : eax = 5 ebx = pointer to asciz abs or rel pathname ecx = file acces bits edx = file permissions, mode output: eax = fd (16bits file descriptor) error : ACCESS, EXIST, FAULT, ISDIR, LOOP, MFILE, NAMETOOLONG, NFILE, NOENT, NODEV, NODIR, NOMEM, NOSPC, NXIO, ROFS, TXTBSY ____________________________________________ 6 | ___| close input : eax=6 ebx=file descriptor of the file to close output: none error : EBADF ____________________________________________ 11 | ___| execve input : eax= = 11 ebx = pointer to terminated string of program path and name ecx = pointer to zero terminated list of pointers to terminated program argument string edx = pointer to zero terminated list of pointers to terminated environement strings output: none error : eax = 2big, acces, inval, io, isdir, libbad loop, nfile, noexec, noent, nomem, notdir, fault, nametoolong, perm, txtbusy ____________________________________________ 12 | ___| chdir input : eax = 12 ebx = ptr to asciz abs or rel file name output: none error : nametoolong, noent, nomem, notdir ____________________________________________ 14 | ___| mknod input : eax = 14 ebx = ptr to asciz abs or rel file name ecx = file permissions flags edx = specifies the major and minor numbers of the newly created device special file; otherwise it is ignored. output: none error : acces, exist, fault, inval, loop, nametoolong, noent, nomem, nospc, notdir, perm, rofs ____________________________________________ 15 | ___| chmod input : eax = 15 ebx = pointer to asciz abs or rel pathname ecx = file permissions flags output: none error : acces, badf, fault, io, loop, nametoolong, noent, nomem, notdir, perm, rofs ____________________________________________ 19 | ___| lseek input : eax = 19 ebx = fd ecx = number of byte to move edx = how to move(one of this flag) SEEK_SET 0   distance from the beginning of file SEEK_CUR 1   distance from the current position SEEK_END 2   distance from the end of file output: offset of the file pointer from the beginning of file error : badf, inval, ispipe ____________________________________________ 38 | ___| rename input : eax = 38 ebx = old file name ecx = new file name output: none error : busy, exist, isdir, notempty, xdev (and other f.s. errors) ____________________________________________ 39 | ___| mkdir input : eax = 39 ebx = name of the directory to create ecx = file permissions flags output: none error : ? ____________________________________________ 40 | ___| rmdir input : eax = 40 ebx = directory name to delete output: none error : access, busy, fault, loop, nametoolong, noent, nomem, notdir, notempty, perm, rofs ____________________________________________ 89 | ___| readdir input : eax = 89 ebx = fd ecx = pointer to a dirent structure edx = count (ignored???) output: none error : badf, fault, inval, noent, notdir ____________________________________________ 122| ___| uname input : eax = 122 ebx = pointer to utsname structure output: none error : fault ____________________________________________ 141| ___| getdents input : eax = 141 ebx = fd ecx = pointer to 'dirent' structure edx = size of the buffer output: eax = number of bytes read error : badf, fault, inval, noent, notdir ____________________________________________ .-, ,-. \ Important STRUCTURES / ' ' '.________.----.________.' ************ *** dirent *** ************ there is two kind of dirent structure (dirent and dirent64), why??? good question! I'm not sure off that, but you if take a look a the ELF file format, you will find a byte value (EI_CLASS)at offset 4 which determine the types of data structures of the ELF file: EI_CLASS = 1 ---> 32-bit structures EI_CLASS = 2 ---> 64-bit structures nasm syntax of the dirent structure: dirent: ; maximum size of this structure: 266 bytes d_ino resd 1 ; inode number d_off resd 1 ; dir-file offset (offset from beginning of ; directory file to concerning entry.) d_reclen resw 1 ; lenght of record d_ino resb 256 ; file name, null terminated C syntax of the dirent64 structure (not from me, just a 'CRTRL+C, CTRL+V') (64 means two dwords, short means a word) Kernel 2.4.xx: struct dirent64 { __u64 d_ino; (resd 2) __s64 d_off; (resd 2) unsigned short d_reclen; (resw 1) unsigned char d_type; char d_name[256]; }; Kernel 2.4.18 (fs/readdir.c): struct linux_dirent64 { u64 d_ino; s64 d_off; unsigned short d_reclen; unsigned char d_type; char d_name[0]; }; ************* *** utsname *** ************* new_utsname: | old_utsname: | oldold_utsname: | | sysname resb 65 | sysname resb 65 | sysname resb 9 nodename resb 65 | nodename resb 65 | nodename resb 9 release resb 65 | release resb 65 | release resb 9 version resb 65 | version resb 65 | version resb 9 machine resb 65 | machine resb 65 | machine resb 9 domainname resb 65 | | with kernel 2.4, new_utsname seems to be used .-, ,-. \ Important FLAGS DESCRIPTIONS / '.____________.--------.___________.' Use flags as in win32 or DOS programming: You can add the flag you want. the value are in hexadecimal. file access bits -------------------- O_ACCMODE 0000003 O_RDONLY 0000000 <--- read only O_WRONLY 0000001 <--- write only O_RDWR 0000002 <--- read and write O_CREAT 0000100 O_EXCL 0000200 O_NOCTTY 0000400 O_TRUNC 0001000 O_APPEND 0002000 O_NONBLOCK 0004000 O_NDELAY O_NONBLOCK O_SYNC 0010000 specific to ext2 fs and block devices FASYNC 0020000 fcntl, for BSD compatibility O_DIRECT 0040000 direct disk access hint - currently ignored O_LARGEFILE 0100000 O_DIRECTORY 0200000 must be a directory O_NOFOLLOW 0400000 don't follow links file permissions flags ---------------------- S_ISUID 04000 set user ID on execution S_ISGID 02000 set group ID on execution S_ISVTX 01000 sticky bit S_IRUSR 00400 read by owner (S_IREAD) S_IWUSR 00200 write by owner (S_IWRITE) S_IXUSR 00100 execute/search by owner (S_IEXEC) S_IRGRP 00040 read by group S_IWGRP 00020 write by group S_IXGRP 00010 execute/search by group S_IROTH 00004 read by others S_IWOTH 00002 write by others S_IXOTH 00001 execute/search by others ____________________________________________ .-, ,-. \ HOW to use some syscall / '.__________.-------._________.' .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' print a text on screen : syscall number 4 (write) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' The screen is considered as a file, his fd is 1 mov eax,4 ; the number of the syscall 'write' mov ebx,1 ; file descritor (fd) of the screen mov ecx,text ; offset of buffer to write mov edx,7 ; number of bytes to write int 80h ; interrupt 80h text db "HELLO",13,10,0 .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' get info about kernel : syscall number 122 (uname) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' mov eax,122 ; the number of the syscall 'uname' mov ebx,utsname ; pointer to utsname int 80h ; interrupt 80h utsname: sysname resb 65 nodename resb 65 release resb 65 version resb 65 machine resb 65 domainname resb 65 expl of the utsname structure return by syscall uname: OFFSET |00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F| ASCII view ---------+-----------------------------------------------+---------------- 0000:0000 4c 69 6e 75 78 00 00 00 00 00 00 00 00 00 00 00 Linux........... 0000:0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0040 00 6c 6f 63 61 6c 68 6f 73 74 2e 6c 6f 63 61 6c .localhost.local 0000:0050 64 6f 6d 61 69 6e 00 00 00 00 00 00 00 00 00 00 domain.......... 0000:0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0080 00 00 32 2e 34 2e 31 38 2d 33 00 00 00 00 00 00 ..2.4.18-3...... 0000:0090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:00a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:00b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:00c0 00 00 00 23 31 20 54 68 75 20 41 70 72 20 31 38 ...#1 Thu Apr 18 0000:00d0 20 30 37 3a 33 32 3a 34 31 20 45 44 54 20 32 30 07:32:41 EDT 20 0000:00e0 30 32 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02.............. 0000:00f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0100 00 00 00 00 69 36 38 36 00 00 00 00 00 00 00 00 ....i686........ 0000:0110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0140 00 00 00 00 00 28 6e 6f 6e 65 29 00 00 00 00 00 .....(none)..... 0000:0150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000:0180 00 00 00 00 00 00 ...... .-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' open a file : syscall number 5 (open) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' mov eax,5 ; the number of the syscall 'open' mov ebx,file ; offset of the ascii name (null terminated) of the file to open mov ecx,2 ; read and write file access flags mov edx,7777h ; all file permissions flags int 80h ; interrupt 80h mov dword[fd],eax ; save file descriptor return by the syscall file db "FiLe",0 .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' open a directory : syscall number 5 (open) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' mov eax,5 ; the number of the syscall 'open' mov ebx,direct ; offset of the ascii name (null terminated) of the directory to open mov ecx,200000h ; directory flag mov edx,7777h ; all file permissions flags int 80h ; interrupt 80h mov dword[fd],eax ; save file descriptor return by the syscall direct db 'DiReCtOrY',0 ;REMARK : ; to open the current directory : open '.',0 ; to open previous/parent directory: open '..',0 .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' close a file descriptor (an open file) : syscall number 6 (close) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' mov eax,6 ; the number of the syscall 'close' mov ebx,dword[fd] ; the file descriptor return by the syscall 'open' int 80h .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' create a new file : syscall number 14 (mknod) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' mov eax,14 ; the number of the syscall 'mknod' mov ebx,filename ; offset of the name of the file to create mov ecx,7777h ; file permissions flags mov edx,0 ; ignored int 80h filename db "file_test.txt",0 .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' move the file pointer : syscall number 19 (lseek) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' mov eax,19 ; the number of the syscall 'lseek' mov ebx,dword[fd] ; the file descriptor return by the syscall 'open' mov ecx,15 ; number of byte to move mov edx,0 ; how to move: 0 = from beginning of the file int 80h .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' write bytes in a (opened) file : syscall number 4 (write) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' mov eax,4 ; the number of the syscall 'write' mov ebx,dword[fd] ; the file descriptor return by the syscall open mov ecx,buffer ; data to write in the file mov ecx,10 ; number of byte to write int 80h buffer db "WRITE THIS" .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' read bytes in a (opened) file : syscall number 3 (read) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' mov eax,3 ; the number of the syscall 'read' mov ebx,dword[fd] ; the file descriptor return by the syscall open mov ecx,buffer ; buffer which receive data readed mov edx,16 ; number of byte to read int 80h buffer resb 16 .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' find files in directory : syscall number 141 (getdents) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' ;---------------------------------------------------------------------+ ; To compile this file: ; ; ; ; ; ; nasm -f elf test.asm ; ; ; ld -o test test.o ; ; ;- - - - - - - - - - - - - ; ; BITS32 ; section .text ; global _start ; _start ; ;----------------------------------+ ; ; first we should open a directory ; ; ;----------------------------------+ ; ; mov eax,5 ; the number of the syscall 'open' ; mov ebx,dir ; offset of the ascii name (null terminated) ; ; of the directory to open ; mov ecx,0200000h ; directory flag ; mov edx,7777h ; file permissions flags ; int 80h ; interrupt 80h ; ; you should check here for error! ; mov dword[fd],eax ; save file descriptor return by the syscal ; ; call getdents ; read file entry in the opened directory ; call getdents ; one more time ; call getdents ; one more time ; ; mov eax,1 ; the number of the syscall 'exit' ; int 80h ; ; getdents: ; mov eax,141 ; the number of the syscall 'getdents' ; mov ebx,dword[fd] ; fd ; mov ecx,dirent ; pointer to 'dirent' structure ; mov edx,600 ; size of the buffer ; int 80h ; ret ; ; section .data ; ; dir db '.',0 ; ; section .bss ; ; fd resd 1 ; dirent resb 600 ; ;--------------------------------------------------+------------------+ ;REMARK : | ; | ; to open the current directory : open '.',0 | ; to open previous/parent directory: open '..',0 | ;--------------------------------------------------+ remember this: dirent: ; d_ino resd 1 ; inode number ; d_off resd 1 ; dir-file offset (offset from beginning ; ; of directory file to concerning entry.); d_reclen resw 1 ; lenght of record ; d_ino resb 256 ; file name, null terminated So we have read 3 entry of the opened directory, let's look in hex what happend in our buffer: (I like HEX view!!!) ----------+-------------------------------------------------+------------------+ offset | 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f | view in ascii | ----------+-------------------------------------------------+------------------+ 0000:0000 | 50 c3 10 00 0c 00 00 00 0c 00 2e 00 63 81 28 00 | P+..........c.(. | 0000:0010 | 18 00 00 00 10 00 2e 2e 00 00 00 00 7d 83 34 00 | ............}.4. | 0000:0020 | 2c 00 00 00 18 00 53 75 62 44 69 72 65 63 74 6f | ,.....SubDirecto | 0000:0030 | 72 79 00 00 ef c3 10 00 38 00 00 00 10 00 66 69 | ry..ï+..8.....fi | 0000:0040 | 6c 65 00 00 f0 c3 10 00 00 10 00 00 18 00 73 74 | le..ð+........st | 0000:0050 | 75 70 69 64 66 69 6c 65 32 00 00 00 00 00 00 00 | upidfile2....... | 0000:0060 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | ----------+-------------------------------------------------+------------------+ __inversed bytes__ / \ / \ ,----'----, ,----'----, offset 00 to 03 : 50 c3 10 00 : d_ino = 0010c350 (inode number) offset 04 to 07 : 0c 00 00 00 : d_off = 0000000c (dir-file offset) offset 08 to 09 : 0c 00 : d_reclen = 000c (lenght of record) offset 0a to 0b : 2e 00 : d_ino = ".",0 (filename) the lenght of the first dirent is 0ch, so we will find the next dirent at offset 0+0ch (beginning+lenght of record) offset 0c to 0f : 63 81 28 00 : d_ino = 00288163 (inode number) offset 10 to 13 : 18 00 00 00 : d_off = 00000018 (dir-file offset) offset 14 to 15 : 10 00 : d_reclen = 0010 (lenght of record) offset 16 to 18 : 2e 2e 00 : d_ino = "..",0 (filename) the lenght of the SECOND dirent is 10h, so we will find the next dirent at offset 0c+10=1c (lenght of dirent#2+lenght of record) offset 1c to 1f : 7d 83 34 00 : d_ino = 00288163 (inode number) offset 20 to 23 : 2c 00 00 00 : d_off = 0000002c (dir-file offset) offset 24 to 25 : 18 00 : d_reclen = 0018 (lenght of record) offset 26 to 32 : 5375624469726563746f727900 : d_ino = 'SubDirectory',0 (filename) the lenght of the THIRD dirent is 18h, so we will find the next dirent at offset 18+1c=34 (lenght of dirent#3+lenght of record) offset 34 to 37 : ef c3 10 00 : d_ino = 0010c3ef (inode number) offset 38 to 3b : 38 00 00 00 : d_off = 00000038 (dir-file offset) offset 3c to 3d : 10 00 : d_reclen = 0010 (lenght of record) offset 3e to 32 : 66696c6500 : d_ino = 'file',0 (filename) so we have found those files: . .. SubDirectory file stupidfile2 when you call getdents, dirent structure seems to return the name of the files founded and "." and "..", like in DOS or Win32. Read 'using the getdents(2) Linux syscall to read directory entries from disk.' tutorial from sblip in Metaphase Issue # 2 .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' find files in directory : syscall number 89 (readdir) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' some tut and doc will tell you that the readir syscall is or will be remplaced by getdents syscall (in my linux intallation, both can be used...) the difference between readdir and getdents is that readdir return only one dirent (and not several succesiv dirent as getdents do!): with readdir, it's like in dos with the DTA or in win32 with the win32_data structure return by FindNext API. .-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-. ' ' ' read the keyboard : syscall number 89 (read) ' '._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.' mov eax,3 ; number of the syscall 'read' mov ebx,0 ; fd of the keyboard mov ecx,buffer ; buffer which receive the key pressed mov edx,1 ; number of input to read int 80h ; interrupt 80h .-, ,-. \ ERROR CODES / '._____.--._____.' in kernel 2.4 it seems to be: 0fffff000h < eax <= 0ffffffffh ---> error code The value of error code are return in eax after int 80h instruction (here they are in decimal: -1 in decimal ---> 0ffffffffh The value of the error codes can be 'not the same' in future kernel or in other linux OS or in freeBSD +--------------+-----+-----------------------------------------------------+ | ERROR |VALUE| DESCRIPTION | | NAME |(DEC)| | +--------------+-----+-----------------------------------------------------+ | E2BIG | -7 | The argument list (or vector) too long | +--------------+-----+-----------------------------------------------------+ | EACCESS | | The taks does not have enough privilege | | | | to perform the operation | +--------------+-----+-----------------------------------------------------+ | EAGAIN | -11 | Try again | +--------------+-----+-----------------------------------------------------+ | EBADF | -9 | Bad file descriptor number | +--------------+-----+-----------------------------------------------------+ | EBADFD | -77 | File descriptor in bad state | +--------------+-----+-----------------------------------------------------+ | EEXIST | -17 | Tried to create a file that already exists | +--------------+-----+-----------------------------------------------------+ | EFAULT | -14 | A pointer points to an invalid address | +--------------+-----+-----------------------------------------------------+ | EINTR | -4 | The system call was interrupted by a signal | +--------------+-----+-----------------------------------------------------+ | EINVAL | -22 | Invalid argument | +--------------+-----+-----------------------------------------------------+ | EIO | -5 | An I/O error occured | +--------------+-----+-----------------------------------------------------+ | EISDIR | -21 | The file descriptor is a directory | +--------------+-----+-----------------------------------------------------+ | ELIBBAD | -80 | Accessing a corrupted shared library | +--------------+-----+-----------------------------------------------------+ | ELOOP | -40 | Too many symbolic links or cyclic path encountered | +--------------+-----+-----------------------------------------------------+ | ENAMETOOLONG | -36 | File name too long | +--------------+-----+-----------------------------------------------------+ | ENFILE | -23 | The kernel has too many open files | +--------------+-----+-----------------------------------------------------+ | ENODEV | -19 | No such device | +--------------+-----+-----------------------------------------------------+ | ENOENT | -2 | An element of the path does not exist | +--------------+-----+-----------------------------------------------------+ | ENOEXEC | -8 | Exec format error | +--------------+-----+-----------------------------------------------------+ | ENOMEM | -12 | Out of kernel memory to complete the call | +--------------+-----+-----------------------------------------------------+ | ENOSPC | -28 | Not enough space left on device to complete the call| +--------------+-----+-----------------------------------------------------+ | ENOTDIR | -20 | (NODIR?) An element of the path that should have | | | | been a directory is not, in fact, a directory | +--------------+-----+-----------------------------------------------------+ | ENOTEMPTY | -39 | The destination directory is not empty | +--------------+-----+-----------------------------------------------------+ | ENXIO | -6 | No such device or address | +--------------+-----+-----------------------------------------------------+ | EPERM | -1 | Operation not permitted | +--------------+-----+-----------------------------------------------------+ | EPIPE | -32 | Broken pipe | +--------------+-----+-----------------------------------------------------+ | EROFS | -30 | Attemps to modify a read-only file system | +--------------+-----+-----------------------------------------------------+ | ESPIPE | -29 | (ISPIPE?) Illegal seek | +--------------+-----+-----------------------------------------------------+ | ETXTBSY | -26 | Text file busy | +--------------+-----+-----------------------------------------------------+ | EXDEV | -18 | Tried to create a cross-device link | +--------------+-----+-----------------------------------------------------+ ,...... .-, ,-. .' '. \ SUMMARY / . . ' ' : .-. .-. : '.__.---.__.' : : _ : : _ : : : :(_); :(_); : * Introduction : , '. : * ELF header : , :; : * Program header : .-___- . : * SECTION header : '.___.' '. * COMMENTED example ;. ..: * CODE for 'COMMENTED example' :... ....: * HEX view of the CODE for 'COMMENTED example' ;:::: .:::::: :::::; ::::::: ;::::; ::::::: .::::; '::::::. .;::::: ::::;. ;::::; ELF FILE ':::; .:;; '::, ': FORMAT ': : : : by LiTlLe VxW : '. . '. ; '. .--..___..--'. ; , . , , ._._._' ._._._. WRITTEN FOR 29A ______________ _ _ __/ \__ _ _ _ _ __ INTRODUCTION __ _ _ \______________/ ELF files looks like this: +-------------------+ | ELF header | | | +-------------------+ | program header | | (c0h bytes) | +-------------------+ | section #1 | +-------------------+ | section #2 | +-------------------+ . . . . . . . . . . . . . . . . . . +-------------------+ | section #n | +-------------------+ | section header | | (n*20h bytes) | +-------------------+ ____________ _ _ __/ \__ _ _ _ _ __ ELF HEADER __ _ _ \____________/ +==========================================================================+ | ELF header | +=====+===========+=============+==========================================+ | off | size | ref | DESCRIPTION +-----+-----------+-------------+------------------------------------------- | 0 | 10h bytes | e_ident | 'ELF' signature and other values | 10 | word | e_type | file type | 12 | word | e_machine | machine type to run this file | 14 | dword | e_version | ELF header version. usually 1 | 18 | dword | e_entry | entry point virtual address | 1c | dword | e_phoff | offset of program header | 20 | dword | e_shoff | offset of sections header | 24 | dword | e_flags | flags | 28 | word | e_ehsize | size of the ELF header | 2A | word | e_phentsize | size of one entry in the program header | 2C | word | e_phnum | number of entrys in the program header | 2E | word | e_shentsize | size of one entry in the section header | 30 | word | e_shunum | number of entrys in the section header | 32 | word | e_shstrndx | entry number of the name string section +-----+-----------+-------------+------------------------------------------- elf_header gives informations about other headers, how to launch the programm file and other info (architecture,...) \ e_ident : 'ELF' signature (464c457f) and other values: '------- structure of the 10h first bytes: +-----+---------------+---------+------------------------------------------+ | off | Name | size | description | +-----+---------------+---------+------------------------------------------+ | 0 | e_ident | dword | ELF signature | | 4 | EI_CLASS | byte | identifies the file's class, or capacity.| | 5 | EI_DATA | byte | Data encoding | | 6 | EI_VERSION | byte | File version | | 7 | EI_OSABI | byte | Operating system/ABI identification | | 8 | EI_ABIVERSION | byte | ABI version | | 9 | EI_PAD | 8 bytes | unused/reserved | | 0F | EI_NIDENT | byte | Size of e_ident structure??? | +-----+---------------+---------+------------------------------------------+ e_ident : 'ELF' signature (464c457f) EI_CLASS : file's class, or capacity. defines the basic types used by the data structures of the object file container itself. ELFCLASSNONE 0 Invalid class ELFCLASS32 1 32-bit objects <--- 32-bit structures ELFCLASS64 2 64-bit objects <--- 64-bit structures EI_DATA : specifies the encoding of both the data structures used by object file container and data contained in object file sections. ELFDATANONE 0 Invalid data encoding ELFDATA2LSB 1 specifies 2's complement values, with the least significant byte occupying the lowest address. ELFDATA2MSB 2 specifies 2's complement values, with the most significant byte occupying the lowest address. EI_VERSION : ELF header version number EI_OSABI : dentifies the operating system and ABI to which the object is targeted. EI_ABIVERSION : identifies the version of the ABI to which the object is targeted EI_PAD : unused/reserved \ e_type : determine the file type. '------ VALUE Meaning 0 No file type 1 Relocatable file 2 Executable file 3 Shared object file 4 Core file fe00 Operating system-specific feff Operating system-specific ff00 Processor-specific ffff Processor-specific \ e_machine : machine type to run this file '--------- VALUE Meaning (DEC?) 0 No machine 1 AT&T WE 32100 2 SPARC 3 Intel 80386 4 Motorola 68000 5 Motorola 88000 7 Intel 80860 8 MIPS I Architecture 9 IBM System/370 Processor 10 MIPS RS3000 Little-endian 15 Hewlett-Packard PA-RISC 17 Fujitsu VPP500 18 Enhanced instruction set SPARC 19 Intel 80960 20 PowerPC 21 64-bit PowerPC 36 NEC V800 37 Fujitsu FR20 38 TRW RH-32 39 Motorola RCE 40 Advanced RISC Machines ARM 41 Digital Alpha 42 Hitachi SH 43 SPARC Version 9 44 Siemens Tricore embedded processor 45 Argonaut RISC Core, Argonaut Technologies Inc. 46 Hitachi H8/300 47 Hitachi H8/300H 48 Hitachi H8S 49 Hitachi H8/500 50 Itanium-based platform 51 Stanford MIPS-X 52 Motorola ColdFire 53 Motorola M68HC12 54 Fujitsu MMA Multimedia Accelerator 55 Siemens PCP 56 Sony nCPU embedded RISC processor 57 Denso NDR1 microprocessor 58 Motorola Star*Core processor 59 Toyota ME16 processor 60 STMicroelectronics ST100 processor 61 Advanced Logic Corp. TinyJ embedded processor family 66 Siemens FX66 microcontroller 67 STMicroelectronics ST9+ 8/16 bit microcontroller 68 STMicroelectronics ST7 8-bit microcontroller 69 Motorola MC68HC16 Microcontroller 70 Motorola MC68HC11 Microcontroller 71 Motorola MC68HC08 Microcontroller 72 Motorola MC68HC05 Microcontroller 73 Silicon Graphics SVx 74 STMicroelectronics ST19 8-bit microcontroller 75 Digital VAX 76 Axis Communications 32-bit embedded processor 77 Infineon Technologies 32-bit embedded processor 78 Element 14 64-bit DSP Processor 79 LSI Logic 16-bit DSP Processor 80 Donald Knuth's educational 64-bit processor 81 Harvard University machine-independent object files 82 SiTera Prism \ e_version : ELF header version. '------- value Meaning 0 invalid version 1 current version \ e_entry : entry point virutal address, it is the offset of '------- the entry point (beginning of code). \ e_phoff : offset of program header table. If the file has no program '------- header table, e_phoff is zero. \ e_shoff : offset of sections header table. If the file has no section '------- header table, e_shoff is zero \ e_flags : processor-specific flags associated with the file '------- \ e_ehsize : size of the ELF header in bytes '-------- \ e_phentsize : size of one entry in the program header table, all entries are the same size '----------- \ e_phnum : number of entrys in the program header, if a file has no program header table, '------- e_phnum is zero. \ e_shentsize : size of one entry in the section header table, all entries are the same size '----------- \ e_shunum : number of entry in the section header table, If a file has no section header table, '-------- e_shnum is zero \ e_shstrndx : This member holds the section header table index of the entry associated '---------- with the section name string table.If the file has no section name string table, this member holds the value SHN_UNDEF. ________________ _ _ __/ \__ _ _ _ _ __ PROGRAM header __ _ _ \________________/ program header table is a structure which describes for the system how to prepare the program for execution. A file specifies its own program header size with the ELF header's 'e_phentsize' and 'e_phnum' value (in ELF header) remember EI_CLASS in ELF header, you can have structure in 32bits mode or in 64bits mode: +============================================================================+ | program header table (32bits) | +=======+=========+==========+===============================================+ | off | Size | name | meaning | +-------+---------+----------+-----------------------------------------------+ | 00 | dword | p_type | type of segment | | 04 | dword | p_offset | physical offset where to start the segment at | | 08 | dword | p_vaddr | virtual address in memory | | 0c | dword | p_paddr | physical address | | 10 | dword | p_filesz | size of datas read from offset | | 14 | dword | p_memsz | size of the segment in memory | | 18 | dword | p_flags | segment flags (rwx perms) | | 1c | dword | p_align | alignement | +-------+---------+----------+-----------------------------------------------+ +====================================================================+ | program header table (64bits) | +=========+==========+===============================================+ | Size | name | meaning | +=========+==========+===============================================+ | dword | p_type | type of segment | | dword | p_flags | segment flags (rwx perms) | | 8 bytes | p_offset | physical offset where to start the segment at | | 8 bytes | p_vaddr | virtual address in memory | | 8 bytes | p_paddr | physical address | | 8 bytes | p_filesz | size of datas read from offset | | 8 bytes | p_memsz | size of the segment in memory | | 8 bytes | p_align | alignement | +---------+----------+-----------------------------------------------+ \ p_type : type of segment '------ PT_NULL 0 unused PT_LOAD 1 see below PT_DYNAMIC 2 Dynamic linking information PT_INTERP 3 see below PT_NOTE 4 Location and size of auxiliary information PT_SHLIB 5 reserved PT_PHDR 6 see below PT_LOOS 60000000 reserved for operating system-specific semantics PT_HIOS 6fffffff reserved for operating system-specific semantics PT_LOPROC 70000000 reserved for processor-specific semantics PT_HIPROC 7fffffff reserved for processor-specific semantics PT_LOAD : loadable segment, described by p_filesz and p_memsz. The bytes from the file are mapped to the beginning of the memory segment. If the segment's memory size (p_memsz) is larger than the file size (p_filesz), the ``extra'' bytes are defined to hold the value 0 and to follow the segment's initialized area. The file size may not be larger than the memory size. Loadable segment entries in the program header table appear in ascending order, sorted on the p_vaddr member. PT_INTERP : location and size of a null-terminated path name to invoke as an interpreter. This segment type is meaningful only for executable file PT_PHDR : if present, specifies the location and size of the program header table itself, both in the file and in the memory image of the program.it may occur only if the program header table is part of the memory image of the program. If it is present, it must precede any loadable segment entry. \ p_offset : offset from the beginning of the file at which the first byte '-------- of the segment resides. \ p_vaddr : virtual address at which the first byte of the segment resides in memory. '------- \ p_addr : physical address (if relevant, else equ to p_vaddr) '------ \ p_filesz : number of bytes in the file image of the segment; it may be zero '-------- \ p_memsz : number of bytes in the memory image of the segment; it may be zero '------- \ p_flags : permissions segment flags: '------- name Value meaning PF_X 1 Execute PF_W 2 Write PF_R 4 Read PF_MASKOS 0ff00000 Unspecified PF_MASKPROC f0000000 Unspecified \ p_align : alignement '------- 0 and 1 mean no alignment is required Otherwise, p_align should be a positive, integral power of 2, and p_vaddr should equal p_offset, modulo p_align. ________________ _ _ __/ \__ _ _ _ _ __ SECTION header __ _ _ \________________/ All sections in ELF files can be found using the Section header table. The section header is similar to the program header. Each entry is relative to a section of the file. perhaps I will write a bug here ---> section header seems to be optional. +---------------------------------------------------------------------------+ | section header (32bits mode) | +-----+--------------+-------+----------------------------------------------+ | off | Name | size | description | +-----+--------------+-------+----------------------------------------------+ | 00 | sh_name | dword | pointer to the ascii name of the section | | 04 | sh_type | dword | section type | | 08 | sh_flags | dword | flags | | 0c | sh_addr | dword | virtual addresse | | 10 | sh_offset | dword | physical offset | | 14 | sh_size | dword | size | | 18 | sh_link | dword | depends on the section type | | 1c | sh_info | dword | depends on the section type | | 20 | sh_addralign | dword | alignement | | 24 | sh_entsize | dword | used when section contains fixed size entrys | +-----+--------------+-------+----------------------------------------------+ +-----------------------------------------------------------------------------+ | section header (64bits mode) | +-----+--------------+---------+----------------------------------------------+ | off | Name | size | description | +-----+--------------+---------+----------------------------------------------+ | 00 | sh_name | dword | pointer to the ascii name of the section | | 04 | sh_type | dword | section type | | 08 | sh_flags | dword | flags | | 0c | sh_addr | 8 bytes | virtual addresse | | 14 | sh_offset | 8 bytes | physical offset | | 1c | sh_size | dword | size | | 20 | sh_link | dword | depends on the section type | | 24 | sh_info | dword | depends on the section type | | 28 | sh_addralign | dword | alignement | | 3c | sh_entsize | dword | used when section contains fixed size entrys | +-----+--------------+---------+----------------------------------------------+ \ sh_name : name of the section. Its value is an index into the section header '------- string table section \ sh_type : type of section '------- Name Value description SHT_NULL 0 Marks the section header as inactive SHT_PROGBITS 1 section holds information defined by the program SHT_SYMTAB 2 sections hold a symbol table, provides symbols for link editing SHT_STRTAB 3 The section holds a string table SHT_RELA 4 section holds a string table SHT_HASH 5 The section holds a symbol hash table SHT_DYNAMIC 6 section holds information for dynamic linking SHT_NOTE 7 section holds information that marks the file in some way SHT_NOBITS 8 section of this type occupies no space in the file SHT_REL 9 section holds relocation entries without explicit addends SHT_SHLIB a reserved SHT_DYNSYM b holds a minimal set of dynamic linking symbols SHT_INIT_ARRAY e section contains an array of pointers to initialization functions SHT_FINI_ARRAY f section contains an array of pointers to termination functions SHT_PREINIT_ARRAY 10 section contains an array of pointers to functions that are invoked before all other initialization functions SHT_LOOS 60000000 reserved for operating system-specific semantics. SHT_HIOS 6fffffff reserved for operating system-specific semantics. SHT_LOPROC 70000000 reserved for processor-specific semantics SHT_HIPROC 7fffffff reserved for processor-specific semantics SHT_LOUSER 80000000 value specifies the lower bound of the range of indexes reserved for application programs. SHT_HIUSER ffffffff value specifies the upper bound of the range of indexes reserved for application programs \ sh_flags : 1-bit flags that describe miscellaneous attributes '-------- \ sh_addr : If the section will appear in the memory image of a process, this '------- member gives the address at which the section's first byte should reside. Otherwise, the member contains 0. \ sh_offset : offset from the beginning of the file to the first byte in the section '--------- \ sh_size : section's size in bytes '------- \ sh_link : This member holds a section header table index link, whose interpretation '------- depends on the section type \ sh_info : This member holds extra information, whose interpretation depends on '------- the section type. \ sh_addralign : Some sections have address alignment constraints. For example, if a '------------ section holds a doubleword, the system must ensure doubleword alignment for the entire section. The value of sh_addr must be congruent to 0, modulo the value of sh_addralign. \ sh_entsize : Some sections hold a table of fixed-size entries, such as a symbol table. '---------- For such a section, this member gives the size in bytes of each entry As i said above, i won't waste your HD space with tons of flag descriptions. You'll find them in the - nice - ELF documention (urlz at 4.3). Everything is document. It's the linux world. So use the source dudez !!! ______________________________________________________________________________________________________ ___________________ _ _ __/ \__ _ _ _ _ __ COMMENTED example __ _ _ \___________________/ Take a look at the end of this tutorial, you will find: CODE for 'COMMENTED example' HEX view of the CODE for 'COMMENTED example' I will use them to explain you the mystery of an ELF file Let's look at our hello file program in hexa: +----------+-------------------------------------------------+------------------+ | OFFSET | 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f | ascci view | +----------+-------------------------------------------------+------------------+ |0000:0000 | 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 | .ELF............ | |0000:0010 | 02 00 03 00 01 00 00 00 80 80 04 08 34 00 00 00 | ............4... | |0000:0020 | 10 01 00 00 00 00 00 00 34 00 20 00 02 00 28 00 | ........4. ...(. | |0000:0030 | 08 00 05 00 | ...... | +----------+-------------------------------------------------+------------------+ it is the ELF header, all ELF file begin with this header: at offset 00h : ELF signature : 464c457f at offset 04h : data structure type : 01h ---> 32bits structure! at offset 10h : file type : 0002h ---> executable at offset 12h : machine flags, 0003h : Intel 80386 at offset 14h : version of ELF hedaer : 00000001h . . 18h : entry point virtual address: 08048080h . . 1ch : offset of program header : 00000034h . . 20h : offset of sections header : 00000110h . . 28h : size of the ELF header : 34h bytes . . 2ah : size of one entry in program header table : 0020h . . 2ch : number of entrys in program header table : 0002h . . 2eh : size of one entry in section header table : 0028h . . 30h : number of entrys in section header table : 0008h . . 32h : entry number of the name string section table: 0005h Remark : entry in program header table is the description (program header structure) of one section of the ELF file. we have found in the ELF header that offset of program header: 00000034h so GO at offet 34h !!! (EI_CLASS specifie it's a 32bits structure!) +----------+-------------------------------------------------+------------------+ | OFFSET | 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f | ascci view | +----------+-------------------------------------------------+------------------+ |0000:0030 | 01 00 00 00 00 00 00 00 00 80 04 08 | ............ | |0000:0040 | 00 80 04 08 a2 00 00 00 a2 00 00 00 05 00 00 00 | ....½...½....... | |0000:0050 | 00 10 00 00 01 00 00 00 a4 00 00 00 a4 90 04 08 | ........?...?... | |0000:0060 | a4 90 04 08 18 00 00 00 18 00 00 00 06 00 00 00 | ?............... | |0000:0070 | 00 10 00 00 | .... | +----------+-------------------------------------------------+------------------+ e_phnum in ELF header specifies the number of entrys (section) in the program header: here there is 2 sections... SECTION #1 : at offset 34h : type of segment : 00000001h ----> loadable segment at offset 38h : physical offset : 00000000h . . 3ch : virtual address : 08048000h . . 40h : physical address: 08048000h . . 44h : physical size : 000000a2h . . 48h : virtual size : 00000002h . . 4ch : flags : 00000005h (read/execute) . . 50h : alignement : 00001000h SECTION #2 : at offset 54h : type of segment : 00000001h ----> loadable segment at offset 58h : physical offset : 000000a4h . . 5ch : virtual address : 080490a4h . . 60h : physical address: 080490a4h . . 64h : physical size : 00000018h . . 68h : virtual size : 00000018h . . 6ch : flags : 00000006h (read/write/execute) . . 70h : alignement : 00001000h Here is the section table: there is 8 entry. those 'section table' structure give information about section described in the 'program header' and the others section of the files +----------+-------------------------------------------------+------------------+ | OFFSET | 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f | ascci view | +----------+-------------------------------------------------+------------------+ |0000:0110 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | |0000:0120 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | |0000:0130 | 00 00 00 00 00 00 00 00 1b 00 00 00 01 00 00 00 | ................ | |0000:0140 | 06 00 00 00 80 80 04 08 80 00 00 00 22 00 00 00 | ............"... | |0000:0150 | 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 | ................ | |0000:0160 | 21 00 00 00 01 00 00 00 03 00 00 00 a4 90 04 08 | !...........?... | |0000:0170 | a4 00 00 00 18 00 00 00 00 00 00 00 00 00 00 00 | ?............... | |0000:0180 | 04 00 00 00 00 00 00 00 27 00 00 00 08 00 00 00 | ........'....... | |0000:0190 | 03 00 00 00 bc 90 04 08 bc 00 00 00 00 00 00 00 | ....?...?....... | |0000:01a0 | 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 | ................ | |0000:01b0 | 2c 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 | ,............... | |0000:01c0 | bc 00 00 00 1f 00 00 00 00 00 00 00 00 00 00 00 | ?............... | |0000:01d0 | 01 00 00 00 00 00 00 00 11 00 00 00 03 00 00 00 | ................ | |0000:01e0 | 00 00 00 00 00 00 00 00 db 00 00 00 35 00 00 00 | ........ê...5... | |0000:01f0 | 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 | ................ | |0000:0200 | 01 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 | ................ | |0000:0210 | 50 02 00 00 f0 00 00 00 07 00 00 00 0b 00 00 00 | P...Ð........... | |0000:0220 | 04 00 00 00 10 00 00 00 09 00 00 00 03 00 00 00 | ................ | |0000:0230 | 00 00 00 00 00 00 00 00 40 03 00 00 39 00 00 00 | ........@...9... | |0000:0240 | 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 | ................ | +----------+-------------------------------------------------+------------------+ from offset 110h to 137h : entry #1 in section table from 138h to 15fh : entry #2 in section table from 160h to 187h : entry #3 in section table from 188h to 1afh : entry #4 in section table from 1b0h to 1d7h : entry #5 in section table from 1d8h to 1ffh : entry #6 in section table from 200h to 227h : entry #7 in section table from 228h to 24fh : entry #8 in section table to find the ascii name of the section, the value sh_name is relative to the beginning of the string table. string table is at offset 340h (see entry #8 description) entry #1 : --------- sh_name : 00h : pointer to the name of the section sh_type : 00h : section header inactive sh_addr : 00h : virtual addr sh_offset : 00h : physical offset sh_size : 00h : size entry #2 : --------- sh_name : 1bh : pointer to the name of the section sh_type : 01h : holds information defined by the program sh_addr : 08048080h : virtual addr sh_offset : 80h : physical offset sh_size : 22h : size entry #3 : --------- sh_name : 21h : pointer to the name of the section sh_type : 01h : holds information defined by the program sh_addr : 080490a4h : virtual addr sh_offset : a4h : physical offset sh_size : 18h : size entry #4 : --------- sh_name : 27h : pointer to the name of the section sh_type : 08h : section occupies no space in the file sh_addr : 080490bch : virtual addr sh_offset : bch : physical offset sh_size : 00h : size entry #5 : --------- sh_name : 2ch : pointer to the name of the section sh_type : 01h : holds infos defined by the program sh_addr : 00h : virtual addr sh_offset : bch : physical offset sh_size : 1fh : size entry #6 : --------- sh_name : 11h : pointer to the name of the section sh_type : 03h : holds infos defined by the program sh_addr : 00h : virtual addr sh_offset : dbh : physical offset sh_size : 35h : size entry #7 : --------- sh_name : 01h : pointer to the name of the section sh_type : 02h : hold a symbol table, provides symbols for link editing sh_addr : 00h : virtual addr sh_offset : 250h : physical offset sh_size : f0h : size entry #8 : --------- sh_name : 09h : pointer to the name of the section sh_type : 03h : hold a string table sh_addr : 00h : virtual addr sh_offset : 340h : physical offset sh_size : 39h : size ______________________________ _ _ __/ \__ _ _ _ _ __ CODE for 'COMMENTED example' __ _ _ \______________________________/ ;----------------------------------------------------. ; To compile this file: ; ; ; ; ; ; nasm -f elf hello.asm ; ; ; ld -o hello hello.o ; ; ;- - - - - - - - - - - - - ; ; BITS32 ; section .text ; code section ; global _start ; <---. ; _start: ; <----\__for nasm ; mov eax,4 ; syscall 'write' ; mov ebx,1 ; file descritor of the screen ; mov ecx,message ; offset of what to write ; mov edx,23 ; number of bytes to write ; int 80h ; interrupt 80h ; ; mov eax,1 ; syscall 'exit' ; mov ebx,0 ; exit code ; int 80h ; interrupt 80h ; ; section .data ; data section ; ; message db "hello LINUX world !!!",13,10,0 ; ;----------------------------------------------------' ______________________________________________ _ _ __/ \__ _ _ _ _ __ HEX view of the CODE for 'COMMENTED example' __ _ _ \______________________________________________/ +----------+-------------------------------------------------+------------------+ | OFFSET | 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f | ascci view | +----------+-------------------------------------------------+------------------+ |0000:0000 | 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 | .ELF............ | <-. <--------+ |0000:0010 | 02 00 03 00 01 00 00 00 80 80 04 08 34 00 00 00 | ............4... | \_ ELF | |0000:0020 | 10 01 00 00 00 00 00 00 34 00 20 00 02 00 28 00 | ........4. ...(. | / HEADER | |0000:0030 | 08 00 05 00 01 00 00 00 00 00 00 00 00 80 04 08 | ................ | <-< (3ch) | |0000:0040 | 00 80 04 08 a2 00 00 00 a2 00 00 00 05 00 00 00 | ....½...½....... | \ | |0000:0050 | 00 10 00 00 01 00 00 00 a4 00 00 00 a4 90 04 08 | ........?...?... | > program +- Section #1 |0000:0060 | a4 90 04 08 18 00 00 00 18 00 00 00 06 00 00 00 | ?............... | / header | |0000:0070 | 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | <-' (73h) | |0000:0080 | b8 04 00 00 00 bb 01 00 00 00 b9 a4 90 04 08 ba | ?....¯....û?...§ | | |0000:0090 | 17 00 00 00 cd 80 b8 01 00 00 00 bb 00 00 00 00 | ....Ö.?....¯.... | | |0000:00a0 | cd 80 00 00 68 65 6c 6c 6f 20 4c 49 4e 55 58 20 | Ö...hello LINUX | <-------------+_ section # 2 |0000:00b0 | 77 6f 72 6c 64 20 21 21 21 0d 0a 00 00 54 68 65 | world !!!....The | <-------------' |0000:00c0 | 20 4e 65 74 77 69 64 65 20 41 73 73 65 6d 62 6c | Netwide Assembl | |0000:00d0 | 65 72 20 30 2e 39 38 2e 32 32 00 00 2e 73 79 6d | er 0.98.22...sym | |0000:00e0 | 74 61 62 00 2e 73 74 72 74 61 62 00 2e 73 68 73 | tab..strtab..shs | |0000:00f0 | 74 72 74 61 62 00 2e 74 65 78 74 00 2e 64 61 74 | trtab..text..dat | |0000:0100 | 61 00 2e 62 73 73 00 2e 63 6f 6d 6d 65 6e 74 00 | a..bss..comment. | |0000:0110 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | <--. |0000:0120 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | | |0000:0130 | 00 00 00 00 00 00 00 00 1b 00 00 00 01 00 00 00 | ................ | | |0000:0140 | 06 00 00 00 80 80 04 08 80 00 00 00 22 00 00 00 | ............"... | | |0000:0150 | 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 | ................ | | |0000:0160 | 21 00 00 00 01 00 00 00 03 00 00 00 a4 90 04 08 | !...........?... | | |0000:0170 | a4 00 00 00 18 00 00 00 00 00 00 00 00 00 00 00 | ?............... | | |0000:0180 | 04 00 00 00 00 00 00 00 27 00 00 00 08 00 00 00 | ........'....... | | |0000:0190 | 03 00 00 00 bc 90 04 08 bc 00 00 00 00 00 00 00 | ....?...?....... | | |0000:01a0 | 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 | ................ | | SECTION table |0000:01b0 | 2c 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 | ,............... | | |0000:01c0 | bc 00 00 00 1f 00 00 00 00 00 00 00 00 00 00 00 | ?............... | | |0000:01d0 | 01 00 00 00 00 00 00 00 11 00 00 00 03 00 00 00 | ................ | | |0000:01e0 | 00 00 00 00 00 00 00 00 db 00 00 00 35 00 00 00 | ........ê...5... | | |0000:01f0 | 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 | ................ | | |0000:0200 | 01 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 | ................ | | |0000:0210 | 50 02 00 00 f0 00 00 00 07 00 00 00 0b 00 00 00 | P...Ð........... | | |0000:0220 | 04 00 00 00 10 00 00 00 09 00 00 00 03 00 00 00 | ................ | | |0000:0230 | 00 00 00 00 00 00 00 00 40 03 00 00 39 00 00 00 | ........@...9... | | |0000:0240 | 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 | ................ | <--' |0000:0250 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | <--. |0000:0260 | 00 00 00 00 80 80 04 08 00 00 00 00 03 00 01 00 | ................ | | |0000:0270 | 00 00 00 00 a4 90 04 08 00 00 00 00 03 00 02 00 | ....?........... | | |0000:0280 | 00 00 00 00 bc 90 04 08 00 00 00 00 03 00 03 00 | ....?........... | | |0000:0290 | 00 00 00 00 00 00 00 00 00 00 00 00 03 00 04 00 | ................ | | |0000:02a0 | 00 00 00 00 00 00 00 00 00 00 00 00 03 00 05 00 | ................ | | |0000:02b0 | 00 00 00 00 00 00 00 00 00 00 00 00 03 00 06 00 | ................ | |_ SYMBOL table |0000:02c0 | 00 00 00 00 00 00 00 00 00 00 00 00 03 00 07 00 | ................ | | |0000:02d0 | 01 00 00 00 00 00 00 00 00 00 00 00 04 00 f1 ff | ..............¤˜ | | |0000:02e0 | 0b 00 00 00 80 80 04 08 00 00 00 00 00 00 01 00 | ................ | | |0000:02f0 | 12 00 00 00 a4 90 04 08 00 00 00 00 00 00 02 00 | ....?........... | | |0000:0300 | 1a 00 00 00 80 80 04 08 00 00 00 00 10 00 01 00 | ................ | | |0000:0310 | 21 00 00 00 bc 90 04 08 00 00 00 00 11 00 f1 ff | !...?.........¤˜ | | |0000:0320 | 2d 00 00 00 bc 90 04 08 00 00 00 00 11 00 f1 ff | -...?.........¤˜ | | |0000:0330 | 34 00 00 00 bc 90 04 08 00 00 00 00 11 00 f1 ff | 4...?.........¤˜ | <--' |0000:0340 | 00 68 65 6c 6c 6f 2e 61 73 6d 00 42 49 54 53 33 | .hello.asm.BITS3 | <--. |0000:0350 | 32 00 6d 65 73 73 61 67 65 00 5f 73 74 61 72 74 | 2.message._start | |_ STRING table |0000:0360 | 00 5f 5f 62 73 73 5f 73 74 61 72 74 00 5f 65 64 | .__bss_start._ed | | |0000:0370 | 61 74 61 00 5f 65 6e 64 00 | ata._end. | <--' +----------+-------------------------------------------------+------------------+ LAST DWORDS: I like linux, and I ask me why I have written this tut, because I don't want that linux OS will be infected with stupid virii with destructiv payloads. I hate destructiv virus writers! they kill the host of their virus, and they will be tracked and hunted by authorities. Only anti virus company love those guys... and if you say me that there is no life without destruction, I answer you : 'are you GOD to pretend to create life ?!!' I hope you have enjoy reading this litlle tutorial...see you soon. GREETS: Silvio Cesare for your tutorials about linux vx mandragore for having push some asm code in your linux infection tutorial Lord Julus I've learn a lot with you. With VXtazy wonderfull zine and your dead mailling list... are TKT zine ready? vxheavens webmaster for your web site, and to keep it alive... Gigabyte all vx coders want to see you, but next times don't claims to the world 'I'm a female and I write vx!!!' IKX members are you sleeping? everybody wait for a new xine! WAKE UP!!! past/present 29a menbers you have make me insomiac with your articles ;-) YOU for having reading this article... Bad VB coders OK, VB virus was fine because they were a new technic but now stop to write stupid VB virus, LEARN ASM !!! .-, ,-. \ LiTlLe VxW, March 2004 / '._________.-----._________.'