The Portable Executable (PE) format is a file format for executables, object code, DLLS, and others used in Windows operating systems. The structure of a PE file includes several headers, each serving a specific purpose.
Understanding those are crucial for reverse engineering, debugging, and malware analysis. Each header provides specific information that helps in determining how the executable will be loaded, what resources it requires, and how it interacts with the operating system.
DOS Header
This is the first part of any PE file and is crucial for backward compatibility with MS-DOS, containing a small executable (DOS Stub) that usually outputs “This program cannot be run in DOS mode.”
It has two key fields:
- e_magic – Signature MZ( Mark Zbikowski), indicating it’s a DOS executable.
- e_lfanew – Offset to the PE header
PE Header
This header marks the start of the PE file format.
It has one key field
- Signature – Which identifies the file as a PE file, usually “PE\0\0”
COFF File Header
It contains information about the target machine and the characteristics of the PE file.
It has several key fields:
- Machine – Which specifies the target CPU architecture.
- NumberOfSections – Number of sections in the file.
- TimeDateStamp – The timestamp when the file was created.
- PointerToSymbolTable – Offset to the symbol table, usually zero.
- NumberOfSymbols – Number of symbols in the symbol table, usually zero.
- SizeOfOptionalHeader – Size of the optional header.
- Characteristics – Flags indicating attributes of the file (e.g. executable, DLL)
Optional Header
Provides important details for loading and executing the program, despite its name, it’s required for executables.
It has several key fields:
- Magic – Identifies type of executable (PE32(0x010B) or PE32+(0x020B) for 64.bit).
- MajorLinkerVersion/MinorLinkerVersion – Linker version used.
- SizeOfCode – Size of the code section.
- SizeOfInitializedData – Size of the initialized data section.
- AddressOfEntryPoint – Relative Virtual Address (RVA) of the entry point.
- BaseOfCode/BaseOfData – RVAs of the code and data sections.
- ImageBase – Preferred load address of the image in memory.
- SectionAlignment/FileAlignment – Alignment of sections in memory and disk.
- SizeOfImage – Total size of the image in memory.
- SizeOfHeader – Combined size of all headers.
- Subsystem – The subsystem required to run the executable (e.g. Windows GUI).
- DllCharacteristics – Flags indicating DLL characteristics.
- SizeOfStackReverse/SizeOfStackCommit – Size of the stack reserve and commit.
- SizeOfHeapReverse/SizeOfHeapCommit – Size of the heap to reserve and commit.
- NumberOfRvaAndSizes – Number of directory entries.
Data Directories
This points to various tables and resources in the file, such as imports, exports, resources and more.
It has several key fields:
- Export Table – Address and size of the export table.
- Import Table – Address and size of the import table.
- Resource Table – Address and size of the resource table.
- Exception Table – Address and size of the exception table.
- Certificate Table – Address and size of the security certificate table.
- Base Relocation Table – Address and size of the base relocation table.
- Debug Data – Address and size of debug data.
- Architecture Data – Reserved, should be zero.
- Global Pointer Register – Relative virtual address of the value to be stored in the global pointer register.
- TLS Table – Address and size of the thread-local storage table.
- Load Configuration Table – Address and size of the load configuration table.
- Bound Import Table – Address and size of the bound import table.
- Import Address Table – Address and size of the import address table.
- Delay Import Descriptor – Address and size of the delay import descriptor.
- CLR Runtime Header – Address and size of the CLR runtime header.
- Reserved – Reserved for future use.
Section Headers
Describe each section in the file, including code, data, and other information.
It has several key fields:
- Name – Name of the section;
- .text – This section contains the executable code, it’s the main section where the actual instructions that the CPU executes are stored typically marked as read and execute but not writable, essential for the functioning of the program as it includes all compiled code.
- .rdata – This section contains the read-only data, storing constants and data that should not be modified during execution, examples include string literals and import/export tables, marked as read-only to protect the integrity of data.
- .data – This section contains initialized read-write data, stores global and static variables that can be modified at runtime, includes both initialized data (int a = 0;) and default values, marked as read and write.
- .bss (Block Started by Symbol) – This section contains unitialized data, stores global and static variables that are declared but not assigned a value. Example ‘int b;’ will be placed in this section. The operating system initializes this section to zero when the program is loaded.
- .rsrc (Resource) – This section contains resources for the program, it stores resources like icons, menus, dialogues and other UI elements. It’s used by the Windows API to access these resources, marked as read-only to prevent modification during execution.
- VirtualSize – Total size of the section in memory.
- VirtualAddress – RVA of the section.
- SizeOfRawData – Size of the section on disk.
- PointerToRawData – File offset to the section’s data.
- PointerToRelocations – File offset to the relocations for this section.
- PointerToLinenumbers – File offset to the line numbers for this section.
- NumberOfRelocations – Number of relocations for this section.
- NumberOfLinenumbers – Number of line numbers for this section.
- Characteristics – Flags indicating the characteristics of the section (E.g. Executable, writable).
Packing
Packing executables involves compressing or encrypting the executable file to reduce its size or obfuscate its contents, you can perform this with tools like UPX, ASPack, and Themida.
Identify Packed Executables
Identifying a packed executable involves checking for signs that indicate the file has been compressed or encrypted.
- File size – Packed executables are typically smaller than their unpacked counterparts due to compression.
- File headers – Tools like PEiD, Detect It Easy (Die) can identify common packers and display the type of packing used.
- Entropy Analysis – High entropy like 8 (randomness) in a file suggests compression or encryption, tools like Binwalk or PEiD can measure the entropy of a file.
- Suspicious Sections – Packed executables often have unusual or high entropy sections, tools like PE Explorer of CFF Explorer can help inspect the sections of a PE file.
- Static Analysis – Look for signs like a small .text section or code that jumps to a decompression routine.
- Runtime Behavior – Run the executable in a controlled environment and use a debugger to see if the program starts with unpacking code before executing the main logic.
- Others – EXECUTE permissions for multiple sections, significant difference between SizeOfRawData and Misc_VirtualSize of some PE sections, few import functions.
FlareVM and REMnux are two OS that have the tools needed for PE analysis as well as malware analysis tools.