This page is an example, for manually crafting Windows PE files - with form@fix.
For an immediate action, just save this page on disk and feed it to the form@fix interpreter, for a fresh-made .exe for Win95 - the wHeaveno.exe.
For running form@fix, MSDOS is enough. However, the produced PE, the wHeaveno.exe is a Windows95 program. If without Win95-or-later, only the MSDOS stub of it would run, not the Windows code.
That is well, too, for example, at a web server, for generating .exe or applet files for multiple target configurations, and for a variety of purposes.
For ease of programming, first, let there exist, a few mnemonics, constants, etc.
The resulting .exe is only 0x400 bytes. Therefore, we may place these at a later offset, 0x5000, without any of this latter area getting written in the resulting .exe
\a1 0x5000 \\arrive @ 0x5000, & fill with zeroes \r f0 \* 1
f0 is only a fancier name, to fill with zeroes, at byte-width, e.g: with \a+f0 20 to fill with 20 zeroes. i.e: "1" is the fill-width, not the filler-value. Filler is zero, if not explicitly specified.
Although this page does not need all of these, here is a bootstrap, to have a raw remz. The problem, in general, is the flexibility of form@fix. The default relative-offset is resettable. In that case, even an explicit numeric, would mean different. That is very well, when we refer to a pointer, as that pointer would mean at run-time, but we must ensure that we override that default relative-offset, in other cases.
//bootstrap int - from fixDef.htm // to avoid \* variability // if/after platform was redefined \r Zero \* Zero Zero \- \r One \* Zero One \- \r Two \* Zero Two \- \r Three \* Zero Three \- \r Four \* Zero Four \- \r! = \\r+ is 0 \a+f0 4 \\width is 1 \@1 One \@1 Zero \r! =2 \\r+ is 0 \a+f0 4 \\width is 2 \@1 Two \@1 Zero \r! =4 \\r+ is 0 \a+f0 4 \\width is 4 \@1 Four \@1 Zero \r! =RVA \\r+ is 0 \*=4 0xe00 \\width is 4 \@1 Four \@1 Zero \r! WX \\for a VA (RVA+ImageBase) of Win \\The default r+ for a WinExe is this. \\However, this page is for a format-study. \\Therefore, every r+ is explicit. \\r+ is the ImageBase \*=4 0x400e00 \\width is 4 \@1 Four \@1 Zero
\r char \*=4 1 \\char-width = 1-byte \r int \*=4 4 \\int-width = 4 bytes (32-bit) \r NULL \*=4 0
Here are a few assembly-mnemonics for intel 80x86.
//for intel 8086 and higher - from fix8086.htm \r push_eAX \*= 1 0x50 \r [@]=eAX \*= 1 0xa3 \\mov [xyz],eAX \r eAX=[@] \*= 1 0xa1 \\mov eAX,[xyz] \r eSP+=b \*= 2 0x83 0xc4 \\add eSP,byteval \r eSP+=i \*= 2 0x81 0xc4 \\add eSP,intval \r [f]() \*= 3 0x2E 0xFF 0x15 \\call cs:[f] \r ret \*= 1 0xc3 //for intel 80386 and higher - from fix80386.htm \r push_b \*= 1 0x6a \\push byte >= i386 \r push_i \*= 1 0x68 \\push int >= i386 \r! for_ease
After these are in place, next, I start the presentation with the PE file header. Although every PortableExecutable file, in fact, starts/preceded with an MSDOS stub, I chose to start the presentation with the PE part and send the DOS stub to the Appendix D, because that stub is rarely ever expected to do anything other than telling the user that, to run that application, Windows is needed, and it quits. i.e: Not to run on MSDOS.
In the listings that follow, not every field is identified with a remz. In many cases, only there is a commentary. To have a remz, with \r is to refer to that offset later, as a variable or constant. It indicates that the variable/setting that follows, at that offset, is meaningfully modifiable, based on what the program content/configuration is. The other "variable"s are rarely modified, if at all.
Two representative cases of label-omissions are the Date (TimeStamp) and the SizeOfOptionalHeader fields in the PE file header. The former does not matter. It is for your own reference, if you care. The latter, at least for an exe file, is always 224. Hence, even the name "optional" is out of context.
As a result, I omitted labels for them, to take attention to what is, in deed, a variable.
\a 0x80 \\PE_signature \'PE' \a+f0 2 \\Intel i386 \*=2 0x14c \r NumberOfSections \*=2 2 \\rdata, code, & runtime \\Date=Mar 14,2006 :-) \*=4 0x03142006 \r PointerToSymbolTable \*=4 0 \\NULL Not 4 .exe \r NumberOfSymbols \*=4 0 \\NULL Not 4 .exe \\SizeOfOptionalHeader. \*=2 224 \r ExeCharacteristics \*=2 0x30f
The next is informative. A few flag definitions (from Microsoft documentation) ...
\\ IMAGE_FILE_EXECUTABLE_IMAGE 0x0002 \\ Image only. Indicates that the image file is \\ valid and can be run. If this flag is not set, \\ it generally indicates a linker error. \\ IMAGE_FILE_LINE_NUMS_STRIPPED 0x0004 \\ COFF line numbers have been removed. \\ IMAGE_FILE_LOCAL_SYMS_STRIPPED 0x0008 \\ COFF symbol table entries for local symbols \\ have been removed. \\ IMAGE_FILE_BYTES_REVERSED_LO 0x0080 \\ Little endian: LSB precedes MSB in memory. \\ IMAGE_FILE_32BIT_MACHINE 0x0100 \\ Machine based on 32-bit-word architecture. \\ IMAGE_FILE_DEBUG_STRIPPED 0x0200 \\ Debugging information removed from image file. \\The DLL'ness is not true for the current one. \\The next flag is handy here for info & switching. \\ IMAGE_FILE_DLL 0x2000 \\ The image file is a dynamic-link library (DLL). \\ Such files are considered executable files for \\ almost all purposes, although they cannot be \\ directly run.
Here, for Windows95-exe crafting,
\\ PE32 "magic" constant \*=2 0x10b \\ MajorLinkerVersion. FYI \*= 5 \\ MinorLinkerVersion. FYI \*= 5 \r SizeOfCode \*=4 0x100 \r SizeOfInit \*=4 0x100 \r SizeOfUnInit \\.extra \*=4 0x1000 \r EntryPointRVA \\ .exe start @ \*=4 0x2000 \r CodeOffsetRVA \\ code section \*=4 0x1000 \r DataOffsetRVA \\ initialized-data \*=4 0x1000 \\ ImageBase, for Win95 .exe \*=4 0x400000 \r SectionAlignment \\at multiples of \*=4 0x1000 \r FileAlignment \\at multiples of \*=4 0x200 \\ MajorNo of Required Operating System \*=2 4 \\ MinorNo of Required Operating System \*=2 0 \\ MajorImageVersion: version 0.0 \*=2 0 \\ 5 \\ MinorImageVersion: version 0.0 \*=2 0 \\ MajorSubsystemVersion: (Win32 4.0) \*=2 4 \\ MinorSubsystemVersion: (Win32 4.0) \*=2 0 \\ Reserved \*=4 0 \r SizeOfImage \\memory-aligned \*=4 0x3000 \r SizeOfHeaders \\file-aligned/rounded \*=4 0x200 \\ CheckSum (for drivers) \*=4 0 \r WinSubsystem \\for CUI(console) == \*=2 3 \r DllCharacter \\ Not 4 .exe \*=2 0 \r StackReserveSize \*=4 0x100000 \\ SizeOfStackCommit (at start) \*=4 0x1000 \\ SizeOfHeapReserve \*=4 0x100000 \\ SizeOfHeapCommit (at start) \*=4 0x1000 \\ LoaderFlags: Obsolete. \*=4 0
\\NumberOfRvaAndSizes. Always 16. \*=4 16 \\pad 16*8 zeroes \r! z \a+f0 128 \a z Address (RVA) Size ------------- -------- \r ExportDir \a+ 8 \r ImportDir \a+ 8 \r ResourceDir \a+ 8 \r ExceptionDir \a+ 8 \r SecurityDir \a+ 8 \r BaseRelocDir \a+ 8 \r DebugDir \a+ 8 \r CopyrightDir \a+ 8 \r GlobalPtrDir \a+ 8 \r TLSDir \a+ 8 \r LoadConfigDir \a+ 8 \r BoundImportDir \a+ 8 \r IATDir \a+ 8 \\Three unused directories \a+ 24
Notice that, for each section,
The virtual address of a section is (SectionOffsetInFile/FileAlignment)*MemAlignment. For convenience, we have chosen the FileAlignment==MemAlignment==0x1000. Therefore, VirtualAddress==Ptr2RawData.
\\"rdcode" section: ReadableData&Code \r RDCode_Name \'rdcode' \a+1 2 \\6+2=8 \r RDCode_VirtualSize \*=4 0x200 \\Unused? \r RDCode_VirtualAddress \*=4 0x1000 \r RDCode_SizeOfRawData \*=4 0x200 \r RDCode_PtrToRawData \*=4 0x200 \r RDCode_PtrToRelocs \*=4 0 \\PtrToLineNums, for .OBJ \*=4 0 \r RDCode_NumberOfRelocs \*=2 0 \\NumOfLineNums, for .OBJ \*=2 0 \r RDCode_Flags \*=4 0x60000040 \\ 0x00000040 container for initialized data \\ 0x20000000 can be executed as code \\ 0x40000000 readable
\\"@RAM" section: RAM/Run-time-only data \\Does not exist in file. FlusZero at load time. \r RAM_Name \'@RAM' \*=4 0 \r RAM_VirtualSize \*=4 0x1000 \r RAM_VirtualAddress \*=4 0x2000 \r RAM_SizeOfRawData \*=4 0x200 \\WinXP joke?!? (Read the next note.) \r RAM_PtrToRawData \*=4 0 \r RAM_PtrToRelocs \*=4 0 \\PtrToLineNums, for .OBJ \*=4 0 \r RAM_NumberOfRelocs \*=2 0 \\NumOfLineNums, for .OBJ \*=2 0 \r RAM_Characteristics \*=4 0xc0000080 \\ 0x00000080 container for uninitialized (all-zeroes) data \\ 0x40000000 readable \\ 0x80000000 writeable
WinXP does not accept RAM_SizeOfRawData==0x1000. Informed me that, this was "not a valid Win32" although Win95 was happily running. For a file-based section, the file-alignment size is fitting. That is no-need here, though, because the @RAM section is only in RAM, not recorded.
In your code, you refer to callees with the address of ptr-to-FuncNameRVA-within-FirstThunk.
each entry is padded to the next 16-bit
\r ?GetStdHandle \*=2 0x18a\'GetStdHandle\0' \a/ 2 \r ?WriteConsoleA\*=2 0x2d8\'WriteConsoleA\0'\a/ 2 \r ?GetLastError \*=2 0x153\'GetLastError\0' \a/ 2 \r ?wsprintfA \*=2 0x240\'wsprintfA\0' \a/ 2 \r ?MessageBoxA \*=2 0x182\'MessageBoxA\0' \a/ 2
Pad each .dll name to the next 32-bit (4 bytes) boundary.
\r azK32 \'kernel32.dll\0' \a/ 4 \r azU32 \'user32.dll\0' \a/ 4
\r! z0 \a IATDir \*=RVA z0 \a z0 \r IAT_kernel32 \\for kernel32.dll \r GetStdHandle() \*=RVA ?GetStdHandle \r WriteConsole() \*=RVA ?WriteConsoleA \r GetLastError() \*=RVA ?GetLastError \@int NULL \r IAT_kernel32_end \r IAT_user32 \\for user32.dll \r wsprintf() \*=RVA ?wsprintfA \r MessageBox() \*=RVA ?MessageBoxA \@int NULL \r IAT_user32_end \r! z \* z0 \* z \-. \a IATDir \a+ int \@2 z \a z
\r ILT_kernel32 \\for kernel32.dll \* IAT_kernel32 IAT_kernel32_end \-. \a ILT_kernel32 \@ILT_kernel32 IAT_kernel32 \r ILT_user32 \\for user32.dll \* IAT_user32 IAT_user32_end \-. \a ILT_user32 \@ILT_user32 IAT_user32
\r! z0 \a ImportDir \*=RVA z0 \a z0 //Kernel32.dll \r K32_ImportLookupTable \*=RVA ILT_kernel32 \r K32_TimeDateStamp \*=4 0 \\unbound \r K32_ForwarderChain \*=4 0 \\no forwarders \r K32_DLLName \*=RVA azK32 \r K32_IAT \*=RVA IAT_kernel32 //User32.dll \r U32_ImportLookupTable \*=RVA ILT_user32 \r U32_TimeDateStamp \*=4 0 \\unbound \r U32_ForwarderChain \*=4 0 \\no forwarders \r U32_DLLName \*=RVA azU32 \r U32_IAT \*=RVA IAT_user32 //next, list-terminator (all-null) "import header" \a+f0 20 \\20 bytes of all-zeroes \r! z \* z0 \* z \-. \a ImportDir \a+ int \@2 z \a z
\a/ 4 .\r! z \a DataOffset \*=RVA z \a z \r azHeavenoWorld \'Heaveno, world!\0' \r azPrintFormat \'\r\nI said "%s"\r\n\0' \r azHaven4Heaven \'Ready? R-world.\0' \r u32Heav# \*=4 15 \r u32Print# \*=4 29
Now we have the data in place. Sufficient. Let's go on to give names to the pointers we will be using to point the in-memory objects just declared. Simplifies our referrals.
No need to waste file-space, if a data-item does not start with a known initial-value, or if the value is zero. There is the uninitialized-data section for such data. Win95 does initialize them as all-zeroes, at the program-load time.
These need not exist in the written PE. Therefore, let them, next to the for_ease list.
\r! z \a for_ease \r [azPrintBuf] \*WX 0x2000 \r [u32Result] \*WX 0x2100 \r [hStdout] \*WX 0x2104 \r! for_ease \a z
To start a function at a 4-bytes boundary, is a well-known optimization. Therefore, even when not otherwise needed, the DWORD-residence is wanted.
Let the PE-optional-header point here, as the program EntryPoint.
\r! z \a EntryPointRVA \*=RVA z \a z
The first piece of code is legacy of Luevelsmayer (the PE tutorial paper at wotsit.org's Windows section. Here is the form@fix version of it - with a line added for saving the returned stdout handle. As we are not exiting right away, we may/will use it later.
\_ push_b \@1 NULL \_ push_i \@int [u32Result] \_ push_b \@char u32Heav# \_ push_i \*WX azHeavenoWorld \_ push_b \*= 0xf5 \\stdout \_ [f]() \*WX GetStdHandle() \_ [@]=eAX \@int [hStdout] \_ push_eAX \_ [f]() \*WX WriteConsole()
The line "\_ [@]=eAX \@int [hStdout]" has first expanded the mnemonic "[@]=eAX" into its equivalent machine code opcode bytes, then wrote the 32-bit value pointed (at form@fix compile time) by the pointer we placed at the variable [hStdout].
At runtime, this will mean, the 80386 (or Pentium or later) CPU will first decode the opcode, then will swallow the expected 32-bit number - which is the address of (i.e: pointer to) the variable hStdout that we had manually written at the label [hStdout]
Here, next, I employ the wsprintf() function, of Windows, to demonstrate the difference of the C vs. Pascal function-call conventions. As opposed to most other Windows functions, wsprintf() is with the C convention. Therefore, the function does not remove the arg-list from the stack. Here, as the program does not need the pushed variables, either, I only reset the hardware stack pointer, eSP.
\_ push_i \*WX azHeavenoWorld \_ push_i \*WX azPrintFormat \_ push_i \@int [azPrintBuf] \_ [f]() \*WX wsprintf() \_ eSP+=b \*= 12 \\the width of the arg-list
To that [hStdout] handle, as I have kept that in memory, I may write yet another time.
\_ push_b \@1 NULL \_ push_i \@int [u32Result] \_ push_b \@1 u32Print# \_ push_i \@int [azPrintBuf] \_ eAX=[@] \@int [hStdout] \_ push_eAX \_ [f]() \*WX WriteConsole()
The following code piece, is the quickest way to have a window. No need to code with RegisterClass(), CreateWindow(), and ShowWindow(). Really good for informing-yourself and for any choice, at critical points, while the program is running @ testing.
\_ push_b \*= 0x24 \\MB_YESNO=4 MB_ICONQUESTION=0x20 \_ push_i \*WX azHaven4Heaven \_ push_i \*WX azHeavenoWorld \_ push_b \@1 NULL \_ [f]() \*WX MessageBox()
Keep in mind to push, in the reverse order. That little point was a problem, with this function, when the first and the fourth were reversed. The function showed no window, on the monitor. No problem, either. That is, although it would not show a message box, the pushed parameters were cleaned up. Therefore, after the function-return, the program was not crushing, with any protection-fault. The stack was fine.
Do not forget to put a ret or call ExitProcess() at the program-halt point of your code. If you are regularly programming with C or the like, the last closing curly brace in main() stops action. In machine (or assembly code) level, you have to be explicit, and must have brought the eSP back to the point where you had found it at program start, if using ret for exiting.
The code section is the last section to be written into the .exe file. The ".extra" section that follows, in memory, is not written in file. We pad this last section until the next multiple of file-alignment size (0x1000, as we had chosen).
\af0 0x400 \r @End
Relocations, that is, load/run-time fixups of code are not needed thanks to virtual addresses. Pietrek (1994), in discussion of ImageBase setting, tells that older way of setting ImageBase to 0x10000 leads to longer exe-load times.
So far as I know, Windows95 expects 0x400000 anyway. This lets multiple programs with that same base address to run at the same time.
Hence, we dump the idea of preparing fix-up records, for the time-being. If some of us, you or I, find a good reason to go back to preparing those records, later versions of this paper may include a section for relocations/fixups, too. Just tell me a good reason for that.
The PortableExecutable quirks exit. Here is a list - not probably exhaustive.
For an ExeCharacteristics field, a mysterious point is, how would the IMAGE_FILE_BYTES_REVERSED_LO option would relate to the IMAGE_FILE_BYTES_REVERSED_HI? Are they antagonistic, or would they co-operate? In Microsoft documentation, they are little-endian, and big-endian, respectively. But Win95 QuickView does refer to them as "Low bytes of machine word are reversed," and "High bytes of machine word are reversed," and the (older) pedump.exe (published by MSDN of Microsoft), is 0x818e, i.e: with both 0x80, and 0x8000. What do they mean, really? And if I interpret the words of QuickView right, then what would reverse the lower and upper word, so that the full range of four bytes are listed little-endian (or, for vice versa, big-endian)?
The ExeCharacteristics field, on this page, is 0x38e. In the NT-based tutorial at wotsit.org, it was 0x102 - probably, a minimal set.
The MS-DOS stub, in Appendix D, does not list any relocations. The field telling the count, and also the offset of the relocation-table, therefore, was acceptably zero. Or, I thought so. When the offset is zero (that is, NULL), the QuickView of Windows95 insisted that the file is an MSDOS executable - not Windows. Only when I set the "address" of that non-existent table to 0x40, that was recognized as a Windows-executable. The quirk was that, the file was listed as an MSDOS executable, although running as a Windows program. i.e: Not the DOS stub, but the Windows-code was the executed. (pedump recognized that as a Windows .exe, in both cases.)
By the way, while I was inspecting the MSDOS-header of the pedump.exe to learn why it was recognized as a Windows-exe, while my program was not, I also noticed that the file-length fields (the two fields right after the 'MZ' signature) of pedump.exe do not reflect the real length of the pedump.exe. If pedump were not a Windows file, Win95 would not run that as an MSDOS program, either. Win95 does not run an MSDOS program, when the reported filelength within that file, does not reflect the real filelength. An exe-verifier, that is. i.e: The compiler which compiled that pedump.exe waived the requisite, in a gray-area, where neither Windows nor MSDOS refused to run the produced program. I do not know whether that quirk would have an advantage, at all. I keep with the documented way, for a stub, in Appendix D.
On this page, until here, we have crafted the Windows-executable portion of the PE. We need to write the MSDOS-stub, too. That is next, in Appendix D. After that, Appendix E is for writing the full PE, as a Windows .exe.
The reader, yourself, may modify the code on this page to craft your own code. Good starting point, I think.
A Win95 exe file, has two major divisions. An MS-DOS stub at the start of the file, and the Portable Executable (PE) portion that comes next.
The DOS stub takes effect when/if you run the code without Windows.
For most Windows-based apps, because they do not really care to do anything substantial in a DOS setting, all the stub portion does is to warn the user that he/she has attempted to run a Windows-based executable at some platform that can only run DOS code.
This behavior is a legacy of Windows's DOS-based history.
Now, switching to coding. Here is a minimal DOS-stub header. It is 64 bytes. You can find this in Windows header file "winnt.h" as the structure IMAGE_DOS_HEADER
\a 0 \'MZ' \\ MSDOS signature at file start. \*=2 512 \\ Bytes on last page of file \*=2 16 \\ Pages in file (each 512 bytes) \*=2 0 \\ Relocations \*=2 4 \\ Size of header in paragraphs \*=2 0 \\ Minimum-extra-needed /16 \*=2 0xffff \\ Maximum-extra-needed /16 \*=2 0 \\ Initial (relative) SS \*=2 0x100 \\ Initial eSP \*=2 0 \\ Checksum \*=2 0 \\ Initial IP value \*=2 0 \\ Initial (relative) CS \*=2 0x40 \\ File address of relocation table \*=2 0 \\ Overlay number \a+f0 8 0 \\ Reserved words \*=2 0 \\ OEM identifier (for e_oeminfo) \*=2 0 \\ OEM information; e_oemid specific \a+f0 20 0 \\ Reserved \r PortExe \*=4 0x80\\ point at PE header
The file-address-of-relocation-table is critical for Win95 to accept this executable as a Windoes-executable. Although the .exe is run as a Windows executable, the Win95 Quickview, does list it as if an MS-DOS executable. That is especially strange, as here, and as in pedump.exe, the value 0x40 does not really point to a table, at all. i.e: That is not really a needed information for the loader. A NULL (zero) value would suffice.
After the DOS header, comes the DOS-based application's code/data/etc.
\\@64: Let DS=CS \*= 0xe 0x1f //@66: Write azMess to stdout // through DOS-int21h (AH=0x40) \*= 0xb4 0x40 0xbb 1 0 \*= 0xb9 19 0 0xba 19 0 0xcd 0x21 \\@79: Quit to DOS w/ DOS-int21h(AH=0x4c). \*= 0xb4 0x4c 0xcd 0x21 \\@83: azMessage (strlen=17+2=19) \'This Ain\'t 4 DOS!\r\n\0'
Let's put it in its own sovereignty as an .exe file. Then, it can be run on the Windows platform.
\\Ae 0 0 \\== the default Arz-extent \a @End \F== wHeaveno.exe \A--
This page is not a full tutorial, about everything. If this page is valuable, that is mainly the ready code, and After this quick introduction, may not be so much tutorial in nature. If it helps, it is because the presentation is clear and the result works - on Win95. You may find the references more useful as tutorials, or for WinNT, etc.
Microsoft PE&COFF documentation was at
When that was not there again, I found at archive.org - the internet log.
There was the article, by Pietrek,Matt (1994), "Peering Inside the PE: A Tour of the Win32 Portable Executable File Format," Microsoft Systems Journal, March 1994.
Luevelsmeyer (1999) "The PE file format" pe.txt in pe.zip at www.wotsit.org Windows section. This was with the manually-crafted exe. I appreciated that approach, with the instructive value. The immediate feedback from a manually byte-crafted code, is a good approach. In general, that is how we study to master a file-format. And to empathize with a program is an instructive-method, too, for example when teaching pointer/storage types, as was noticed in a CUJ article, in 1989 (or, around 1990). Next, I translated that pe.txt from WinNT to Win95, runnable with MSDOS debug, and later with form@fix. I keep developing independently.
The article "The Portable Executable File Format from Top to Bottom" by
Randy Kath, was at http://www.microsoft.com/win32dev/base/pefile.htm
Findable, at archive.org, or elsewhere.
After having learned how to code software, if in need, to authenticate/distribute your code (or, receive/download code), the authentication concept of Microsoft, is signature/encryption-based. But I do not think that is enough, or to that point, even useful, at all. There do exist, unfortunately a lot of ways to steal secrets!!! How would a web-surfer, or software-downloader know, when/who/what is with stolen "authentication?"
I favor a face-to-face authentication through people we already recognize-and-respect. If we may really ensure, strictly-verified-IP-addresses (of already respected people), may work, too. That is, a peer-to-peer authentication. Extremely affine for a RRRR, as that is already a relay-architecture. For information on software, the informaze is there, as already cozy for a RRRR. To sum up, I think R-world is really ready for a high level of defense, fending off viruses, etc.
The web-surfer case is reminiscent. Already, there are people who rate pages, e.g: to not let children to evil.
To do that centrally, for example, as we already know,