readme.rst 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546
  1. :author: Joachim Bauch
  2. :contact: mail@joachim-bauch.de
  3. :copyright: `Creative Commons License (by-sa)`__
  4. __ http://creativecommons.org/licenses/by-sa/2.5/
  5. .. contents::
  6. Overview
  7. =========
  8. The default windows API functions to load external libraries into a program
  9. (LoadLibrary, LoadLibraryEx) only work with files on the filesystem. It's
  10. therefore impossible to load a DLL from memory.
  11. But sometimes, you need exactly this functionality (e.g. you don't want to
  12. distribute a lot of files or want to make disassembling harder). Common
  13. workarounds for this problems are to write the DLL into a temporary file
  14. first and import it from there. When the program terminates, the temporary
  15. file gets deleted.
  16. In this tutorial, I will describe first, how DLL files are structured and
  17. will present some code that can be used to load a DLL completely from memory -
  18. without storing on the disk first.
  19. Windows executables - the PE format
  20. ====================================
  21. Most windows binaries that can contain executable code (.exe, .dll, .sys)
  22. share a common file format that consists of the following parts:
  23. +----------------+
  24. | DOS header |
  25. | |
  26. | DOS stub |
  27. +----------------+
  28. | PE header |
  29. +----------------+
  30. | Section header |
  31. +----------------+
  32. | Section 1 |
  33. +----------------+
  34. | Section 2 |
  35. +----------------+
  36. | . . . |
  37. +----------------+
  38. | Section n |
  39. +----------------+
  40. All structures given below can be found in the header file `winnt.h`.
  41. DOS header / stub
  42. ------------------
  43. The DOS header is only used for backwards compatibility. It precedes the DOS
  44. stub that normally just displays an error message about the program not being
  45. able to be run from DOS mode.
  46. Microsoft defines the DOS header as follows::
  47. typedef struct _IMAGE_DOS_HEADER { // DOS .EXE header
  48. WORD e_magic; // Magic number
  49. WORD e_cblp; // Bytes on last page of file
  50. WORD e_cp; // Pages in file
  51. WORD e_crlc; // Relocations
  52. WORD e_cparhdr; // Size of header in paragraphs
  53. WORD e_minalloc; // Minimum extra paragraphs needed
  54. WORD e_maxalloc; // Maximum extra paragraphs needed
  55. WORD e_ss; // Initial (relative) SS value
  56. WORD e_sp; // Initial SP value
  57. WORD e_csum; // Checksum
  58. WORD e_ip; // Initial IP value
  59. WORD e_cs; // Initial (relative) CS value
  60. WORD e_lfarlc; // File address of relocation table
  61. WORD e_ovno; // Overlay number
  62. WORD e_res[4]; // Reserved words
  63. WORD e_oemid; // OEM identifier (for e_oeminfo)
  64. WORD e_oeminfo; // OEM information; e_oemid specific
  65. WORD e_res2[10]; // Reserved words
  66. LONG e_lfanew; // File address of new exe header
  67. } IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;
  68. PE header
  69. ----------
  70. The PE header contains informations about the different sections inside the
  71. executable that are used to store code and data or to define imports from other
  72. libraries or exports this libraries provides.
  73. It's defined as follows::
  74. typedef struct _IMAGE_NT_HEADERS {
  75. DWORD Signature;
  76. IMAGE_FILE_HEADER FileHeader;
  77. IMAGE_OPTIONAL_HEADER32 OptionalHeader;
  78. } IMAGE_NT_HEADERS32, *PIMAGE_NT_HEADERS32;
  79. The `FileHeader` describes the *physical* format of the file, i.e. contents, informations
  80. about symbols, etc::
  81. typedef struct _IMAGE_FILE_HEADER {
  82. WORD Machine;
  83. WORD NumberOfSections;
  84. DWORD TimeDateStamp;
  85. DWORD PointerToSymbolTable;
  86. DWORD NumberOfSymbols;
  87. WORD SizeOfOptionalHeader;
  88. WORD Characteristics;
  89. } IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;
  90. .. _OptionalHeader:
  91. The `OptionalHeader` contains informations about the *logical* format of the library,
  92. including required OS version, memory requirements and entry points::
  93. typedef struct _IMAGE_OPTIONAL_HEADER {
  94. //
  95. // Standard fields.
  96. //
  97. WORD Magic;
  98. BYTE MajorLinkerVersion;
  99. BYTE MinorLinkerVersion;
  100. DWORD SizeOfCode;
  101. DWORD SizeOfInitializedData;
  102. DWORD SizeOfUninitializedData;
  103. DWORD AddressOfEntryPoint;
  104. DWORD BaseOfCode;
  105. DWORD BaseOfData;
  106. //
  107. // NT additional fields.
  108. //
  109. DWORD ImageBase;
  110. DWORD SectionAlignment;
  111. DWORD FileAlignment;
  112. WORD MajorOperatingSystemVersion;
  113. WORD MinorOperatingSystemVersion;
  114. WORD MajorImageVersion;
  115. WORD MinorImageVersion;
  116. WORD MajorSubsystemVersion;
  117. WORD MinorSubsystemVersion;
  118. DWORD Win32VersionValue;
  119. DWORD SizeOfImage;
  120. DWORD SizeOfHeaders;
  121. DWORD CheckSum;
  122. WORD Subsystem;
  123. WORD DllCharacteristics;
  124. DWORD SizeOfStackReserve;
  125. DWORD SizeOfStackCommit;
  126. DWORD SizeOfHeapReserve;
  127. DWORD SizeOfHeapCommit;
  128. DWORD LoaderFlags;
  129. DWORD NumberOfRvaAndSizes;
  130. IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
  131. } IMAGE_OPTIONAL_HEADER32, *PIMAGE_OPTIONAL_HEADER32;
  132. .. _DataDirectory:
  133. The `DataDirectory` contains 16 (`IMAGE_NUMBEROF_DIRECTORY_ENTRIES`) entries
  134. defining the logical components of the library:
  135. ===== ==========================
  136. Index Description
  137. ===== ==========================
  138. 0 Exported functions
  139. ----- --------------------------
  140. 1 Imported functions
  141. ----- --------------------------
  142. 2 Resources
  143. ----- --------------------------
  144. 3 Exception informations
  145. ----- --------------------------
  146. 4 Security informations
  147. ----- --------------------------
  148. 5 Base relocation table
  149. ----- --------------------------
  150. 6 Debug informations
  151. ----- --------------------------
  152. 7 Architecture specific data
  153. ----- --------------------------
  154. 8 Global pointer
  155. ----- --------------------------
  156. 9 Thread local storage
  157. ----- --------------------------
  158. 10 Load configuration
  159. ----- --------------------------
  160. 11 Bound imports
  161. ----- --------------------------
  162. 12 Import address table
  163. ----- --------------------------
  164. 13 Delay load imports
  165. ----- --------------------------
  166. 14 COM runtime descriptor
  167. ===== ==========================
  168. For importing the DLL we only need the entries describing the imports and the
  169. base relocation table. In order to provide access to the exported functions,
  170. the exports entry is required.
  171. Section header
  172. ---------------
  173. The section header is stored after the OptionalHeader_ structure in the PE
  174. header. Microsoft provides the macro `IMAGE_FIRST_SECTION` to get the start
  175. address based on the PE header.
  176. Actually, the section header is a list of informations about each section in
  177. the file::
  178. typedef struct _IMAGE_SECTION_HEADER {
  179. BYTE Name[IMAGE_SIZEOF_SHORT_NAME];
  180. union {
  181. DWORD PhysicalAddress;
  182. DWORD VirtualSize;
  183. } Misc;
  184. DWORD VirtualAddress;
  185. DWORD SizeOfRawData;
  186. DWORD PointerToRawData;
  187. DWORD PointerToRelocations;
  188. DWORD PointerToLinenumbers;
  189. WORD NumberOfRelocations;
  190. WORD NumberOfLinenumbers;
  191. DWORD Characteristics;
  192. } IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;
  193. A section can contain code, data, relocation informations, resources, export or
  194. import definitions, etc.
  195. Loading the library
  196. ====================
  197. To emulate the PE loader, we must first understand, which steps are neccessary
  198. to load the file to memory and prepare the structures so they can be called from
  199. other programs.
  200. When issuing the API call `LoadLibrary`, Windows basically performs these tasks:
  201. 1. Open the given file and check the DOS and PE headers.
  202. 2. Try to allocate a memory block of `PEHeader.OptionalHeader.SizeOfImage` bytes
  203. at position `PEHeader.OptionalHeader.ImageBase`.
  204. 3. Parse section headers and copy sections to their addresses. The destination
  205. address for each section, relative to the base of the allocated memory block,
  206. is stored in the `VirtualAddress` attribute of the `IMAGE_SECTION_HEADER`
  207. structure.
  208. 4. If the allocated memory block differs from `ImageBase`, various references in
  209. the code and/or data sections must be adjusted. This is called *Base
  210. relocation*.
  211. 5. The required imports for the library must be resolved by loading the
  212. corresponding libraries.
  213. 6. The memory regions of the different sections must be protected depending on
  214. the section's characteristics. Some sections are marked as *discardable*
  215. and therefore can be safely freed at this point. These sections normally
  216. contain temporary data that is only needed during the import, like the
  217. informations for the base relocation.
  218. 7. Now the library is loaded completely. It must be notified about this by
  219. calling the entry point using the flag `DLL_PROCESS_ATTACH`.
  220. In the following paragraphs, each step is described.
  221. Allocate memory
  222. ----------------
  223. All memory required for the library must be reserved / allocated using
  224. `VirtualAlloc`, as Windows provides functions to protect these memory blocks.
  225. This is required to restrict access to the memory, like blocking write access
  226. to the code or constant data.
  227. The OptionalHeader_ structure defines the size of the required memory block
  228. for the library. It must be reserved at the address specified by `ImageBase`
  229. if possible::
  230. memory = VirtualAlloc((LPVOID)(PEHeader->OptionalHeader.ImageBase),
  231. PEHeader->OptionalHeader.SizeOfImage,
  232. MEM_RESERVE,
  233. PAGE_READWRITE);
  234. If the reserved memory differs from the address given in `ImageBase`, base
  235. relocation as described below must be done.
  236. Copy sections
  237. --------------
  238. Once the memory has been reserved, the file contents can be copied to the
  239. system. The section header must get evaluated in order to determine the
  240. position in the file and the target area in memory.
  241. Before copying the data, the memory block must get committed::
  242. dest = VirtualAlloc(baseAddress + section->VirtualAddress,
  243. section->SizeOfRawData,
  244. MEM_COMMIT,
  245. PAGE_READWRITE);
  246. Sections without data in the file (like data sections for the used variables)
  247. have a `SizeOfRawData` of `0`, so you can use the `SizeOfInitializedData`
  248. or `SizeOfUninitializedData` of the OptionalHeader_. Which one must get
  249. choosen depending on the bit flags `IMAGE_SCN_CNT_INITIALIZED_DATA` and
  250. `IMAGE_SCN_CNT_UNINITIALIZED_DATA` that may be set in the section`s
  251. characteristics.
  252. Base relocation
  253. ----------------
  254. All memory addresses in the code / data sections of a library are stored relative
  255. to the address defined by `ImageBase` in the OptionalHeader_. If the library
  256. can't be imported to this memory address, the references must get adjusted
  257. => *relocated*. The file format helps for this by storing informations about
  258. all these references in the base relocation table, which can be found in the
  259. directory entry 5 of the DataDirectory_ in the OptionalHeader_.
  260. This table consists of a series of this structure
  261. ::
  262. typedef struct _IMAGE_BASE_RELOCATION {
  263. DWORD VirtualAddress;
  264. DWORD SizeOfBlock;
  265. } IMAGE_BASE_RELOCATION;
  266. It contains `(SizeOfBlock - IMAGE_SIZEOF_BASE_RELOCATION) / 2` entries of 16 bits
  267. each. The upper 4 bits define the type of relocation, the lower 12 bits define
  268. the offset relative to the `VirtualAddress`.
  269. The only types that seem to be used in DLLs are
  270. IMAGE_REL_BASED_ABSOLUTE
  271. No operation relocation. Used for padding.
  272. IMAGE_REL_BASED_HIGHLOW
  273. Add the delta between the `ImageBase` and the allocated memory block to the
  274. 32 bits found at the offset.
  275. Resolve imports
  276. ----------------
  277. The directory entry 1 of the DataDirectory_ in the OptionalHeader_ specifies
  278. a list of libraries to import symbols from. Each entry in this list is defined
  279. as follows::
  280. typedef struct _IMAGE_IMPORT_DESCRIPTOR {
  281. union {
  282. DWORD Characteristics; // 0 for terminating null import descriptor
  283. DWORD OriginalFirstThunk; // RVA to original unbound IAT (PIMAGE_THUNK_DATA)
  284. };
  285. DWORD TimeDateStamp; // 0 if not bound,
  286. // -1 if bound, and real date\time stamp
  287. // in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND)
  288. // O.W. date/time stamp of DLL bound to (Old BIND)
  289. DWORD ForwarderChain; // -1 if no forwarders
  290. DWORD Name;
  291. DWORD FirstThunk; // RVA to IAT (if bound this IAT has actual addresses)
  292. } IMAGE_IMPORT_DESCRIPTOR;
  293. The `Name` entry describes the offset to the NULL-terminated string of the library
  294. name (e.g. `KERNEL32.DLL`). The `OriginalFirstThunk` entry points to a list
  295. of references to the function names to import from the external library.
  296. `FirstThunk` points to a list of addresses that gets filled with pointers to
  297. the imported symbols.
  298. When we resolve the imports, we walk both lists in parallel, import the function
  299. defined by the name in the first list and store the pointer to the symbol in the
  300. second list::
  301. nameRef = (DWORD *)(baseAddress + importDesc->OriginalFirstThunk);
  302. symbolRef = (DWORD *)(baseAddress + importDesc->FirstThunk);
  303. for (; *nameRef; nameRef++, symbolRef++)
  304. {
  305. PIMAGE_IMPORT_BY_NAME thunkData = (PIMAGE_IMPORT_BY_NAME)(codeBase + *nameRef);
  306. *symbolRef = (DWORD)GetProcAddress(handle, (LPCSTR)&thunkData->Name);
  307. if (*funcRef == 0)
  308. {
  309. handleImportError();
  310. return;
  311. }
  312. }
  313. Protect memory
  314. ---------------
  315. Every section specifies permission flags in it's `Characteristics` entry.
  316. These flags can be one or a combination of
  317. IMAGE_SCN_MEM_EXECUTE
  318. The section contains data that can be executed.
  319. IMAGE_SCN_MEM_READ
  320. The section contains data that is readable.
  321. IMAGE_SCN_MEM_WRITE
  322. The section contains data that is writeable.
  323. These flags must get mapped to the protection flags
  324. - PAGE_NOACCESS
  325. - PAGE_WRITECOPY
  326. - PAGE_READONLY
  327. - PAGE_READWRITE
  328. - PAGE_EXECUTE
  329. - PAGE_EXECUTE_WRITECOPY
  330. - PAGE_EXECUTE_READ
  331. - PAGE_EXECUTE_READWRITE
  332. Now, the function `VirtualProtect` can be used to limit access to the memory.
  333. If the program tries to access it in a unauthorized way, an exception gets
  334. raised by Windows.
  335. In addition the section flags above, the following can be added:
  336. IMAGE_SCN_MEM_DISCARDABLE
  337. The data in this section can be freed after the import. Usually this is
  338. specified for relocation data.
  339. IMAGE_SCN_MEM_NOT_CACHED
  340. The data in this section must not get cached by Windows. Add the bit
  341. flag `PAGE_NOCACHE` to the protection flags above.
  342. Notify library
  343. ---------------
  344. The last thing to do is to call the DLL entry point (defined by
  345. `AddressOfEntryPoint`) and so notifying the library about being attached
  346. to a process.
  347. The function at the entry point is defined as
  348. ::
  349. typedef BOOL (WINAPI *DllEntryProc)(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpReserved);
  350. So the last code we need to execute is
  351. ::
  352. DllEntryProc entry = (DllEntryProc)(baseAddress + PEHeader->OptionalHeader.AddressOfEntryPoint);
  353. (*entry)((HINSTANCE)baseAddress, DLL_PROCESS_ATTACH, 0);
  354. Afterwards we can use the exported functions as with any normal library.
  355. Exported functions
  356. ===================
  357. If you want to access the functions that are exported by the library, you need to find the entry
  358. point to a symbol, i.e. the name of the function to call.
  359. The directory entry 0 of the DataDirectory_ in the OptionalHeader_ contains informations about
  360. the exported functions. It's defined as follows::
  361. typedef struct _IMAGE_EXPORT_DIRECTORY {
  362. DWORD Characteristics;
  363. DWORD TimeDateStamp;
  364. WORD MajorVersion;
  365. WORD MinorVersion;
  366. DWORD Name;
  367. DWORD Base;
  368. DWORD NumberOfFunctions;
  369. DWORD NumberOfNames;
  370. DWORD AddressOfFunctions; // RVA from base of image
  371. DWORD AddressOfNames; // RVA from base of image
  372. DWORD AddressOfNameOrdinals; // RVA from base of image
  373. } IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;
  374. First thing to do, is to map the name of the function to the ordinal number of the exported
  375. symbol. Therefore, just walk the arrays defined by `AddressOfNames` and `AddressOfNameOrdinals`
  376. parallel until you found the required name.
  377. Now you can use the ordinal number to read the address by evaluating the n-th element of the
  378. `AddressOfFunctions` array.
  379. Freeing the library
  380. ====================
  381. To free the custom loaded library, perform the steps
  382. - Call entry point to notify library about being detached::
  383. DllEntryProc entry = (DllEntryProc)(baseAddress + PEHeader->OptionalHeader.AddressOfEntryPoint);
  384. (*entry)((HINSTANCE)baseAddress, DLL_PROCESS_ATTACH, 0);
  385. - Free external libraries used to resolve imports.
  386. - Free allocated memory.
  387. MemoryModule
  388. =============
  389. MemoryModule is a C-library that can be used to load a DLL from memory.
  390. The interface is very similar to the standard methods for loading of libraries::
  391. typedef void *HMEMORYMODULE;
  392. HMEMORYMODULE MemoryLoadLibrary(const void *, size_t);
  393. FARPROC MemoryGetProcAddress(HMEMORYMODULE, const char *);
  394. void MemoryFreeLibrary(HMEMORYMODULE);
  395. Downloads
  396. ----------
  397. The latest development release can always be grabbed from Github at
  398. http://github.com/fancycode/MemoryModule/
  399. Known issues
  400. -------------
  401. - All memory that is not protected by section flags is gets committed using `PAGE_READWRITE`.
  402. I don't know if this is correct.
  403. License
  404. --------
  405. Since version 0.0.2, the MemoryModule library is released under the Mozilla Public License (MPL).
  406. Version 0.0.1 has been released unter the Lesser General Public License (LGPL).
  407. It is provided as-is without ANY warranty. You may use it at your own risk.
  408. Copyright
  409. ==========
  410. The MemoryModule library and this tutorial are
  411. Copyright (c) 2004-2015 by Joachim Bauch.