Using the Disassembler  ...



When you [Open] a PE  without a Source Code inside or not written with RosAsm, RosAsm offers to Disassemble it. The proposed options are:


Normal Disassembly. This is the default, for the Source building, that is a simple Assembly Source. All Data and Code Labels are in the form of, for example, 'Code0403058', 'Data0405062'.


With Commented Hexa Code. In this Mode, the Hexa Code is given, in Comments, at the right of each Instruction.


With Symbolic Analyses. In this Mode, RosAsm tries to point out the Parameters passed on the Stack, for each Api call. When found, it replaces the mechanic labels by their true Names, as found in the Win32 Documentation. This is a first step toward full HLL interpretations.


Group the Data at Top. Many Executable Files have Data stored in the Code Section. With this Flag set 'On', RosAsm groups them all at the Top of the Source. With this Flag set 'Off', the Data will be given at the place where they are found, in the .Code Section.


Use a Map File. If RosAsm has already disassembled, say MyFile.exe, a MyFile.map has been saved, aside. Reusing it may speed up the Disassembling. When such a matching Map File is found aside the Disassembled Application, the default is set 'On'.



General approach


RosAsm's Disassembler is first, an Automatic Disassembler, that tries to provide a Source that could be re-compiled without any further hand work. This is actually effective on most small Demos. Between, say, 100 and 300 Ko, this may also work, but it depends, essentially on the quality (clean vs dirty construct) of the PE. Over this size (Megas) there is no hope, and  probably never will be, unless the PE organization would be absolutely standard.



What the Disassembler actually does


Intelligent Recognition of the PE's Sections, even in cases of merged Sections.


Recovering of all Resources (but Version Info Resources, not yet implemented in RosAsm). The Resources saved by Named IDs -instead of Numbered IDs are computed, but the RosAsm Resources Editors are not able, actually, to assume them (all RosAsm Resources Editors work only with Resources saved by Numbers). For the Main Menu, the original IDs are replaced by the usual RosAsm Equates Names, if the 'MainWindowProc' branchings are identified.


The various Data Formats recognitions, for Floats, Strings, pointers to Code or Data, are implemented.


Most small Applications, like Iczelion Tutorials Demos (all) and Test Department ones (all but Tut_5), Four-F ''Cocomac'' Demos, and so on... are correctly disassembled and re-assembled (re-run) in two Clicks. Even middle Size Applications, like the Iczelion Demo 35, for a RichEdit Editor, or Test Department's biggest Demos, seem to run fine, or, at least..., partially..., without any intermediate hand work between the Disassembling and the [Run].


A first HLL Interpretation, based on the Api calls Parameters may be applied. In this case, all the identified Api call parameters are replaced by the names found in the Api List Documentation. This process, actually based on the final Source text manipulations, is... very slow.


MainWindowProc and Main are detected and provided in the Source.


Api calls performed through a Jump Table (two Instructions instead of one) are replaced by the usual RosAsm direct calls. The original Api Jumps Table is provided, for cases of moves to Variables. In such cases, the Jumps Table Label is used.


A bit of Interactivity has been introduced with RosAsm V.2.022a. See Disassembler_Flags.



What it does not do


It will fail on encrypted PEs, on Auto-writeable Code, and on Code making a direct usage of hard Coded References, instead of Pointers.


It does not yet take care of the Menus-Items Equates for Dialogs. Only the first Menu, considered to be the default MainWindow-Menu is assumed.


Another weak point, is with the Recognition of small Chunks of Data nested inside Code. The Intelligent recognition may fail at deciphering if the Chunk is Data or not-called-Code. In such cases, it provides the DB Bytes, plus several commented Interpretations. (In other cases, when the Chunk is big enough to be identified true Data, the Chunk is moved into the normal Data).


The replacements of Structures Members Names and of Win32 Equates Names is not yet implemented.


The HLL Constructs (If, While, and friends) replacements are not yet implemented.



Practice


In practice, if you believe that you will have the possibility of disassembling a big Executable, and of re-Assembling it in two clicks, you will be disappointed. This is not at all the purpose of this Disassembler, and no Disassembler on earth will ever do that. It is simply impossible, unless the complete file would be 100% standard, from a Sections point of view and 100% clean, which is extremely uncommon.


So, work first with small Applications.


With middle size (100 / 300 Ko), you may have a valid Disassembly, that would not reflect exactly the Disassembled PE, because of minor failures.


The most usual failure cases are with erroneous  interpretations of small Chunks of Data or Code. In these cases, you may give a try to the [Bad Disassembly] Option of the Foat-Menu, when double-clicking on the suspected Label.


Then, once the Application is correctly re-compiled, it may also misbehave because of several minor points, that you may have to fix by hand, after analysis.



Purpose and scope


The Disassembler will remain under intensive development for several months. In its final state, it will be a Decompiler outputting a complete restoration of the Targeted File, ready for re-compilation in a significant amount of cases, which will make RosAsm a Programmer's Tool without any competitor, in that area.


Even in case of failure of the full ''Two-Clicks-Disassembler-ReAssembler'' process, the results will often be usable, at least, for study and for helping at the translation works.


The Disassembler is a Study and Translation Tool designed for the Open Source Movement. The main goal is to make the translations of Demos and Tuts to RosAsm syntax, as easy and fast as possible. Even when having the Sources, a port to Assembly may be not so easy, with big files. We can be sure that the Disassembler will, at least,  make fewer translation errors, and will take much less of our working time than we would when translating it by hand.


By no means do I intend to develop a Disassembler-Reassembler able to analyse any weird or on-design tricky PE, made with the purpose to resist  Disassembly.

 


~~~~~~~