Top > Product Version << Back

(Cover art supplied by ISFDB)


<Method to decompile in DCC>



    <Contents>
    1. Is correct the description of functions and data?
      1.1. Functions
      1.2. Data
    2. Is correct the description of simple statements?
    3. Is correct the description of multi-goto (switch) statements?
    4. Is correct the description of loop (for-while) statements?
    5. Is correct the description of branch (if-else) statements?

We will describe some check points to output the complete decompiled C source files in DCC.
The main check points to know whether or not a decompiling operation is achieved successfully are the following five;
    [1] Is correct the description of functions and data?
    [2] Is correct the description of simple statements?
    [3] Is correct the description of multi-goto (switch) statements?
    [4] Is correct the description of loop (for-while) statements?
    [5] Is correct the description of branch (if-else) statements?
As for the decompiling work, CSM (C simulator) executes it all automatically. The above works are correspoding to CSM PASS 1 to CSM PASS 5, respectively, that are displayed if started CSM with -v start option. The concrete work is, making .dcc file, to describe there DCC commands, mainly CSM commands.

1. Is correct the description of functions and data?

1.1. Functions

All functions recognized by CSM are displayed in detail into the message file, as CSM PASS 1, if started CSM with -v start option. In case that the range of a function is erroneous, you should assign 'pro' command or 'cpr' command. (Since 'pro' command is XSIM command, if assigned 'pro' command for it, you have to restart it from XSIM. In case of 'cpr' command, you can do it only by restarting CSM with -i start option. The difference between 'pro' and 'cpr' commands is whether to reflect it in ASM files or only in C source.)
If the range of each function is precise, you should assign the type of a function, its parameters ('par' option), and its internal data ('dyn' option) by using 'cfu' command. In case to assign also a structure ('cst' command) and typedef ('cty' command) when assigning data types, you should also assign 'cst' and 'cty' commands.
Also as for external functions displayed as Extern Function in the message file, you should assign the type of a function and the types of its parameters by 'cfu' command.
As for import functions in Windows, since they are all common, if added in the command file, 'import.dcc', they are automatically referred when CSM executes. Also the structures ('cst' command) and typedefs ('cty' command) common in Windows you should added in 'import.dcc'. (PROXYAN 2.60 automatically generates all necessary cfu/cst/cty commands in 'import.dcc'.)

1.2. Data

All data recognized by CSM can be indentified in the screen output in CBR or the output of .h file in CGN. For assignment of data, you should assign the type of data by 'cgl' command.
In case that the range of data is erroneous, if assigned by 'cgl' command, it is automatically described precisely as data in the size assinged by 'cgl' command. The label name of each string in an array of strings is only a temporal name, and so, you need not to rename it by 'ren' command especailly. The initial value of a external data is not described if it is 0, but if it is not 0, it is described as an expression of an initial value.
In case that the assignment of type in data is erroneous, you can know it also by that the description of the statement referring this data is erroneous. For example, it is the case that the reference to a string is erroneous.
Also for external data, you should assign the type of data by 'cgl' command.
As for import data in Windows, its name of data is automatically renamed and so, referring the name of data, you should also assign the type of data by 'cgl' command. As for import data in Windows, since they are all common, if added in the command file, 'import.dcc', they are automatically referred when CSM executes. (PROXYAN 2.60 automatically generates all necessary cgl/cst/cty commands in 'import.dcc'.)

2. Is correct the description of simple statements?

The number of simple statements recognized by CSM is displayed into the message file, as CSM PASS 2, if started CSM with -v start option. If assigned 2 by 'cpa' command, CSM terminates in CSM PASS 2, and so, you can know, by the screen output in CBR or the output of CGN, whether all simple statements are correctly recognized. Also if assigned 'log fil' command, all statements from simple to compound ones are described in the logging file (.log) in turn when CSM excutes.
If simple statements are not recognized correctly then the compound one is also not recognized, hence it is the important point for C source description whether simple statements are correctly recognized.
If the disassembling work in XSIM is complete as described in <Method to disassemble in AGNSS>, it is the only cause for an erroneous simple statement that the type of a function or data is erroneous.
For example, in an assignment statement, a value is actually saved into a register such as eax, and so on, temporarily, and then assigned into a data that must be a left value. If the data is not defined, then CSM does not recognize it as a left value, hence the statement is not described as a simple statement.
If the type of a function and data is assigned correctly, then a simple statement is also described correctly.

3. Is correct the description of multi-goto (switch) statements?

The number of multi-goto (switch) statements recognized by CSM is displayed into the message file, as CSM PASS 3, if started CSM with -v start option. The statements recognized as multi-goto (switch) statements by CSM are only multi-jump instructions that are described as jmp [ebx+label], etc., in XSIM. In case that a C compiler describes a multi-goto statement as an array of if-else statements, not as a multi-jump instruction, CSM describes it as an array of if-else statements precisely.
In C language syntax, there is no simple statement corresponding to a mult-jump instruction, but, in CSM PASS 2, it is described as a simple statement such as
'goto label[8]', etc.
If a simple statement is correctly described, the multi-goto (switch) statement is also correctly described.
If you want to restrain CSM to output a multi-goto (switch) statement or alter it, you can use 'csw' command.

4. Is correct the description of loop (for-while) statements?

The number of loop (for-while) statements recognized by CSM is displayed into the message file, as CSM PASS 4, if started CSM with -v start option. The statements recognized as loop (for-while) statements by CSM are only cases that a goto-loop is detected as ones can be described as a 'for' or 'while' statement.
If a simple statement is correctly described, the loop statement is also correctly described.
If you want to restrain CSM to output a loop statement or alter it, you can use 'cfo' command or 'cwh' command.
The same loop statement can be either as 'for', 'while' or as 'goto' statements, and so, it is according to user's favorite to restrain/alter a loop statement.

5. Is correct the description of branch (if-else) statements?

The number of branch (if-else) statements recognized by CSM is displayed into the message file, as CSM PASS 5, if started CSM with -v start option. The statements recognized as branch (if-else) statements by CSM are only cases that a goto-branch is detected as ones can be described as a 'if' and 'else' statements.
If you want to restrain CSM to output a branch statement or alter it, you can use 'cif' command.
If a simple statement is correctly described, the branch statement is also correctly described.
The same branch statement can be either as 'if', 'else' or as 'goto' statements, and so, it is according to user's favorite to restrain/alter a branch statement.

Bilyzkid Co.,Ltd.
Higashi-Izumi 1-34-19-102
Komae-Shi, Tokyo 201-0014, JAPAN
Phone:81-3-5497-1962