Softpanorama May the source be with you, but remember the KISS principle ;-)	Home	Switchboard	Unix Administration	Red Hat	TCP/IP Networks	Neoliberalism	Toxic Managers
	(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix

Decompilation and Decompilers

News	Recommended Books	Recommended Links	Compilation	Reverse Engineering	Program Understanding
Disassemblers	Decompilers	Decomplation of control structures	Java decompilations	Disassemblers	GDB
The Wisconsin Program-Slicing Tool, Version 1.0.1	Refactoring vs Restructuring	Peephole Refactoring	Code Reviews	Software archeology	Strategies of Defending Microsoft Windows against Malware
Selected projects	Program Slicing as Decomplation/Program Reconstruction Technique	Legal Aspects	Lang history	Humor	Etc

Decompilation is a part or reverse engineering of software which is pretty underdeveloped area, but many methods developed for compilation and especially for the optimization of object code are directly applicable to the problem. It is also related to Program Understanding

Decompilation is the next stage after disassembly the program. See Disassemblers. So we assume that assembler code is already available and verified.

Students that wish to study this area are strongly advised to learn basic theory of compilation and, especially, classic code optimization methods. Generally one needs to augment pattern-based constructs recovery with control flow and data flow analysis.

The program graph analysis is much better suited for recovery of control structures than pure pattern matching although even such a simple approach as peephole refactoring can recover some local control flow constructs. For example dominators can be used for detecting high level control structures. See Decomplation of control structures
Data flow analysis makes it possible to detect major structures and even somewhat understand their usage.

Another way to extract useful information is slicing. Peephole optimization can used as a powerful decompilation method as well. You can use regular expression to detect certain pattern in "sliding window" and decompile them one by one. Prolog can be used as a "regular expression engine on steroids" (see Decompilation of Java bytecode to Prolog by partial evaluation - 2009 ScienceDirect )

If you know using what complier the code was complied, the task of decompilation became much easier. String of compiler generation phase provides you will a lot of useful information about where the boundaries of construct can be located and how typical control structure and data structure re complied.

Decompilation is generally non deterministic. There are many source programs that can compile into the same object code. So what you recover is not the original source code but its equvalent.

One special area (the area I was initially interested in when I studied this problem in 1989-1991). Here the problem is that malware uses special methods of obfuscation of code that make decompilation more difficult and generally impossible without debugger and virtual machine. Now this areas is called Malware analysis. This area produced some interesting approaches to parcial decompilation such as Analyzing malware by API calls - Malwarebytes Labs Malwarebytes Labs which are related to the area of "program understanding"

Even some MS DOS viruses used pretty elaborate techniques of hiding their code by encrypting most of it all the time and decrypting only rather small "active area" that need to be executed. It is incredible how sophisticated was obfuscation in some 4K DOS viruses. They served as a launching pad for mal ware in Windows although self replication generally became less important in Windows world.

Still running a debugger usually gives you capability to reconstruct the code and even defect some higher level control and data structures. But this is a heuristic activity and automating it should generally be considered as a good test of AI capabilities.

With the "next generation malware" (started with Stuxnet worm detection in 2010) decompilation of malware became an important activity that received state support as a part of cyber defense capabilities. Unfortunately much of the activity in this area remains classified.

Dr. Nikolai Bezroukov

Top Visited <p>Your browser does not support iframes.</p>					Switchboard
					Latest
					Past week
					Past month

NEWS CONTENTS

20180613 : reverse engineering - Is there a C++ decompiler ( Jun 13, 2018 , stackoverflow.com )
20180613 : Java Decompiler download ( SourceForge.net )
20180613 : Krakatoa: Decompilation in Java ( Jun 1, 1997 , usenix.org )
20180613 : decompilation of .class file into .java (Java in General forum at Coderanch) ( Jan 1, 2012 )
20120705 : Crowd Sourced Malware Reverse Engineering Platform Launched ( Crowd Sourced Malware Reverse Engineering Platform Launched, Jul 05, 2012 )
20110512 : Flash Decompiler Trillix 5.0 ( Flash Decompiler Trillix 5.0, May 12, 2011 )
20101229 : Re Language inference tool ( Dec 2, 2010 , Comp.compilers )
20101227 : Making C compiler generate obfuscated code ( Dec 07, 2010 , Comp.compilers )
20060116 : computer intelligence assembler disassembler by Albert van der Horst ( computer intelligence assembler disassembler, Jan 16, 2006 )
20060116 : Is Sony in violation of the LGPL - Programming stuff ( Is Sony in violation of the LGPL - Programming stuff, )
20051028 : Program Transformation Wiki - Decompilation Resources ( Program Transformation Wiki - Decompilation Resources, Oct 28, 2005 )
20051028 : freshmeat.net Project details for uncc ( freshmeat.net Project details for uncc, )
20051028 : Pyreverse ( Pyreverse, )
20011228 : Decompilation page and link to a decompiler by Satish Kumar. ( Decompilation page and link to a decompiler, Dec 28, 2001 )
20001218 : Clipper and FoxPro decompilation ( Dec 18, 2000 )
20001010 : Filter Factory Plug-in Decompiler ( Filter Factory Plug-in Decompiler, Oct 10, 2000 )
20000915 : Java decompilers ( Sept 15, 2000 )
20000915 : C. Cifuentes and K. Gough. Decompilation of binary programs. Software -- Practice and Experience , 25(7):811--829, July 1995. ( )
20000915 : C. Cifuentes, Interprocedural data-flow decompilation . Journal of Programming Languages Vol 4 No 2 (1996) pp 77--99 ( )
20000915 : Of Graphs and Coffi Grounds: Decompiling Java - Patrick Lam (1998) ( Of Graphs and Coffi Grounds: Decompiling Java - Patrick Lam (1998), )
20000915 : Making Graphs Reducible with Controlled Node Splitting - Johan Janssen (1997) ( Making Graphs Reducible with Controlled Node Splitting - Johan Janssen (1997), )
20000915 : GOTO Removal Based On Regular Expressions - Paul Morris Ronald ( GOTO Removal Based On Regular Expressions - Paul Morris Ronald, )

Old News ;-)

[Jun 13, 2018] reverse engineering - Is there a C++ decompiler

Jun 13, 2018 | stackoverflow.com

David Holm ,Oct 15, 2008 at 15:08

You can use IDA Pro by Hex-Rays . You will usually not get good C++ out of a binary unless you compiled in debugging information. Prepare to spend a lot of manual labor reversing the code.
If you didn't strip the binaries there is some hope as IDA Pro can produce C-alike code for you to work with. Usually it is very rough though, at least when I used it a couple of years ago.

davenpcj ,May 5, 2012 at

To clarify, IDA will only give the disassembly. There's an add-on to it called Hex-Rays that will decompile the rest of the way into C/C++ source, to the extent that's possible. – davenpcj May 5 '12 at

Dustin Getz ,Oct 15, 2008 at 15:15

information is discarded in the compiling process. Even if a decompiler could produce the logical equivalent code with classes and everything (it probably can't), the self-documenting part is gone in optimized release code. No variable names, no routine names, no class names - just addresses.

Darshan Chaudhary ,Aug 14, 2017 at 17:36

"the soul" of the program is gone, just an empty shell of it's former self..." – Darshan Chaudhary Aug 14 '17 at 17:36

,

Yes, but none of them will manage to produce readable enough code to worth the effort. You will spend more time trying to read the decompiled source with assembler blocks inside, than rewriting your old app from scratch.

Java Decompiler download

Abandonware from 2002

SourceForge.net

The aim of this project is to develope a decompiler for java which is platform independent and has options to obfuscate the class file also.
The project takes class file as input and decompiles it and provides the source file.

Krakatoa: Decompilation in Java

PDF available online.

Jun 1, 1997 | usenix.org

This paper presents our technique for automatically decompiling Java bytecode into Java source.

Our technique reconstructs souree-level expressions from bytecode, and reconstructs readable, high-level control statements from primitive goto-like branches.

Fewer than a dozen simple code-rewriting rules reconstruct the high-level statements.

decompilation of .class file into .java (Java in General forum at Coderanch)

Jan 1, 2012

Mehmet Gunacti
Take a look at http://java.decompiler.free.fr

Meherdad Bomanbehram
many are available here

[Jul 05, 2012] Crowd Sourced Malware Reverse Engineering Platform Launched

Slashdot
wiredmikey writes "Security startup CrowdStrike has launched CrowdRE, a free platform that allows security researchers and analysts to collaborate on malware reverse engineering. CrowdRE is adapting the collaborative model common in the developer world to make it possible to reverse engineer malicious code more quickly and efficiently. Collaborative reverse engineering can take two approaches, where all the analysts are working at the same time and sharing all the information instantly, or in a distributed manner, where different people work on different sections and share the results. This means multiple people can work on different parts simultaneously and the results can be combined to gain a full picture of the malware. Google is planning to add CrowdRE integration to BinNavi, a graph-based reverse engineering tool for malware analysis, and the plan is to integrate with other similar tools. Linux and Mac OS support is expected soon, as well."

[May 12, 2011] Flash Decompiler Trillix 5.0

Flash Decompiler Trillix is a decompiler for SWF files. It allows you to extract objects from a Flash movie, save sounds as WAV and MP3, images as PNG, JPEG, and BMP, videos as FLV (Flash video format),... AVI, or MPEG, and text as RTF, TXT, and HTML. It can also export ActionScripts

Changes: Flash files can now be converted into the Adobe Flash CS5 file format (.xfl, XML-based FLA). Support was added for the "Binary" tag in SWF elements tree. Data binding support... in Flex files was improved. The SWF to FLA conversion process is substantially faster now

[Dec 29, 2010] Re Language inference tool

Dec 2, 2010 | Comp.compilers

glen herrmannsfeldt <[email protected]>

Torben Fgidius Mogensen <[email protected]> wrote:
> Vivien Parlat <[email protected]> writes:

>> My job requires me to work with a no-more-supported and unknown
>> language. I'm considering the idea of building tools in order to
>> facilitate the work with it, but defining its grammar manually could
>> be very tedious.

(snip)

> Grammar inference will not give you anything useful.

That was my first thought, but then I wasn't so sure.

For one, the usual programming languages use only a small subset of
the possible grammars. With the appropriate limitations, it might be
possible to get a good start on one.

> If you only show positive examples, the simplest grammar is the one
> that accepts any text string, and the most precise is the one that
> accepts the examples and nothing else.

OK, consider trying to find the C grammar from a reasonably sized
sample of C code. Most likely, it wouldn't be able to separate
library calls from statements, but maybe that isn't so bad.
Otherwise, it is statistical. If a certain word appears many times at
the beginning of a statement (line?) then it is likely a statement
keyword. In between the two limits, there are some that are likely to
accept other valid examples, and less likely to accept ones that
aren't valid.

> Anything in between these two extremes is extremely difficult to
> define, and I know of no tool that will give a useful result,
> especially if the grammar is inherently large.

Large is a problem, but that is exactly the case where an automated
system is most useful. How about COBOL for an example? How hard
would it be to extract the COBOL grammer from sample code? Is the
extraction program allowed to know that blanks are significant, and
that keywords are reserved? The more hints, the better the result is
likely to be.

> Also, even if you manage to infer a grammar, this will only give you
> a recognizer for valid programs but not any useful structure, as the
> inferred grammar is likely to be highly ambiguous. Making it
> unambiguous requires expert knowledge on par with what is required to
> write a grammar from scratch.

Well, consider it in the way that OCR was not so long ago. (and
likely still is). It isn't perfect, but sometimes editing the output
of OCR is easier than typing it all by hand. The statistics (types of
mistakes) are different, though.

> So while grammar inference may be of theoretical interest, it is not
> useful for writing grammars for programming languages.

How about just a verifier, for a hand written grammar? One could
start writing, then have a program show some statements that it
doesn't accept, to be added by hand. Iterate until good enough.

But another question is, what does one want to do with the result?

If one wants a syntax verifier, then the problem is somewhat
easier than needed for a compiler. For example, a compiler is
usually expected to generate a parse tree with the appropriate
operator precedence already included. A verifier just needs
to verify that the input can be parsed, but doesn't need to
know about precedence. It does seem unlikely that an automated
system would extract precedence from sample input.

Many years ago I used the BNF description of Fortran in the IBM
manual for a Fortran syntax verifier. I don't remember it being
in the compiler manuals (Program Logic Manuals), but it was in
the Syntax Verifier manual. But syntax verifiers aren't used
much anymore.

> It will probably be more useful to try to uncover documentation for the
> language. If you don't know its name or anything, you could try to post
> an example of code on comp.languages.misc and see if anyone recognizes
> it.

-- glen

Valerio Cavadini <[email protected]>:

There is a tool called "Grammar Recovery Kit".http://www.cs.vu.nl/grammarware/grk/

I did not use it so I can not say if it works or not...

Regards

Valerio

[Dec 27, 2010] Making C compiler generate obfuscated code

The thread contains a very interesting discussion...

Dec 07, 2010 | Comp.compilers

About my little attempt to hack Tiny C compiler's codegenerator so it produces obfuscated code:
http://blogs.conus.info/node/58

Martin Ward <[email protected]> :

On Thursday 16 Dec 2010 at 15:23, Joshua Cranmer <[email protected]> wrote:

> I'm not sure what work has been done on deoptimization (perhaps anyone
> with the Hex-Rays decompiler could tell us?), but some of the
> optimization techniques seem relatively easy to reverse.
> > From a purely theoretical standpoint, obfuscation that adds
> non-executing code is going to be less difficult to reverse engineer
> than obfuscation that does the same thing, just... less obviously.

A major theoretical result in this area is the paper "On the (Im)possibility of Obfuscating Programs" by Boaz Barak et al, published in: CRYPTO '01 Proceedings of the 21st Annual International Cryptology Conference on Advances in Cryptology. Boaz Barak gives a non-technical description of the results and their meaning here:

http://www.math.ias.edu/~boaz/Papers/obf_informal.html

The FermaT program transformation system can interactively transform code into semantically equivalent forms: including restructuring, simplification, dataflow analysis , SSA construction, slicing and, in simple cases, abstraction to a specification. The FermaT engine and graphical front end runs on Windows and Linux and can be downloaded from here:

http://www.gkc.org.uk/fermat.html

or:

http://www.cse.dmu.ac.uk/~mward/fermat.html

FermaT's primary commercial application is migrating assembler to structured and maintainable C and COBOL, so the "deobfuscation" transformations are geared towards removing the "clever tricks" introduced by assembler programmers to save a byte here and a cycle there: or just because they were

-- Martin

STRL Reader in Software Engineering and Royal Society Industry Fellow [email protected] http://www.cse.dmu.ac.uk/~mward/ Erdos number: 4 G.K.Chesterton web site: http://www.cse.dmu.ac.uk/~mward/gkc/ Mirrors: http://www.gkc.org.uk and http://www.gkc.org.uk/gkc

glen herrmannsfeldt <[email protected]>:

Martin Ward <[email protected]> wrote: (snip)

> FermaT's primary commercial application is migrating assembler to
> structured and maintainable C and COBOL, so the "deobfuscation"
> transformations are geared towards removing the "clever tricks"
> introduced by assembler programmers to save a byte here and a cycle
> there: or just because they were

One of the tricks I remember from the 8 bit microprocessor days was to put an instruction in the two bytes of a load instruction instead of a branch around the instruction.

I used to have a table driven disassembler for 8080, Z80, and then later 6809 code that would follow along from the entry point, keeping track of conditional branch destinations on a stack. When it reached an unconditional branch it would take an entry off the stack. It keeps flags for each byte as the first byte (opcode), absolute address, relative address, or, if not part of any instruction, data. I suppose one would eventually learn enough tricks though.

Along the line of such tricks, there are stories of Apple II ROM cloners claiming that so many such tricks were used that the Apple implementation was the only one possible in the available number of bytes. Then, since it was the only one possible, it should not protected against cloning.

-- glen

Rob Warnock:

Martin Ward <[email protected]> wrote:
+---------------
| A major theoretical result in this area is the paper "On the
| (Im)possibility of Obfuscating Programs" by Boaz Barak et al,
| published in: CRYPTO '01 Proceedings of the 21st Annual International
| Cryptology Conference on Advances in Cryptology. Boaz Barak gives a
| non-technical description of the results and their meaning here:
|
| http://www.math.ias.edu/~boaz/Papers/obf_informal.html
+---------------

See also more recent results mentioned here:

http://www.wisdom.weizmann.ac.il/~oded/p_obfuscate.html

In particular, there's a 2010 revised version of the CRYPTO'01 full paper here (54 pages, 0.5 MB PDF):

http://www.wisdom.weizmann.ac.il/~oded/PS/obf4.pdf

Some snippets from the abstract:

Our main result is that, even under very weak formalizations of the above intuition, obfuscation is impossible. ... We extend our impossibility result in a number of ways, including even obfuscators that (a) are not necessarily computable in polynomial time, (b) only approximately preserve the functionality, and (c) only need to work for very restricted models of computation (TC0). We also rule out several potential applications of obfuscators, by constructing "unobfuscatable" signature schemes, encryption schemes, and pseudorandom function families.

-Rob

Hans-Peter Diettrich <[email protected]> :

Torben Fgidius Mogensen schrieb:

> Some of the least readable code I have seen has been code where every
> trick in the book was used to make it as short as possible, so using
> extreme optimisaton tricks is probably a much better obfuscator than
> inserting random code -- even random control-structure code.

Best obfuscation can be found in (older) Windows system code. Short jumps have been replaced by dummy instructions (CMP), with other jumps going to the argument of that instruction. What assembly code should a disassembler generate from such ambiguous constructs?

Also nice are vararg-type subroutine parameters inlined after the call statement, so that the location of the next statement after the call is unknown. Even worse when multiple possible return adresses (jump tables) are part of these parameters. Such obfuscation requires to decode and emulate the called subroutine, before the location of the next statement is known.

In practice such interruptions of the control flow make automatic disassembling almost impossible. Instead a good *interactive* disassembler is required (as I was writing when I came across above tricks), and time consuming manual intervention and analysis is required with almost every break in the control flow. The mix of data and instructions not only makes it impossible to generate an assembler listing, but also hides the use of memory locations (variables or constants), with pointers embedded in the inlined parameter blocks. Now tell me how a decompiler or other analysis tool should deal with such constructs, when already the automatic separation of code and data is impossible.

DoDi

Torben Ægidius Mogensen:

Hans-Peter Diettrich <[email protected]> writes:

> In practice such interruptions of the control flow make automatic
> disassembling almost impossible. Instead a good *interactive*
> disassembler is required (as I was writing when I came across above
> tricks), and time consuming manual intervention and analysis is
> required with almost every break in the control flow. The mix of data
> and instructions not only makes it impossible to generate an assembler
> listing, but also hides the use of memory locations (variables or
> constants), with pointers embedded in the inlined parameter
> blocks. Now tell me how a decompiler or other analysis tool should
> deal with such constructs, when already the automatic separation of > code and data is impossible.

Using jump tables and the like is, indeed, going to make unobfuscation hard. Especially if the tables change dynamically.

You might be able to get around this by symbolic execution: You start with a state description which allows arbitrary values of variables. You then symbolically execute and update the state description as you go along. At any test, you split the state description into two new state descriptions and continue symbolic execution on two individual paths. Whenever symbolic execution reaches a previously-visited program point with a state description equivalent to what was seen before, you make a loop. To keep the set of state descriptions finite, you apply generalisation when a state description gets too complicated.

This process is similar to online partial evaluation, which also uses state descriptions and generalisation.

That said, unobfuscation can never be perfect: Equivalence of programs is undecidable, so it is in theory possible to make a program so obfuscated that no automatic process can recover the original.

But what if you know the obfuscation method? Assuming that the obfuscation method is polynomic, deobfuscation is at worst NP-hard, so it is decidable. But it can be so intractable that it doesn't matter.

Torben

George Neuner <[email protected]> :

On Sat, 18 Dec 2010 06:23:36 +0000 (UTC), glen herrmannsfeldt <[email protected]> wrote:

>Along the line of such tricks, there are stories of Apple II ROM
>cloners claiming that so many such tricks were used that the Apple
>implementation was the only one possible in the available number of
>bytes. Then, since it was the only one possible, it should not
>protected against cloning.

I've heard these claims about the Apple II ROM for thirty years and I think they are bogus. I learned assembly programming on a IIe and it was impossible to get much done without a good understanding of the ROM routines. Many subroutines had multiple entry points and many were routinely reused from different contexts, but I don't believe the Apple's ROM was any more complex than any other computer ROM of that era (or indeed any 8-bit device today).

The major stumbling block to cloning the Apple IIe without licensing the ROM code was Applesoft FP BASIC, which was implemented using a bytecoded 16-bit virtual machine interpreter called Sweet16. Approximately 5KB of Sweet16 code was included in the 16KB Apple IIe ROM. The cloners couldn't copy the Sweet16 code, nor could they figure out how to fit a non-infringing work-alike BASIC into the same address space.

There were a few other issues with the ROM code for the custom disk controller and video (in particular the hi-res graphics IIRC). Here the problem was not having Wozniak's hardware ... work-alike code for OTS video display and disk controllers couldn't be fit into the same address space.

There eventually were a number of legal Apple IIe and //c clones, but, AFAIK, all of them had much larger ROMs than the machine they copied and had to use bank-switching and address shadowing to present a compatible memory map for existing Apple programs.

YMMV but I don't consider implementing a virtual machine to be a "trick". Wozniak created Sweet16 to make programming easier ... it ultimately saved some ROM space, but it was never intended to deceive anyone ... the Sweet16 interpreter and the fact that it was used was documented in the Apple ][ Integer BASIC manual. I don't recall if Sweet16 ever was mentioned in the FP BASIC manual, but FP BASIC was an extension of Integer BASIC and so, ISTM, anyone interested should have inferred that Sweet16 had been used in FP BASIC's implementation as well.

George

George Neuner <[email protected]>:

On Mon, 20 Dec 2010 12:53:51 +0100, [email protected] (Torben Fgidius Mogensen) wrote:

>Equivalence of programs is undecidable, ...

I'm not sure that's true ... at least in theory. Turing equivalence guarantees that any program can be expressed in the lambda calculus, and the Church-Rosser theorem proves that two lambda expressions which reduce to the same expression are equivalent.

In practice, though, I agree that deciding equivalence is intractable.

>... so it is in theory possible to make a program so >obfuscated that no automatic process can recover the original.

Without meta information, such as exists in Java and .NET binaries, it is already difficult to accurately reconstruct source for any but the simplest of programs.

>But what if you know the obfuscation method? Assuming that the >obfuscation method is polynomic, deobfuscation is at worst NP-hard, so >it is decidable. But it can be so intractable that it doesn't matter.

I don't think knowing the method will help because any decent implementation likely will have multiple ways to obfuscate any particular construct and the selection will be psuedo-random.

George

Walter Banks <[email protected]>:

George Neuner wrote:

> There were a few other issues with the ROM code for the custom disk > controller and video (in particular the hi-res graphics IIRC). Here > the problem was not having Wozniak's hardware ... work-alike code for > OTS video display and disk controllers couldn't be fit into the same > address space.

Woz's disk controller was very clever. It used a simple state machine to encode the data on and off the disk. At one point I wired up a 3 1/2 drive for my own use that read and write apple ][ disks (and higher density that could not be read on an apple). At the time I was impressed how well thought out Woz's design was. In my "one of" I did something similar and used a 2708 eprom to store the jump tables for a state machine.

> YMMV but I don't consider implementing a virtual machine to be a > "trick". Wozniak created Sweet16 to make programming easier ... it > ultimately saved some ROM space, but it was never intended to deceive > anyone ... the Sweet16 interpreter and the fact that it was used was > documented in the Apple ][ Integer BASIC manual. I don't recall if > Sweet16 ever was mentioned in the FP BASIC manual, but FP BASIC was an > extension of Integer BASIC and so, ISTM, anyone interested should have > inferred that Sweet16 had been used in FP BASIC's implementation as > well.

Woz wrote a Byte article documenting the Sweet16 code and use. I don't think the FP basic used the Sweet 16 library. Sweet 16 was used in the integer basic.

He also wrote a floating point package for the apple. At the time not all 6502's had an unconditional branch which meant that Woz used conditional branches even when one was not needed. This single "feature" obfuscated the sources almost to the point of almost being unreadable.

http://www.6502.org/source/floats/wozfp3.txt

All the best of the season,

Walter Banks

glen herrmannsfeldt <[email protected]>:

Torben Fgidius Mogensen <[email protected]> wrote: (snip)

> Using jump tables and the like is, indeed, going to make > unobfuscation hard. Especially if the tables change dynamically.

Yes for automated systems, but maybe not with a human in the loop.

If, for example, the target code is an interpreter then there is likely a jump table for processing different statements. Finding that table, then, tells you more than finding a bunch of conditional branches.

> You might be able to get around this by symbolic execution: You start > with a state description which allows arbitrary values of variables.

Well, you do have to find the end of the table, which often isn't hard done by hand. (You can see when the addresses stop looking like addresses. Also, there might be a conditional test on the table offset.)

(snip) > This process is similar to online partial evaluation, which also uses > state descriptions and generalisation.

> That said, unobfuscation can never be perfect: Equivalence of programs > is undecidable, so it is in theory possible to make a program so > obfuscated that no automatic process can recover the original.

Again, it helps to have a human in the loop, especially one who know the art of the design of similar programs.

One of my first, larger, disassemblies (for personal use) was the BASIC interpreter for the TRS-80 Color Computer, MC6809 code. I was working on it for a while without finding the main loop that reads statement codes and branches. It turned out that such loop is loaded into RAM at startup, and executes there. When disassembling ROM, one doesn't always expect execution in RAM.

Anyway, I believe that mostly automated but with a human in the loop is the best way to do disassembly and deobfuscation.

-- glen

Martin Ward <[email protected]>:

On Tuesday 21 Dec 2010 at 17:25, George Neuner <[email protected]> wrote:
>
>Equivalence of programs is undecidable, ...
>
> I'm not sure that's true ... at least in theory. Turing equivalence
> guarantees that any program can be expressed in the lambda calculus,
> and the Church-Rosser theorem proves that two lambda expressions which > reduce to the same expression are equivalent.

Any program which takes no input and produces no output can only do one of two things:

(a) Terminate. (b) Not terminate.

So such a program is equivalent to either SKIP or ABORT. If this equivalence were decidable, then termination would also be decidable. But termination is undecidable (http://en.wikipedia.org/wiki/Halting_problem).

-- Martin

Hans-Peter Diettrich <[email protected]>:

Torben Fgidius Mogensen schrieb:

>> In practice such interruptions of the control flow make automatic
>> disassembling almost impossible. Instead a good *interactive*
>> disassembler is required (as I was writing when I came across above
>> tricks), and time consuming manual intervention and analysis is
>> required with almost every break in the control flow. The mix of data
>> and instructions not only makes it impossible to generate an assembler
>> listing, but also hides the use of memory locations (variables or
>> constants), with pointers embedded in the inlined parameter
>> blocks. Now tell me how a decompiler or other analysis tool should
>> deal with such constructs, when already the automatic separation of
>> code and data is impossible.
>
> Using jump tables and the like is, indeed, going to make unobfuscation
> hard. Especially if the tables change dynamically.

In the observed cases the presence of jump tables was unknown, and also the structure and size of the data block, that follows the call instruction :-(

> You might be able to get around this by symbolic execution: You start
> with a state description which allows arbitrary values of variables.

Then you'll end up with a tree of states, in the best case, and a graph (with loops and knots) in the worst case.

> But what if you know the obfuscation method? Assuming that the
> obfuscation method is polynomic, deobfuscation is at worst NP-hard, so
> it is decidable. But it can be so intractable that it doesn't matter.

It may be possible to crack algorithmic obfuscation, but odds are bad with the encountered "handmade" obfuscation. The intended (and achieved) effect was optimization (almost for smaller size), and the resulting obfuscation only was a side effect.

Even if one can produce equivalent assembler code, with some tricks (macros...) for data structures with multiple meanings (instructions in instruction arguments...), that code will remain hard to understand - and that's the primary goal of every obfuscation. Who will be able to tell the *purpose* of a state machine or other automaton, given only its implementation?

More unobfuscation problems come into mind, like the use of modified external code, maybe only different versions of (shared) standard libraries. When some code relies on the values returned from such external subroutines, and the precise implementation in a specific library version, the entire environment (version of the OS and all installed libraries) has to be taken into account.

DoDi

[Jan 16, 2006] computer intelligence assembler disassembler ciasdis tool by Albert van der Horst

This page is about how the Post-It Fix-Up principle works out in practical program code in Forth. For the impatient: jump to the downloads

Actual assemblers

Applying the Post-It Fix-Up principle to a 8086 assembler led to the discovery of problems that had to be solved. It turns out that some types of fixups better be considered not relative to the start of the instruction, but relative to the end. Otherwise there would be diffent fixups for e.g. byte/cell indication (B| X|), dependant on the length of the opcode. It is still there in the fig-forth version of the opcodes, such as B| W| besides B1| and W1| . So a new class of fixup, the "fix up's from behind" or reverse fixups were added. It turned out that other fixup's are not needed for the Intel, up to the Pentium. Other processors require fixup's with build in data. These so called data fixups are needed for the 6809 and the DEC Alpha.
A program was added that generates a PostScript file with the first byte opcodes for 8080 as well as 8086 , and the 80386 , a so called quick reference card. Comparing that to Intels documentation led to the discovery of one more bug. I had to redesign the opcodes, so other people could have trouble using this beast without such a reference card and the `SHOW: MOV|SG,' that lists all forms allowed for the move segment instruction.

Is Sony in violation of the LGPL - Programming stuff

Update: Click here
I'm sure you've already heard about the Sony rootkit that was first revealed by Mark Russinovich of Sysinternals. After the Finnish hacker Matti Nikki (aka muzzy) found some revealing strings in one of the files (go.exe) that are part of the copy protection software, the rootkit is also suspected to be in violation of the open-source license LGPL. The strings indicate that code from the open-source project LAME was used in the copy protection software in a way that's not compatible with the LGPL license which is used by LAME.
On Slashot muzzy mentioned that he doesn't have access to Sabre BinDiff, a tool that can be used to compare binary files. I was in the opposite position as I have BinDiff but I didn't have the file in question (go.exe). I mailed muzzy and he hooked me up with the file.
I compared go.exe with a VC++-compiled version of lame_enc.dll but unfortunately BinDiff didn't find a single relevant matched function. A quick manual check didn't reveal any LAME functions in go.exe either.
Even though go.exe apparently does not contain any LAME code, a considerable amount of tables and constants from the LAME source files can be found in the go.exe file. Here's a list of the LAME tables I've been able to locate. The first column shows the hex address where the table can be found in the go.exe file, the second column shows the name of the table as it appears in the LAME source code and the third column shows the LAME source file where the table can be found.
I have to add though, that not a single table actually seems to be used by the go.exe code. What does that mean? I've asked random people and I've heard speculation ranging between "accidentaly linked" and "encrypted code in go.exe that uses the tables and can't be found in the disassembler". Further analysis needs to be made but at this point I'm leaning towards more or less accidental inclusion.
Posted by sp in Misc at 11:38

Comments
Display comments as (Linear | Threaded)
The code in GO.EXE could be compiled with another compiler. In that case your comparison would probably not find a match, but it may still be there.
#1 Rhialto on Nov 14 2005, 12:41 Reply
This idea is absolutely right and I've thought about it. go.exe was apparently compiled using VC++ 7 (debug build) while lame_enc.dll was compiled using VC++ 6 (release build). That's what PEiD ( http://peid.has.it/ ) says, at least.
In the past I've succesfully used BinDiff to match functions from files compiled with gcc to functions from files compiled with VC++ though. The question is now whether VC++ 7 is so much different from VC++ 6 that BinDiff is less likely (or even unable) to match them even though code produced from VC++ 6 and gcc seem to be similar enough for BinDiff to work.
Furthermore I think the main point of importance is that the tables in go.exe are not referenced by any code (at least not in a way that a static disassembler can detect). I think the reason for this might be the solution to the entire violation question.
#1.1 sp on Nov 14 2005, 12:52 Reply
They are very different, the VC7 compiler was a complete rewrite of the VC6 compiler and has many improvements.
#1.1.1 Anonymous on Nov 15 2005, 17:03 Reply
Heya,
I'm another reverser, and I'd be interested in taking a look at this myself (I also have IDA and BinDiff)... could you possibly send me a copy of the exe?
Cheers,
Will
#2 Will Whistler on Nov 14 2005, 13:06 Reply
This raises an interesting question: are tables originating from LGPLed code enough to make the LGPL apply to the final executable, even though it might not actually use the data?
After all, the tables also have been written and are part of the source code covered by the license. I don't think copyright law would make a difference between the source for executable code and that for the data needed by that code.
#3 Arend Lammertink (http://plone.vrijschrift.org) on Nov 14 2005, 14:38 Reply
Good observation. That's actually exactly why I didn't even make an attempt to answer the question posed in the topic. I don't know enough about license and copyright issues to make an educated guess.
#3.1 sp on Nov 14 2005, 15:27 Reply
Coming to think of it, it's not surprising at all you can't find any code if you compare a dll and a static linked executable on Windows.
Windows' dlls are designed in such a way that function calls between dlls are completely different from their static equivalents. Function calls are adressed using an offset table in the dll. The caller uses special access code. That's why dlls are accompanied by "import" libraries. Every function that can be used from outside of a dll has to be "exported" using some declspec macro's. I'm sure these will also influence name mangling, etc.
To make a long story short: try comparing the executable with a static Lame library...
#4 Arend Lammertink (http://plone.vrijschrift.org) on Nov 14 2005, 15:18 Reply
I assumed that this wouldn't matter because of the level of abstraction BinDiff uses to determine whether code from two files is equal or not. The calling convention shouldn't really matter here.
But alas, assumption is the mother of all fuck-ups. So I went back to check. As I expected I don't get any results from a statically linked LAME either.
I also want to draw attention to another issue. LAME is an application that uses a lot of FPU instructions. Go.exe barely uses any.
I've created an opcode distribution list for the files lame_enc.dll and go.exe. The former uses tens of thousands of FPU instructions with fld being the 2nd most used instruction (only mov is used more often). The latter file, on the other hand, uses only a few hundred FPU instructions and there are 26 more frequently used CPU instructions before the 1st FPU instruction comes in the list.
#4.1 sp on Nov 14 2005, 20:11 Reply
What relevant parts of the LGPL would be infringed if it does contain this? The LGPL doesn't require that things that link to it also be LGPL, unlike the GPL.
#5 Nick Johnson on Nov 14 2005, 21:04 Reply
They still have to offer the source code for any LGPL code they distribute, or modify and distribute, and they still have to include an LGPL license notice. They can link to LPGL code, but they can't hide it.
#5.1 Rodrin on Nov 15 2005, 16:22 Reply
Another, perhaps more logical explanation, given the lack of substantial similarity: Perhaps the Sony software includes LAME signatures so it can detect whether a user is running LAME to encode MP3s.
#6 Ansel (http://www.anselsbrain.com/) on Nov 14 2005, 21:22 Reply
Perhaps the tables in question aren't used to execute anything, but merely to detect LAME and/or programs that use it?
#7 HyperHacker (http://hypernova.amarok-shadow.com) on Nov 15 2005, 07:19 Reply
Let me quote some comment on a slashdot story (http://yro.slashdot.org/yro/05/11/15/1250229.shtml?tid=117&tid=188&tid=17) from muzzy:
That only concerns GO.EXE, and while the analysis is correct for that executable, I checked for LAME references against every binary in the compressed XCP.DAT file after I managed to unpack it (thanks to freedom-to-tinker.com guys for providing description of the format). Turns out, there's more binaries including references to LAME, and this time there's actually code that uses the data as well. And not just LAME, there's also Id3lib included in one dll, and bladeenc and mpglib distributed along with the DRM. All of this is LGPL, it's code, and it's being used.
#8 Cone on Nov 15 2005, 15:06 Reply
Yes, this is correct. We're right now working on the new files and we've already matched code manually. We're now in the process of developing a few tools to match code automatically because there's a lot of code to match.
#8.1 sp on Nov 15 2005, 15:08 Reply
Congratulations, guys!
#8.1.1 Arend Lammertink on Nov 15 2005, 15:26 Reply
What if the tables from LAME are there, to be used to detect a LAME encoder being used on the system? ie, if you try to rip the tracks, it will see that LAME is running, and perhaps corrupt the resulting ripped file?
#9 Ed Felton on Nov 15 2005, 16:30 Reply
Wonders if go.exe makes any systems calls to register itself.
#10 hawkeyeaz1 on Nov 15 2005, 16:44 Reply

[Oct 28, 2005] Program Transformation Wiki - Decompilation Resources

plain This page contains links to projects only peripherally related to decompilation. Links to the actual decompilers are located on the DecompilationGeneralApproach, DecompilationApplicationSpecificApproach (Java, Foxbase, etc) and DecompilationCompilerSpecific pages. An attempt is being made to reduce overlap with the other pages.

freshmeat.net Project details for uncc

uncc is a C decompiler which helps reverse engineers and programmers improve their understanding of assembly code.

Homepage:
http://www.uncc.info/
Tar/GZ:
http://www.uncc.info/uncc-0.1.2.1.tar.gz

Pyreverse v. 0.2

About: pyreverse is a set of tools for reverse engineering Python code. It features dependency analysis tools, documentation generation, and XMI generation for importation in a UML modeling tool. A special module can be used to generate files readable by Argo UML.

Changes: Support was added for links between classes, exceptions raised in functions, and package diagrams. Some documentation with a full example was added.

[Dec 28, 2001] Decompilation page and link to a decompiler by Satish Kumar.

Contains a beta version of DisC - Decompiler for TurboC and a small intro to the problem of decompilation using Intel assembler fragments of small C programs as an example. Compare to Decompilation of Binary Programs - dcc

[Dec 18, 2000] Clipper and FoxPro decompilation

[Oct 10, 2000] Filter Factory Plug-in Decompiler

This little 16bit DOS program generates a ".afs" of any ".8bf" file compiled by the Filter Factory of Adobe Photoshop (PC version only).

[Sept 15, 2000] Java decompilers

Java Decompiler HOW-TO
BackToJava (BTJ) - General tool for manipulation of Java bytecode (disassembly, assembly, statistics, decompilation, debug information...), written in Java. In progress. [LGPL license]
NMI's Java Code Viewer - The user-friendly Java decompiler ...
Ahpah Software | Makers of the SourceAgain Java Decompiler
java decompiler and java disassembler - ClassCracker
Introduction to WingDis 2.15 A Java Decompiler Version 2.15:
Javacafe.Com - Java Decompiler
DJ Java Decompiler
Decafe Pro - The Java decompiler
dis - a Fast Java Disassembler, NOT a Java Decompiler

C. Cifuentes and K. Gough. Decompilation of binary programs. Software -- Practice and Experience, 25(7):811--829, July 1995.

C. Cifuentes, Interprocedural data-flow decompilation. Journal of Programming Languages Vol 4 No 2 (1996) pp 77--99

Of Graphs and Coffi Grounds: Decompiling Java - Patrick Lam (1998)

......expressions, must be combined with the control-flow restructuring. Finally, phase III of decompilation must be implemented; we must eliminate goto for arbitrary control-flow graphs. This will probably be done using work previously done with the Compilers and Concurrency group, described in [Ero95] [EH94]. 6 Summary This paper discusses techniques used in the control-flow structuring phase of the Sculpt decompiler. In particular, the three main phases of control-flow structuring were described. Stage I recognized simple structures in the control-flow graph which could be locally reduced; among ......
[EH94] Ana M. Erosa and Laurie J. Hendren. Taming control flow: A structured approach to eliminating goto statements. In Proceedings of the 1994 International Conference on Computer Languages, pages 229--240, May 1994.

Making Graphs Reducible with Controlled Node Splitting - Johan Janssen (1997)

......control flow graphs is the fact that data flow analysis, that is an essential part of any compiler, can be done more efficiently [3]. Related work: The problem of converting irreducible flow graphs to reducible flow graphs can be tackled at the front-end or at the back-end of the compiler. In [4] and [5] methods for normalizing the control flow graph of a program at the front-end are given. These methods rewrite an intermediate program in a normalized form. During normalization irreducible flow graphs are converted to reducible ones. To make a graph reducible, code has to be duplicated, ......
[4] Ana M. Erosa and Laurie J. Hendren. Taming control flow: A structured approach to eliminating goto statements. In Proceedings of the 1994 International Conference on Computer Languages, pages 229--240, Toulouse, France, May 1994.

GOTO Removal Based On Regular Expressions - Paul Morris Ronald

......our GOTO removal program served as a bridge to extend the range of the software. We have also applied this work to the process of "levitating" IBM 370 assembly language code to higher-level forms (Morris et al., 1996). Previous work (Ashcroft et al., 1972, Peterson et al., 1973, Ramshaw, 1988, Erosa et al., 1994), while highly developed and abstract in nature, treats GOTO removal as an isolated problem in the area of programming languages. This paper is based on the observation that GOTO removal for flowcharts resembles the problem of converting finite-state transition networks to regular expressions, and ......
......preferred over maintaining the existing structure. The algorithm presented here is applicable to nonreducible programs, and meets the lesser path-equivalence standard. Subjectively, this standard appears to produce more readable results than approaches that introduce auxiliary variables, such as (Erosa et al., 1994), and is easier to relate to the source presented for restructuring. These considerations are important for reverse-engineering. While the algorithm discussed in this paper has worked well in practical contexts, we feel its most significant feature is the link to important concepts in computer ......
Erosa, A. M. and Hendron, L. J. (1994). `Taming Control Flow: A structured Approach to Eliminating Goto Statements,' Proc. IEEE International Conference on Computer Languages, 229--240.

Decompiler - Wikipedia
Mocha (decompiler) - Wikipedia
Overview of decompilation techniques
Centre for Software Maintenance
- The Decompilation Page
- The Ethics of Decompiling
- Disassembling
Decompilation of Binary Programs - dcc
Decompilation
History of Decompilation
Decompilation Advice
Decompilation
A Methodology for Decompilation - Cifuentes, Gough (ResearchIndex)
Decompilation- the Enumeration of Types and Grammars - Breuer, Bowen (
White Paper- SourceAgain and Java Decompilation
GeeK- decompilation
Educom Review Decompilation as a "Fair Use" of Copyrighted Programs
Christian Collberg's and Software Watermarking and Obfuscation Page
ip13098
Instruction-Set Simulation and Tracing
Bibliogrphy
**** Decompilation of Binary Programs - dcc

The dcc decompiler decompiles .exe files from the (i386, DOS) platform to C programs. The final C program contains assembler code for any subroutines that are not possible to be decompiled at a higher level than assembler.

The analysis performed by dcc is based on traditional compiler optimization techniques and graph theory. The former is capable of eliminating registers and intermediate instructions to reconstruct high-level statements; the later is capable of determining the control structures in each subroutine.

Please note that at present, only C source is produced; dcc cannot (as yet) produce C++ source.

The structure of a decompiler resembles that of a compiler: a front-, middle-, and back-end which perform separate tasks. The front-end is a machine-language dependent module that reads in machine code for a particular machine and transforms it into an intermediate, machine-independent representation of the program. The middle-end (aka the Universal Decompiling Machine or UDM) is a machine and language independent module that performs the core of the decompiling analysis: data flow and control flow analysis. Finally, the back-end is high-level language dependent and generates code for the program (C in the case of dcc).

In practice, several programs are used with the decompiler to create the high-level program. These programs aid in the detection of compiler and library signatures, hence augmenting the readability of programs and eliminating compiler start-up and library routines from the decompilation analysis.
SAS2000: Efficient Inference of Static Types for Java Bytecode back

Authors: Etienne Gagnon, Laurie Hendren and Guillaume Marceau
Date: January 2000

Abstract
Even though Java bytecode has a significant amount of type information embedded in it, there are no explicit types for local variables. However, knowing types for local variables is very useful for both program optimization and decompilation. In this paper, we present an efficient and practical algorithm for inferring static types for local variables in a 3-address, stackless, representation of Java bytecode.

By decoupling the type inference problem from the low level bytecode representation, and abstracting it into a constraint system, we show that there exists verifiable bytecode that cannot be statically typed. Further, we show that, without transforming the program, the static typing problem is NP-hard. In order to develop a practical approach we have developed an algorithm that works efficiently for the usual cases and then applies efficient program transformations to simplify the hard cases.

Our solution is an multi-stage algorithm. In the first stage, we propose an efficient algorithm that infers static types for most bytecode found in practice. In case this stage fails, the second stage is applied. It consists of a simple and efficient variable splitting operation that renders most bytecode typeable using the algorithm of stage one. Finally, for completeness of the algorithm, we present a final stage that efficiently transforms and infers types for all remaining bytecode (such bytecode is likely to be a contrived example, and not code produced from a compiler).

We have implemented this algorithm in the Soot framework. Our experimental results show that all of the 17,000 methods used in our tests were successfully typed, 99.8% of those required only the first stage, 0.2% required the second stage, and no methods required the third stage.

View the paper (.ps) Download the paper (.ps.gz)

Krakatoa: Decompilation in Java (Does Bytecode Reveal Source?) - Todd A. Proebsting, Scott A..

Abstract: This paper presents our technique for automatically decompiling Java bytecode into Java source. Our technique reconstructs source-level expressions from bytecode, and reconstructs readable, high-level control statements from primitive goto- like branches. Fewer than a dozen simple coderewriting rules reconstruct the high-level statements. 1 Introduction Decompilation transforms a low-level language into a high-level language. The Java Virtual Machine (JVM) specifies a low-level bytecode language for a stack-based machine [LY97]. This language defines 203 operators, with most of the control flow specified by simple explicit transfers and labels. Compiling a Java class yields a class file that contains type information and bytecode. The JVM requires a significant amount of type... (Correct Abstract)

......since the output of the decompiler should be as readable as possible. Her technique structures old FORTRAN programs for readability. As a result, her technique may leave some goto's in the resulting programs, which is not allowed in Java. Other techniques for eliminating goto's have been proposed [EH94, Amm92, AKPW83, AM75]. These techniques may change the structure of the program, and may add condition variables, or create subroutines. 8 Conclusion In this paper, we present a technique for decompiling Java bytecode into Java source. Our decompiler, Krakatoa, produces syntactically legal Java source from legal, ......
[EH94] Ana M. Erosa and Laurie J. Hendren. Taming control flow: A structured approach to eliminating goto statements. pages 229--240. International Conference on Computer Languages, May 1994.

Decompilers

DeCompilers
REVEAL.ORG - First Online Java Class Decompiler and Obfuscator
- Reveal It. Verson 1.0
Java Code Engineering engineer & reverse engineer Java class files (Meurrens 980201) -- (Java bytecode, decompilers, obfuscators, etc)
Decompiling Java Code
Home Page of Jad - the fast Java decompiler
Comparison between various Java decompilers
Todd Proebsting: Decompilation in Java (Does Bytecode Reveal Source?)

Selected open source projects

Boomerang

This is an open source decompiler, with several front ends (two are well developed) and a C back end. It uses an internal representation based on the Static Single Assignment form, and pioneers dataflow-based type analysis. At the time of writing, it is still limited to quite small (toy) binary programs.

Desquirr

This is an IDA Pro Plugin, written by David Eriksson as part of his Master's thesis. It decompiles one function at a time to the IDA output window. While not intended to be a serious decompiler, it illustrates what can be done with the help of a powerful disassembler and about 5000 lines of C++ code. Because a disassembler does not carry semantics for machine instructions, each supported processor requires a module to decode instruction semantics and addressing modes. The X86 and ARM processors are supported. Conditionals and loops are emitted as gotos, there is some simple switch analysis, and some recovery of parameters and returns is implemented.

Legal Aspects

Samuelson - Fair Use for Computer Programs and Other Copyrightable Works in Digital Form The Implications of Sony, Galoob and Sega
TECHNOLOGICAL PROTECTION FOR COPYRIGHTED WORKS by Pamela Samuelson and Benjamin D. Black
E Law Reverse Engineering in Japan and the Global Trend Towards Interoperability
FOR EDUCATIONAL USE ONLY: *479 DRAWING THE BOUNDARY BETWEEN COPYRIGHT AND CONTRACT: COPYRIGHT PREEMPTION OF SOFTWARE LICENSE TERMS by Maureen A. O'Rourke [FNd1]
Cohen, Reverse Engineering and the Rise of Electronic Vigilantism: Intellectual Property Implications of "Lock-Out" Programs
REVERSE ENGINEERING SOFTWARE IS IT SAFE
Reverse Engineering Software to Achieve Compatibility is Fair Use (January 1993)
WIPO-INT-SIN-98-6
SISASubmission1
Decompilation-Disassemblers for DU
E Law Subject - Decompilation
6.6 Reproduction or Decompilation of Copyrighted Computer Programs
cni-copyright- USTR on decompilation of computer programs
cni-copyright- "That Delicate Balance" and decompilation
DECOMPILATION AND COPYING OF COMPUTER PROGRAMS WITHOUT AUTHORISATION OF THE OWNER

Java decompilers

Ir. Marc W.F. Meurrens: "Java Bytecode Engineering and Reverse Engineering" -- very interesting page [Link updated Jan 16, 2000]
Java Decompiler HOW-TO
BackToJava (BTJ) - General tool for manipulation of Java bytecode (disassembly, assembly, statistics, decompilation, debug information...), written in Java. In progress. [LGPL license]
NMI's Java Code Viewer - The user-friendly Java decompiler ...
Ahpah Software | Makers of the SourceAgain Java Decompiler
java decompiler and java disassembler - ClassCracker
Introduction to WingDis 2.15 A Java Decompiler Version 2.15:
Javacafe.Com - Java Decompiler
DJ Java Decompiler
Decafe Pro - The Java decompiler
dis - a Fast Java Disassembler, NOT a Java Decompiler

Clipper and FoxPro decompilation

History

SUBJECT-TABLE.html

Structuring Decompiled Graphs - Cifuentes (ResearchIndex)

Context 1. F.E. Allen. Control flow analysis. SIGPLAN Notices, 5(7):1--19, July 1970.

Context 2. F.E. Allen. A basis for program optimization. In Proc. IFIP Congress, pages 385--390, Amsterdam, Holland, 1972. North-Holland Pub.Co.

Context 3. F.E. Allen. Interprocedural data flow analysis. In Proc. IFIP Congress, pages 398--402, Amsterdam, Holland, 1974. North-Holland Pub.Co.

Context 4. F.E. Allen and J. Cocke. Graph theoretic constructs for program control flow analysis. Technical Report RC 3923 (No. 17789), IBM, Thomas J. Watson Research Center, Yorktown Heights, New York, July 1972.

Context 5. F.E. Allen and J. Cocke. A program data flow analysis procedure. Communications of the ACM, 19(3):137--147, March 1976.

Details Context 6. Z. Ammarguellat. A control-flow normalization algorithm and its complexity. IEEE Transactions on Software Engineering, 18(3):237--251, March 1992.

Context 7. B.S. Baker. An algorithm for structuring flowgraphs. Journal of the ACM, 24(1):98--120, January 1977.

Context 8. C. Bohm and G. Jacopini. Flow diagrams, Turing machines and languages with only two formation rules. Communications of the ACM, 9(5):366--371, May 1966.

Details Context 9. C. Cifuentes. Reverse Compilation Techniques. PhD dissertation, Queensland University of Technology, School of Computing Science, July 1994.

Details Context 10. C. Cifuentes. Interprocedural dataflow decompilation. In print: Journal of Programming Languages, 1996.

Context 11. C. Cifuentes and K.J. Gough. Decompilation of binary programs. Software -- Practice and Experience, 25(7):811--829, July 1995.

Context 12. J. Cocke. Global common subexpression elimination. SIGPLAN Notices, 5(7):20-- 25, July 1970.

Details Context 13. A.M. Erosa and L.J. Hendren. Taming control flow: A structured approach to eliminating goto statements. In Proceedings of the International Conference on Computer Languages, Universit'e Paul Sabatier, Toulouse, France, May 1994. IEEE Computer Society.

Context 14. M.S. Hecht. Flow Analysis of Computer Programs. Elsevier North-Holland, Inc, 52 Vanderbilt Avenue, New York, New York 10017, 1977.

Context 15. B.C. Housel. A Study of Decompiling Machine Languages into High-Level Machine Independent Languages. PhD dissertation, Purdue University, Computer Science, August 1973.

Context 16. G.L. Steele Jr. and G.J. Sussman. Design of a LISP-based microprocessor. Communications of the ACM, 23(11):628--645, November 1980.

Context 17. D.E. Knuth and R.W. Floyd. Notes on avoiding go to statements. Information Processing Letters, 1(1):23--31, 1971.

Context 18. S.R. Kosaraju. Analysis of structured programs. Journal of Computer and System Sciences, 9(3):232--255, 1974.

Context 19. U. Lichtblau. Decompilation of control structures by means of graph transformations. In Proceedings of the International Joint Conference on Theory and Practice of Software Development (TAPSOFT), Berlin, 1985.

Context 20. U. Lichtblau. Recognizing rooted context-free flowgraph languages in polynomial time. In G. Rozenberg H. Ehrig, H.J. Kreowski, editor, Graph Grammars and their application to Computer Science, number 532 in Lecture Notes in Computer Science, pages 538--548. Springer-Verlag, 1991.

Details Context 21. J. McCarthy. Recursive functions of symbolic expressions and their computation by machine, part I. Communications of the ACM, 3(4):184--195, April 1960.

Context 22. G. Oulsnam. Unravelling unstructured programs. The Computer Journal, 25(3):379--387, 1982.

Context 23. D.J. Pavey and L.A. Winsborrow. Demonstrating equivalence of source code and PROM contents. The Computer Language, 36(7):654--667, 1993.

Context 24. L. Ramshaw. Eliminating go to's while preserving program structure. Journal of the ACM, 35(4):893--920, October 1988.

Context 25. M. Sharir. Structural analysis: A new approach to flow analysis in optimizing compilers. Computer Languages, 5:141--153, 1980.

Context 26. R.L. Sites, A. Chernoff, M.B. Kirk, M.P. Marks, and S.G. Robinson. Binary translation. Communications of the ACM, 36(2):69--81, February 1993.

Context 27. M.H. Williams. Generating structured flow diagrams: the nature of unstructuredness. The Computer Journal, 20(1):45--50, 1977.

Context 28. M.H. Williams and G.Chen. Restructuring Pascal programs containing goto statements. The Computer Journal, 28(2):134--137, 1985.

Context 29. M.H. Williams and H.L. Ossher. Conversion of unstructured flow diagrams to structured form. The Computer Journal, 21(2):161--167, 1978. This article was processed using the L A T E X macro package with LLNCS style

Context [AKPW83] J.R. Allen, Ken Kennedy, Carrie Porterfield, and Joe Warren. Conver- sion of control dependence to data dependence. pages 177--189, 1983.

Context [AM75] E. Ashcroft and Z. Manna. Translating programs schemas to while-schemas. SIAM Journal of Computing, 4(2):125-- 146, 1975.

Details Context [Amm92] Zahira Ammarguellat. A control-flow normalization algorithm and its complexity. IEEE Transactions on Software Engineering, 18(2):237--250, 1992.

Context [ASU86] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, Reading, Massachusetts, 1986.

Context [Bak77] Brenda S. Baker. An algorithm for structuring flowgraphs. Journal of the Association for Computing Machinery, 24(1):98--120, January 1977.

Details Context [Cif93] Cristina Cifuentes. A structuring algorithm for decompilation. In Proceedings of the XIX Conferencia Latinoamericana de Informatica, pages 267--276, Buenos Aires, Argentina, August 1993.

Details Context [EH94] Ana M. Erosa and Laurie J. Hendren. Taming control flow: A structured approach to eliminating goto statements. pages 229--240. International Conference on Computer Languages, May 1994.

Context [LY97] Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. The Java Series. Addison-Wesley, 1997.

Context [PKT73] W.W. Peterson, T. Kasami, and N. Tokura. On the capabilities of while, repeat and exit statements. Communications of the ACM, 16(8):503--512, 1973.

Context [Ram88] Lyle Ramshaw. Eliminating go to's while preserving program structure. Journal of the Association for Computing Machinery, 35(4):893--920, October 1988.

Context [Sri96] KB Sriram. Hashjava. url: http://www.sbktech.org/hashjava.html, 1996.

Context [vV96] Hanpeter van Vliet. Mocha. current url: http://www.brouhaha.com/~eric/ computers/mocha-b1.zip, 1996.

Context [Wil77] M.H. Williams. Generating structured flow diagrams: The nature of unstructuredness. Computer Journal, 20(1):45--50, 1977.

Context [Wil97] U. G. Wilhelm. Cryptographically protected objects, May 1997. A french version appeared in the Proceedings of RenPar'9, Lausanne, CH. http://lsewww.epfl.ch/~wilhelm/ CryPO.html.

Context [WO78] M.H. Williams and H.L. Ossher. Conversion of unstructured flow diagrams to structured. Computer Journal, 21(2):161--167, 1978. A Additional Rewriting Rules We anticipate using a few other tree rewriting rules that might improve readability of our code. The anticipated rules build more natural for-loops.

Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D

Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: June 13, 2018