Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Google   


Finding Dominators in Directed Graphs

Old News See Also Recommended Books Recommended Links Algorithms Implementations Minimal spanning trees Humor Etc

Dominator - Wikipedia, the free encyclopedia

Dominance was first introduced by Prosser in a 1959 paper on analysis of flow diagrams. Prosser did not present an algorithm for computing dominance, which had to wait ten years for Lowry and Medlock. Ron Cytron rekindled in the interest in dominance in 1989 when he applied it to efficient computation of φ functions, which are used in static single assignment form.


Notes:
  • Those pages are written by people for whom English is not a native language. Some amount of grammar and spelling errors should be expected.
  • This is a Spartan WHYFF (We Help You For Free) site. It cannot replace the best teachers and the best books.
  • The site contain some obsolete pages as it develops like a living tree... Some links on older pages are broken. Please try to use Google, Open directory, etc. to find a replacement link (see HOWTO search the WEB for details). We would appreciate if you can mail us a correct link.

Search Amazon by keywords:

Google   
Open directory

Research Index

 

Old News ;-)

[PDF] Microsoft PowerPoint - dfa3

File Format: PDF/Adobe Acrobat - View as HTML
Dominators” needed to explain. reducibility. In reducible flow graphs, loops are well. defined, retreating edges are unique. (and called “back” edges). ...
suif.stanford.edu/~courses/cs243/lectures/dfa3.pdf - Similar pages

[PPT] M.Sc. of Software Engineering for the e-Economy CO7206 System ...

File Format: Microsoft Powerpoint - View as HTML
Dominators / Post-dominators; Control-dependence Analysis. Data Flow Analysis; Program and System Dependence Graphs; Slicing. Program Analysis Outline ...
www.cs.le.ac.uk/~mer14/week5-2.ppt - Similar pages
 

Doug Simon Structuring Assembly Programs

Doug's 1997 Honours thesis, under the supervision of Cristina Cifuentes, involved the creation of a new structuring algorithm to structure assembly-based program flow graphs.  This new algorithm was prompted as a way to simplify the expensive data structures and analysis required by the algorithms used in the dcc project.

Doug implemented the algorithms used by Cristina in the dcc project (see Structuring Decompiled Graphs or Chapter 6 of the PhD thesis Reverse Compilation Techniques) as well as the new, simplified, structuring algorithms we devised in 1997.  Both algorithms are compared in a tool called AST (assembly structuring tool) for SPARC assembly programs.

Abstract

An automated assembly to high level language (e.g. C) translator would be of great benefit to maintainers of legacy assembly programs. A component of such a translator would derive higher level constructs such as conditional, loops and sequences from the original source. This can be done by applying an algorithm to structure the control flow graph of the assembly source. This is commonly referred to as a structuring algorithm.

Most existing techniques for control flow structuring not easily adapted to structuring assembly code. During research into the feasibility of a binary decompiler, an algorithm more suited to the task of structuring assembly code was developed. This seminar focuses on a number of modifications to this algorithm that were developed as part of an honour's thesis.  In particular we discuss techniques for generating `better' high level code from unstructured assembly control flow graphs. Unstructured graphs are those that will require the use of gotos in the high level code. The quality of the code can be measured by the number of gotos generated.

In addition to our attempt to improve the functionality of the original algorithm, we also seek to improve the runtime speed of the translator by investigating less expensive data structures. Ideally, any improvement in speed will not involve a trade off with functionality.

We present the results of a prototype assembly structuring tool. The results were generated by running the tool on a number of well known, large benchmark programs used by those in the compiler community. From these results, we discuss what combination of improvements in speed and functionality are optimal in some well defined sense.

Code

The assembly structuring tool, AST, is distributed under a BSD license and it is (C) Doug Simon, 1997.

The code base was last compiled on a SPARC Solaris platform with gcc 3.1 on 1st March 2002. 

Two versions of the AST tool can be built: ast and ast_old.  The ast tool implements the new algorithms, based on dominators and post-dominators of a graph's underlying tree.  The ast_old tool implements the algorithms used in the dcc project, which are based on intervals and the derived sequence of graphs.  The Makefile provided in the distribution compiles the ast tool, to compile the ast_old tool, you need to uncomment some lines in the Makefile.

Code base: ast.tar.gz  (49 KB)

[PPT] A New Simpler Linear Time Dominators Algorithm

File Format: Microsoft Powerpoint 97 - View as HTML
Use augmented graphs of microtrees to summarize the arcs from G into the trees; determine internal immediate dominators (ie idoms residing inside a ...
www.cs.wustl.edu/~raa4/projects/bkrw/cs531bkrw-2.PPT - Similar pages

[PDF] Finding Dominators in Practice

File Format: PDF/Adobe Acrobat - View as HTML
dominators in the graph obtained from G by reversing all arc orientations. ... memory and compute dominators for each graph in sequence, measuring the total ...
www.daimi.au.dk/~loukas/dominators_esa04.pdf - Similar pages

[PDF] Finding Dominators Revisited

File Format: PDF/Adobe Acrobat - View as HTML
Algorithm for finding dominators in reducible graphs [1]. Lemma 5.1. For any vertex w = r, idom(w) is the. nearest common ancestor of sdom(w) and parent(w) ...
www.daimi.au.dk/~loukas/dominators_soda04.pdf - Similar pages
 

Dominators Dominators at AT&T Labs Research

In the summer of 1998, I worked with Anne Rogers and Adam Buchsbaum at AT&T Labs Research. We studied three major algorithms used to find important vertices called dominators in flowgraphs (directed, rooted graphs). A node d dominates n if every path from the root to n includes d. Dominator analysis is an important problem, with applications ranging from optimizing compilers to network flow analysis.

Programmers primarily use one of three algorithms to find dominators in flowgraphs; these are Purdom-Moore, Bit Vector, and Lengauer-Tarjan. On graphs with n vertices and m edges, the respective running times for each algorithm are O(m n), O(m n^2), and O(m alpha(m,n)) (where alpha is the inverse Ackermann function, which grows extremely slowly). When Lengauer and Tarjan introduced their algorithm in a 1979 paper, they used randomly generated flowgraphs as input to compare the actual running times for each of these algorithms. In their paper, they found the Lengauer-Tarjan algorithm showed better performance, even for small graphs, reporting that their algorithm out-performed Purdome-Moore for all test graphs larger than twelve nodes, and it outperformed Bitvector for all test graphs.

Over the summer, I explored whether modern machines with caches exhibit different results and whether using compiler-generated flowgraphs as input affects runtime. I implemented Purdom Moore, and Bit Vector and modified an existing Lengauer-Tarjan implementation. We collected running times for all three algorithms on two classes of graphs: random graphs and flowgraphs generated using the MACHINE SUIF compiler (from Harvard/Stanford) on programs from the SPEC benchmarks. In my analysis, I found that for both random and compiler generated graphs, Lengauer-Tarjan still proved fastest for large graphs. However, in the experiment, Purdom-Moore lost ground, moving the crossover point by Lengauer-Tarjan to seven nodes. We also found the bit vector algorithm does compete with Lengauer-Tarjan for small graphs. Therefore, the conclusions of the original paper still stand, even with modern machine caches and compiler generated graphs.

Although this work did not overturn existing ideas about dominator algorithms, I gained a great deal from the experience. On my return to school, I was able to leverage my experience with dominators in a class project where I used Visual C++ and MFC to create a Windows visualization tool for animating graph algorithms. The program allows users to create directed graphs and monitor the progress of various algorithms on them, including Purdom-Moore. My experience with dominators was more than just research; it was a unique opportunity for me to gain confidence in my research skills, to work with talented researchers, and to bring my summer experience to the classroom.

 

Recommended Links

Dominator - Wikipedia, the free encyclopedia

[PDF] Finding Dominators in Practice

File Format: PDF/Adobe Acrobat - View as HTML
method. In Proceedings of the 25th International Symposium on Microarchitecture,. pages 260–263, 1992. 23. RE Tarjan. Finding dominators in directed graphs. ...
www.daimi.au.dk/~loukas/dominators_esa04.pdf - Similar pages

 

Stephen Alstrup, Dov Harel, Peter W. Lauridsen and Mikkel Thorup.
Dominators in linear time.
Vol. 28(6), 2117-2132,1999, Siam Journal on Computing.

A New, Simpler Linear-Time Dominators Algorithm - Buchsbaum (ResearchIndex)

Abstract: this article is organized as follows. Section 2 outlines Lengauer and Tarjan's approach. Section 3 gives a broad overview of our algorithm and differentiates it from previous work. Section 4 presents our algorithm in detail, and Section 5 analyzes its running time. Section 6 presents our new path-compression result, on which the analysis relies. Section 7 describes our implementation, and Section 8 reports experimental results. We conclude in Section 9. (Update)

Dominators in Linear Time - Alstrup, Harel, Lauridsen, Thorup (ResearchIndex) Dominators in Linear Time (1996)  (Make Corrections)  (14 citations)
Stephen Alstrup, Peter W. Lauridsen, Mikkel Thorup DIKU technical report

Dominators in Linear Time - Alstrup, Harel, Lauridsen, Thorup (1999)   (14 citations)  (Correct)

.... and program optimization [2, 3, 4, 5, 10, 15] The problem was first raised in 1969 by Lowry and Medlock [15] where an O(n 4 ) algorithm for the problem was proposed (as usual, n is the number of nodes and m the number of edges in a graph) The result has been improved several times (see e.g. [1, 2, 17, 20]) and in 1979 an O(mff(m; n) algorithm was found by Lengauer and Tarjan [14] Finally, at STOC 85, Dov Harel [11] announced a linear time algorithm. Based on Harel s result, linear time algorithms have been found for many other problems (see e.g. 4, 5, 10] Harel s description was, however, ....

R.E. Tarjan. Finding dominators in directed graphs. SIAM Journal on Computing, 3(1):62--89, 1974.

Notes on Graph Algorithms Used in Optimizing Compilers - Offner (ResearchIndex)

Notes on Graph Algorithms Used in Optimizing Compilers (1995)  (Make Corrections)  (2 citations)

A New, Simpler Linear-Time Dominators Algorithm - Adam Buchsbaum Haim (1998)   (1 citation)  (Correct)

....theory, with crucial applications in global flow analysis and program optimization [2, 7, 9, 15] Lorry and Medlock [15] introduced an O(n 4 ) time algorithm, where n = jV j and m = jAj, to find all the immediate dominators in a flowgraph. Successive improvements to this time bound were achieved [2, 20,22], culminating in Lengauer and Tarjan s [13] O(mff(m; n) time algorithm; ff is the standard functional inverse of the Ackermann function and is slightly superlinear [25] The Lengauer Tarjan (LT) algorithm is extremely efficient in practice. Reducing the asymptotic time complexity of finding ....

An O(|V|*|E|) Algorithm for Finding Immediate Multiple-Vertex Dominators (1996) - Alstrup, Clausen.. (1996)   (Correct)

Abstract: We present an O(jV j jEj) algorithm for finding immediate multiple-vertex dominators in a graph with vertices V and edges E. 1 Introduction. Finding dominators in a graph has been investigated in many papers [6, 7, 8, 9, 10] in connection with global flow analysis and program optimization. Recently Gupta extended the problem to find generalized dominators [4, 5], which can be used for e.g. propagating loop invariant statements out of the loop in cases, where no single vertex dominates the... (Update)

Generalized Dominators for Structured Programs - Alstrup, Lauridsen, Thorup (1996)   (Correct)

Abstract: . Recently it has been discovered that control flow graphs of structured programs have bounded treewidth. In this paper we show that this knowledge can be used to design fast algorithms for control flow analysis. We give a linear time algorithm for the problem of finding the immediate multiple-vertex dominator set for all nodes in a control flow graph. The problem was originally proposed by Gupta (Generalized dominators and post-dominators, ACM Symp. on Principles of Programming Languages,... (Update)

Algorithms Implementations

[patch] new dominance calculator

+/* Compute dominator relationships using new flow graph structures in + nearly linear time. */ void compute_flow_dominators (dominators, post_dominators) + ...
gcc.gnu.org/ml/gcc-patches/2000-06/msg00794.html - 8k - Cached - Similar pages
 

soot.toolkits.graph (Soot API)

DominatorsFinder, General interface for a dominators analysis. ExceptionalGraph, Defines the interface for navigating a control flow graph which ...
www.sable.mcgill.ca/soot/doc/ soot/toolkits/graph/package-summary.html - 21k - Cached - Similar pages

DominatorTree (Soot API)

protected DominatorsFinder · dominators. protected HashMap · godeToDode "gode" is a node in the original graph, "dode" is a node in the dominator tree. ...
www.sable.mcgill.ca/soot/doc/ soot/toolkits/graph/DominatorTree.html - 31k - Cached - Similar pages
 

[PDF] The Machine-SUIF Control Flow Analysis Library

File Format: PDF/Adobe Acrobat - View as HTML
Call find dominators to compute the dominator relation in the forward graph. Then access this infor-. mation through the following methods. immed dom(n) ...
www.eecs.harvard.edu/hube/software/nci/cfa.pdf - Similar pages

 

Dominators (joeq 1.0 API)

dom_nav. public static final jwutil.graphs.Navigator dom_nav ... public void calculateDominanceFrontier(Dominators.DominatorNode tree) ...
joeq.sourceforge.net/apidocs/ joeq/Compiler/Quad/Dominators.html - 29k -
Cached - Similar pages

Doug Simon Structuring Assembly Programs

Doug's 1997 Honours thesis, under the supervision of Cristina Cifuentes, involved the creation of a new structuring algorithm to structure assembly-based program flow graphs.  This new algorithm was prompted as a way to simplify the expensive data structures and analysis required by the algorithms used in the dcc project.

Doug implemented the algorithms used by Cristina in the dcc project (see Structuring Decompiled Graphs or Chapter 6 of the PhD thesis Reverse Compilation Techniques) as well as the new, simplified, structuring algorithms we devised in 1997.  Both algorithms are compared in a tool called AST (assembly structuring tool) for SPARC assembly programs.

Abstract

An automated assembly to high level language (e.g. C) translator would be of great benefit to maintainers of legacy assembly programs. A component of such a translator would derive higher level constructs such as conditional, loops and sequences from the original source. This can be done by applying an algorithm to structure the control flow graph of the assembly source. This is commonly referred to as a structuring algorithm.

Most existing techniques for control flow structuring not easily adapted to structuring assembly code. During research into the feasibility of a binary decompiler, an algorithm more suited to the task of structuring assembly code was developed. This seminar focuses on a number of modifications to this algorithm that were developed as part of an honour's thesis.  In particular we discuss techniques for generating `better' high level code from unstructured assembly control flow graphs. Unstructured graphs are those that will require the use of gotos in the high level code. The quality of the code can be measured by the number of gotos generated.

In addition to our attempt to improve the functionality of the original algorithm, we also seek to improve the runtime speed of the translator by investigating less expensive data structures. Ideally, any improvement in speed will not involve a trade off with functionality.

We present the results of a prototype assembly structuring tool. The results were generated by running the tool on a number of well known, large benchmark programs used by those in the compiler community. From these results, we discuss what combination of improvements in speed and functionality are optimal in some well defined sense.

Code

The assembly structuring tool, AST, is distributed under a BSD license and it is (C) Doug Simon, 1997.

The code base was last compiled on a SPARC Solaris platform with gcc 3.1 on 1st March 2002. 

Two versions of the AST tool can be built: ast and ast_old.  The ast tool implements the new algorithms, based on dominators and post-dominators of a graph's underlying tree.  The ast_old tool implements the algorithms used in the dcc project, which are based on intervals and the derived sequence of graphs.  The Makefile provided in the distribution compiles the ast tool, to compile the ast_old tool, you need to uncomment some lines in the Makefile.

Code base: ast.tar.gz  (49 KB)

[PDF] Enterprise Navigator: A System for Visualizing and Analyzing ...

File Format: PDF/Adobe Acrobat - View as HTML
2.4 Graph Dominators (Dominator). The Enterprise Navigator uses the Dominator tool to ... dominators of a graph. In a graph with a selected root node R, ...
www.research.att.com/resources/ trs/TRs/99/99.16/99.16.1.body.pdf - Similar pages

 

History

[PDF] A Fast Algorithm for Finding Dominators in a Flowgraph

File Format: PDF/Adobe Acrobat - View as HTML
TARJAN, R. Finding dominators in directed graphs. SIAM J. Comptng. 3. (1974), 62-89. ACM Transactions on Programming Languages and Systems, Vol. 1, No. ...
www.cs.princeton.edu/courses/ archive/spr03/cs423/download/dominators.pdf - Similar pages

[PDF] The Program Dependence Graph and Its Use in Optimization

File Format: PDF/Adobe Acrobat - View as HTML
and dominators [l, 31. The Appendix contains the basic graph theoretic termi- ... flow graph is equivalent to computing dominators [l] in the reverse ...
www.cs.utexas.edu/users/ less/reading/spring00/ferrante.pdf - Similar pages

 

Finding Dominators in Directed Graphs by Robert Tarjan

Abstract. This paper describes an algorithm for finding dominators in an arbitrary directed graph. The algorithm uses depth-first search and efficient algorithms for computing disjoint set unions and manipulating priority queues to achieve a time bound of $O(V\log V + E)$ if $V$ is the number of vertices and $E$ is the number of edges in the graph. This bound compares favorably with the $O(V(V + E))$ time bound of previously known algorithms for finding dominators in arbitrary directed graphs, and with the $O(V + E\log E)$ time bound of a known algorithm for finding dominators in reducible graphs. If $E \geqq V\log V$, the new algorithm requires $O(E)$ time and is optimal to within a constant factor.

Keywords. algorithm, binary tree, complexity, connectivity, depth-first search, directed graph, dominator, equivalence algorithm, graph, immediate dominator, priority queue, search, set union, stack, topological sorting, tree

Full Text (pdf)

 


Copyright © 1996-2007 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: February 28, 2008

similar except that the transfer is to the statement following the end of the i'th enclosing loop.

The algorithm initially orders the nodes in the graph by constructing a spanning tree of the graph in a depth first manner. The details of the depth-first procedure are omitted here but can be found in any book dealing with algorithms and data structures such as [7].

When structuring loops, loop headers are determined to be those with an edge into them from a node lower in the graph. For each of these loop headers an extra node is introduced. This node does not introduce any extra code in the underlying program and is used when tagging all the nodes within a loop. These are determined to be all the nodes between the new loop header (the introduced node) and the loop tail. We do not detail how the tail is determined here as it is not relevant. Figure 2.5 shows the process of adding the extra node to a graph. In graph (a), B is determined to be a loop header and the results of adding the extra node (B

0

) for this header is shown in

graph (b). The next step would be to introduce a new node for A.

After adding the extra nodes, the algorithm the defines a predicate HEAD

2.3. SEMANTIC PRESERVING STRUCTURING 27

D A

B

C D

A

B'

B C

(a) (b) Figure 2.5: (a) Original CFG and (b) CFG with extra loop header node added.

for each node in the graph such that HEAD (q) is the head of the smallest loop enclosing q in the program. If q is not in a loop, then HEAD (q) is undefined.

Having added the extra nodes, we are guaranteed that for loop header p, HEAD (p) will be either the header of an enclosing loop or undefined. This means that nested loop are correctly determined. Loop nesting occurs when one loop is inside another. An example of this is shown by graph (a) in Figure 2.5. The loop induced by the back edge (D ; A) nests the loop induced by the other back edge (C ; B ).

For conditional structuring, a reach set is constructed for each node p where the members of this set are all the nodes entered by edges from p or from nodes nested within p (i.e. nodes within a loop headed by p or nodes within the clause of a conditional headed by p). The constraint on the set membership for p is that it cannot contain p or any node nested within p. Figure 2.6 gives an example of a CFG and a table showing the reach sets derived from it.

Using these sets, a node is structured as a conditional header if it has at least

28 CHAPTER 2. RELATED WORK

A B

D E

F

G

H

C

Node Reachset

A fBg

B fE,Fg C fE,Fg D fE,Fg

E fGg F fGg G fg H fg

Figure 2.6: A graph with the reach set shown for each node. two out edges, both of which eventually lead to one of the nodes in the reach set with at least one doing so indirectly. It does this by considering a subset of all reachable statements from each statement where the membership of the subset is well defined. For each conditional detected, the first statement of each branch is recorded as well as the follow node of the whole conditional (i.e. the first common statement reached by the branch of the conditional). The predicates of the branching statements are inverted if necessary to prevent generating if..then..else statements where the then clause is empty. As an example consider nodes B and C . Node B will be structured as a conditional as both of its out edges lead indirectly (via c and D ) to the nodes in its reach set. Node C on the other hand, will not be structured as a conditional as both of its out edges lead directly to the nodes of its reach set (i.e. e and F ).

The algorithm is extended to handle irreducible CFGs. The above algorithms ensures that unstructured jumps will only take the form of forward jumps into a clause of an if..then..else statement or into the body of a loop.

This algorithm gives us the useful concept of imposing an ordering between the nodes within a CFG and how such an ordering can be constructed (i.e. by a depth first search traversal of the graph from the entry node). The immediate dominator technique of structuring conditionals provides the basis for how we perform structuring of conditionals.

An adaptation of the technique of inserting nodes when structuring loops is

2.4. ASSEMBLY CONTROL FLOW GRAPH STRUCTURING 29 used by our structuring algorithms. The direct result is that we can structure the one node to be both a loop header and a conditional header.

Baker's algorithm does not take into account the occurrence of nodes which have more than two out edges (i.e. case and switch statements). She states that adapting the algorithm to handle these cases is straightforward but this is not the case. The reason being that when structures headed by one of the mulit-way nodes overlaps with a loop, there is a need for extra analysis to decide which structure will be generated in the HLL code and which will results in goto statement(s). The restriction of goto jumps being only forward is also unacceptable when attempting to minimise the number of goto statements generated. For example, in the case where there is a jump back into the subgraph of a switch statement, it is desirable to structure that backwards jump as a goto rather than as a loop with multiple forward goto jumps into it.

2.4 Assembly Control Flow Graph Structuring

The structuring work performed by Cifuentes [5] as a component of a binary decompiler

1

includes most of the algorithms upon which our work is based.

The algorithms used for structuring rely upon a number on control flow techniques available in the literature dealing with the analysis and manipulation of CFGs such as successors, predecessors, depth first search and more. The definitions of these concepts is postponed until Chapter 4. The control flow analysis is particularly relevant as it is performed on the control flow graphs of assembly-like programs.

Loop structuring is performed through the use of intervals and derived sequence of graphs which had previously been used in both compiler control flow and data flow analyses. Conditional structuring was performed by using the immediate dominator information for each node in the graph.

An algorithm is also given for structuring short-circuit evaluated graphs. These graphs result from the efficient evaluation of compound boolean conditions. For example, the graph shown on the left hand side of Figure 2.7

1

A binary decompiler is a tool that translates a binary program (executable) into a

high level language program.

30 CHAPTER 2. RELATED WORK representing conjunction in a HLL program, can be implemented by the CFG on the right hand side of the same figure. This type of low level implementation is even guaranteed by some high level languages (such as C) and, for example, allows a programmer to test a pointer before dereferencing it without needing to use a lengthier conditional statement.

else t f

t

t

f

f then

elsethen x or y x

y

Figure 2.7: Two possible CFG representations for a compound boolean expression. The short-circuited representation is given on the right hand side of the figure.

The algorithms used for structuring 2way and nway conditionals were given separately in this work. We can modify how this structuring is done so that the two can be performed together. The modification should not only give better performance but also improve the quality of the generated code.

As mentioned previously, the loop structuring algorithm presented by Cifuentes is a major focus of this thesis. We attempt to preserve the functionality of the algorithm presented in this research while improving its execution time by using an alternative to the two data structures she used.

We do not include the structuring of short-circuit evaluated graphs as to do so requires data flow analysis. This is quite feasible in the context of developing a tool (such as a decompiler) that already involves data flow analysis. However, as we are concentrating purely on control flow analysis, we have neglected this type of structuring.

Chapter 3 Assembly to High Level Language

The task of structuring assembly programs involves more than just designing some structuring algorithms. We need to deal with the issue of getting the program into its CFG representation which is required by the structuring algorithms. We also need to provide some means of generating the HLL code from the CFG after it has been augmented with the structuring information. This chapter describes these broader issues of assembly structuring. For the components that are not central to the translation process, a detailed discussion of their algorithmic aspects is given here instead of devoting a whole chapter to them.

The structure of an assembly to HLL translation is similar to standard compilation on a conceptual level (although significantly simpler) as it consists of a series of phases that transform the input (an assembly program) from one representation to another. The phases involved are shown in Figure 3.1. The parser performs both syntax analysis and a simple form of semantic analysis on the input assembly. Any implementation of this phase is architecture dependent because assembly languages are dependent on a particular machine. The CFG generation phase builds the CFG representation of the program. As a result of semantic analysis performed in the parser phase, this phase is both architecture and HLL target independent. The third phase is the structuring process. Both it and the last phase, HLL code generation, are architecture independent but are dependent on the HLL chosen as the target. These last two phases are the subject of the following two chapters

32 CHAPTER 3. ASSEMBLY TO HIGH LEVEL LANGUAGE

Assembly Program

Parser CFG Generation

Structuring HLL Generation

HLL Program

AST

Figure 3.1: The phases of an assembly to HLL translation. respectively and are the core of the structuring process.

3.1 Assembly Parsing An assembly language is in essence just a symbolic syntax for a given machine's binary instruction set. This explains why any given assembly language is machine dependent and programs written in it are not portable to another architecture without significant translation. Only pure binary programs, which are further restricted by binary file formats and operating systems, are less portable. The implications of this architecture dependence for any implementation of the translator is that it will only be able to process assembly programs from one architecture. However, given that the later phases are independent of the given assembly syntax, porting the implementation to a different architecture should only involve rewriting the first phase module.

The semantic analysis performed by the parser is the categorisation of each instruction parsed into one of two categories; transfer instructions and non 3.1. ASSEMBLY PARSING 33 transfer instructions. Knowing what category an instruction is assists in the next phase; control flow graph generation. We partition the instruction set of any architecture into the following two categories:

ffl Control Instructions (TI): this is the set of instructions that change the

value of the program counter. That is, they explicitly set the address of the instruction to be executed. For convenience, we refer to any instruction within this set as a control transfer instruction (CTI). These instructions are:

- Unconditional branches

1

: the flow of control is transferred to the

target branch address.

- Conditional branches: depending on the value of some condition

code(s) set by preceding instructions, the flow of control is transferred either to the target branch address or the next instruction in the sequence.

- Indexed jumps: the target of the jump is defined by the value of

a register and is therefore a range of values, one of which the flow of control is transferred to.

- Subroutine call: the flow of control is transferred to the routine

given in the appropriate operand of the instruction. For our purposes, we assume that the flow of control is transferred back to the next instruction in the sequence after successful completion of the subroutine and only this flow of control is represented in a control flow graph.

- Subroutine return: the flow of control is returned to the caller

subroutine that invoked the current subroutine.

ffl Non transfer instruction (NTI): the set of instructions that implicitly

transfer control to the next instruction in the sequence. That is, all the instructions that are not in TI.

Parsing the assembly input is little more than text file processing. Checking for the correctness of the assembly program is considered to be outside the domain of the AHT process. This phase only reads in the instructions and build a representation of the instruction sequence within memory. When

1

Different architectures use the terms `jump' and `branch' to mean the same thing. We

adhere mainly to the latter but the two are used interchangeably hereafter.

34 CHAPTER 3. ASSEMBLY TO HIGH LEVEL LANGUAGE parsing a control transfer instruction (CTI), the internal representation is tagged as being a CTI as well as being augmented with the location(s) of the transfer control. This simple semantic analysis is tightly coupled to the host architecture but the information it renders is architecture independent. The locations are stored as positions within the internal instruction sequence. Essentially, they are array indices. A parser for assembly in any architecture could be used here instead as long as it performs the architecture independent categorisation of the instructions as discussed above.

3.2 Control Flow Generation As a preliminary to discussing the generation of a control flow graph, we define the concept of a basic block. Working from this definition, we then proceed to formalise the concept of a control flow graph as well as introducing terminology used when discussing control flow graphs.

3.2.1 Basic Blocks This section formally defines the concept of basic blocks which are also the building blocks of a CFG. The definition is given in terms of the sequence of instructions that were parsed into memory by the parser phase.

Definition 3.1 A basic block b = [i

1

; : :; i

n\Gamma 1

; i

n

; i

n+1

], n ? 1 is an instruction sequence that satisfies the following conditions:

1. [i

1

; : :; i

n\Gamma 1

; i

n+1

] 2 NTI

2. 8 i 2 fi

2

: : i

n+1

g ffl i is not the target of a CTI

3. i

n

2 TI

OR 1. [i

1

; : :; i

n+1

] 2 NTI

2. i

n+2

is the target of a CTI

The first alternative in Definition 3.1 specifies that a basic block is a sequence of instructions such that all but the second last are from the NTI set. Additionally, only the first instruction may be the target of a CTI and the second

3.2. CONTROL FLOW GENERATION 35 last instruction must be a CTI. The second alternative defines a basic block that is not delimited by an instruction that explicitly transfers control. In this case, the last instruction has the property that its successor is the target of a CTI. All the instructions in this type of basic block will be from the NTI set.

The above definition is unfortunately dependent upon the underlying architecture. The one we have given is in terms of the SPARC architecture which includes the concept of delayed instructions. These are the instructions that immediately follow most CTIs. This instruction is executed, conditionally in some cases, before the transfer of control takes place. For simplicity, we therefore define these delayed instructions to be a member of the same basic as their preceding CTI. Chapter 6 deals further with the idiosyncrasies of these delayed instruction further when discussing how they were handled by an implementation of our structuring algorithms.

3.2.2 Graph Theory Before defining a control flow graph, we formalise the notion of a graph and its relevant properties. The definitions presented are those given in [5] and are similar to those found in any text dealing with graph theory.

Definition 3.2 A graph G is a tuple (V,E,h) where V is a set of nodes, E is a set of edges, and h is the root of the graph. An edge is a pair of nodes (v,w) such that v,w 2 V.

Definition 3.3 A directed graph G = (N,E,h) is a graph that has directed edges; i.e. each (n

i

; n

j

) 2 E has a direction which is from n

i

to n

j

.

Definition 3.4 A path from n

1

to n

m

in graph G = (N,E,h) , represented by

n ! n

m

, is a sequence of edges (n

1

; n

2

); (n

2

; n

3

); : :; (n

m\Gamma 1

; n

m

) 2 N ; m ? 1.

Definition 3.5 If G = (V,E,h) is a graph such that V = fhg and E = f g, then G is a trivial graph (i.e. a graph with only one node and no edges).

Definition 3.6 If G = (N,E,h) is a graph and 8 n 2 N ; h ! n, then G is a connected graph (i.e. every node in the graph is reachable from the head of the graph).

36 CHAPTER 3. ASSEMBLY TO HIGH LEVEL LANGUAGE 3.2.3 Control Flow Graphs The following definition formally shows that a control flow graph is constructed from the basic blocks of a program. There is one CFG for each subroutine in the program and these CFGs do not overlap

2

. However, the

CFGs do not necessarily partition the basic blocks of a program. Since the edges within a CFG represent the flow of control between the blocks, any unreachable blocks will not be included in any CFG.

Definition 3.7 A control flow graph G = (N,E,h) for a procedure P is a connected, directed graph such that:

1. 9 f : B 7ae N where B is the set of basic blocks for the procedure. 2. 8 e = (n

i

; n

j

) 2 E , e represents a flow of control from one basic block

to another and n

i

; n

j

2 N .

3. 9 b

i

2 B ffl h is the first instruction in b

i

.

Part 1 states that there is a partial one-to-one function mapping basic blocks to nodes in the graph. That is, for every node there is a corresponding unique basic block. However, due to the aforementioned unreachable basic blocks, the reverse is not necessarily true. Part 2 equates the control flow between two basic blocks with an edge in the graph. The third part of the definition defines the head of the graph to be the node containing an instruction that is an entry to a subroutine. This instruction will either be the first instruction in the program or the target of a subroutine call.

For certain types of analysis, it is important that a control flow graph has only one exit node. That is, there is a unique node in the graph such that it is the only one that has no successors. If there is more than one of these in the original graph, we conceptually add a new node and add an edge from each of the candidate exit nodes in the original graph to this new node. This new node does not introduce any code into the underlying program.

2

As we are only concerned with control flow analysis, we do not consider the transfer

of data between subroutines. For this reason, we do not explicitly show the transfer of control from one subroutine to another as an edge in a graph. It suffices for us to simply show the points at which a subroutine call is made.

3.2. CONTROL FLOW GENERATION 37 Definition 3.8 The exit node, e, of a graph is such that 9 e 2 N ffl @n 2 N ffl (e; n) 2 E

As with the instructions, the basic blocks are also classified into different types for the purpose of control flow analysis. The type is determined by the delimiting instruction of a block. Under the SPARC architecture (with its delayed instructions), the delimiting instruction of a basic block will be the second last one in the case where the block is defined by the first part of Definition 3.1. Otherwise it will be the last instruction in the block. The types are:

ffl Unconditional basic block: the second last instruction in the block is an

unconditional branch instruction. This block will have one out-edge.

ffl Conditional basic block: the second last instruction is a conditional

branch. This block will have two out-edges.

ffl Nway basic block: the second last instruction is a indexed jump. The

block will have n out-edges, one for each possible target of the jump.

ffl Call basic block: the second last instruction is a call to a subroutine.

This block will have only one out-edge which will be to the lexical successor of the block's last instruction.

ffl Return basic block: the second last instruction is a return instruction.

This block will have no out-edges.

ffl Fall through basic block: the block is not delimited by a CTI.

The nodes in the graph representing basic blocks

3

of the above types are

given the following names respectively: ucond, cond, nway, call, ret, fall.

To illustrate the process of basic block partitioning and CFG generation consider the example shown in Figure 3.2. The input to the example is an assembly program

4

. Instructions 9 and 12 are examples of unconditional branches

which delimit an unconditional basic block while instruction 6 demonstrates

3

Given this representational relationship between nodes and basic blocks, the two terms

are used interchangeably hereafter.

4

A number of non-transfer instructions have been removed from the code to cut down

the size of the example.

38 CHAPTER 3. ASSEMBLY TO HIGH LEVEL LANGUAGE an indexed jump

5

which delimits an nway basic block. The graph resulting

from these basic blocks and the flow of control between them is shown in Figure 3.3.

1 main: save %sp,-120,%sp 2 or %o0,%lo(.LLC0),%o0 3 call scanf,0 4 or %o0,%lo(.LL3),%o0 5 ld [%o1+%o0],%o0 6 jmp .LL3 .LL12 .LL4 7 nop 8 .LL3: ld [%fp-20],%o1 9 b .LL14 10 or %o0,%lo(.LLC1),%o0 11 .LL4: ld [%fp-20],%o1 12 b .LL14 13 or %o0,%lo(.LLC1),%o0 14 .LL12: sethi %hi(.LLC2),%o0 15 or %o0,%lo(.LLC2),%o0 16 .LL14: bne .LL15 17 nop 18 call printf,0 19 mov 0,%i0 20 .LL15 ret 21 restore

Label Type Instructions

A call 1 - 4

B nway 5 - 7 C ucond 8 - 10 D ucond 11 - 13

E fall 14 - 15 F cond 16 - 17 G call 18 - 19 H ret 20 - 21

Figure 3.2: A sample assembly program. The basic block partitioning is shown by the table on the right hand side.

3.2.4 Unreachable blocks One last refinement has to be performed to the CFGs before they are ready for structuring. We must remove the aforementioned unreachable basic blocks.

To remove the unreachable blocks, a depth first traversal if performed from each node representing the entry point to a subroutine, tagging each node as it is traversed. Upon completion of the traversals, it is simply a matter of removing the untagged nodes.

5

The range of the possible index values have been resolved by a preprocessor.

3.2. CONTROL FLOW GENERATION 39

call nway ucond ucond fall

cond call

ret

A B

E

H G

F

DC

Figure 3.3: The control flow graph for the program given in Figure 3.2

Chapter 4 Structuring Assembly Graphs A control flow graph a representation of the underlying assembly program which does not contain any information that cannot be read directly from the assembly code. To generate high level code from these control flow graphs requires further analysis. The use of structuring algorithms will give us the required information. As a result of these algorithms, the graphs will be annotated with the information describing what high level control constructs are represented in the graph. This chapter presents the control flow analysis performed by the structuring algorithms to attain the high level control flow constructs.

4.1 Notation This chapter includes a number of algorithms presented in pseudocode. While this style is mostly intuitive, the following conventions are used:

ffl attributes of objects are written (with a lower case first letter) as a

function call on the object. For example, iPDom(x) accesses the node x's immediate post-dominator and isBackEdge(v,w) returns whether or not (v ; w ) is an edge.

ffl procedure calls are written in their usual style with the distinguishing

feature of having an uppercase first letter. Return values are implemented as var (i.e. reference) parameters.

4.2. GRAPH STRUCTURING 41

ffl the nesting levels are represented through the use of indentation. ffl a test for equality uses `=' and assignment uses the `:=' symbol.

While it is presumed that the meaning of each node attribute is self-explanatory, a reference for all these attributes is provided in Appendix A.1. This reference relies upon the definitions given in the following sections. Section A.2 defines any predicates used while Section A.3 defines an attribute of a loop when referred to as an object.

4.2 Graph Structuring To structure a control flow graph, a set of high level constructs is required to give the structuring some context. For example, the structuring of control flow graphs could be performed very easily if we were only aiming to describe the flow of control using two-way conditionals (if..then..else) and goto statements. However, it is questionable whether this leaves us much better off than the original assembly program.

The set of high level constructs chosen for structuring needs to be general enough to cater for some of the more common high level languages. This allows any implementation of the assembly structuring process to be easily retargeted for any high level language that at least provides the set of constructs used to do the structuring. Further analysis could then be performed on the output to use additional constructs provided by the language in question. Alternatively, the algorithms could be modified slightly to support new constructs.

A comprehensive review of some common high level languages and the control flow mechanisms they provide is given on pages 138 - 140 in [5]. The constructs embodied in each of the languages is a subset of the constructs shown in Figure 4.1 along with their control flow graph representations. Note that some of the other common control structures such as for loops have not been shown. These omitted structures can usually be implemented using those already shown. For example, in the case of the for loop, it can be implemented by using a sequential statement followed by a while loop.

The set of structures used for our purposes include all those shown in Figure 4.1 except for the last two, multilevel exits and multilevel cycles. Instead,

42 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

Goto Statement

Repeat loop

Multilevel exit Multilevel cycle Multiexit loop

Multicycle loop

While loop Endless loopSequence

n-way branch conditionalConditionalSingle branch conditional

Figure 4.1: HLL control structures and examples of their control flow graph representations.

goto's will be used to emulate these structures should they appear as most languages provide the former but very few (e.g. Ada) provide the latter. However, the inclusion of gotos also excludes at least one common language, Modula2.

4.3 Control Flow Graph Terminology Before discussing how we gather information about the underlying structures present in a control flow graph, we first define the relevant terminology and definitions used. These definitions are taken directly from [5] (apart from

4.3. CONTROL FLOW GRAPH TERMINOLOGY 43 the post-dominator definitions which are a modification of the immediate dominator definitions given in the same piece of work) and are given in terms of a directed graph G = (N ; E ; h; e). This is an extension of the normal representation of a graph in that we now explicity mention the exit node (see Definition 3.8).

Definition 4.1 A closed path or cycle is a path n1 ! n

v

where n

1

= n

v

.

Definition 4.2 The successors of n

i

2 N are fn

j

2 N j n

i

! n

j

g. The

immediate successors of n

i

2 N are fn

j

2 N j (n

i

; n

j

) 2 E g.

Definition 4.3 The predecessors of n

j

2 N are fn

i

2 N j n

i

! n

j

g. The

immediate predecessors of n

j

2 N are fn

i

2 N j (n

i

; n

j

) 2 E g. A node

n

i

2 N post-dominates a node n

k

2 N if n

i

is on every path n

k

! e.

Definition 4.4 A node n

i

2 N immediately post-dominates n

k

2 N if

@n

j

2 N ffl n

j

post-dominates n

k

and n

i

post-dominates n

j

(i.e. n

i

is the

closest post-dominator of n

k

).

Definition 4.5 A strongly connected region (SCR) is a subgraph S = (N

S

; E

S

; h

S

) ffl 8 n

i

; n

j

2 N

S

ffl 9 n

i

! n

j

and n

j

! n

i

.

Definition 4.6 A strongly connected component of G is a subgraph S = (N

S

; E

S

; h

S

) ffl S is a strongly connected region and @S

2

2 SCR(G) ffl

S ae S

2

(i.e. it is the smallest connect region involving h

S

).

Definition 4.7 A depth first search (DFS) is a method of traversing a graph that selects edges to traverse from the most recently visited node that still has unvisited nodes on at least one of its edges. A depth first spanning tree (DFST) of a flow graph G is a directed, rooted, ordering spanning tree of G grown by a DFS algorithm.

As described on page 482 of [7], the DFST T of a graph partitions the edges of the graph into four categories; tree edges, forward edges, backward edges and crossedges. For the purpose of structuring CFGs, the differentiation between tree, forward and cross edges is uninteresting and so we collapse them all to the one category, forward edges. We end up with the following two categories:

44 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

1. Backward edges = f(v ; w ) j (v ; w ) 2 E ^ w ! v 2 T g. 2. Forward edges are all the remaining edges.

The backward edges (also referred to as back edges) are used when structuring loops. The forward edges are used for structuring all other constructs as well as loops.

4.3.1 Interval Theory and the Derived Sequence of

Graphs

For the purpose of the describing the structuring algorithm we aim to improve upon, we define the two data structures it uses. The first structure was defined by Cocke in [6] and the algorithm used to build it developed by Cifuentes in [4].

Definition 4.8 Given some node, n 2 N , the interval I (n) is the maximal, single-entry subgraph in which n is the only entry node (i.e. the only node with a predecessor outside the subgraph) and in which all closed paths contain n. The unique interval node n is called the interval head or simply the header node.

We can use intervals to partition the nodes of G into a unique set of disjoint intervals I = fI (h

1

); : :; I (h

n

)g for some N ? 1. The algorithm presented

in Figure 4.2 is used to derive the sequence of intervals for a graph. This algorithm is derived from work done by Allen [2] in the area of control flow analysis. Defining I as a sequence imposes an ordering between the intervals of a graph which is required when building the other data structure required for this algorithm, the derived sequence of graphs.

Interval theory alone is not sufficient for the complete structuring of loops. We use an additional concept called the derived sequence of graphs which was described by Allen [2] and is based upon the intervals of a graph. The construction of the graphs is an iterative method that collapses intervals into nodes. The nodes formed by this process are then the nodes of the next order graph. The edges for this next order graph are both the in edges to the header of the interval from which the new node was derived as well as the out edges from nodes within the same interval whose destination was not

4.3. CONTROL FLOW GRAPH TERMINOLOGY 45

procedure FindIntervals (G = (N ; E ; s); var I) * Pre: G is a graph * Post: the intervals of G are contained in the sequence I

I:= h i H := fhg for (all unprocessed n 2 H ) do

I(n) := fng repeat

I (n) := I (n) + fm 2 N j 8 p 2 pred(m) ffl p 2 I (n)g until (no more nodes can be added to I (n)) H := H + fm 2 N j m 62 H ^ m 62 I (n) ^ (9 p 2 pred(m) ffl p 2 I (n))g I:= I + I (n) end procedure

Figure 4.2: Interval building algorithm.

within the interval. The process of building the intervals is then performed on the new graph after which the whole process starts again. This continues until either the trivial graph is derived or the graph derived is the same as the graph from which it was derived. If the latter is the case, then the graph is said to be irreducible.

An adaptation of the algorithm given by Cifuentes in [4] is shown in Figure 4.3. The adaptation is a result of using an object oriented design. Under this paradigm, an interval is represented by a class derived from the class used to represent an ordinary graph node with a few additional attributes (see Appendix A.4). For readability, we do not explicitly show the necessary type conversion required when treating a sequence as a set. The nodes and edges of the derived graph G

i

are denoted by N

i

and E

i

respectively.

Examples of the derived sequences for two programs are shown in Figures 4.4 and 4.5. They demonstrate how the intervals are defined for each graph within the sequence. Figure 4.5 is an example of an irreducible graph. The final graph in its derived sequence is the canonical irreducible graph. All irreducible graphs will reduce to containing only or more instances of this graph. An irreducible graph will occur when there is a forward edge into a loop subgraph.

For convenience, we use the term DSG (derived sequence of graphs) when describing the loop structuring algorithm that uses these two data structures.

46 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

procedure BuildDerivedSequences(G = (N ; E ; h); var G) * Pre: G is a graph * Post: the derived sequence of graphs G

1

; : :; G

n

; n ? 1 has

* been constructed and is returned in G

G

1

:= G

G:= hG

1

i

i := 2 repeat /* Construction of G

i

*/

FindIntervals(G

i\Gamma 1

,I)

N

i

:= I

h

i

:= I (1)

E

i

:= f g

if (#N

i\Gamma 1

6= # I) then

for (all I 2 I) do

for (all n 2 nodesOf(I )) do

for (all m 2 (succ(n) - nodesOf(I ))) do

E

i

:= E

i

+ (I ; inInterval(m))

G:= G + G

i

i := i + 1 until (G

i

is the trivial graph or #N

i\Gamma 1

= # I)

end procedure

Figure 4.3: Derived Sequence of Graphs building algorithm.

4.3.2 Parenthesis Theory The proposed alternative loop structuring algorithm does not require any additional data structures beyond the control flow graph itself. However, it requires an extra property to be given to each node within the graph. Given that this extra property can also be used to make another improvement through 2 way conditional restructuring to the structuring algorithms, we define it for the nodes in the graph irrespective of what loop structuring algorithm is used.

We define this property in terms of a graph G = (N ; E ; h). We also rely on a DFS traversal of the graph from h to increment a `time' counter (initialised to 1) after the first and last visit to each node during the traversal. The first of these properties was given by Cormen et al. on pages 477-481 of [7]. The other two are derived from this first one.

Definition 4.9 A parenthesis of n 2 N is denoted t

1

! t

2

and t

1

is the

4.3. CONTROL FLOW GRAPH TERMINOLOGY 47

I(1)

I(2)

G2

G3 1

2

3 4

5

6

G1

2-6

1 1-6

I(1)

I(1)

Figure 4.4: The Derived Sequence of a graph with the intervals of the initial graph and each derived graph shown by the dotted boxes.

4

1 2 1 2

3 3-4

I(2)

I(1) I(1)

I(2) 1(3)

G1

G2

Figure 4.5: An example of an irreducible graph. The final graph is the canonical irreducible graph.

time at which n was first visited during a DFS traversal from h and t

2

is the

time at which n was last visited during the same traversal.

Definition 4.10 A parenthesis n

1

! n

2

is enclosed by parenthesis m

1

!

48 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS m

2

iff both were defined during the same traversal of a graph and m

1

! n

1

!

n

2

! m

2

.

Definition 4.11 9 n

1

; n

2

2 N ffl (n

1

O/ n

2

j (9 P

1

; P

2

ffl P

1

is a parenthesis

of n

1

^ P

2

is a parenthesis of n

2

^ P

1

is enclosed by P

2

)). The O/ relationship

is pronounced "is in".

It is relevant to note that the above definitions imply that there is more than one possible DFS traversal of a graph from the same starting point. This is important in our case as will be shown later. It is sufficient to state now that the two (or more) different traversals are a result of the ordering defined between the out edges of a node. Recall that for a 2way (or cond) node, the edges represent the flow of control taken when the condition is true or false respectively. This gives us a means of performing two symmetrical DFS traversals of the graph, one traversing the true branches first, the other traversing the false branches first. For an nway node, the symmetrical case involves working through the out edges in reverse.

Figure 4.6 shows the recursive algorithm used to set the parenthesis of each node in a graph. Note that it has a boolean parameter, indicating which edges are to be traversed first. For an nway node, when doing the `true' traversal, we work through the sequence of out edges in ascending order and descending order for the `false' traversal. In the context of parenthesis theory, the nodes have a few extra attributes which are described in Appendix A.5.

procedure SetParenthesis(G = (N ; E ; h); brch;var time) * Pre: G is a graph * Post: The brch traversal parenthesis for h has been determined.

traversed(h) := true parenthesis

brch

(h)(1) := time

time := time + 1 for (n 2succ(h) ^ : traversed(n)) do

SetParenthesis(G; brch; time) parenthesis

brch

(h)(2) := time

time := time + 1 end procedure

Figure 4.6: Algorithm used to set the parentheses for each node.

For convenience, we use the term PT (Parenthesis theory) when describing the loop structuring algorithm that uses the parenthesis properties of nodes.

4.4. STRUCTUREDNESS 49 4.4 Structuredness Having defined the context for structuring, we can now classify control flow graphs as either structured or unstructured. Structured graphs are those in which the flow of control represented can be explained completely in terms of a set of structured control structures. All of the structures in our chosen set (see Section 4.2) apart from the goto are structured control structures. Therefore, any control flow graph we can reduce in terms of these structures is a structured graph. In the context of assembly programs, the control flow graphs will quite often be unstructured. We are not recognising subgraphs representing short circuit evaluation of compound conditions

1

, so the graphs

containing these subgraphs will appear to be unstructured.

To deal with cases of unstructuredness in the CFGs, it is important to first identify the types of unstructured cases that we may come across. In the following sections, we describe the categories of structures we want to detect and what information we aim to derive for instances within each of these categories. The type of unstructuredness that may arise in each category is mentioned.

4.4.1 Loops The structured forms of loops are subgraphs that display a single entry/single exit nature. This includes the while, repeat and do loops. We also include multiexit and multicycle loops in this case but this decision is qualified in the following section on unstructured loops.

Examples of structured loops are the pretested, postested, endless, multiexit and multicycle loops shown in Figure 4.1.

For any loop, we want to derive the following information:

header - the node at the entrance to the subgraph representing the loop. latch - the node in the loop whose edge to the header defines the loop. follow - the first node reached after termination of the loop (if any).

1

See Section 2.4 for an explanation of why we make this restriction.

50 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS loop nodes - the nodes that are within the subgraph representing the loop. loop type - is it a preTested (while), postTested (repeat) or endless (do)

loop?

We introduce the notation h \Phi l to represent a loop with the header node h and the latching node l .

Unstructured Loops Some of the categories defined by Williams [11] that were discussed in Chapter 2 are shown in Figure 4.7. We ommit his category of parallel loops as we can structure these as multicycle loops. This is influenced by the target language we have chosen, C. However, both break's and continue's within a loop are only detected when generating the high level code. This means that we consider them as forms of unstructuredness when doing the structuring. When modifying the assembly structuring process for another high level language that provides neither of these control structures, all that is required is a small modification to the code generation phase. The type of control flow transfer occurring in these two structures is often considered to be a restricted form of unstructuredness. That is, they can both be seen as gotos with a limitation on the destinations to which control can be transfered.

Multientry

Overlapping Multiexit Figure 4.7: Sample unstructured loops.

4.4. STRUCTUREDNESS 51 4.4.2 2way Conditionals The control flow graph representation of structured 2way conditionals will also display a single entry/single exit nature. For each 2way conditional header, there will be two branches (one for each out edge from the header). For a structured 2way conditional the nodes reachable along each of the branches will be disjoint from each other except for the one node upon which they must both converge.

Examples of structured 2way conditionals are the two middle graphs in the first row in Figure 4.1.

For any 2way conditional, we need to derive the following information:

header - the node at the entrance to the subgraph representing the conditional.

follow - the unique node at which all paths from each branch converge. conditional type - is it a single clause (if..then) or double clause (if..then..else)

2way conditional?

We introduce the notation h3f to represent a 2way conditional with the header node h and the follow node l .

Unstructured 2way conditional Unstructured 2way conditionals are those that have an abnormal entry or exit to or from a branch. Both of these cases are represented in the canonical abnormal selection path graph, shown in Figure 4.8. Graph (a) contains two potential if..then..else structures, headed by nodes 1 and 3. The former is shown in graph (c) and contains an abnormal exit from its right branch whereas the latter is shown in graph (b) which has an abnormal entry into its left branch. Both types of abnormal exit can also be along a back edge. In that case, the conditional will be overlapping a loop subgraph.

52 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

1 2 3

4 5

6

4

3

6

5 2

1

3 4

(c)(b)(a) Figure 4.8: Unstructured 2way conditional graphs.

4.4.3 Nway conditionals Similarly, an nway conditional is structured if the nodes reachable from each of the branches corresponding to the out edges from the nway header are all disjoint. Once again, there will also be a unique follow node on which all branches converge. The same set of properties is required for an nway conditional as is required for a 2way conditional plus one additional property. As with loops, we also want to know the set of nodes within the nway subgraph. This helps with unstructured cases.

We use the same notation to represent an nway conditional as a 2way conditional.

Unstructured Nway conditionals Being a generalisation of a 2way conditional, an unstructured nway conditional is similar to an unstructured 2way conditional. The three forms of unstructured nway conditionals we can encounter is shown in Figure 4.9. In graph (a) the nway conditional A3E has an unstructured entry from the back edge (F ; D ). Graph (b) has an unstructured exit from B 3F via the back edge (C ; A). The nway conditional B 3F in graph (c) has an unstructured entry via the edge (A; C ). We cannot have an unstructured exit from an nway conditional via a forward edge. If we did, then the node reached by that edge would be lower in the graph than the immediate post-dominator of the nway header. As this contradicts the definition of an immediate postdominator, such a case cannot occur.

4.4. STRUCTUREDNESS 53 Graphs (a) and (b) raise an interesting issue that must be resolved. The back edge in both of these can potentially be structured as the defining back edge for a loop. At the HLL level, overlapping nway conditionals (switch) and loops will result in a large number of gotos being used. To prevent this, we must make a choice between one or the other when they overlap. We choose to prioritise nways above loops as an nway can have a large number of clauses. If we were not to structure these as clause of an nway, then each one would involve generating a goto. Loops, on the otherhand, will only involve generating one goto for the backward edge.

A

B C D E

FF

A

DCB E

(a) (b) (c)

A

B C D E

F

Figure 4.9: Unstructured nway conditionals.

4.4.4 Multi-structure headers Unlike the original algorithm given by Cifuentes in [4], a node in a control flow graph is not restricted to being a header of only one type of control structure. However, the only case in which this can happen is when a node is the header of a repeat or do loop header and is also the header of a 2way conditional. This exception can occur in both structured and unstructured graphs as shown in Figure 4.10. Node 1 in graph (a) is both a do loop and a single-branch conditional header. This intuitively makes sense as the header of an endless loop does not actually control the loop iterations. It is merely the first node inside the loop. If this kind of multi-structuring was not enabled for a single node, a goto would have to be used instead. The same applies to graph (b). Node 1 here can be the header of both a repeat loop and a 2way conditional. The edge (2,5) will generate a goto in this case

54 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS but without the multi-structuring, another goto would also be required for the edge (4,1).

1

2 3

1 2 3

4

5 (b)(a) Figure 4.10: (a) A structured and (b), an unstructured graph in which node 1 is the header of two different control structures.

4.5 Structuring Algorithms This section presents the algorithms designed to determine the underlying control structures of an arbitrary control flow graph (i.e. structured or unstructured, reducible or irreducible). The nodes within the graph are augmented with the structuring information determined which is used during the following phase when generating the high level code.

As we are considering two variations on the algorithm used to structure loops, the treatment of both algorithms is given in separate sections. It also means that the phase diagram in Figure 4.11 describing the order in which the structuring algorithms are performed is not the usual simple sequence. Instead, there is a branch in the path followed by the structuring process when it uses different algorithms depending on how it does the loop structuring.

The determine ordering phase imposes an ordering between the nodes of the graphs. The second phase, determine reverse ordering, uses the concept of a reverse graph and imposes a similar ordering between the nodes in this graph. The set parenthesis phase prepares the nodes for loop structuring under the PT algorithm of Section 4.3.2. However as mentioned earlier, the parenthesis of a node can be used to make improvement to both quality of the generated code when 2way conditionals overlap with either loops or

4.5. STRUCTURING ALGORITHMS 55

Determine

Ordering

Determine Reverse

Ordering

Conditionals

Structure

Determine Immediate

Post-dominators

Build DerivedSequences

StructuringPhases Unstructured CFG

Restructure Strucured CFG

Structure Structure

ParenthesesSet

LoopsLoops DSG DSGPT Figure 4.11: The phases performed during structuring. Branches taken by PT algorithms are labelled `PT' and branches taken by DSG algorithms are labelled `DSG'.

nway conditionals and so this phase is common to both variations of the overall structuring algorithms. The build derived sequences phase, builds the data structures required for loop structuring under the DSG algorithm Section 4.3.1. The next phase, determine immediate post-dominators does exactly that for each node in the graph. The structure conditionals phase detects the potential conditionals represented within the CFG. The phases diverge again here depending on which loop structuring algorithm is being used. Each alternative, structure loops (PT or DSG), detects the loops represented within the CFG using one of the two loop structuring algorithms. Finally, the restructure phase gives the aforementioned improvement to the overall structuring process. It reanalyses the 2way conditionals detected in

56 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS the conditional structuring phase to see if they can be restructured when they overlap with an nway conditional or a loop. Each phase is next explained in detail.

4.5.1 Determine Node Ordering The first step in structuring a CFG is to define an ordering between the nodes within the graph. Apart from other uses, this provides an efficient means of traversing the graph in a linear but partially-ordered fashion. The ordering we define is the post-order of the nodes. The node that was last visited first during a DFS traversal will be first in the ordering and the start node will have an order equal to the number of nodes within the graph. That is, the nodes are ordered according to when they were last visited. A traversal of the nodes according to this ordering can be visualised as starting from the bottom of the graph and working to the top of the graph.

During the traversal used to determine this ordering, an ordering is also imposed on the in edges to the node from its predecessors. This is the ordering used when later algorithms include a traversal of the in edges in ascending order.

The algorithm designed to perform this traversal is given in Figure 4.12.

procedure DefineOrdering (G = (N ; E ; s); var ord) * Pre: G is a graph * Post: the reverse post ordering has been determined

traversed(s) := true for (n 2 succ(s))

append s to the end of the predecessor ordering for n if (: traversed(s))

DefineOrdering (G = (N ; E ; n); ord) order(s) := ord ord := ord + 1 end procedure

Figure 4.12: Algorithm to establish the reverse post-order within the nodes of the original CFG.

4.5. STRUCTURING ALGORITHMS 57 4.5.2 Determine Reverse Ordering The reverse graph G

rev

= (N

rev

; E

rev

; h

rev

) of the graph G = (N ; E ; h) can be

defined by the following set of equations (we assume that a unique eit node has been added if it did not already exist):

1:N

rev

= N

2:E

rev

= f(n; m) j (m; n) 2 E g

3: 9

1

h

rev

2 N ffl @n 2 N ffl (h

rev

; n) 2 E

The order between the predecessors established by the algorithm in Figure 4.12 gives us an ordering between the successors of each node in the reverse graph. The importance of having this relationship between a node's predecessors in the original graph and it successors in the reverse graph becomes apparent when determining the immediate post-dominator of each node. This process relies upon the reverse ordering of the nodes. There are multiple possible reverse orderings and so the aforementioned relationship enables us to constrain the reverse ordering to be directly related to the forward ordering.

The algorithm for determining the post-ordering of the nodes in the reverse graph in shown in Figure 4.13. Note that the input to this algorithm is the reverse graph. We never actually need to construct this graph as it is just the original graph, viewed in a different way (as shown by the above equations).

procedure DefineReverseOrdering (G

rev

= (N

rev

; E

rev

; h

rev

); var ord)

* Pre: G

rev

is the reverse of graph G and the ordering

* between the predecessors of each n 2 G has been established * Post: the reverse post ordering has been determined

traversed(s) := true for (n 2 succ(h

rev

) in ascending order)

if (: traversed(h

rev

))

DefineReverseOrdering (G = (N

rev

; E

rev

; n); ord)

revOrder(s

rev

) := ord

ord := ord + 1 end procedure

Figure 4.13: Algorithm to establish the reverse post-order for the nodes within the reverse CFG.

58 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

2 43 11 12

13

5

7 8

9

6

1

10

Node Post Order in

Reverse Graph

1 13 2 11 3 4 4 5 5 8 6 10 7 7 8 9 9 6 10 3 11 2 12 12 13 1

Figure 4.14: A control flow graph with the nodes identified by their post-order value. The table gives the post-order of each node in the reverse graph. The nodes in the `Node' column correspond with the nodes in the graph shown.

The control flow graph shown in Figure 4.14 shows how the post-ordering of the nodes enables a top to bottom (or bottom to top) traversal of the graph. The nodes are labeled by their post-order value. This convention of identifying nodes by their post-order value will be followed for the graphs presented hereafter. In this graph, the true edges for a 2way node are on the left. The accompanying table displays the post-order for each node in the reverse graph.

4.5.3 Immediate Post Dominators The intuitive interpretation of Definition 4.4 is that the immediate postdominator, ipd , of a node, n, is the closest node to n such that all paths from n to the graph's exit node pass through ipd . Hecht and Ullman in [12] give an algorithm for finding the dominator tree of a control flow graph. If we consider the graph to be the reverse of another graph, then this algo 4.5. STRUCTURING ALGORITHMS 59 rithm gives us the post-dominator tree. As we are only concerned with the immediate post-dominators, a simplification of their algorithm is fine for our purposes. This simplified version is given in Figure 4.15 and Figure 4.16.

procedure FindImmPostDominators(G = (N ; E ; s); RevG) * Pre: RevG is the reverse of G and the post-ordering of its nodes has been determined * Post: the immediate post-dominator for each node has been determined.

/* first pass */ for (all n 2 N in descending order in RevG)

iPDom(n) := undefined for (all c 2 pred

rev

(n))

if (revOrder(c) ? revOrder(n))

iPDom(n) := CommPostDom(iPDom(n),c)

/* second pass */ for (all n 2 N in ascending order in G ffl # succ(n) ? 1)

for (all c 2 succ(n))

iPDom(n) := CommPostDom(iPDom(n),c)

/* third and final pass */ for (all n 2 N in ascending order in G ffl # succ(n) ? 1)

for (all c 2 succ(n))

if (isBackEdge(n; c) ^ order(iPDom(c)) ! order(iPDom(n))

iPDom(n) := CommPostDom(iPDom(n),iPDom(c)) else

iPDom(n) := CommPostDom(iPDom(n),c) end procedure

Figure 4.15: Algorithm for calculating the immediate post-dominators.

The first algorithm, FindImmPostDominators, makes use of the ordering defined between the nodes of the reverse graph. To make sure that the immediate-post dominator for each node is correct with respect to the original CFG, we require three passes over the nodes in the graph.

The first pass traverses the nodes using the ordering defined within the reverse graph. Intuitively, we traverse the nodes in the reverse graph from top to bottom. For each node, we consider its predecessors that are higher in the reverse graph (i.e. they are not reached via a back edge). If there is only one such predecessor, then it is determined to be the immediate postdominator. Otherwise, the immediate post-dominator is initialised to be the common post-dominator of all the predecessors. Finding this common postdominator is performed by the CommPostDom algorithm. The result of this

60 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

procedure CommPostDom (var ipd; s) * Pre: ipd is the current immediate post-dominator and s is a successor of the * node being considered and is lower in the graph * Post: ipd is the immediate post-dominator of the node under consideration.

if (ipd = undefined)

ipd := s else if (s = undefined)

ipd := ipd else

while (ipd and s are defined ^ ipd 6= s)

if (revOrder(ipd) ? revOrder(s))

s := iPDom(s) else

ipd := iPDom(ipd) end procedure

Figure 4.16: Common post-dominator algorithm.

first pass is that every node except for the exit node has now been given an immediate post-dominator.

The second pass is required because the immediate post-dominator information determined in the first pass is not necessarily correct with respect to the original control flow graph. This occurs only when the original graph includes loops, which is the case for most control flow graphs. The CFG shown in Figure 4.17 is an example where we cannot rely upon the immediate postdominator information gained from only one pass. The three graphs depict the state of each node after each pass. Each node is given an alphanumeric identifier. The parenthesised data for each node is its position in the reverse ordering of the graph and its immediate post-dominator after the relevant phase. Shaded nodes are those whose immediate post-dominator changed after the previous pass.

The immediate post-dominators determined for nodes B,C and E after pass 1 are wrong. For node B, there is a path (B,D,H) to the exit node (H) that does not pass through C. Therefore, C is not a post-dominator of B, let alone the immediate post-dominator. The reason for this error is that the reverse ordering algorithm is `confused' by the presence of loops. For instance, one would expect node D to have a higher reverse order than both A and B as it is `higher' than both of these in the reverse graph. The problem arises from the fact that it is impossible to include in the reverse ordering algorithm any

4.5. STRUCTURING ALGORITHMS 61

(a) (b)

(c)

H(8,-) G(7,H)(4,B)F (6,G)E

(3,E)C (1,H)D

B(5,H)

A(2,B)

H(8,-) G(7,H)(4,B)F (6,G)E

(3,E)C (1,H)D

B(5,C)

A(2,B)

H(8,-) G(7,H)(4,B)F (6,H)E

(3,H)C (1,H)D

B(5,H)

A(2,B)

Figure 4.17: A CFG that has different reverse graphs. capabilities for dealing with loops in the reverse graph as it does not know what are back edges until the reverse ordering has been completed.

The second pass attempts to correct the remaining errors by doing another traversal based on the original ordering of the CFG. This time, only nodes that have 2 or more out edges are considered as the immediate postdominators determined for single out edges in pass one will be correct (there is no other choice of immediate post-dominators for these nodes). In the example, only node B's immediate post-dominator is corrected. Nodes C and

62 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS E are left uncorrected as they were traversed in the second pass before B was and they both rely on B to be correct.

After the second pass, there is still a possibility (as shown in the example) that some nodes within a loop (including the latching node) will still have the incorrect immediate post-dominator. The third pass corrects these remaining nodes. Preliminary analysis seems to indicate that a third pass is only required when the graph contains overlapping loops such as the two loops in the example induced by the back edges (D,A) and (F,B). The table in Figure 4.18 gives the immediate post-dominators determined for the nodes in the graph of Figure 4.14.

Node 1 2 3 4 5 6 7 8 9 10 11 12 13 ImmPDom - 12 2 2 2 2 2 6 2 2 2 1 1

Figure 4.18: The immediate post-dominators for each node in the graph of Figure 4.14.

4.5.4 Structuring Conditionals When structuring conditionals, we are analysing nodes that have more than two out edges. For the subgraph headed by one of these nodes, there will be a unique node such that all paths from each of the out edges from the conditional header converge upon it. As mentioned in Sections 4.4.2 and 4.4.3, this is referred to as follow node. Defined in this way, the follow node corresponds with the conditional header's immediate post-dominator and is already determined once this immediate post-dominator is known.

The set of nodes on all paths starting from some successor of a conditional header is called a branch. This set is defined in terms of the conditional h3f and one of the succesors of h such that it is not f .

Definition 4.12 For 9 s 2 (succ(h)nff g), a branch of the conditional h3f is fn 2 N j n is on a path s ! f ^ n 62 fh; f gg.

Note that the above definition implies that a branch may be empty. If a subgraph represents a structured conditional h3f , then it will satisfy the following two properties:

4.5. STRUCTURING ALGORITHMS 63

1. 8 b

i

2 branches of h3f ffl

"

i

(b

i

) = f g

2. 8 b 2 branches of h3f ffl (@n 2 b ffl (m; n) 2 E ^ m 62 (fhg [ b))

The first property states that the membership of the branches must be mutually exclusive (i.e no node may belong to more than one branch of a given conditional). The second property says the predecessor of any node within a branch must be either another node within the same branch or the conditional header. An unstructured conditional will violate at least one of these properties.

Whenever, conditional subgraphs are nested within each other, it may be the case that they share the same follow node. This will mean that both the outer and inner conditional headers will share the same immediate post-dominator. The same will also be true for overlapping (unstructured) conditionals.

2 43 11 12

13

5

7 8

9

6

1

10

Figure 4.19: A control flow graph with the potential conditional nodes shaded.

64 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS Consider the graph in Figure 4.19. The shaded nodes indicate the potential conditional header nodes. Even though some of these may later be structured as loop headers or latching nodes, we still consider them when structuring conditionals. The reason for this is that a node may be both a conditional and a loop header (as discussed in Section 4.4.4).

Determining the type of a 2way conditional For a 2way conditional, we seek to determine exactly what type of 2way conditional it is. The possibilities are:

ffl if..then..else - neither successor of the header is the follow node or at

least one successor is reached by a back edge.

ffl if..then - the false successor is the follow node. ffl if..else - the true successor is the follow node.

Figure 4.19 gives an example of these 2way conditional types. The conditionals (1331) and (1231) are if..else type of 2way conditionals as their true branches (i.e. the left branches) lead directly to their follow nodes. When generating HLL code for this type of conditional, we have the option of either generating an if..then..else that has an empty then clause or we can negate the condition in the header node and generate an if..then statement. The latter is a more natural looking statement and so we choose it. The conditionals (732), (932) and (1132) are if..then..else's as neither of the successors of the respective header nodes are the follow node. The conditional (532) is an interesting case as one of its out edges is a back edge. This type of node will always be initially structured as an if..then..else header. As conditional structuring is performed before loop structuring, there is still a chance it may be detected as the latching node for a loop.

Determining the nodes within an nway conditional For an nway conditional, we want to tag all the nodes on a branch of the conditional with the header node of their most enclosing nway conditional. This information will be used later when performing loop structuring.

4.5. STRUCTURING ALGORITHMS 65 The algorithm used to tag each node within a given nway conditional h3f is essentially a modified depth first traversal from h. All nodes (except for the nway header node and its follow) visited during the traversal are tagged with h as their nway header unless it has already been tagged with another nway header node or it was reached via a back edge. The base case for the recursive call is when the follow node is reached. If h3f encloses another nway conditional, h

2

3f

2

(where f

2

may be f ), then all the nodes within h

2

3f

2

will already have been tagged. This is because we use the post-ordering of

the nodes to traverse the graph in such a way that inner nway conditional will be structured first. Therefore, when we come across the header of an inner (or nested) nway conditional, we can jump directly to the follow node of the inner conditional. This means that each node will only be traversed once during all the nway tagging traversals for the complete CFG. The only nway conditional in Figure 4.19 is (1032). The nway tagging traversal for this conditional will tag nodes 3 : : 9 as having node 10 as their most enclosing nway conditional header. The algorithm that performs this traversal is shown in Figure 4.20.

procedure TagNodesInCase (G = (N ; E ; n); h; f ) * Pre: (h3f ) is an nway conditional and n = h or is in a branch of (n3f ) * Post: all the nodes within the branches of (h3f ) * that are not already members of another nway conditional have been * tagged as belonging to this nway

if (n 6= h)

caseHead(n) := h if (nodeType(n) = nway ^ condFollow(n) 6= f )

TagNodesInCase(G = (N ; E ;condFollow(n)); h; f ) else

for (all c 2 succ(n) ^ caseHead(c) is undefined)

if (c 6= f ^ : isBackEdge(n; c)

TagNodesInCase(G = (N ; E ; c); h; f ) end procedure

Figure 4.20: Algorithm to Tag the Nodes within an Nway Conditional.

Complete Conditional Structuring Algorithm The complete algorithm for structuring both types of conditionals is given in Figure 4.21. The nodes of the graph are traversed by an ascending post 66 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS order traversal. This ensures that inner conditionals are structured before outer ones. Only nodes with at least two out edges are analysed during the traversal.

procedure StructConds (G = (N ; E ; h)) * Pre: The immediate post-dominator has been determined for each node * Post: The conditional headers have been tagged, their follows have been determined. * Each node has been tagged with the header of its most enclosing nway conditional * and the type of a 2way conditional has been determined.

for (all n 2 N in ascending order ffl #succ(n) ? 1)

structType(n) := Cond if (9 m ffl (n; m) 2 E ^ isBackEdge(n; m))

condType(n) := ifThenElse else

condFollow(n) := iPDom(n)

if (n is an nway node)

condType(n) := Case TagNodesInNway(n,condFollow(n)) elseif (succ(n,Then) = condFollow(n))

condType(n) := ifElse elseif (succ(n,Else) = condFollow(n))

condType(n) := ifThen else

condType(n) := ifThenElse end procedure

Figure 4.21: Conditional Structuring Algorithm

4.5.5 Loop Structuring using Derived Sequences of Graphs This section describes how loops are structured using the DSG theory. It is based upon the work on loop structuring given in [5]. Slight modifications have been made as a result of further development on the original algorithm.

The first step in structuring loops is to determine how a loop is represented within a CFG. A loop must involve a back edge from its latching node to its header node. Therefore, we want to know both of these nodes for any loop. We also want to know which nodes are within a given loop as well as a nesting order between the loops. Hecht in [8] made the point that the

4.5. STRUCTURING ALGORITHMS 67 representation of loops by cycles is too fine as not all loops are properly nested or disjoint. The use of strongly connect components is too coarse as there is no nesting order whereas strongly connected regions do not provide a unique coverage of the graph and does not cover the whole graph. The interval and derived sequence of graphs theory discussed in Section 4.3.1 provides us the information we require: one loop per interval and a nesting order given by the derived sequence of graphs.

2 43 11 12

13

5

7 8

9

6

10

I3={3,4,10..12}

I1={13}

I5={2} 1 I2={1}

I4={5..9}

Figure 4.22: The intervals of the Control Flow Graph of Figure 4.14 For some interval I (h) (where h is the interval header), there is a loop headed by h if there is an edge from some node l 2 I (h) to h. These edges will invariably be a back edge and l will be the potential latching node for the loop. Figure 4.22 shows the intervals determined for the graph of Figure 4.14. The interval I

4

is the only interval so far that has a loop in it. This loop,

9 \Phi 5 results from the back edge from 5 to 9.

To detect the loop 12 \Phi 2, we make use of the derived sequence of graphs. To find the next graph in the sequence, we collapse each interval into a single node and form the new graph from these nodes. Figure 4.23 shows the derived sequence of graphs for the original flow graph of Figure 4.22. In the

68 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS first derived graph, G

1

, the loop 12 \Phi 2 is now detectable by the back edge

from the node I

5

(which contains node 2) to the node I

3

(which contains node

12). The derived graph G

2

does not contain any back edges and therefore,

there are no more loops in the graph.

I1 I2 I3

I4 I5

I8={I3,I4,I5} I6={I1}I7={I2} I6

I7 I9={I6,I7,I8} I9

I8

G1

G2

G3

Figure 4.23: Derived Sequence of Graphs for Figure 4.22 The nesting of loop 9 \Phi 5 within loop 12 \Phi 2 is derived from the fact that the former is was detected earlier in the sequence of derived graphs.

We made a note earlier that a back edge from within an interval to the interval header is only a potential latching node. It may be the case that a more suitable latching node can be found. Consider the example in Figure 4.24. In the first graph, G

0

, the interval I

1

contains a back edge to the interval

header, node 1, from node 2 (which is within the same interval). This will result in the initial structuring of the loop 1 \Phi 2. When considering the next graph in the sequence, G

1

, we are presented another possible latching

node, node 4 (from interval I

2

) for a loop headed by node 1. We therefore

restructure the loop to be 1 \Phi 4 as it encloses more nodes. We generalise this decision and state that the latching node found latest in a sequence of derived graphs will always overide an alternative latching node for the same loop header found earlier in the sequence.

Given a control flow graph G = G

0

that has been annotated with its intervals,

its derived sequence of graphs G

0

: : G

n

and the set of intervals of these

graphs, I

0

: : I

n

, we can use the following algorithm to structure loops: each

header of an interval in G

0

is checked for having a back edge from a latching

node that belongs to the same interval and is enclosed by the same nway

4.5. STRUCTURING ALGORITHMS 69

1 2 3

I1={1,2} I3={I1,I2}

G1 I1 I2

4

5

I2={3,4,5}

G0 Figure 4.24: A loop header with two potential latching nodes.

conditional

2

. We choose the latching with the least post-order value if there

is more than one. If a latching node was found, then we check to see if the header has already been structured as a loop header (from a derived graph earlier in the sequence). If so, then this latching node is set back to what is was determined to be (an if..then..else conditional header) before being detected as a latch node and all the nodes within the now defunct loop have their loop header marker set back to undefined. We now set the latch node to be the most recently found latch node and together with the header node, a loop has been found. We now determine its type (preTested, postTested or endless) and mark all the nodes belonging to it. Next, the intervals of G

1

, I

1

are checked for loops and the process is repeated until the intervals in

I

n

have been processed. Whenever there is a potential loop (i.e. an interval

header that has a back edge into from a node within the same interval) such that the loop header and latch node are currently marked as belonging to different loops, the new loop is disregarded as it belongs to an unstructured loop. This back edge will always generate a goto during code generation. The complete algorithm is given in Figure 4.25. This algorithm correctly determines the appropriate nesting