PARUS Project Page

PARUS Project Page / Manual / Utilities / parser

Utility information

Utility name	parser
Usage	parser {filename1}, ... {filenameN}

C program dependencies analyzer

General information

The target is to develop a tool, that converts sequential program text into data dependency graph. The language source code has to be written on is ANSI ISO C. With this implementation it is supposed to investigate abilities and perspectives of a fully automatical approach to creation of parallel programs. The resulting graph can be used as an independent form the machine arcitecture way of program representation.

Text analysis is spent in assumption of static memory distribution. Program couldn't hava dynamical memory allocation, e. g. malloc system call. Surely, statical analysis can't fully consider all C programming language features because of variety of functional features and widely used programming techniques, which programmers apply in their daily practice. For instance, it is very difficult to use statical analysis if pointer type and operations with objects of such a type which are stipulated in C language are included in representation. Therefore, this and similiar features are not analized in current version of developed analizer, and program end its work with diagnostical message when detects it. Existance check for not currently supported feature in C program is made at informational dependencies definition stage, after text recognition.

Main strategy of building graph using C program is strtegy of maximal separating of operations for getting widest representation of graph. In other words, it is maximization of number of independent among themselves operations. Arithmetical equations are converted in tree structure, on which basical number of operations and special transformations are implemented. For instance, it is possible to calculate the result of equation if values of variables which in this equation are are defined, linearize and simplify indexed expression and so on. Next, this structure is used in representation of information about dependencies. It can be noticed that there is no any analysis of parallel execution abitity for expressions, because changing of calculation flow may lead into incorrect or errorneous results (arithmetical floating-point operations are not commutative). But rearranging of calculations in expression is possible. It can be reached by preliminary calculating of functions with subsequent replacement of call with result. Note that such a function hasn't have any collateral effects.

Set of global variables if defined for each function, which operates with it. Weight of function, which determines its calculation comprehensiveness, is calculated. In theory, it is possible to make insertion of body of function with its actual parameters in call position. In С++ such a functions are marked with inline keyword. Unfortunately, this feature isn't implemented in current version. Several significant restrictions are applied on these functions. In our case the most significant restriction is unique return in function. This operator set so called control dependency and can't be interpreted in terms of informational dependencies directly. Unique (unconditional) return operator can be replaced with assigment of return value to a variable. If function body contains conditional operators with return, text can't be inserted in call position of function. Such a style of programming is widely used because of its convience and presentation.