Ðóñ Eng Cn Translate this page:
Please select your language to translate the article


You can just close the window to don't translate
Library
Your profile

Back to contents

Software systems and computational methods
Reference:

New built-in tools for extending the Planning C language

Pekunov Vladimir Viktorovich

Doctor of Technical Science

Software Engineer, JSC "Informatika"

153000, Russia, Ivanovskaya oblast', g. Ivanovo, ul. Tashkentskaya, 90

pekunov@mail.ru
Other publications by this author
 

 

DOI:

10.7256/2454-0714.2022.1.37240

Received:

30-12-2021


Published:

03-04-2022


Abstract: In this paper, the problem of developing language extensions of Planning C (a dialect of C++) is considered. The review of existing external programs and solutions built into languages that allow translating new constructions introduced into the language into the output code is carried out. Based on the analysis, it is concluded that the most natural solution built into the language will be some combination of improved regular expressions (to highlight new constructions) with code generators based on procedural and syntactic macros. At the same time, it is advisable to use elements of direct logical programming (both in macros and in regular, more precisely, regular-logical expressions).   The proposed approach makes it possible to more flexibly allocate replaceable constructs in comparison with template approaches and more simply replace them with output code in comparison with approaches based on manipulations with the syntax tree. The syntax and semantics of the proposed solutions are described. A preprocessing scheme is proposed that implements the selection of initial constructions by scanners (groups of parameterized regular logical expressions) and their replacement with output code implemented by deductive macromodules (with possible multiple matching). This scheme allows you to work with arbitrary input and output syntaxes and provides prompt input of new constructions into Planning C, which is especially valuable, for example, when prototyping new extensions. The paper contains information about the successful testing of the proposed approaches (on the development of a number of syntactically non-trivial extensions of Planning C).


Keywords:

programming language, Planning C, language extension, syntactic macros, procedural macros, text transformations, deductive macros, regular expressions, logic programming, preprocessing

This article is automatically translated. You can find original text of the article here.

In the course of software development in C++ and other languages derived from it, problems often arise related to the need to repeatedly write some rather cumbersome algorithmic constructions that are impossible or difficult to formalize in the form of a function or macro. An example of such a construction is, for example, a loop with the ability to process an interrupt (break) after a cycle, or a nested loop with the ability to exit an interrupt to an arbitrary level of such a cycle. In such cases, it would be very useful to have any operational language extension schemes that would not require the introduction of such extensions at the compiler level (as is done, for example, in Clang-based projects), but would be limited to direct processing of the source code of the program, that is, text transformations. Transformations can be:

a) straight lines of the form "text-> text" (here work usually goes on at the level of text templates with variables),

b) mediated in the form of "text->stream of tokens->text" (text templates (or written using some grammar) or procedural descriptions of transformation algorithms can also be used here),

c) mediated in the form of "text->abstract syntactic tree->text" (in this case, a description of at least the input grammar, as well as the rules of transformation of the tree, written with the elements of such a grammar, is required).

Such transformations can be performed either by external transformer programs (which is not quite convenient to use), or they are implemented by standard language tools by the type of syntactic (declarative or procedural) macros. Since there are no such standard tools in the case of C++, their development, at least within the framework of C++ dialects, is quite an urgent task. In our case, such a task will be solved within the framework of the Planning C language (C++ dialect).

As transformer programs, we mention Stratego/XT ([1], transformations "text->abstract syntactic tree->text", and the tree is represented in a text list form), TXL ([2], the approach is close to "text-> stream of tokens->text" using templates based on grammars), DMS Software Reengineering Tolkit ([3], the "text->text" template approach is used, elements of procedural processing in a Pascal-like language are also possible) and some domestic developments based on "text->text" templates [4] and based on manipulations with the abstract syntax tree of the source text [5].

The standard means of operational expansion of the language with new syntactic constructions are usually represented by varieties of syntactic macros. For example, in the PL/1 language, there are procedural and syntactic macros, which are fragments of code in PL/1, executed at the compilation stage, taking a set of input parameters and generating some text output, which is then embedded in the program at the point of accessing the macro. The disadvantage of this approach is that it focuses exclusively on the procedural syntax of extensions.

A kind of syntactic macros can be considered classic Lisp macros, which also, in fact, modify the source program by inserting flexible fragments into it. However, here the input fragments must fit into the common basic list-bracket syntax of the Lisp language, unlike ordinary syntactic macros that allow deviations from the general syntax of the source language. Note that modern Lisp also has reading macros that allow you to move away from the common basic syntax.

In the Rust [6] and Nemerle [7] languages, there are declarative macros that execute some code for expressions corresponding to a given pattern, as well as procedural-syntactic macros that accept a stream of tokens (read directly or unfolded into an abstract syntactic tree) and execute some code that generates a replacement stream of tokens.

The Scala language [8] includes procedural and syntactic macros that actually work with an abstract syntax tree of macro arguments, replacing their own call with a configurable output fragment embedded in the program tree instead of the source fragment. As in PL/1, the disadvantage of this approach is its focus on a close to procedural syntax.

Enhanced C# [9] contains a LeMP system of macros that either use code templates with parameters, or contain code executed at the compilation stage that works with an abstract syntax tree. It also contains LLLPG, a system for generating lexical analyzers based on regexp–like expressions containing embedded C# code.

Summarizing the results of this brief review, it should be noted that approaches based on input parameterized templates or input procedural syntax do not allow describing complex multi-link input constructs that differ significantly in syntax from the source language. At the same time, approaches focused on working with an abstract syntactic tree allow such constructions to be described, but they are quite complex for programming. An approach in which the primary allocation of the construction is carried out according to a scheme similar to LLLPG (based on some kind of regular expressions, for example, regular-logical expressions (see, for example, [10, 11]), allowing to go beyond regular grammars), and further code generation is performed by procedural-syntactic macros, seems to be quite optimal. by type implemented in Rust and PL/1.

At the same time, it should be taken into account that language extensions with complex (for example, parallelizing) constructions may require sufficiently intelligent algorithms for analyzing and generating output replacement code. An approach based on multiple coordination of macros, as well as the use of logic programming elements, not only at the level of regular logical expressions, but also when generating output code, seems justified.

So, the purpose of this work is to increase the efficiency of programming in the Planning C language (C++ dialect) by introducing standard extension tools built on some kind of scanning regular logical expressions in combination with logical, code-producing macros into this language. To achieve this goal, we will set the following tasks: a) to define a new preprocessing scheme of the Planning C language, which allows us to isolate the extension constructs introduced into the language and invoke deductive macromodules corresponding to them; b) describe the syntax and semantics of the means of allocating expanding constructions; c) to test the developed approaches.

 

The mechanism for implementing extensions. Improved preprocessing scheme

Note that the introduced syntactic constructions may require mutual agreement even if they are "scattered" across the program code, for example, in order to select the optimal implementation algorithm or for the operational initialization of some new auxiliary variables/constants. Considering that in Planning C deductive macromodules, which usually perform operational code synthesis, are written in the GNU Prolog language, it makes sense to implement collective coordination of macromodules implementing the "compilation" of new constructions through a single GNU Prolog fact base. At the same time, the linear nature of the program analysis should be taken into account – it is necessary to provide an opportunity for the structures going higher in the text to coordinate their behavior with the structures going lower in the text. The simplest implementation of this process can be repeated preprocessing with the preservation of a single database of facts between stages.

It remains to determine the mechanism of syntactic isolation of the introduced new program constructions and their "convolution" into a parameterized call of a deductive macromodule that performs code synthesis (or some other transformation of the selected syntactic structure). It seems justified to introduce the concept of a scanning module or scanner into the language (there may be many such modules).

The scanner has three regular logical expressions [10, 11]: left context, main and right context. The text of the program must be viewed by a scanner that finds fragments that fall under the concatenation of the above three expressions, and either replaces the fragment corresponding to the main expression by calling a deductive macro module (which will later be expanded into some program code), characteristic of the Planning C language, or, without replacing it, only places some additional fact (which other macromodules will be taken into account in their work) into a common database of facts. It should be noted right away that, although the syntax of the program is usually not reduced to regular grammars, the program may well be disassembled (at least in most cases) with the allocation of new constructions, using regular logical expressions that can widely use various pluggable external predicates that perform complex types of parsing. In particular, to parse the constructions mentioned in this paper, only one such external predicate BAL was required, which determines whether its argument is an expression balanced by various types of brackets.

Based on the above, it is necessary to implement a new preprocessing scheme in Planning C:

1. Let the required maximum number of preprocessing cycles be given K. The source code of the program is also set.

2. The first phase. The entire text of the program is analyzed with the execution of preprocessor directives and the execution of macro substitutions in the C++ style.

3. The second phase. The entire program text received so far is viewed, all scanning macros (in the Planning C style) and directives setting the scanners used, the mode and the order of their application are detected. Next, scanning macros are applied to the currently received program text in parsing or scanning mode (will be discussed later). The found expressions are replaced by calls to macromodules or by empty lines (with the generation of GNU Prolog-fact).

4. The third phase. The resulting text of the program is viewed again — a table of named constants is collected and a search is carried out for calls to macromodules for which constant values are substituted into parameters (if the parameter of the macromodule is an expression, then it is calculated). Further, the program code is generated at the points of access to the macromodules. To do this, if the body of the macromodule contains goals, a GNU Prolog program is formed, which includes library predicates (characteristic of Planning C), predicates and goals (alternately) of the macromodule. The generated code of the logical program is processed for each purpose by calling the external standard GNU Prolog interpreter (at the same time, the logical program takes into account the facts available in the database in its work and possibly replenishes this database). The resulting console output is inserted into the body of the module instead of the corresponding target. After excluding predicates from the module body, we get the code ready to be inserted into the Planning C program.

5. If the prune predicate was called during the execution of any of the deductive macromodules, or if the maximum number of preprocessing cycles is reached, then preprocessing ends. Otherwise, all the results of preprocessing are reset (except for a single database of facts), the source code of the program is loaded and the transition to step 2 is performed.

 

Preprocessing management

Let's introduce a directive in Planning C that defines the maximum number of preprocessing stages:

"#" "preproc_passes" "(" max_number of stages ")"

Additionally, four new directives are introduced that control the way scanning macros are applied in the second phase of preprocessing:

a) application in scanning mode. Adding scanners to the list:

"#" "add_scan" "("scanner id {","scanner id}")"

Specifying an exhaustive set of scanners:

"#" "scan" "(" scanner id {","scanner id}")"

In this mode, all scanners are sequentially iterated once (in the order specified in the directives), each of which scans the entire program and generates a list of all necessary substitutions. At the same time, all scanners work with the same program text received before the second phase of preprocessing. After the scanning is completed, the generated list of substitutions is applied to the current program text. This mode is quite effective with a small number of scanners (applicable only to individual fragments of the program), since viewing the entire program with a scanner can be quite a time-consuming process.;

b) application in parsing mode. Adding scanners to the list:

"#" "add_parse" "(" scanner id {","scanner id}")"

Specifying an exhaustive set of scanners:

"#" "parse" "(" scanner id {"," scanner id}")"

Here the program is viewed sequentially and once. For each next fragment of the program, an attempt is made to apply all scanners sequentially (in the order specified in the directives), and as soon as a certain scanner identifies a fragment, the required replacement element is added to the list of substitutions and the analysis of the fragment ends there. Unlike the first mode, if there is a fragment that is not identified by any of the scanners, an error is generated. All scanners work with the same program text received before the second phase of preprocessing. After viewing the program text, the generated list of substitutions is applied to this text. This mode is usually used if a complete analysis of the program is necessary, for example, to implement its translation into another language or its automatic parallelization. In this case, parsing is usually more efficient than scanning.

 

Scanning macros

scanner = "#" "def_pattern" spaces scanner id [spaces] "=>"

[spaces] (("[" fact identifier"]") | macromodule identifier) [spaces] " (" [fact_macromodule parameters]")" [spaces] "{" telo_scanner "}" ";"

parameters_macromodula_ or fact = parameter {[spaces] "," [spaces] parameter}

parameter = string_constant_in_apostrophes | xpath_expression | list

list = "[" [parameter {[spaces] "," [spaces] parameter}] "]"

telo_scanner = [left_context] main_expression [right context]

left context = regular_logic_expression of PS

PS = translation_string

right_context = regular_logic_expression of PS

main expression = "@" "begin" PS regular_logic_expression PS "@" "end" PS

  It is necessary to give some additional explanations. As soon as the scanner finds a fragment corresponding to the concatenation of the left context (if it is defined), the main expression and the right context (if it is defined), the fragment corresponding to the main expression is cut from the program and replaced either by calling a macro module or by an empty string with the generation of a fact (which is placed in a single database of facts). In this case, the parameters of the fact or macro module are determined in accordance with the scheme specified in the scanner definition.

xpath_expression is specified in the parameters, usually, or to call any standard functions (for example, gid(), which gives a unique identifier of the found fragment corresponding to the main expression; random() or randomid(), which return, respectively, a random number and a random identifier), or to extract the values of variables of a regular-logical expression [10]. At the same time, if the variable has a single value, the result will be a scalar string value, but if the variable has many values, the result will be a list of such values, designed according to the rules of GNU Prolog. If the variable has no value, the result will be an empty string.

 

Approbation

1. Consider a small example. Suppose there was a need to introduce a while loop into the Planning C language with interrupt processing bybreak, as it is done in Python. Let such a cycle have the general form:

while (condition) {

telo_cycle

}

else

break_processor

Figure 1 shows a fragment of a program with a scanner and a directive enabling the scanning mode.

Fig. 1. The initial fragment of the program with the scanning module

The remaining part of the program with a deductive macromodule and the main part, including a loop with interrupt processing, has the form:

#def_module() while_else(GID, COND, BODY) {

@goal:-brackets_off.

@goal:-

  write('bool __break'), write(GID), write(' = false;'), nl,

  write('while('), write(COND), write(') {'), nl,

  write('  __break'), write(GID), write(' = true;'), nl,

  write(BODY), nl,

  write('  __break'), write(GID), write(' = false;'), nl,

  write('}'), nl,

  write('if (__break'), write(GID), write(') '), nl.

};

 

int main() {

            int A[10];

            int k = 0;

            while (k < 10) {

                        cin >> A[k];

                        if (A[k] == 0) break;

                        k++;

            }

            else

                        cout << "Zero was entered!" << endl;

            return 0;

}

2. Using the scheme proposed in this paper, a number of extensions were introduced into the Planning C language:

a) directives for enabling memoization of functions/procedures (void functions) with the generation of appropriate auxiliary code;

b) anonymous procedures and functions with re-entry planning (PPPV and FPPV), translated into sets of auxiliary lambda functions and classical PPPV/FPPV;

c) anonymous parallel computational topologies translated into sets of auxiliary lambda functions and ordinary static topologies.

The operation of these extensions has been tested on a number of meaningful examples.

 

Conclusions

So, in this paper, a number of new solutions for introducing extension capabilities into the Planning C language are proposed. Unlike analogs, these solutions are based on elements of direct logical programming and regular logical expressions, which allow for more flexibility in identifying replaceable constructions compared to template approaches and more simply replacing them with output code compared to approaches based on manipulations with the syntax tree. The syntax and semantics of the proposed solutions are described. A preprocessing scheme is proposed that implements the selection of initial structures by scanners and their replacement with output code, implemented by deductive macromodules (with possible multiple matching). This scheme allows you to work with arbitrary input and output syntaxes and provides prompt input of new constructions into Planning C, which is especially valuable, for example, when prototyping new extensions. The proposed approaches have been successfully tested by developing a number of syntactically non-trivial extensions of Planning C.

As a direction for further research on this topic , the following can be noted:

1. It is possible to implement self-modification and self-consistency of the program as a result of the use of "scanners-macromodules" bundles. For example, scanners can detect the implementation of a bubble sorting algorithm and call a macro module that replaces such a fragment with the implementation of a quick sort. We can also talk about the automatic selection of the computational topology for the problem to be solved with the insertion of the corresponding implementing code fragments.

2. Verification of the algorithms of the program is possible: a) scanners identify implemented algorithms and collapse them into calls of certain concepts, b) the relationship between concepts is determined, c) these actions are repeated until the final graph of concepts is obtained, which can be analyzed for compliance with the formulation of the problem being solved by the program.

 

References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.

Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

The reviewed material is devoted to solving programming problems related to the need to repeatedly write down some rather cumbersome algorithmic constructions that are impossible or difficult to formalize in the form of a function or macro in C++ and other languages derived from it. The research methodology is based on the development of proposals for the use of one of the dialects of the C++ programming language for text transformations of program code without introducing extensions to the compiler level. The relevance of the study is due to the practical need for multiple recording of cumbersome algorithmic constructions in C++ in the absence of standard tools in this language for implementing transformations by the type of syntactic (declarative or procedural) macros, as well as the inconveniences of using external transformer programs for these purposes. According to the reviewer, the scientific novelty of the results of the presented research can be attributed to the solutions proposed by the author of the article, which, unlike their analogues, are based on elements of direct logical programming and regular logical expressions, allowing for more flexibility in identifying replaceable structures compared with template approaches and more simply replacing them with output code in comparison with approaches based on on manipulations with the syntax tree. The following sections are highlighted in the article: The mechanism for implementing extensions. Improved preprocessing scheme; Preprocessing management; Scanning macros; Approbation; Conclusions; Bibliography. At the beginning of the article, the essence of the problem being solved within the framework of the ongoing research is briefly outlined, three main types of text transformations of program code are outlined, a brief overview of transformer programs (Stratego/XT, TXL, DMS Software Reengineering Tolkit) and syntactic macros (in PL/1, classical Lisp macros, declarative macros In languages Rust and Nemerle, etc.), the advantages of solving the formulated problem using the classical and parallel programming language of high Planning C are substantiated. The advantages of the article include the presentation of alternative approaches to solving the problem considered in the article, as well as the presentation of fragments of program code. Among the disadvantages, it should be noted that the initial part of the article (before the section "The mechanism for implementing extensions. The improved preprocessing scheme" is not structured properly, such sections as "Introduction", "Material and research methods" are not highlighted. The presented material corresponds to the topic of the journal "Software Systems and Computational Methods", contains proposals aimed at improving the efficiency of programming in the Planning C language by introducing standard extension tools built on a variety of scanning regular logical expressions in combination with logical, code-producing macros into this language. The article is recommended for publication after minor revision related to the need to title the initial part (or parts) of the publication.