@(#)userman.html 1.1 - 03/15/02

RECON2 USER'S MANUAL

This software was created as a project by students in the Software Engineering program at the University of West Florida.

TABLE OF CONTENTS

  1. DISCLAIMER OF WARRANTY
  2. INTRODUCTION
    1. Purpose
    2. What You Need
    3. Problem Reporting
    4. Cautions and Warnings
  3. QUICK START TUTORIAL
    1. About the Tutorial Program
    2. Copying and Compiling the Recon2 Source Files
    3. Steps for Tutorial Setup
    4. Instrumenting the Source Code
    5. Compiling the Instrumented Version
    6. Testing the Instrumented Version
    7. Analyzing the Test Results
  4. USING RECON2 OPTIONAL INSTRUMENTATION FEATURES
  5. UNIX MULTI-PROCESS PROGRAMS
  6. RECON2 ENHANCEMENTS
  7. RECON2 COMMANDS AND FILES
  8. GLOSSARY


1. DISCLAIMER OF WARRANTY

In accepting and using RECON2, you agree to the following disclaimer of warranty:

THIS SOFTWARE AND MANUAL ARE PROVIDED "AS IS" AND WITHOUT WARRANTIES AS TO PERFORMANCE OR MERCHANTABILITY OR ANY OTHER WARRANTIES WHETHER EXPRESSED OR IMPLIED. BECAUSE OF THE VARIOUS HARDWARE AND SOFTWARE ENVIRONMENTS INTO WHICH THIS PROGRAM MAY BE PUT, NO WARRANTY OF FITNESS FOR A PARTICULAR PURPOSE IS OFFERED.

GOOD COMPUTER PROCEDURE DICTATES THAT ANY PROGRAM BE THOROUGHLY TESTED WITH NON-CRITICAL DATA BEFORE RELYING ON IT. THE USER MUST ASSUME THE ENTIRE RISK OF USING THE PROGRAM. ANY LIABILITY OF THE PROVIDER WILL BE LIMITED EXCLUSIVELY TO PRODUCT REPLACEMENT.

Some states do not allow the exclusion of the limit of liability for consequential damages, so the above limitation may not apply to you.

This agreement shall be governed by the laws of the State of Florida and shall inure to the benefit of the Wilde Bunch, the University of West Florida, and any successors, administrators, heirs, and assigns. Any action or proceeding brought by either party against the other arising out of or related to this agreement shall be brought only in a STATE or FEDERAL COURT of competent jurisdiction located in Okaloosa or Escambia County, Florida. The parties hereby consent to in personam jurisdiction of said courts.


2. INTRODUCTION

2.1 Purpose

RECON2 is a tool to help software engineers locate the parts of a large target program that implement a particular program feature. First, RECON2 makes an instrumented copy of the user's target program source code with output statements on each branch structure. Then, the software engineer compiles and runs the instrumented code using some test cases that use the feature and others that do not. RECON2 identifies which branches in the program were executed most frequently when the feature was being used. These branches provide good starting points for locating the feature when reading the program's code.

If you have not already done so, read the document RECON - OVERVIEW for a quick introduction to the steps in using RECON2. For information on how to install and build RECON on your system, refer to the "readme.txt" file that is included in the distribution file.

2.2 What You Need

To use RECON2, you need a C/C++ Compiler. It is assumed that you have a broad knowledge of C programming and are familiar with compilers, editors and testing. You must know how to recompile and relink the target program. You must also understand the purpose of the target program well enough to be able to write test cases that exhibit the program feature you want to locate.

2.3 Problem Reporting

Please report all documentation and software problems to:

Dr. Norman Wilde
Department of Computer Science
University of West Florida
Pensacola, FL 32514
(904) 474-2548

Let us know if you have any suggestions for improvement to RECON2.

2.4 Cautions and Warnings

Normally, RECON2 does not modify the original source code files, but if the user specifies that the instrumented source path is the same as the original source path, the original source file will be overwritten by the instrumentation program.

RECON2 may substantially slow your software since it keeps track of each time a branch is executed. If performance is important, keep test cases short or avoid instrumenting files that contain the innermost loops of your program. Consider also using the Minimum Trace option described in the Glossary. There may be a modest size increase in your compiled code.


3. QUICK START TUTORIAL

3.1 About the Tutorial Program

All the examples in this document are based on a simple Reverse Polish Notation (RPN) Calculator program adapted from Kernigan and Ritchie, The C Programming Language (2nd edition). The program can be considered to have four features: addition, subtraction, multiplication and division. We will show how the branches for the multiplication feature are located. The four files that comprise the calculator program are:

rpnmain.c
rpngetop.c
rpnstack.c
rpngetch.c

In reverse Polish notation, each operator follows its set of operands; an infix expression like:

(1 - 2) * (4 + 5)
is entered as:
1 2 - 4 5 + * Q

Each test case is terminated by Q to end execution of the tutorial program.

To find the unique code where the multiplication operation is executed, first define a set of test cases, some of which use the multiplication ("*") operator, some of which do not. For the example, we have selected the test cases in Table 1. Note that there is one test case that exhibits the multiplication feature and three that do not.

test case description exhibits feature?
6 3 * Q multiplication
6*3=18
Y
6 3 / Q division
6/3=2
N
6 3 + Q addition
6+3=9
N
6 3 - Q subtraction
6-3=3
N

Table 1. Test Cases for the RPN Calculator

3.2 Copying and Compiling the RECON2 Source Files

Copy and compile the RECON2 source files to a directory as described in the "readme.txt" file that is included in the distribution file. For the purpose of this document we will use this path: /usr/cs/<usrname>/<RPN source file dir> . Substitute a directory name of your choice in its place.

3.3 Steps for Tutorial Setup

The tutorial examples are shown for the UNIX operating systems. For the MS DOS operating system, simply add the drive letter and change the " / " to " \ ".

Create a directory to hold the instrumented source files:

/usr/cs/<usrname>/<RPN source file dir>/inst

Also create a directory to hold the output trace files and analysis:

/usr/cs/<usrname>/<RPN source file dir>/inst/out

3.4 Instrumenting the Source Code

Introduction
Your "C/C++" source files will be "instrumented" so that a trace file will be produced every time your program is executed. Instrumentation involves copying your source files and inserting statements that will write to the trace file when a branch is executed.

You may not want to instrument some files of your program, either because you are sure the feature you are looking for is not located in certain files, or because some files contain loops that will produce a large trace file.

Processing
At the DOS prompt run r2inst giving the full path name of the file to be instrumented and the location of where to place the instrumented file. It is best to provide the full directory path name of the source files. If a relative path is given as an input to r2inst, r2analyz will fail to find the source file unless run from the same directory used in running r2inst.

For the RPN tutorial, give the following instrumentation commands:

r2inst /usr/cs/<usrname>/<RPN source file dir>/inst/rpngetch.c
r2inst /usr/cs/<usrname>/<RPN source file dir>/inst/rpngetop.c
r2inst /usr/cs/<usrname>/<RPN source file dir>/inst/rpnmain.c
r2inst /usr/cs/<usrname>/<RPN source file dir>/inst/rpnstack.c

Hint: When using RECON2 on your own target program, you can create a DOS batch file or UNIX shell script containing the instrumentation commands. Commands to copy other needed files, such as headers or makefiles, could also be included. Then you can reinstrument easily by just running the batch file or shell script.

Output
The result of the instrumentation process is a set of instrumented source code files. For example, an "if" statement on line 5 would be changed from:

if ('a' == c)
to:
if (('a' == c)?R2True(R2srcfile_ptr,5):R2False(R2srcfile_ptr,5))

where R2srcfile_ptr points to the path of the preinstrumented source file.

The instrumented RPN Calculator files will be created and placed in the output directory.

r2inst /usr/cs/<usrname>/<RPN source file dir>/inst/rpngetch.c
r2inst /usr/cs/<usrname>/<RPN source file dir>/inst/rpngetop.c
r2inst /usr/cs/<usrname>/<RPN source file dir>/inst/rpnmain.c
r2inst /usr/cs/<usrname>/<RPN source file dir>/inst/rpnstack.c

3.5 Compiling the Instrumented Version

Introduction
After RECON2 instruments your source code, you will need to compile the instrumented code. This may involve linking with libraries and non-instrumented components.

Input
The instrumented source code files will be used as input along with other source files as needed for your particular system.

Processing
The command you must give to compile will vary depending on the compiler you use. Each compiler has a slightly different command line format. Some compilers are part of a programming environment, and it may be easier to use that environment to create a project file, and then let the environment do the compile.

However your compiler works, here are some things to be careful of:

The following is an example of a compile command for the RPN example using the UNIX Gnu C/C++ compiler:

gcc -o rpn -DSYS_UNIX rpnmain.c rpngetch.c rpngetop.c rpnstack.c r2protos.c

Output
Compilation produces an instrumented executable version of the target program. For the RPN tutorial, an instrumented executable file called rpn is produced.

3.6 Testing the Instrumented Version

Introduction
To locate a feature using the instrumented executable, run test cases that exhibit the feature and others that do not. Usually only a few cases are needed. For the RPN Calculator to locate the multiplication feature, we suggest one case for each of the four features: multiply, add, subtract, and divide. (See Table 1). The instrumented executable will create a trace file from each case.

Input
RECON needs to know what name to give each trace file. To specify the trace file name, and an optional comment about the test, run r2test before each test run. r2test stores the trace file name and the comment in a temporary file named r2tmp.dat where the instrumented executable can find them. After the test is finished, run r2end to delete this temporary file. Ensure r2test and r2end are in the execution path. (If the instrumented executable does not find the r2tmp.dat file, it will name the trace unknown.r2t. You can rename it after the test.)

Processing
Here is how to run the multiply test case in the RPN example:

r2test -t out/multiply.r2t -c "Testing multiplication, input is 6 3 *"
RPN
>Welcome to the Reverse Polish Calculator
>Enter Two Numbers Separated by White Space
>then an operator. (i.e., 2 3+)
>Enter Q anytime to Quit
6 3 *
>18
Q

r2end

Repeat the process for add, subtract and divide by running R2TEST with
-t out/add.r2t,
-t out/subtract.r2t
and -t out/divide.r2t
respectively and executing RPN replacing the * with +, -, and /.

Output
Output trace files named multiply.r2t, add.r2t, subtract.r2t, and divide.r2t are created that record each branch that was encountered, the line number, the switch value (for switch statements only), the string length of the source file name, and the source file name. An example of the trace file is shown below:

        #Testing multiplication, input is 6 3 *
        F 14 31 /usr/cs/<usrname>/inst/rpngetop.c
        F 17 31 /usr/cs/<usrname>/inst/rpngetop.c
        T 20 31 /usr/cs/<usrname>/inst/rpngetop.c
        F 21 31 /usr/cs/<usrname>/inst/rpngetop.c
        F 23 31 /usr/cs/<usrname>/inst/rpngetop.c
        T 28 31 /usr/cs/<usrname>/inst/rpngetop.c
        F 15 31 /usr/cs/<usrname>/inst/rpngetch.c
        T 23 30 /usr/cs/<usrname>/inst/rpnmain.c
        S 25 48 30 /usr/cs/<usrname>/inst/rpnmain.c
        T 11 31 /usr/cs/<usrname>/inst/rpnstack.c
        E 12 14 /usr/cs/<usrname>/inst/rpngetch.c
        E 14 14 /usr/cs/<usrname>/inst/rpngetop.c
        T 14 31 /usr/cs/<usrname>/inst/rpngetop.c
        F 14 31 /usr/cs/<usrname>/inst/rpngetop.c
        etc....


3.7 Analyzing the Test Results

Introduction
For the analysis, you must create a list file that specifies which of your test cases exhibited the feature you are looking for. RECON then determines which branch statements seem to be most closely related to that feature.

Input
The inputs are:

For the RPN tutorial, to locate the multiply feature, create a list file called MULTIPLY.LST with the following contents. Case is important for the "D", "Y", and "N".

D
>>>>>
95
/usr/cs/<usrname>/<RPN source file dir>/inst/out/add.r2t
N
/usr/cs/<usrname>/<RPN source file dir>/inst/out/divide.r2t
N
/usr/cs/<usrname>/<RPN source file dir>/inst/out/subtract.r2t
N
/usr/cs/<usrname>/<RPN source file dir>/inst/out/multiply.r2t
Y

Processing
To run the analysis program r2analyz, type the command:

r2analyz -r /usr/cs/<usrname>/<RPN source file dir>/inst/out/multiply.lst -p /usr/cs/<usrname>/<RPN source file dir>/inst/out/

Where -r specifies the list file to use and the -p specified the location to place the output. For options, refer to the description of r2analyz in Section 7.

Output
The default output of the analysis program is an annotated listing file of the user's source code for each source code file that was instrumented and met the conditions of the analysis. The annotated files have the same names as the original source files except the extension has been changed to .out. Each control statement that met the conditions of the analysis is annotated with >>>>> and the values that are associated with the control statement. (See RECON - OVERVIEW for an example of the annotated listing output file.)


4. USING RECON2 OPTIONAL INSTRUMENTATION FEATURES

We have found that RECON2 users often want greater control over the tracing process. The following additional calls to functions in r2protos.c can be inserted by hand at key points in the users program to control tracing:

  1. Suspend Tracing:
    The following functions may be used to suspend tracing, for example, around a particularly tight loop:
  2. int R2Suspend();
    int R2Resume();
    

    R2Suspend suspends writing to the trace file until R2Resume, the resume command, is given. Both return a 0 if no errors occur.

  3. Insert Comment:
    The following function may be used to insert comment messages.
  4. int R2Comment(char * msg_ptr);
    

    R2Comment writes a comment message to the trace file. It returns a 0 if no errors occur.

  5. Change to a New Trace File:
    The following function may be used to switch to a new trace file:
  6. int R2NewTrace(char *r2newfile);
    

    R2NewTrace closes the current trace, if it is open, and opens the new trace file r2newfile. The trace file path name can be any string acceptable to the fopen function. It returns a 0 if no errors occur.


5. UNIX MULTI-PROCESS PROGRAMS

Multi-process programs are difficult to trace since the different processes may attempt to write to the same trace file and may produce corrupted output. We have provided an option that may work for multi-process systems under UNIX, but since there are many dialects of UNIX and many compilers, we only offer it on a "try it and see" basis. In this option, the code in R2PROTOS.C will try to write to a UNIX message queue which is, in turn, written to the trace file. UNIX handles contention among the different processes.

When this option is used, R2Testq and R2Endq are substituted for R2Test and R2End. If you have not compiled these two programs, then do so using, for example, gcc:

gcc -o r2testq -DSYS_UNIX r2testq.c
gcc -o r2endq -DSYS_UNIX r2endq.c
You also need to ensure the resulting files are in the executable path.

To use this option, when you compile r2protos.c (and your program), define a macro named MESSAGE_Q. A typical command line to compile your multi-process program named "user" from a file named "user.c" might be:

gcc -o user -DMESSAGE_Q -DSYS_UNIX user.c r2protos.c

Before running each test, run the program < CODE>r2testq as a background process instead of using r2test as described earlier. r2testq creates the message queue, "listens" for trace records coming from the queue and writes them to the trace file. Wait about 5 seconds to give UNIX time to create the message queue before starting the test. After the test is complete, run r2endq instead of r2end. r2endq tells r2testq to shut down the queue and close the trace file. Again, it is best to wait a few seconds before doing another test to be sure the queue has shut down.

The complete sequence for one test could thus be similar to the following example:

r2testq -t "trace" -c "test of functionality A" &
{wait 5 seconds}
user {any input or output for your program named "user"}
r2endq
{wait 5 seconds}

Note the "&" on the r2testq command to be sure that r2testq is a background process.


6. RECON ENHANCEMENTS

March 1997 - RECON2 is being modified to provide software reconnaissance on ADA programs. This work is being completed by the Pensacola Software Engineering project team.


7. RECON COMMANDS AND FILES

7.1 RECON Commands

R2ANALYZ - This is the RECON2 program that analyzes your trace files to locate where a feature is located. The default is to output the annotated source listing, print limited progress information, and display error messages to your screen. The options are not case sensitive. There must be a space between the option switch (-v or -p) and the option argument (i.e. <list file>)


    r2analyz -r <list file> -p <output directory> [ ? | -h |-q | -v ] [-s]
where
        -r = List filename location
        -p = output directory to place analysis
     ?, -h = Brief listing of the options like this list
        -q = No terminal output, not even errors
        -v = Very verbose mode
        -s = Output is the sets each branch is in instead 
             of the annotated source listing

If the "-s" option is used, the analysis program writes a set report (See SECTION 7.2 for more information on the Set Output Report) to standard output with a list of the executed branches in all files of the source program. Each branch is tagged with the sets it belongs to in the "functionality view" of the software. This option is intended primarily for statistical analysis of the software system. See Section 7.2 for more information on the Set Reports.

R2END - This program deletes the r2tmp.dat file.

R2INST - This program instruments the target program source code files inserting statements that write to the trace file as each branch is executed. The switch options are not case sensitive.


     r2inst source dest [-V (or -v)[level]] [-E (or -e)]

where
        source - C/C++ language source file to be instrumented.

        dest - destination file for the instrumented source.

        -V or -v (verbose) - write progress messages to the screen.
          level - {'A'|'F'|'S'|'T'} each level adds info to previous level:
             A = Admin, writes only high level messages (file names, etc).
             F = Function names.
             S = Statements instrumented.
             T = Tokens recognized.

        -E or -e (entry) - instrument function entry points.
         

R2PROTOS.C - r2protos.c is a C file that must be compiled with the user's target program. It contains instrumentation routines required by the instrumented source code.

R2TEST - This program is run before each test case. It places the trace file path name and the trace comment in a temporary file named r2tmp.dat in the current working directory. When the instrumented target program executes, it reads this file and uses its contents to set the trace file pathname. If it cannot find r2tmp.dat, it will name the file unknown.r2t. The option switches are not case sensitive and there must be a space between the option switch and the option value.


    r2test -T/-t <trace file name> [-C/-c "trace comment"] 
where
        -T/-t  - use a specific trace file
              <trace file name> - valid file name (We recommend
                 using a ".r2t" extension)
        -C/-c  - Inset a comment
              "trace comment" - A comment including the "s
                 inserted as the first line of a trace file.

The optional comment will become the first line of the trace file to remind you which test case produced the file.

7.2 RECON2 Files

This section gives the format of all of the RECON2 files, including those that the user needs to create. All RECON2 files are ASCII text files (i.e. each line ends with a carriage return/line feed combination in MS-DOS or a carriage return in UNIX).

Annotated Listing Outputs - These files are ASCII formatted sequential files with .out extensions. They consist of the original source code (before instrumentation) with analysis information records inserted where branches implementing the feature may be located. These records are inserted directly before the point at which the original code was instrumented (i.e. before the if statements, etc.). The analysis information records will have the following format:


Cols. 1 - 5

These characters delineate the analysis information record. They are specified by the user and are found in record 2 of the list file.

Col. 6

Blank space for readability

Cols. 7 - 8

Branch probability (in percentage) - this field is left blank when using deterministic method of analysis

Cols. 9 - end

Branch Value (T, F, TF, or ordinal value for switch). Additional ordinal value - this field is repeated as many times as needed. A space shall separate each value.

The annotated listing output files will be in the analysis output directory specified by the -p variable when running r2analyz. The annotated listings will be named the same as the original source files except with a .out extension.

An example of an annotated listing output file is shown in the document RECON2 - OVERVIEW.

List File - Analysis phase specification - The list file contains the information necessary to control the analysis phase. The analysis program, r2analyz, reads the test case trace files included in list file and keeps two counts for each branch that is mentioned in any file: a count of the number of test cases "with" the feature that executed the branch and a count of the number of test cases "total" that executed the branch.

You may select either a "deterministic" or a "probabilistic" analysis. In a deterministic analysis, RECON2 will tell you about branches that are only executed when the feature is present and never otherwise. They are likely to be involved in implementing the feature in some way.

In a probabilistic analysis, you set a probabilistic threshold given as a percentage. RECON will tell you about branches for which:

threshold<(test cases "with") x 100 / (test cases "total")

This formula can be interpreted as a conditional probability that the feature and the branch appear together.


Record 1

In column 1, an uppercase letter to specify the type of analysis, either "P" for probabilistic or "D" for deterministic.

Record 2

In columns 1 to 5, up to five characters that will be used to mark the branches of interest in the annotated listing output.

Record 3

In columns 1 to 3, the probabilistic threshold, expressed as an integer (1 to 100). This record needs to be present, but the value is disregarded if deterministic analysis is selected.

Subsequent Records

The remaining records are in pairs, one pair for each test case that should be analyzed. The first record of each pair contains the full path name of the test case trace file. The second record of each pair contains, in column 1, an uppercase "Y" if the test case exhibits the feature or an uppercase "N" if it does not.

An example list file could be multiply.lst that contains:

    D
    >>>>>
    95
    /usr/cs/<usrname>/proj/out/add.r2t
    N
    /usr/cs/<usrname>/proj/out/divide.r2t
    N
    /usr/cs/<usrname>/proj/out/subtract.r2t
    N
    /usr/cs/<usrname>/proj/out/multiply.r2t
    Y

Temporary Data File - File r2tmp.dat contains the trace file path name and the trace comment as a result of running R2TEST.

Set Output Report - This file is produced by the analysis phase when the "-s" option is specified when executing r2analyz. There is one line for each branch in the source program that was executed in any test case. The line shows the sets containing the branch in the functionality view of software, described in the report Augmenting Program Understanding Strategies with Test Case Based Methods, by M. Scully, SERC-TR-68-F, Software Engineering Research Center, University of Florida, Gainesville, Florida 32611, July 1993.

Each output line has the following blank separated fields:

- source file index (3 digits, with leading zeros)

- line number of the if, switch, while, etc. in the source file that created the branch. (5 digit integer with leading zeros)

- type of condition, which is "T" or "F" for if, and while statements and "S" for switch statements

- if the type of condition field was "S", then the next field is the switch value (sign plus 9 digit integer, with leading zeros)

- set fields. There will be one field for each set the branch is in, coded as follows:

"#IC#" - is in ICOMPS(f)
"#II#" - is in IICOMPS(f)
"#CC#" - is in CCOMPS
"#RC#" - is in RCOMPS(f)
"#SH#" - is in SHARED(f)
"#UC#" - is in UCOMPS(f)

Three typical output lines could be:

003 00015 T #IC# #II# #CC#
003 00042 S -000000001 #IC# #II# #UC#
002 00123 F

Note that this format is designed to be easily sorted. Also the branches in any particular set can be extracted using grep.

Trace file - This file is produced at run time when the instrumented code is executed with a specific test case. The name of this file is determined by the user on the command line when r2test is run. The following is a description of the format of this file.

Record 1

Contains the trace comment specified by the user when r2test was run just before test case execution. (If no comment was specified, the record contains "#NO COMMENT.")

Remaining records

One record for each time a branch is executed. (If the user specified the Minimum Trace option during compilation, duplicate records will be eliminated.) Each record contains space separated fields with the first field containing a T for true, F for false, E for function entry, S for switch or a # for comment. When a # is used in the first field, the remaining inputs on the line are ignored. If a T, F or S is in the first field, the next field contains the line number for the control statement ("if", "while", "switch", etc.) that created the branch. For a switch, the next field contains the value of the switch. Then for the true, false or switch the next field is the string length of the original source code file name. The last field contains the original source code file name. An example of this is shown in section 3.6.

Default Trace File - This file is produced if r2test cannot find r2tmp.dat. The default trace file name is unknown.r2t.

8. Glossary

Deterministic analysis - This option tells the user the branches that are only executed when the feature is used and never otherwise. They are the most likely to be involved in implementing the feature.

Functionality - A feature of a software system that may be used or not used as determined by the user.

Instrumentation - C language constructs added to the original source code that will provide information to RECON as to which components were executed during a test case execution.

Minimum trace - As each decision in the instrumented source program is executed, r2protos.c stores the information in memory. When the program terminates, one line is written to the trace file for each branch (true or false or value of the switch variable) of each decision that has been executed. This option greatly reduces the size of the trace file and may also execute faster. It does require more memory and the user must ensure that the program terminates by calling "exit()" somewhere in the instrumented code. The exit() statement is instrumented to make sure that the trace data are written to the trace file.

Normal trace - As each decision in the instrumented source program is executed, one line is written to the trace file. This option can create a very large trace file (but requires little memory) since each decision will produce one line for each time it is executed. This is the default option.

Probabilistic analysis - The user specifies a probabilistic threshold given as a percentage. The formula can be interpreted as a conditional probability that the feature and the branch appear together. Branches that tend to appear in test cases that exhibit a feature can be viewed as "good indicators" of the feature.

Target program - The program that the user wants to investigate using RECON2. RECON2 helps locate where features of the user's target program are located.


End of Document