This software was created as a project by students in the Software Engineering program at the University of West Florida.
In accepting and using RECON2, you agree to the following disclaimer of warranty:
THIS SOFTWARE AND MANUAL ARE PROVIDED "AS IS" AND WITHOUT WARRANTIES AS TO PERFORMANCE OR MERCHANTABILITY OR ANY OTHER WARRANTIES WHETHER EXPRESSED OR IMPLIED. BECAUSE OF THE VARIOUS HARDWARE AND SOFTWARE ENVIRONMENTS INTO WHICH THIS PROGRAM MAY BE PUT, NO WARRANTY OF FITNESS FOR A PARTICULAR PURPOSE IS OFFERED.
GOOD COMPUTER PROCEDURE DICTATES THAT ANY PROGRAM BE THOROUGHLY TESTED WITH NON-CRITICAL DATA BEFORE RELYING ON IT. THE USER MUST ASSUME THE ENTIRE RISK OF USING THE PROGRAM. ANY LIABILITY OF THE PROVIDER WILL BE LIMITED EXCLUSIVELY TO PRODUCT REPLACEMENT.
Some states do not allow the exclusion of the limit of liability for consequential damages, so the above limitation may not apply to you.
This agreement shall be governed by the laws of the State of Florida and shall inure to the benefit of the Wilde Bunch, the University of West Florida, and any successors, administrators, heirs, and assigns. Any action or proceeding brought by either party against the other arising out of or related to this agreement shall be brought only in a STATE or FEDERAL COURT of competent jurisdiction located in Okaloosa or Escambia County, Florida. The parties hereby consent to in personam jurisdiction of said courts.
2.1 Purpose
RECON2 is a tool to help software engineers locate the parts of a large target program that implement a particular program feature. First, RECON2 makes an instrumented copy of the user's target program source code with output statements on each branch structure. Then, the software engineer compiles and runs the instrumented code using some test cases that use the feature and others that do not. RECON2 identifies which branches in the program were executed most frequently when the feature was being used. These branches provide good starting points for locating the feature when reading the program's code.
If you have not already done so, read the document RECON - OVERVIEW for a quick introduction to the steps in using RECON2. For information on how to install and build RECON on your system, refer to the "readme.txt" file that is included in the distribution file.
To use RECON2, you need a C/C++ Compiler. It is assumed that you have a broad knowledge of C programming and are familiar with compilers, editors and testing. You must know how to recompile and relink the target program. You must also understand the purpose of the target program well enough to be able to write test cases that exhibit the program feature you want to locate.
Please report all documentation and software problems to:
Let us know if you have any suggestions for improvement to RECON2.
Normally, RECON2 does not modify the original source code files, but if the user specifies that the instrumented source path is the same as the original source path, the original source file will be overwritten by the instrumentation program.
RECON2 may substantially slow your software since it keeps track of each time a branch is executed. If performance is important, keep test cases short or avoid instrumenting files that contain the innermost loops of your program. Consider also using the Minimum Trace option described in the Glossary. There may be a modest size increase in your compiled code.
3.1 About the Tutorial Program
All the examples in this document are based on a simple Reverse Polish Notation (RPN) Calculator program adapted from Kernigan and Ritchie, The C Programming Language (2nd edition). The program can be considered to have four features: addition, subtraction, multiplication and division. We will show how the branches for the multiplication feature are located. The four files that comprise the calculator program are:
In reverse Polish notation, each operator follows its set of operands; an infix expression like:
Each test case is terminated by Q
To find the unique code where the multiplication operation is executed, first define a set of test cases, some of which use the multiplication ("*") operator, some of which do not. For the example, we have selected the test cases in Table 1. Note that there is one test case that exhibits the multiplication feature and three that do not.
| test case | description | exhibits feature? |
|---|---|---|
| 6 3 * Q |
multiplication 6*3=18 |
Y |
| 6 3 / Q |
division 6/3=2 |
N |
| 6 3 + Q |
addition 6+3=9 |
N |
| 6 3 - Q |
subtraction 6-3=3 |
N |
Table 1. Test Cases for the RPN Calculator
3.2 Copying and Compiling the RECON2 Source Files
Copy and compile the RECON2 source files to a directory as described in the "readme.txt" file that is included in the distribution file. For the purpose of this document we will use this path: /usr/cs/<usrname>/<RPN source file dir> . Substitute a directory name of your choice in its place.
The tutorial examples are shown for the UNIX operating systems. For the MS DOS operating system, simply add the drive letter and change the " / " to " \ ".
Create a directory to hold the instrumented source files:
Also create a directory to hold the output trace files and analysis:
3.4 Instrumenting the Source Code
Introduction
Your "C/C++" source files will be "instrumented" so
that a trace file will be produced every time your program is executed.
Instrumentation involves copying your source
files and inserting statements that will write to the trace file when a
branch is executed.
You may not want to instrument some files of your program, either because you are sure the feature you are looking for is not located in certain files, or because some files contain loops that will produce a large trace file.
Processing
At the DOS prompt run r2inst giving the full path name of the file
to be instrumented and the location of where to place the instrumented
file. It is best to provide the full directory path name of the source
files. If a relative path is given as an input to r2inst, r2analyz will
fail to find the source file unless run from the same directory used in
running r2inst.
For the RPN tutorial, give the following instrumentation commands:
r2inst /usr/cs/<usrname>/<RPN source file dir>/inst/rpngetch.cHint: When using RECON2 on your own target program, you can create a DOS batch file or UNIX shell script containing the instrumentation commands. Commands to copy other needed files, such as headers or makefiles, could also be included. Then you can reinstrument easily by just running the batch file or shell script.
Output
The result of the instrumentation process
is a set of instrumented source code files. For example, an "if"
statement on line 5 would be changed from:
where R2srcfile_ptr points to the path of the preinstrumented source file.
The instrumented RPN Calculator files will be created and placed in the output directory.
3.5 Compiling the Instrumented Version
Introduction
After RECON2 instruments your source code, you will need to compile
the instrumented code. This may involve linking with libraries and non-instrumented
components.
Input
The instrumented source code files will be used as input along with
other source files as needed for your particular system.
Processing
The command you must give to compile will vary depending on the compiler
you use. Each compiler has a slightly different command line format. Some
compilers are part of a programming environment, and it may be easier to
use that environment to create a project file, and then let the environment
do the compile.
However your compiler works, here are some things to be careful of:
r2protos.c must be compiled and linked with
the instrumented source code. r2protos.c contains instrumentation
routines required by the instrumented source code.
r2.h, r2protos.h, and r2protos.c must be present in the instrumented
source code directory. For the RPN example, you would copy these files
to /usr/cs/<usrname>/<RPN source file dir>/inst. If instrumented files exist in multiple
directories, r2protos.h needs to be included in each of the directories.
r2.h file.
r2.h.
The following is an example of a compile command for the RPN example using the UNIX Gnu C/C++ compiler:
Output
Compilation produces an instrumented executable version of the target
program. For the RPN tutorial, an instrumented executable file called rpn
is produced.
3.6
Testing the Instrumented Version
Introduction
To locate a feature using the instrumented executable, run test cases
that exhibit the feature and others that do not. Usually only a few cases
are needed. For the RPN Calculator to locate the multiplication feature,
we suggest one case for each of the four features: multiply, add, subtract,
and divide. (See Table 1). The instrumented executable will create a trace
file from each case.
Input
RECON needs to know what name to give each trace file. To specify the
trace file name, and an optional comment about the test, run r2test before
each test run. r2test stores the trace file name and the comment in a temporary
file named r2tmp.dat where the instrumented executable can find them. After
the test is finished, run r2end to delete this temporary file. Ensure r2test
and r2end are in the execution path. (If the
instrumented executable does not find the r2tmp.dat file, it will name
the trace unknown.r2t. You can rename it after the test.)
Processing
Here is how to run the multiply test case in the RPN example:
r2test -t out/multiply.r2t -c "Testing multiplication, input is
6 3 *"
r2endRepeat the process for add, subtract and divide by running R2TEST with
-t out/add.r2t,
and
-t out/subtract.r2t-t out/divide.r2t
respectively
and executing RPN replacing the * with +, -, and /.
Output
Output trace files named multiply.r2t, add.r2t, subtract.r2t, and divide.r2t
are created that record each branch that was encountered, the line number,
the switch value (for switch statements only), the string length of the
source file name, and the source file name. An example of the trace file
is shown below:
#Testing multiplication, input is 6 3 *
F 14 31 /usr/cs/<usrname>/inst/rpngetop.c
F 17 31 /usr/cs/<usrname>/inst/rpngetop.c
T 20 31 /usr/cs/<usrname>/inst/rpngetop.c
F 21 31 /usr/cs/<usrname>/inst/rpngetop.c
F 23 31 /usr/cs/<usrname>/inst/rpngetop.c
T 28 31 /usr/cs/<usrname>/inst/rpngetop.c
F 15 31 /usr/cs/<usrname>/inst/rpngetch.c
T 23 30 /usr/cs/<usrname>/inst/rpnmain.c
S 25 48 30 /usr/cs/<usrname>/inst/rpnmain.c
T 11 31 /usr/cs/<usrname>/inst/rpnstack.c
E 12 14 /usr/cs/<usrname>/inst/rpngetch.c
E 14 14 /usr/cs/<usrname>/inst/rpngetop.c
T 14 31 /usr/cs/<usrname>/inst/rpngetop.c
F 14 31 /usr/cs/<usrname>/inst/rpngetop.c
etc....
Introduction
For the analysis, you must create a list file that specifies which
of your test cases exhibited the feature you are looking for. RECON then
determines which branch statements seem to be most closely related to that
feature.
Input
The inputs are:
For the RPN tutorial, to locate the multiply feature, create a list file called MULTIPLY.LST with the following contents. Case is important for the "D", "Y", and "N".
Processing
To run the analysis program r2analyz, type the
command:
r2analyz -r /usr/cs/<usrname>/<RPN source file dir>/inst/out/multiply.lst -p /usr/cs/<usrname>/<RPN source file dir>/inst/out/Where -r specifies the list file to use and the -p specified the location
to place the output. For options, refer to the description of r2analyz
in Section 7.
Output
The default output of the analysis program is an annotated listing
file of the user's source code for each source code file that was instrumented
and met the conditions of the analysis. The annotated files have the same
names as the original source files except the extension has been changed
to .out. Each control statement that met the conditions of the
analysis is annotated with >>>>> and the values
that are associated with the control statement. (See RECON
- OVERVIEW for an example of the annotated listing output file.)
We have found that RECON2 users often want greater control over the
tracing
process. The following additional calls to functions in r2protos.c can
be inserted by hand at key points in the users program to control tracing:
int R2Suspend(); int R2Resume();
R2Suspend suspends writing to the trace file until R2Resume, the resume command, is given. Both return a 0 if no errors occur.
int R2Comment(char * msg_ptr);
R2Comment writes a comment message to the trace file. It returns a 0 if no errors occur.
int R2NewTrace(char *r2newfile);
R2NewTrace closes the current trace, if it is open, and opens the new
trace file r2newfile. The trace file path name can be any string acceptable
to the fopen function. It returns a 0 if no errors occur.
Multi-process programs are difficult to trace since the different processes may attempt to write to the same trace file and may produce corrupted output. We have provided an option that may work for multi-process systems under UNIX, but since there are many dialects of UNIX and many compilers, we only offer it on a "try it and see" basis. In this option, the code in R2PROTOS.C will try to write to a UNIX message queue which is, in turn, written to the trace file. UNIX handles contention among the different processes.
When this option is used, R2Testq and R2Endq are substituted for R2Test and R2End. If you have not compiled these two programs, then do so using, for example, gcc:
gcc -o r2testq -DSYS_UNIX r2testq.cgcc -o r2endq -DSYS_UNIX r2endq.cTo use this option, when you compile r2protos.c (and your program),
define a macro named MESSAGE_Q. A typical command line to compile your
multi-process program named "user" from a file named "user.c"
might be:
gcc -o user -DMESSAGE_Q -DSYS_UNIX user.c r2protos.c
Before running each test, run the program < CODE>r2testq as a background process
instead of using r2test as described earlier. r2testq creates the message
queue, "listens" for trace records coming from the queue and
writes them to the trace file. Wait about 5 seconds to give UNIX time to
create the message queue before starting the test. After the test is complete,
run r2endq instead of r2end. r2endq tells r2testq to shut down the queue
and close the trace file. Again, it is best to wait a few seconds before
doing another test to be sure the queue has shut down.
The complete sequence for one test could thus be similar to the following example:
r2testq -t "trace" -c "test of functionality A" &
{wait 5 seconds}
user {any input or output for your program named "user"}
r2endq
{wait 5 seconds}
Note the "&" on the r2testq command to be sure that r2testq is
a background process.
March 1997 - RECON2 is being modified to provide software reconnaissance on ADA programs. This work is being completed by the Pensacola Software Engineering project team.
7.1 RECON Commands
R2ANALYZ - This is the RECON2 program that analyzes your trace files to locate where a feature is located. The default is to output the annotated source listing, print limited progress information, and display error messages to your screen. The options are not case sensitive. There must be a space between the option switch (-v or -p) and the option argument (i.e. <list file>)
r2analyz -r <list file> -p <output directory> [ ? | -h |-q | -v ] [-s]
where
-r = List filename location
-p = output directory to place analysis
?, -h = Brief listing of the options like this list
-q = No terminal output, not even errors
-v = Very verbose mode
-s = Output is the sets each branch is in instead
of the annotated source listing
If the "-s" option is used, the analysis program writes a set report (See SECTION 7.2 for more information on the Set Output Report) to standard output with a list of the executed branches in all files of the source program. Each branch is tagged with the sets it belongs to in the "functionality view" of the software. This option is intended primarily for statistical analysis of the software system. See Section 7.2 for more information on the Set Reports.
R2END
- This program deletes the r2tmp.dat file.
R2INST - This program instruments the target program source code files inserting statements that write to the trace file as each branch is executed. The switch options are not case sensitive.
r2inst source dest [-V (or -v)[level]] [-E (or -e)]
where
source - C/C++ language source file to be instrumented.
dest - destination file for the instrumented source.
-V or -v (verbose) - write progress messages to the screen.
level - {'A'|'F'|'S'|'T'} each level adds info to previous level:
A = Admin, writes only high level messages (file names, etc).
F = Function names.
S = Statements instrumented.
T = Tokens recognized.
-E or -e (entry) - instrument function entry points.
R2PROTOS.C
- r2protos.c is a C file that must be compiled with the user's target program.
It contains instrumentation routines required by the instrumented source
code.
R2TEST
- This program is run before each test case. It places the trace file
path name and the trace comment in a temporary file named r2tmp.dat in
the current working directory. When the instrumented target program executes,
it reads this file and uses its contents to set the trace file pathname.
If it cannot find r2tmp.dat, it will name the file unknown.r2t.
The option switches are not case sensitive and there must be a space between the
option switch and the option value.
r2test -T/-t <trace file name> [-C/-c "trace comment"]
where
-T/-t - use a specific trace file
<trace file name> - valid file name (We recommend
using a ".r2t" extension)
-C/-c - Inset a comment
"trace comment" - A comment including the "s
inserted as the first line of a trace file.
The optional comment will become the first line of the trace file to remind you which test case produced the file.
7.2 RECON2 Files
This section gives the format of all of the RECON2 files, including those that the user needs to create. All RECON2 files are ASCII text files (i.e. each line ends with a carriage return/line feed combination in MS-DOS or a carriage return in UNIX).
Annotated Listing Outputs - These files are ASCII formatted sequential files with .out extensions. They consist of the original source code (before instrumentation) with analysis information records inserted where branches implementing the feature may be located. These records are inserted directly before the point at which the original code was instrumented (i.e. before the if statements, etc.). The analysis information records will have the following format:
Cols. 1 - 5
Col. 6
Cols. 7 - 8
Cols. 9 - end
The annotated listing output files will be in the analysis output directory
specified by the -p variable when running r2analyz. The annotated listings
will be named the same as the original source files except with a .out
extension.
An example of an annotated listing output file is shown in the document RECON2 - OVERVIEW.
List
File - Analysis phase specification - The list file contains the information
necessary to control the analysis phase. The analysis program, r2analyz,
reads the test case trace files included in list file and keeps two counts
for each branch that is mentioned in any file: a count of the number of
test cases "with" the feature that executed the branch and a
count of the number of test cases "total" that executed the branch.
You may select either a "deterministic" or a "probabilistic" analysis. In a deterministic analysis, RECON2 will tell you about branches that are only executed when the feature is present and never otherwise. They are likely to be involved in implementing the feature in some way.
In a probabilistic analysis, you set a probabilistic threshold given as a percentage. RECON will tell you about branches for which:
This formula can be interpreted as a conditional probability that the feature and the branch appear together.
Record 1
Record 2
Record 3
Subsequent Records
An example list file could be multiply.lst that contains:
D
>>>>>
95
/usr/cs/<usrname>/proj/out/add.r2t
N
/usr/cs/<usrname>/proj/out/divide.r2t
N
/usr/cs/<usrname>/proj/out/subtract.r2t
N
/usr/cs/<usrname>/proj/out/multiply.r2t
Y
Temporary Data File
- File r2tmp.dat contains the trace file path name and the trace comment as
a result of running R2TEST.
Set
Output Report - This file is produced by the analysis phase when the
"-s" option is specified when executing r2analyz. There is
one line for each branch in the source program that was executed in any
test case. The line shows the sets containing the branch in the functionality
view of software, described in the report Augmenting Program Understanding
Strategies with Test Case Based Methods, by M. Scully, SERC-TR-68-F,
Software Engineering Research Center, University of Florida, Gainesville,
Florida 32611, July 1993.
Each output line has the following blank separated fields:
- source file index (3 digits, with leading zeros)
- line number of the if, switch, while, etc. in the source file that created the branch. (5 digit integer with leading zeros)
- type of condition, which is "T" or "F" for if, and while statements and "S" for switch statements
- if the type of condition field was "S", then the next field is the switch value (sign plus 9 digit integer, with leading zeros)
- set fields. There will be one field for each set the branch is in, coded as follows:
"#IC#" - is in ICOMPS(f)
"#II#" - is in IICOMPS(f)
"#CC#" - is in CCOMPS
"#RC#" - is in RCOMPS(f)
"#SH#" - is in SHARED(f)
"#UC#" - is in UCOMPS(f)
Three typical output lines could be:
003 00015 T #IC# #II# #CC#
003 00042 S -000000001 #IC# #II# #UC#
002 00123 F
Note that this format is designed to be easily sorted. Also the branches in any particular set can be extracted using grep.
Trace
file - This file is produced at run time when the instrumented code
is executed with a specific test case. The name of this file is determined
by the user on the command line when r2test is run. The following is a
description of the format of this file.
Record 1
r2test was run
just before test case execution. (If no comment was specified, the record
contains "#NO COMMENT.") Remaining records
Default Trace File
- This file is produced if r2test cannot find r2tmp.dat. The default
trace file name is unknown.r2t.
Deterministic analysis - This option tells the user the branches that are only executed when the feature is used and never otherwise. They are the most likely to be involved in implementing the feature.
Functionality - A feature of a software system that may be used or not used as determined by the user.
Instrumentation - C language constructs added to the original source code that will provide information to RECON as to which components were executed during a test case execution.
Minimum trace - As each decision in the
instrumented source program is executed, r2protos.c stores the information
in memory. When the program terminates, one line is written to the trace
file for each branch (true or false or value of the switch variable) of
each decision that has been executed. This option greatly reduces the size
of the trace file and may also execute faster. It does require more memory
and the user must ensure that the program terminates by calling "exit()"
somewhere in the instrumented code. The exit() statement is instrumented
to make sure that the trace data are written to the trace file.
Normal trace - As each decision in the instrumented source program is executed, one line is written to the trace file. This option can create a very large trace file (but requires little memory) since each decision will produce one line for each time it is executed. This is the default option.
Probabilistic analysis - The user specifies a probabilistic threshold given as a percentage. The formula can be interpreted as a conditional probability that the feature and the branch appear together. Branches that tend to appear in test cases that exhibit a feature can be viewed as "good indicators" of the feature.
Target program - The program that the user wants to investigate using RECON2. RECON2 helps locate where features of the user's target program are located.