Bits of Learning

Learning sometimes happens in big jumps, but mostly in little tiny steps. I share my baby steps of learning here, mostly on topics around programming, programming languages, software engineering, and computing in general. But occasionally, even on other disciplines of engineering or even science. I mostly learn through examples and doing. And this place is a logbook of my experiences in learning something. You may find several things interesting here: little cute snippets of (hopefully useful) code, a bit of backing theory, and a lot of gyan on how learning can be so much fun.

Wednesday, March 29, 2006

Good Interfaces Is Good for Doing Good Experiments

Perhaps it's a truth that I spend so much effort in giving a professional structure and interface to the prototypes I make simply because I love doing it that way. But, I have seen that it does yield some practical benefits too.

In the couple of days I have spent as much time introducing elements of minimal usability, like command options, and complete end-to-end execution with a single command.

Now when I am actually collecting the experimental data, I am able to do it almost completely automatically using a simple perl script to invoke the tool with the right command line inputs.

Of course, the process involved in designing an automation of an experiment is fairly complex and requires insight regarding the requirement. What are the figures we are are trying to measure? This question may have significant effect on the manner in which the system under test is designed.

This blog will incorporate some of my findings at a high level from designing the experiments for my method.

In short, I was to draw comparison between a specification based regression testing method which I call 'explicit state space enumeration' or ESSE method and another method called 'Legal Random Calls' or LRC method.

Sunday, March 26, 2006

Separate Parsers in The Same Application

It might often happen that you would like to have two parsers coexist in the same program. Here's an example of this situation in my current implementation of Modest -- the Model Based Testing tool.

There're the following two modules:

cg : This reads API specifications (written in a language, say, A) from a spec file. It then generates the GraphMaker code. This, when built and executed, will generate the state space graph (written in a language, say, B) of the given application.

pf : This reads the state space graph, again written in B, reads test specifications, written in a language, say, C, and computes the test sequences.

We observe that there are three languages to be recognised -- The API specification language A, the graph description language B, and the test specification language C. We need parsers for all three of them. Incidentally, it our case, B = C (in context-free grammatical sense). However, the data-structure into which they are read is different. Hence different parsers are anyway required. But the lexical analyser for both B and C is the same.

Say the lexical analyser for A, B and C are l(A), l(B) and l(C), and let the syntax analysers be p(A), p(B), and p(C) respectively. I used yacc (in fact bison) to write the specs for p(A). I hand-coded p(B) and p(C).

l(A) and l(B) were written in lex (in fact flex). And, as mentioned above, l(B) = l(C).

Initially cg and pf were developed separately. Hence, the parsers and the lexical analysers didn't interfere with each. However, when I tried integrating them into Modest, I ran into trouble due to the following:

1. Name Conflicts among globals
--------------------------------------------

When I did flex(Vocab (A)), it generated the lexical analyser function yylex(), which is global. Similarly when I did flex(Vocab(B)), it too generated a lexical analyser function yylex(). Both global, and hence, while linking, gave redefinition error.

Solution:
As mentioned above, the default name of the lexical analyser function generated by flex is yylex(). Similarly, the default name of the syntax analyser function generated by bison is yyparse(). Both these names can, however, be changed with the following.

Running flex as follows:

flex -Pprefixname inputfilename.flex

will generate lexical analyser function with the name prefixnamelex() instead of yylex().

Similarly running bison as follows:

bison --name-prefix prefixname inputfilename.yy

will generate the syntax analyser function with the name prefixnameparse() instead of yyparse(). Corresponding changes happen to many important tokens in the generated parser. For instance, the calls to yylex() in the generated code will all now be to prefixnamelex(). Hence, it is necessary to have the prefixname same for both the flex and the bison commands, so that the linker finds the prefixnamelex() function that the prefixnameparse() functions calls.

This solves the name conflict problems for the lexical analyser and syntax analyser functions for more than than one analysers in the same program. The name coflicts arising between other globals that you might have created can be easily resolved by encapsulating them into namespaces of those modules (I am assuming C++).

2. Name conflict between the input source file-pointer
--------------------------------------------------------------------------

The way to direct flex to generate a lexical analyser that reads from a file pointer of a particular name, say fin, is to embed the following preprocessor directive in the flex input file:

#undef YY_INPUT
#define YY_INPUT(buf,result,max_size) \
if ( (result = fread( (char*)buf, sizeof(char), max_size, fin)) <>
YY_FATAL_ERROR( "read() in flex scanner failed");

For example, for cg, the above was

#undef YY_INPUT
#define YY_INPUT(buf,result,max_size) \
if ( (result = fread( (char*)buf, sizeof(char), max_size, cgfin)) <>
YY_FATAL_ERROR( "read() in flex scanner failed");

And for pf, it was

#undef YY_INPUT
#define YY_INPUT(buf,result,max_size) \
if ( (result = fread( (char*)buf, sizeof(char), max_size, pffin)) <>
YY_FATAL_ERROR( "read() in flex scanner failed");

Of course, it's our responsibility that the lexical analyser finds this FILE * open when it tries to read from it.

Friday, March 17, 2006

Working with CVS

I am facing the versioning issues in proper sense for the first time. Or may be for the first time I trying to solve these problems in that fashion.

The testing directory in my CVS repository contains two tags: after-philips(branch) and with-arguments(main trunk). The after-philips tag is what I had at the end of the Philips Research internship of the last year. with-arguments contains the work that contains the addition that I did afterwards, mainly, the capability of cg to generate code from API specs. with functions taking arguments.

Now, the problem starts. In my funcoding directory, I have a testing directory. This contains the development I was doing in order to incorporate the pointers feature to the API language. This code was checked out from the with-arguments revision tag. Basically the objective of this development is to handle functions accepting pointer arguments. Multiple values are returned through these arguments, and they figure in the preconditions and postconditions of the API functions. That work ran into sticky implementation issues. I digressed from that to do this paper-writing work.

Now, I am supposed to create some results which will hopefully be incorporated in the paper. So, I can checkout yet another copy from with-arguments revision tag. However, I am anticipating that this will contain a series of checkins which I don't want to interfere with my with-pointer(no such tag actually exists in the repository. Here I am refering to it just for explanation sake) branch.

Solution: I created a branch icsm06-demo. And checked out a local copy from this branch. So my tinkering with this branch will keep my with-pointers work unaffected.

Currently, I am not able to foresee if some of this work will need to be merged with the with-pointers branch. We will see later!

Writing algorithms in latex

There're a number of ways in which algorithms can be written in a latex article.

Option 1
One way is to use the \verb command. That would give a type written look to the algorithm. May be OK to use that for small code snippets. But it is very inflexible. In total, it's not advised to use this method.

Option 2
Use listings package. It can be got from here. Please check. It usually comes prepackaged in latex installations. So, it may already be there on your machine. listings is very versatile, giving you the facility to include code snippets of many languages (C, C++, Pascal, pseudocode, HTML...). You can include source-code directly from an external file. You can also inline that. And you could add it right as a part of your regular text.

However, listings appears more appropriate only for inserting code snippets, and not algorithms. Well, I don't think there's any inherent limitation, since it seems to be a very stable package. But, option 3 seems more appropriate for algorithms.

Option 3
To write proper algorithms, one should use one or more of the above. My colleagues seem to prefer a combination of algorithm and algorithmic. Both come bundled in the same package that can be downloaded from here.

A usual way is to nest algorithmic inside algorithm. The latex code will look somewhat like:

\begin{algorithm}
\begin{algorithmic}
...
your algorithm
...
\end
{algorithmic}
\end{algorithm}

However, this seems to have a drawback from what I observed. algorithmic doesn't seem to have a way of having more than one procedure in a single algorithm. And also of invoking other procedures.

That problem gets mitigated if we replace algorithmic with algo. It has got function calls, and multiple procedures. I prefer algorithm, algo combination the best.

Please note that when you are using algo, you must exclude algorithmic. Using both packages together, as in:

\usepackage{algorithm}
\usepackage{algorithmic}
\usepackage{algo}

seems to have some problems. Notably, if you are using algo to write your algorithm, the indentations will disappear if the algorithmic package is used. Just comment out that above line:

%\usepackage{algorithmic}

However, using algo.sty has a severe problem. There seem to be many versions of it available on the web which unfortunately seem to have originated from completely different sources, and therefore are incompatible with each other. In fact, I have lost track of the source and accompanying documentation of the version I am currently using. There's one version available here. I am planning to shift to that next time on.