ERNI Technology Post No. 47: Generating test data for communication protocols and other languages

Situation & problem

How do you efficiently test a program’s communication behavior?

Communication is, next to computation, one of the most important tasks of today’s computer systems, be this computer-supported communication between humans (e.g., e-mail, chat); communication between machines or programs only (e.g., automated clearing house transactions, API or service calls); or between a user and a computer (e.g., graphical user interfaces).

Language. In order to be understood, parties involved in a communication must use a language that is known to all. Language, in a very broad sense, is the definition of a regular structure of both message sequences (protocols) and message content. In order for a language to be tractable, this regularity is paramount. However, among both humans and computers, the memory necessary  to store the program used to understand the language in question is not unlimited

Figure 1: A message is only valid if it is an element of the set defined by the language rules.

This last implication can also be applied in the other direction. Any given program interacting with its environment must apply a language to this communication. This language may have been explicitly stated before implementation or may be implicitly defined by the implementation, but it will be observable nonetheless.

So the question becomes, can we make use of knowledge of the underlying language to test a program’s communication behaviour more efficiently?

Some background on language processing techniques

Parsers. There are many well-defined computer languages, e.g. programming languages, data languages (such as XML), or Internet protocols with their accompanying message structure definitions. Programs that process messages expressed in such languages typically employ a parser to analyse an input message, verify its adherence to the structural rules of the language, and recover that structure from the input. Here is an example.

Sample of subject language – C#

if (a < 9)
       x = 2; 

A C# parser might translate the given input into the following data structure:

Figure 2: Parse tree of the sample above.

Grammars. The definition of a language generally encompasses both structure (syntax) as well as meaning (semantics). As with human languages, the definition of the syntax of a language is called a grammar.

Sample of informal grammar of subject language – C#

An if statement consists of the keyword “if”, followed by an expression (see there) enclosed in parentheses “(”, “)”, a statement (see there), and optionally the keyword “else” followed by another statement.

A statement may be one of the following:

  • An if statement
  • A while statement
  • An assignment; no further nesting, therefore followed by a semicolon “;” as a separator

A block – zero or more statements enclosed in braces “{”, “}”

Although the structure of a language may be arbitrarily complex, and its grammar may grow accordingly in length, the grammar will consist of only a very limited number of different types of elements, as can be seen in the example:

  • A grammar is made up of individual, named (“statement”) rules, which in turn indicate choices (“one of”) between alternatives and sequences (“followed by”) of items
  • Items can be marked as optional and/or as repeating (“zero/one or more”)
  • Items can be simple and given directly (“if”, a keyword and, as such, a fixed terminal "identifier", a variable terminal) or complex and referred to (“statement”, a nonterminal)

This simplicity of structure of a grammar allows us to capture grammars more formally. This is accomplished by a type of language not yet mentioned above—that of language definition languages, or grammar languages.

Sample of formal grammar of subject language – C#

if_statement := "if"  "("  expression ")"  statement  [  "else"  statement  ]  ;
Square brackets “[“, “]” indicate that “else” part is optional. 

statement := if_statement

|  while_statement
|  assignment  ";"
|  …
|  "{"  {  statement  }  "}"  ;

Vertical bar “|” separates alternatives.

Other statement types omitted.
Braces “{“, “}” around “statement” indicate repetition.

The grammar of grammar languages. The very moment we say “grammar language”, there must also be a grammar for that language: This is a meta grammar or grammar grammar. This may sound arcane, but we will need this later on.

Sample of meta grammar 

rule := symbol  ":="  alternatives ";"  ;
Rule terminates in a semicolon. 

alternatives := sequence  {  "|"  sequence  }  ;
One or more "sequences", separated by a vertical bar

sequence := item  {  item  }  ;
One or more "items", not explicitly separated (but implicitly by spaces)

item :=  symbol [  "*"  |  "+"  |  "?"  ]

|   "["  symbol  "]"
|   "{"  symbol  "}"  ;

A symbol, optionally followed by a repetition or option mark,

or a symbol within "optional" brackets,
or a symbol within "repetition" braces.

A parser generator is a program that reads a grammar of a subject language (say, C#)—given in a meta grammar language (say, the one shown above)—and outputs code that implements a parser for that subject language [2, 3]. This parser can then be used for processing texts given in the subject language, e.g., as front end of a C# compiler. Figure 3 illustrates this.

Figure 3: Compiler with parser front end generated from grammar of subject language.

Problem solution

Now we have all the necessary pieces in place. What follows is the construction of a generator for test data that adheres to a given grammar.

The first step is to feed a parser generator with its own meta grammar. This will output a parser that can read grammars. See the top of figure 4, below. (This is exactly like the parser front end of the parser generator that reads grammar files. If the parser generator is available as source code, its front end can be used directly.)

In a second step, feed the generated grammar parser with the grammar of the language for which to generate test data. This will output an in-memory tree representation of that grammar. This is shown on the lower left of figure 4. The parse tree will contain all the information from the grammar—how structures are nested, which elements are optional or can be repeated, which alternatives exist at what point, etc.

Figure 4: Generate parse tree of subject grammar, then output test data according to that grammar.

In a third step, write code (lower right of figure 4) that generates "text" [4] for the subject grammar so that it

  • receives the subject grammar's parse tree as input
  • performs a "depth-first pre-order" traversal ([1], see figure 5 below) on the part of the tree representing the top-most rule of the grammar
  • at each node in the tree
    • when marked as optional, decides if the element should be included or not
    • when marked as repeatable, decides how often the element should be repeated
    • when marked as an alternative between elements, decides which element should be chosen
    • when representing a terminal symbol of the subject language, outputs an instance of the terminal symbol
    • when representing a nonterminal symbol of the subject language, performs a traversal of the tree representing the rule for that nonterminal before continuing from the nonterminal

Figure 5: Depth-first pre-order tree traversal.

Figure 5 illustrates the depth-first pre-order tree traversal. Nodes in the tree are visited in the order given by the arrows and dots on the curved line, starting at the top.

  • Among others, the node labelled “statement” represents a nonterminal of the target language. Here, before continuing to the next node on the same level, a complete traversal is performed of the parse tree of the “statement” rule—including making decisions at all decision points there, as well as traversing parse trees of other rules found there. This is indicated by the dotted line.
  • The closing brace is a decision point; if the decision taken is to repeat the items between the braces again, the traversal repeats from the corresponding opening brace. This is indicated by the dashed line.

Repeating the traversal at each decision point with different decisions made at each repetition will enumerate all the texts that can be expressed in the subject language. Since these are generally infinitely many, provisions must be made to restrict this to a manageable number.

The above generates the positive cases only. Negative cases can be created:

  • by introducing additional decision points and purposefully deviating from the definitions given by the subject grammar at these points:
    • skipping a non-optional element
    • repeating an element where not allowed
    • replacing an element by some other element
    • outputting text that is not an instance of the current terminal symbol
  • by altering the output text after it has been generated.



Parser generators may not be widely used, but are a mature and stable technology. It takes some effort to learn to use this technology, so there certainly is a minimum complexity required in the language before an investment in this approach reaches its break-even point. On the other hand, formalising the grammar or using a given formal grammar helps to better understand the language and therefore arrive at better tests. Once that is achieved, automating test data generation is the logical next step.

Employing a parser generator makes sense especially if the grammar—and therefore the parser code—is changing a lot, e.g., if the grammar is not known and is being discovered in an exploratory manner, or if it is under active development.

The above solution is independent of the subject language. As long as a grammar can be provided, it is possible to generate test data for any language. Applicability of this approach to testing of protocols and APIs is not as obvious, but it works equally well by modelling the allowed flows of messages or API calls as a set of grammar rules. This grammar may be extended by rules that describe the content of the message or the values of the parameters passed, respectively. Depending on the system under test, you may choose to model replies and call-backs as well.

The generated test data can be used for different types of tests, e.g., functional tests and performance or stress tests, as well as on different test levels, from unit tests to acceptance tests.

The test data—because it was generated from a grammar—will, a priori, be good only to detect problems with handling of language structure, and not with semantic problems. This can be remedied by providing an appropriate selection of values for variable terminals. Even if checking for semantic problems is the only goal of testing, the generator is still good for producing the bulk of the message or text frame around the intended test values. 






[4] Chomsky, Noam. 1965. Aspects of the theory of syntax. Cambridge, Massachusetts: MIT Press.

posted on 16.06.2014
by: Oliver Gramberg