Home =>
Articles =>
Test First Challenge: Basic Formulas
Test First Challenge: Basic Formulas
Charlie Poole
January, 2002
Handling More Than One Cell
Part 1 was originally done with only one
string holding the content for all the cells. As expected, this broke on the
first test of Part 2 (testManyCells)and some refactoring was required.
A container of some kind was needed, and there seemed to be a choice to be
made about whether to structure it as a two-dimensional array or a simple map.
Since the array would need to be sparse, it seemed complicated. So I added
a standard map from cell name to value to the Sheet class.
At this point, I considered adding a cell object but didn't. I'm keeping in my
mind the idea that the map may eventually contain Cells rather than Strings. At
this point, the code looks like this (changes in bold).
class Sheet
{
public:
Sheet();
virtual ~Sheet();
String get( const String& cellName );
String getLiteral( const String& cellName );
void put( const String& cellName, const String& cellValue );
private:
std::map(String, String) m_content;
};
String Sheet::get( const String & )
{
String content = m_content[cellName];
if ( content.isNumeric() )
return content.trim();
else
return content;
}
String Sheet::getLiteral( const String& cellName )
{
return m_content[cellName];
}
void Sheet::put( const String & cellName, const String& cellValue )
{
m_content[cellName] = cellValue;
}
Adding a Cell Class
The second test (testFormulaSpec) only required adding a getFormula method with
an identical implementation to getLiteral() from the previous
section. However, the third test (testConstantFormula)
introduced a new behavior of get(). Looking ahead at the rest of the tests, I see
I'm going to be doing a some sort of parsing. For this test to pass, however,
I only need to drop the initial =. I decide to do a little refactoring
so that the function will be in the right place.
I feel driven toward a Cell object that knows how to interpret what's in it. I
refactor all the existing behavior of the Sheet that seems to belong to Cell.
This changes the Sheet class again, but only in its implementation. Now it delegates
everything except keeping track of cells to the Cell class. I like that better.
class Sheet
{
public:
Sheet();
virtual ~Sheet();
String get( const String& cellName );
String getLiteral( const String& cellName );
void put( const String& cellName, const String& cellValue );
private:
std::map(String, Cell) m_content;
};
String Sheet::get( const String & )
{
return m_content[cellName].get();
}
String Sheet::getLiteral( const String& cellName )
{
return m_content[cellName].getLiteral;
}
String Sheet::getFormula( const String& cellName )
{
return m_content[cellName].getLiteral;
}
void Sheet::put( const String & cellName, const String& cellValue )
{
m_content[cellName].put(cellValue);
}
The Cell class looks has code that's actually quite similar to what was in
Sheet for Part1. The main difference is
that the methods don't take the cell name as an argument. I decided that
at this point, the cell wouldn't know it's name. If I need it, I can add
it later.
I added a little function to see if the content of the cell is a formula and
another to evaluate the formula. They're only one line each but I think the
names add value to the code. I'm not entirely happy that isFormula is part
of Cell while isNumeric is in String. But it makes some degree of sense, so
I'm leaving it. There's still no parser, but I'm ready to add one and have
a place to put it.
class Cell
{
public:
Cell();
virtual ~Cell();
String get();
String getLiteral();
String getFormula();
void put( const String& cellValue );
private:
String m_content;
bool isFormula();
String evalFormula();
};
String Cell::get()
{
if ( m_content.isNumeric() )
return m_content.trim();
else
return m_content;
}
String Cell::getLiteral()
{
return m_content;
}
String Cell::getFormula()
{
return m_content;
}
void Cell::put( const String& cellValue)
{
m_content = cellValue;
}
bool Cell::isFormula()
{
return m_content[0] == '=';
}
String Cell::evalFormula()
{
return m_content.substr(1);
}
Simple Expression Parsing
So far, we have been faking it - not really parsing the formula at all.
Begining with testParentheses, it's no longer useful to avoid creating a
parser. But first, I need to create code to call the parser. A small
modification to Cell::evalFormula serves to do this.
String Cell::evalFormula()
{
if ( isFormula() )
{
FormulaParser parser( m_content );
int value;
if ( parser.eval( value ) )
return String::toString( value );
}
return String("#Error");
}
Now evalFormula is for real. The FormulaParser will evaluate the formula and
return an integer. Looking ahead a tiny bit, it seems handy to make the
translation from integer to string here. The integer value is returned using
a reference argument. A more general approach would be to use a getter so
that the type of the value might be determined dynamically. I could also
use a result object for the same purpose but I'm waiting to see what happens.
I also looked ahead to pick the error string to return here.
By the way, the static function String::toString is a template function
I added to my String class in order to keep things at a relatively high level
in this code. I stole the code from CppUnit.
Anyway, Now all we need is a FormulaParser that does matches what evalFormula
expects. Here's my first cut.
class FormulaParser
{
public:
FormulaParser( const String& );
virtual ~FormulaParser();
bool eval( const String& formula, int& value );
}
bool FormulaParser::eval( int& value )
{
value = 7;
return true;
}
So what do you want already? It passes all the tests. The constructor saves the
formula, but doesn't use it anywhere. Next, I add a new test
to testConstantExpression that put's "=37" into the cell. Gee, now it fails so
I have to do some work.
If I'd never done any parsing, I don't know if I'd introduce lexical tokens here.
But I have, and I am. I need to be able to see the text and type of the next
token as I parse the formula. So far, I only need two types of tokens: integer
constants and operators. So that's all I'll do. First, here's the class that
holds information about a token.
class LexicalToken
{
public:
LexicalToken(int type = TT_UNKNOWN, int value = OT_UNKNOWN)
: m_type(type), m_value(value) {}
~LexicalToken();
bool isOperator() { return m_type == TT_OPERATOR; }
bool isOperator(int opType)
{ return m_type == TT_OPERATOR && m_value == optype; }
bool isIntConst() { return m_type == TT_INTCONST; }
int getIntValue(){ return m_value; }
int getOperator(){ return m_value; }
private:
int m_type;
int m_value;
}
It's little more than a struct with some protection. It has no setters and it
can't change once it's created.
The lexical analyzer that produces these tokens could reasonably be another object.
However, I decided to just have a private method
of FormulaParser read the tokens. This turned out to be a mistake, as I'll
explain, but first I'll show the revisions to FormulaParser.
This parser needs to evaluate an expression consisting of an integer wrapped
in one or more levels of parentheses. It needs to prime the pump by reading
a token, and needs to always read a new token whenever it finishes processing
one. Here it is.
class FormulaParser
{
public:
FormulaParser( const String& );
virtual ~FormulaParser();
bool eval( int& value );
private:
std::istringstream m_ist;
LexicalToken m_token;
void readNextToken();
bool evalExpression( int& value );
};
bool FormulaParser::eval( int& value )
{
if ( m_ist.eof() )
return false;
char ch;
m_ist >> ch;
if ( ch != '=' )
return false;
readNextToken();
return evalExpression( value );
}
void FormulaParser::readNextToken()
{
static String operators("()+*");
while ( !m_ist.eof() )
{
char ch;
m_ist >> ch;
if ( !isspace( ch ) )
{
if ( isdigit( ch ) )
{
int value;
m_ist.unget();
m_ist >> value;
m_token = LexicalToken( TT_INTCONST, value );
return;
)
else if ( operators.find( ch ) >= 0 )
{
m_token = LexicalToken( TT_OPERATOR, ch );
return;
}
}
}
m_token = LexicalToken( TT_EOF, 0 );
return;
}
bool FormulaParser::evalExpression( int& value )
{
bool itsOK = false;
if ( m_token.isIntConst() )
{
value = m_token.getIntValue();
itsOK = true;
}
else if ( m_token.isOperator( OT_LPAREN ) )
{
readNextToken();
if ( evalExpression( value ) )
{
readNextToken();
if ( m_token.isOperator( OT_RPAREN ) )
itsOK = true;
}
}
return itsOK;
}
Basically, the eval routine sets things up and calls evalExpression.
If evalExpression finds an integer, it just returns the value. But if
it finds a left parenthesis, it reads a token and calls itself recursively.
After that call succeeds, it expects the next token to be a right parenthesis.
There is probably an error in there regarding reading the next token at all the
right places, but further tests will show that.
Getting the token reading to be correct was hard. Since I made it a private
routine, I would have to go to extra trouble to test it separately. I didn't go
to the trouble, so it took me longer than it might have. If it were a public
method of a separate Lexer object, it could have it's own test suite.
Adding More Operators
Now that the infrastructure for parsing is in place, it's time to make more
use of it. The first test
I added was testMultiply. After it failed, I decided it would be handy to
factor create a method from the code in evalExpression that takes care of
integers. That made the code look like this:
bool FormulaParser::evalExpression( int& value )
{
if ( m_token.isOperator( OT_LPAREN ) )
{
readNextToken();
if ( evalExpression( value ) )
{
readNextToken();
if ( m_token.isOperator( OT_RPAREN ) )
itsOK = true;
}
}
else
return evalFactor();
}
bool FormulaParser::evalIntConst()
{
if ( m_token.isIntConst() )
{
value = m_token.getIntValue();
itsOK = true;
}
else
return false;
}
Of course, testMultiply still fails, but I verified that
everything else works before proceeding. Next, I reorganized the call
relationships between the routines and implemented multiplication and
addition based on the following grammar (informally expressed):
- An expression is a sum of one or more terms.
- A term is a product of one or more factors.
- A factor is either an integer or an expression in parentheses.
So basically, I couldn't stand to keep doing it test by test. Here's the code:
bool FormulaParser::evalExpression( int& value )
{
if ( !evalTerm( value ) )
return false;
while ( m_token.isOperator( OT_PLUS ) )
{
readNextToken();
int termValue;
if ( !evalTerm( termValue ) )
return false;
value += termValue;
}
return true;
}
bool FormulaParser::evalTerm( int& value )
{
if ( !evalFactor( value ) )
return false;
while ( m_token.isOperator( OT_TIMES ) )
{
readNextToken();
int factorValue;
if ( !evalFactor( factorValue ) )
return false;
value *= factorValue;
}
return true;
}
bool FormulaParser::evalFactor( int& value )
{
if ( m_token.isIntConst() )
{
value = m_token.getIntValue();
readNextToken();
return true;
}
else if ( m_token.isOperator( OT_LPAREN ) )
{
readNextToken();
if ( evalExpression( value ) && m_token.isOperator( OT_RPAREN ) )
{
readNextToken();
return true;
}
}
return false;
}
The tests showed a few errors which I proceeded to fix. The full set ran pretty
quickly. My conclusion is that it can be easier to do a group of inter-related
items all at once rather than one test at a time. Obviously, if I got into
trouble, it wouldn't be simple to fix.
Adding subtraction and division was pretty simple. Here's the code for
doing division:
bool FormulaParser::evalTerm( int& value )
{
if ( !evalFactor( value ) )
return false;
while ( m_token.isOperator() )
{
int op = m_token.getOperator();
if ( op != OT_TIMES && op != OT_DIVIDE )
break;
readNextToken();
int factorValue;
if ( !evalFactor( factorValue ) )
return false;
if ( op == OT_TIMES )
value *= factorValue;
else if ( factorValue == 0 )
return false;
else
value /= factorValue;
}
return true;
}
Subtraction is easier since it doesn't need the divide-by-zero check.
At this point, we are able to parse and evaluate constant expressions.
To be useful, our program needs to allow cell references in formulas.
We'll do this in Part 3.
|