writeexcel::formula(3)

NAME

Formula - A class for generating Excel formulas

SYNOPSIS

See the documentation for Spreadsheet::WriteExcel

DESCRIPTION

This module is used by Spreadsheet::WriteExcel. You do not
need to use it directly.

NOTES

The following notes are to help developers and maintainers
understand the sequence of operation. They are also
intended as a pro-memoria for the author. ;-)

Spreadsheet::WriteExcel::Formula converts a textual repre
sentation of a formula into the pre-parsed binary format
that Excel uses to store formulas. For example "1+2*3" is
stored as follows: "1E 01 00 1E 02 00 1E 03 00 05 03".

This string is comprised of operators and operands
arranged in a reverse-Polish format. The meaning of the
tokens in the above example is shown in the following
table:
Token Name Value
1E ptgInt 0001 (stored as 01 00)
1E ptgInt 0002 (stored as 02 00)
1E ptgInt 0003 (stored as 03 00)
05 ptgMul
03 ptgAdd
The tokens and token names are defined in the "Excel
Developer's Kit" from Microsoft Press. "ptg" stands for
Parse ThinG (as in "That lexer can't grok it, it's a parse
thang.")
In general the tokens fall into two categories: operators
such as "ptgMul" and operands such as "ptgInt". When the
formula is evaluated by Excel the operand tokens push val
ues onto a stack. The operator tokens then pop the
required number of operands off of the stack, perform an
operation and push the resulting value back onto the
stack. This methodology is similar to the basic operation
of a reverse-Polish (RPN) calculator.
Spreadsheet::WriteExcel::Formula parses a formula using a
"Parse::RecDescent" parser (at a later stage it may use a
"Parse::Yapp" parser or "Parse::FastDescent").
The parser converts the textual representation of a for
mula into a parse tree. Thus, "1+2*3" is converted into
something like the following, "e" stands for expression:

e
/ | 1 + e
/ | 2 * 3
The function "_reverse_tree()" recurses down through this
structure swapping the order of operators followed by
operands to produce a reverse-Polish tree. In other words
the formula is converted from in-fix notation to post-fix.
Following the above example the resulting tree would look
like this:

e
/ | 1 e +
/ | 2 3 *
The result of the recursion is a single array of tokens.
In our example the simplified form would look like the
following:

(1, 2, 3, *, +)
The actual return value contains some additional informa
tion to help in the secondary parsing stage:

(_num, 1, _num, 2, _num, 3, ptgMul, ptgAdd, _arg, 1)
The additional tokens are:

Token Meaning
_num The next token is a number
_str The next token is a string
_ref2d The next token is a 2d cell reference
_ref3d The next token is a 3d cell reference
_range2d The next token is a 2d range
_range3d The next token is a 3d range
_func The next token is a function
_arg The next token is the number of args for a
function
_class The next token is a function name
The "_arg" token is generated for all lists but is only
used for functions that take a variable number of argu
ments.
The "_class" token indicates the start of the arguments to
a function. This allows the post-processor to decide the
"class" of the ref and range arguments that the function
takes. The class can be reference, value or array. Since
function calls can be nested, the class variable is stored
on a stack in the @class array. The class of the ref or
range is then read as the top element of the stack
$class[-1]. When a "_func" is read it pops the class
value.
Certain Excel functions such as RAND() and NOW() are des ignated as volatile and must be recalculated by Excel
every time that a cell is updated. Any formulas that con
tain one of these functions has a specially formatted
"ptgAttr" tag prepended to it to indicate that it is
volatile.
A secondary parsing stage is carried out by
"_parse_tokens()" which converts these tokens into a
binary string. For the "1+2*3" example this would give:

1E 01 00 1E 02 00 1E 03 00 05 03
This two-pass method could probably have been reduced to a
single pass through the "Parse::RecDescent" parser. How
ever, it was easier to develop and debug this way.
The token values and formula values are stored in the %ptg
and %functions hashes. These hashes and the parser object
$parser are exposed as global data. This breaks the OO
encapsulation, but means that they can be shared by sev
eral instances of Spreadsheet::WriteExcel called from the
same program.
Non-English function names can be added to the %functions
hash using the "function_locale.pl" program in the "exam
ples" directory of the distro. The supported languages
are: German, French, Spanish, Portuguese, Dutch, Finnish,
Italian and Swedish. These languages are not added by
default because there are conflicts between functions
names in different languages.
The parser is initialised by "_init_parser()". The ini
tialisation is delayed until the first formula is parsed.
This eliminates the overhead of generating the parser in
programs that are not processing formulas. (The parser
should really be pre-compiled, this is to-do when the
grammar stabilises).

AUTHOR

John McNamara jmcnamara@cpan.org

COPYRIGHT

© MM-MMII, John McNamara.

All Rights Reserved. This module is free software. It may
be used, redistributed and/or modified under the same
terms as Perl itself.
Copyright © 2010-2025 Platon Technologies, s.r.o.           Home | Man pages | tLDP | Documents | Utilities | About
Design by styleshout