dag_node(3)

NAME

Tree::DAG_Node - (super)class for representing nodes in a
tree

SYNOPSIS

Using as a base class:
  package Game::Tree::Node; # or whatever you're doing
  use Tree::DAG_Node;
  @ISA = qw(Tree::DAG_Node);
  ...your own methods overriding/extending
    the methods in Tree::DAG_Node...
Using as a class of its own:
  use Tree::DAG_Node;
  my $root = Tree::DAG_Node->new();
  $root->name("I'm the tops");
  my $new_daughter = $root->new_daughter;
  $new_daughter->name("More");
  ...

DESCRIPTION

This class encapsulates/makes/manipulates objects that
represent nodes in a tree structure. The tree structure is
not an object itself, but is emergent from the linkages
you create between nodes. This class provides the methods
for making linkages that can be used to build up a tree,
while preventing you from ever making any kinds of link
ages which are not allowed in a tree (such as having a
node be its own mother or ancestor, or having a node have
two mothers).

This is what I mean by a "tree structure", a bit redun
dantly stated:

* A tree is a special case of an acyclic directed graph.

* A tree is a network of nodes where there's exactly one
root node (i.e., 'the top'), and the only primary rela
tionship between nodes is the mother-daugher relationship.

* No node can be its own mother, or its mother's mother,
etc.

* Each node in the tree has exactly one "parent" (node in
the "up" direction) -- except the root, which is parent
less.

* Each node can have any number (0 to any finite number)
of daughter nodes. A given node's daughter nodes consti
tute an ordered list. (However, you are free to consider this ordering irrelevant. Some applications do need
daughters to be ordered, so I chose to consider this the
general case.)

* A node can appear in only one tree, and only once in
that tree. Notably (notable because it doesn't follow
from the two above points), a node cannot appear twice in
its mother's daughter list.

* In other words, there's an idea of up (toward the root)
versus down (away from the root), and left (i.e., toward
the start (index 0) of a given node's daughter list) ver
sus right (toward the end of a given node's daughter
list).

Trees as described above have various applications, among
them: representing syntactic constituency, in formal lin
guistics; representing contingencies in a game tree; rep
resenting abstract syntax in the parsing of any computer
language -- whether in expression trees for programming
languages, or constituency in the parse of a markup lan
guage document. (Some of these might not use the fact
that daughters are ordered.)

(Note: B-Trees are a very special case of the above kinds
of trees, and are best treated with their own class.
Check CPAN for modules encapsulating B-Trees; or if you
actually want a database, and for some reason ended up
looking here, go look at AnyDBM_File.)

Many base classes are not usable except as such -- but
Tree::DAG_Node can be used as a normal class. You can go
ahead and say:
use Tree::DAG_Node;
my $root = Tree::DAG_Node->new();
$root->name("I'm the tops");
$new_daughter = Tree::DAG_Node->new();
$new_daughter->name("More");
$root->add_daughter($new_daughter);
and so on, constructing and linking objects from
Tree::DAG_Node and making useful tree structures out of
them.

A NOTE TO THE READER

This class is big and provides lots of methods. If your
problem is simple (say, just representing a simple parse
tree), this class might seem like using an atomic sledge
hammer to swat a fly. But the complexity of this module's
bells and whistles shouldn't detract from the efficiency
of using this class for a simple purpose. In fact, I'd be
very surprised if any one user ever had use for more that
even a third of the methods in this class. And remember:
an atomic sledgehammer will kill that fly.

OBJECT CONTENTS

Implementationally, each node in a tree is an object, in
the sense of being an arbitrarily complex data structure
that belongs to a class (presumably Tree::DAG_Node, or
ones derived from it) that provides methods.

The attributes of a node-object are:

mother -- this node's mother. undef if this is a root.
daughters -- the (possibly empty) list of daughters of
this node.
name -- the name for this node.
Need not be unique, or even printable. This is
printed in some of the various dumper methods, but
it's up to you if you don't put anything meaningful or
printable here.
attributes -- whatever the user wants to use it for.
Presumably a hashref to whatever other attributes the
user wants to store without risk of colliding with the
object's real attributes. (Example usage: attributes
to an SGML tag -- you definitely wouldn't want the
existence of a "mother=foo" pair in such a tag to col
lide with a node object's 'mother' attribute.)
Aside from (by default) initializing it to {}, and
having the access method called "attributes"
(described a ways below), I don't do anything with the
"attributes" in this module. I basically intended
this so that users who don't want/need to bother
deriving a class from Tree::DAG_Node, could still
attach whatever data they wanted in a node.
"mother" and "daughters" are attributes that relate to
linkage -- they are never written to directly, but are
changed as appropriate by the "linkage methods", discussed
below.
The other two (and whatever others you may add in derived
classes) are simply accessed thru the same-named methods,
discussed further below.
ABOUT THE DOCUMENTED INTERFACE
Stick to the documented interface (and comments in the
source -- especially ones saying "undocumented!" and/or
"disfavored!" -- do not count as documentation!), and
don't rely on any behavior that's not in the documented
interface.
Specifically, unless the documentation for a particular
method says "this method returns thus-and-such a value",
then you should not rely on it returning anything meaning
ful.
A passing acquintance with at least the broader details of the source code for this class is assumed for anyone using
this class as a base class -- especially if you're over
riding existing methods, and definitely if you're overrid ing linkage methods.

MAIN CONSTRUCTOR, AND INITIALIZER

the constructor CLASS->new() or
CLASS->new({...options...})
This creates a new node object, calls
$object->_init({...options...}) to provide it sane
defaults (like: undef name, undef mother, no daugh
ters, 'attributes' setting of a new empty hashref),
and returns the object created. (If you just said
"CLASS->new()" or "CLASS->new", then it pretends you
called "CLASS->new({})".)
Currently no options for putting in {...options...}
are part of the documented interface, but the options
is here in case you want to add such behavior in a
derived class.
Read on if you plan on using Tree::DAG_New as a base
class. (Otherwise feel free to skip to the descrip
tion of _init.)
There are, in my mind, two ways to do object construc
tion:
Way 1: create an object, knowing that it'll have cer
tain uninteresting sane default values, and then call
methods to change those values to what you want.
Example:

$node = Tree::DAG_Node->new;
$node->name('Supahnode!');
$root->add_daughter($node);
$node->add_daughters(@some_others)
Way 2: be able to specify some/most/all the object's
attributes in the call to the constructor. Something
like:

$node = Tree::DAG_Node->new({
name => 'Supahnode!',
mother => $root,
daughters => @some_others
});
After some deliberation, I've decided that the second
way is a Bad Thing. First off, it is not markedly
more concise than the first way. Second off, it often
requires subtly different syntax (e.g., @some_others
vs @some_others). It just complicates things for the
programmer and the user, without making either appre
ciably happier.
(This is not to say that options in general for a con
structor are bad -- "random_network", discussed far
below, necessarily takes options. But note that those
are not options for the default values of attributes.)
Anyway, if you use Tree::DAG_Node as a superclass, and
you add attributes that need to be initialized, what
you need to do is provide an _init method that calls
$this->SUPER::_init($options) to use its superclass's
_init method, and then initializes the new attributes:

sub _init {
my($this, $options) = @_[0,1];
$this->SUPER::_init($options); # call my super
class's _init to
# init all the attributes I'm inheriting
# Now init /my/ new attributes:
$this->{'amigos'} = []; # for example
}
...or, as I prefer when I'm being a neat freak:

sub _init {
my($this, $options) = @_[0,1];
$this->SUPER::_init($options);
$this->_init_amigos($options);
}
sub _init_amigos {
my $this = $_[0];
# Or my($this,$options) = @_[0,1]; if I'm using
$options
$this->{'amigos'} = [];
}
In other words, I like to have each attribute initial
ized thru a method named _init_[attribute], which
should expect the object as $_[0] and the the options
hashref (or {} if none was given) as $_[1]. If you
insist on having your _init recognize options for
setting attributes, you might as well have them dealt
with by the appropriate _init_[attribute] method, like
this:

sub _init {
my($this, $options) = @_[0,1];
$this->SUPER::_init($options);
$this->_init_amigos($options);
}
sub _init_amigos {
my($this,$options) = @_[0,1]; # I need options
this time
$this->{'amigos'} = [];
$this->amigos(@{$options->{'amigos'}}) if $op
tions->{'amigos'};
}
All this bookkeeping looks silly with just one new
attribute in a class derived straight from
Tree::DAG_Node, but if there's lots of new attributes
running around, and if you're deriving from a class
derived from a class derived from Tree::DAG_Node, then
tidy stratification/modularization like this can keep
you sane.
the constructor $obj->new() or $obj->new({...options...})
Just another way to get at the "new" method. This does
not copy $obj, but merely constructs a new object of the same class as it. Saves you the bother of going
$class = ref $obj; $obj2 = $class->new;
the method $node->_init({...options...})
Initialize the object's attribute values. See the
discussion above. Presumably this should be called
only by the guts of the "new" constructor -- never by
the end user.
Currently there are no documented options for putting
in {...options...}, but (in case you want to disregard
the above rant) the option exists for you to use
{...options...} for something useful in a derived
class.
Please see the source for more information.
see also (below) the constructors "new_daughter" and
"new_daughter_left"

LINKAGE-RELATED METHODS

$node->daughters
This returns the (possibly empty) list of daughters
for $node.
$node->mother
This returns what node is $node's mother. This is
undef if $node has no mother -- i.e., if it is a root.
$mother->add_daughters( LIST )
This method adds the node objects in LIST to the
(right) end of $mother's "daughter" list. Making a
node N1 the daughter of another node N2 also means
that N1's "mother" attribute is "automatically" set to
N2; it also means that N1 stops being anything else's
daughter as it becomes N2's daughter.
If you try to make a node its own mother, a fatal
error results. If you try to take one of a a node
N1's ancestors and make it also a daughter of N1, a
fatal error results. A fatal error results if any
thing in LIST isn't a node object.
If you try to make N1 a daughter of N2, but it's
already a daughter of N2, then this is a no-operation -- it won't move such nodes to the end of the list or
anything; it just skips doing anything with them.
$node->add_daughter( LIST )
An exact synonym for $node->add_daughters(LIST)
$mother->add_daughters_left( LIST )
This method is just like "add_daughters", except that
it adds the node objects in LIST to the (left) begin
ning of $mother's daughter list, instead of the
(right) end of it.
$node->add_daughter_left( LIST )
An exact synonym for $node->add_daughters_left( LIST )
Note:
The above link-making methods perform basically an
"unshift" or "push" on the mother node's daughter
list. To get the full range of list-handling func
tionality, copy the daughter list, and change it, and
then call "set_daughters" on the result:

@them = $mother->daughters;
@removed = splice(@them, 0,2, @new_nodes);
$mother->set_daughters(@them);
Or consider a structure like:

$mother->set_daughters(
grep($_->name =~ /NP/
,
$mother->daugh
ters
)
);
the constructor $daughter = $mother->new_daughter, or
the constructor $daughter = $mother->new_daugh
ter({...options...})
This constructs a new node (of the same class as $mother), and adds it to the (right) end of the daugh
ter list of $mother. This is essentially the same as
going

$daughter = $mother->new;
$mother->add_daughter($daughter);
but is rather more efficient because (since $daughter
is guaranteed new and isn't linked to/from anything),
it doesn't have to check that $daughter isn't an
ancestor of $mother, isn't already daughter to a
mother it needs to be unlinked from, isn't already in
$mother's daughter list, etc.
As you'd expect for a constructor, it returns the
node-object created.
the constructor $mother->new_daughter_left, or
$mother->new_daughter_left({...options...})
This is just like $mother->new_daughter, but adds the
new daughter to the left (start) of $mother's daughter
list.
$mother->remove_daughters( LIST )
This removes the nodes listed in LIST from $mother's
daughter list. This is a no-operation if LIST is
empty. If there are things in LIST that aren't a cur
rent daughter of $mother, they are ignored.
Not to be confused with $mother->clear_daughters.
$node->remove_daughter( LIST )
An exact synonym for $node->remove_daughters( LIST )
$node->unlink_from_mother
This removes node from the daughter list of its
mother. If it has no mother, this is a no-operation.
Returns the mother unlinked from (if any).
$mother->clear_daughters
This unlinks all $mother's daughters. Returns the the
list of what used to be $mother's daughters.
Not to be confused with $mother->remove_daughters(
LIST ).
$mother->set_daughters( LIST )
This unlinks all $mother's daughters, and replaces
them with the daughters in LIST.
Currently implemented as just $mother->clear_daughters
followed by $mother->add_daughters( LIST ).
$node->replace_with( LIST )
This replaces $node in its mother's daughter list, by
unlinking $node and replacing it with the items in
LIST. This returns a list consisting of $node fol
lowed by LIST, i.e., the nodes that replaced it.
LIST can include $node itself (presumably at most
once). LIST can also be empty-list. However, if any
items in LIST are sisters to $node, they are ignored,
and are not in the copy of LIST passed as the return
value.
As you might expect for any linking operation, the
items in LIST cannot be $node's mother, or any ances
tor to it; and items in LIST are, of course, unlinked
from their mothers (if they have any) as they're
linked to $node's mother.
(In the special (and bizarre) case where $node is
root, this simply calls $this->unlink_from_mother on
all the items in LIST, making them roots of their own
trees.)
Note that the daughter-list of $node is not necessar
ily affected; nor are the daughter-lists of the items
in LIST. I mention this in case you think
replace_with switches one node for another, with
respect to its mother list and its daughter list,
leaving the rest of the tree unchanged. If that's what
you want, replacing $Old with $New, then you want:

$New->set_daughters($Old->clear_daughters);
$Old->replace_with($New);
(I can't say $node's and LIST-items' daughter lists
are never affected my replace_with -- they can be
affected in this case:

$N1 = ($node->daughters)[0]; # first daughter of
$node
$N2 = ($N1->daughters)[0]; # first daughter of
$N1;
$N3 = Tree::DAG_Node->random_network; # or whatever
$node->replace_with($N1, $N2, $N3);
As a side affect of attaching $N1 and $N2 to $node's
mother, they're unlinked from their parents ($node,
and $N1, replectively). But N3's daughter list is
unaffected.
In other words, this method does what it has to, as
you'd expect it to.
$node->replace_with_daughters
This replaces $node in its mother's daughter list, by
unlinking $node and replacing it with its daughters.
In other words, $node becomes motherless and daughter
less as its daughters move up and take its place.
This returns a list consisting of $node followed by
the nodes that were its daughters.
In the special (and bizarre) case where $node is root,
this simply unlinks its daughters from it, making them
roots of their own trees.
Effectively the same as
$node->replace_with($node->daughters), but more effi
cient, since less checking has to be done. (And I
also think $node->replace_with_daughters is a more
common operation in tree-wrangling than
$node->replace_with(LIST), so deserves a named method
of its own, but that's just me.)
$node->add_left_sisters( LIST )
This adds the elements in LIST (in that order) as
immediate left sisters of $node. In other words,
given that B's mother's daughter-list is (A,B,C,D),
calling B->add_left_sisters(X,Y) makes B's mother's
daughter-list (A,X,Y,B,C,D).
If LIST is empty, this is a no-op, and returns
empty-list.
This is basically implemented as a call to
$node->replace_with(LIST, $node), and so all
replace_with's limitations and caveats apply.
The return value of $node->add_left_sisters( LIST ) is
the elements of LIST that got added, as returned by
replace_with -- minus the copies of $node you'd get
from a straight call to $node->replace_with(LIST,
$node).
$node->add_left_sister( LIST )
An exact synonym for $node->add_left_sisters(LIST)
$node->add_right_sisters( LIST )
Just like add_left_sisters (which see), except that
the the elements in LIST (in that order) as immediate
right sisters of $node;
In other words, given that B's mother's daughter-list
is (A,B,C,D), calling B->add_right_sisters(X,Y) makes
B's mother's daughter-list (A,B,X,Y,C,D).
$node->add_right_sister( LIST )
An exact synonym for $node->add_right_sisters(LIST)

OTHER ATTRIBUTE METHODS

$node->name or $node->name(SCALAR)
In the first form, returns the value of the node
object's "name" attribute. In the second form, sets
it to the value of SCALAR.
$node->attributes or $node->attributes(SCALAR)
In the first form, returns the value of the node
object's "attributes" attribute. In the second form,
sets it to the value of SCALAR. I intend this to be
used to store a reference to a (presumably anonymous)
hash the user can use to store whatever attributes he
doesn't want to have to store as object attributes.
In this case, you needn't ever set the value of this.
(_init has already initialized it to {}.) Instead you
can just do...

$node->attributes->{'foo'} = 'bar';
...to write foo => bar.
$node->attribute or $node->attribute(SCALAR)
An exact synonym for $node->attributes or
$node->attributes(SCALAR)

OTHER METHODS TO DO WITH RELATIONSHIPS

$node->is_node
This always returns true. More pertinently,
$object->can('is_node') is true (regardless of what
"is_node" would do if called) for objects belonging to
this class or for any class derived from it.
$node->ancestors
Returns the list of this node's ancestors, starting
with its mother, then grandmother, and ending at the
root. It does this by simply following the 'mother'
attributes up as far as it can. So if $item IS the
root, this returns an empty list.
Consider that scalar($node->ancestors) returns the ply
of this node within the tree -- 2 for a granddaughter
of the root, etc., and 0 for root itself.
$node->root
Returns the root of whatever tree $node is a member
of. If $node is the root, then the result is $node
itself.
$node->is_daughter_of($node2)
Returns true iff $node is a daughter of $node2. Cur
rently implemented as just a test of ($it->mother eq
$node2).
$node->self_and_descendants
Returns a list consisting of itself (as element 0) and
all the descendants of $node. Returns just itself if
$node is a terminal_node.
(Note that it's spelled "descendants", not
"descendents".)
$node->descendants
Returns a list consisting of all the descendants of
$node. Returns empty-list if $node is a termi
nal_node.
(Note that it's spelled "descendants", not "descen
dents".)
$node->leaves_under
Returns a list (going left-to-right) of all the leaf
nodes under $node. ("Leaf nodes" are also called
"terminal nodes" -- i.e., nodes that have no daugh
ters.) Returns $node in the degenerate case of $node
being a leaf itself.
$node->depth_under
Returns an integer representing the number of branches
between this $node and the most distant leaf under it.
(In other words, this returns the ply of subtree
starting of $node. Consider scalar($it->ancestors) if
you want the ply of a node within the whole tree.)
$node->generation
Returns a list of all nodes (going left-to-right) that
are in $node's generation -- i.e., that are the some
number of nodes down from the root. $root->generation
is just $root.
Of course, $node is always in its own generation.
$node->generation_under(NODE2)
Like $node->generation, but returns only the nodes in
$node's generation that are also descendants of NODE2
-- in other words,

@us = $node->generation_under( $node->moth
er->mother );
is all $node's first cousins (to borrow yet more kin
ship terminology) -- assuming $node does indeed have a
grandmother. Actually "cousins" isn't quite an apt
word, because @us ends up including $node's siblings
and $node.
Actually, "generation_under" is just an alias to "gen
eration", but I figure that this:

@us = $node->generation_under($way_upline);
is a bit more readable than this:

@us = $node->generation($way_upline);
But it's up to you.
$node->generation_under($node) returns just $node.
If you call $node->generation_under($node) but NODE2
is not $node or an ancestor of $node, it behaves as if
you called just $node->generation().
$node->self_and_sisters
Returns a list of all nodes (going left-to-right) that
have the same mother as $node -- including $node
itself. This is just like $node->mother->daughters,
except that that fails where $node is root, whereas
$root->self_and_siblings, as a special case, returns
$root.
(Contrary to how you may interpret how this method is
named, "self" is not (necessarily) the first element
of what's returned.)
$node->sisters
Returns a list of all nodes (going left-to-right) that
have the same mother as $node -- not including $node itself. If $node is root, this returns empty-list.
$node->left_sister
Returns the node that's the immediate left sister of
$node. If $node is the leftmost (or only) daughter of
its mother (or has no mother), then this returns
undef.
(See also $node->add_left_sisters(LIST).)
$node->left_sisters
Returns a list of nodes that're sisters to the left of
$node. If $node is the leftmost (or only) daughter of
its mother (or has no mother), then this returns an
empty list.
(See also $node->add_left_sisters(LIST).)
$node->right_sister
Returns the node that's the immediate right sister of
$node. If $node is the rightmost (or only) daughter
of its mother (or has no mother), then this returns
undef.
(See also $node->add_right_sisters(LIST).)
$node->right_sisters
Returns a list of nodes that're sisters to the right
of $node. If $node is the rightmost (or only) daughter
of its mother (or has no mother), then this returns an
empty list.
(See also $node->add_right_sisters(LIST).)
$node->my_daughter_index
Returns what index this daughter is, in its mother's
"daughter" list. In other words, if $node is
($node->mother->daughters)[3], then $node->my_daugh
ter_index returns 3.
As a special case, returns 0 if $node has no mother.
$node->address or $anynode->address(ADDRESS)
With the first syntax, returns the address of $node
within its tree, based on its position within the
tree. An address is formed by noting the path between
the root and $node, and concatenating the daughterindices of the nodes this passes thru (starting with 0
for the root, and ending with $node).
For example, if to get from node ROOT to node $node,
you pass thru ROOT, A, B, and $node, then the address
is determined as:
* ROOT's my_daughter_index is 0.
* A's my_daughter_index is, suppose, 2. (A is index 2
in ROOT's daughter list.)
* B's my_daughter_index is, suppose, 0. (B is index 0
in A's daughter list.)
* $node's my_daughter_index is, suppose, 4. ($node is
index 4 in B's daughter list.)
The address of the above-described $node is, there
fore, "0:2:0:4".
(As a somewhat special case, the address of the root
is always "0"; and since addresses start from the
root, all addresses start with a "0".)
The second syntax, where you provide an address,
starts from the root of the tree $anynode belongs to,
and returns the node corresponding to that address.
Returns undef if no node corresponds to that address.
Note that this routine may be somewhat liberal in its
interpretation of what can constitute an address;
i.e., it accepts "0.2.0.4", besides "0:2:0:4".
Also note that the address of a node in a tree is
meaningful only in that tree as currently structured.
(Consider how ($address1 cmp $address2) may be magi
cally meaningful to you, if you mant to figure out
what nodes are to the right of what other nodes.)
$node->common(LIST)
Returns the lowest node in the tree that is ancestoror-self to the nodes $node and LIST.
If the nodes are far enough apart in the tree, the
answer is just the root.
If the nodes aren't all in the same tree, the answer
is undef.
As a degenerate case, if LIST is empty, returns $node.
$node->common_ancestor(LIST)
Returns the lowest node that is ancestor to all the
nodes given (in nodes $node and LIST). In other
words, it answers the question: "What node in the
tree, as low as possible, is ancestor to the nodes
given ($node and LIST)?"
If the nodes are far enough apart, the answer is just
the root -- except if any of the nodes are the root
itself, in which case the answer is undef (since the
root has no ancestor).
If the nodes aren't all in the same tree, the answer
is undef.
As a degenerate case, if LIST is empty, returns
$node's mother; that'll be undef if $node is root.

YET MORE METHODS

$node->walk_down({ callback => foo, callbackback =>
foo, ... })
Performs a depth-first traversal of the structure at
and under $node. What it does at each node depends on
the value of the options hashref, which you must pro
vide. There are three options, "callback" and "call
backback" (at least one of which must be defined, as a
sub reference), and "_depth". This is what
"walk_down" does, in pseudocode form:
* Start at the $node given.
* If there's a "callback", call it with $node as the
first argument, and the options hashref as the second
argument (which contains the potentially useful
"_depth", remember). This function must return true
or false -- if false, it will block the next step:
* If $node has any daughter nodes, increment "_depth",
and call $daughter->walk_down(options_hashref) for
each daughter (in order, of course), where
options_hashref is the same hashref it was called
with. When this returns, decrements "_depth".
* If there's a "callbackback", call just it as with
"callback" (but tossing out the return value). Note
that "callback" returning false blocks traversal below
$node, but doesn't block calling callbackback for
$node. (Incidentally, in the unlikely case that $node
has stopped being a node object, "callbackback" won't
get called.)
* Return.
$node->walk_down is the way to recursively do things
to a tree (if you start at the root) or part of a
tree; if what you're doing is best done via pre-pre
order traversal, use "callback"; if what you're doing
is best done with post-order traversal, use "callback
back". "walk_down" is even the basis for plenty of
the methods in this class. See the source code for
examples both simple and horrific.
Note that if you don't specify "_depth", it effec
tively defaults to 0. You should set it to
scalar($node->ancestors) if you want "_depth" to
reflect the true depth-in-the-tree for the nodes
called, instead of just the depth below $node. (If
$node is the root, there's difference, of course.)
And by the way, it's a bad idea to modify the tree from the callback. Unpredictable things may happen.
I instead suggest having your callback add to a stack
of things that need changing, and then, once
"walk_down" is all finished, changing those nodes from
that stack.
Note that the existence of "walk_down" doesn't mean
you can't write you own special-use traversers.
@lines = $node->dump_names({ ...options... });
Dumps, as an indented list, the names of the nodes
starting at $node, and continuing under it. Options
are:
* _depth -- A nonnegative number. Indicating the
depth to consider $node as being at (and so the gener
ation under that is that plus one, etc.). Defaults to
0. You may choose to use set _depth =>
scalar($node->ancestors).
* tick -- a string to preface each entry with, between
the indenting-spacing and the node's name. Defaults
to empty-string. You may prefer "*" or "-> " or
someting.
* indent -- the string used to indent with. Defaults
to " " (two spaces). Another sane value might be ".
" (period, space). Setting it to empty-string sup
presses indenting.
The dump is not printed, but is returned as a list,
where each item is a line, with a "0 at the end.
the constructor CLASS->random_network({...options...})
the method $node->random_network({...options...})
In the first case, constructs a randomly arranged net
work under a new node, and returns the root node of
that tree. In the latter case, constructs the network
under $node.
Currently, this is implemented a bit half-heartedly,
and half-wittedly. I basically needed to make up ran
dom-looking networks to stress-test the various treedumper methods, and so wrote this. If you actually
want to rely on this for any application more serious
than that, I suggest examining the source code and
seeing if this does really what you need (say, in
reliability of randomness); and feel totally free to
suggest changes to me (especially in the form of "I
rewrote "random_network", here's the code...")
It takes four options:
* max_node_count -- maximum number of nodes this tree
will be allowed to have (counting the root). Defaults
to 25.
* min_depth -- minimum depth for the tree. Defaults
to 2. Leaves can be generated only after this depth
is reached, so the tree will be at least this deep -unless max_node_count is hit first.
* max_depth -- maximum depth for the tree. Defaults
to 3 plus min_depth. The tree will not be deeper than
this.
* max_children -- maximum number of children any
mother in the tree can have. Defaults to 4.
the constructor CLASS->lol_to_tree($lol);
Converts something like bracket-notation for "Chomsky
trees" (or rather, the closest you can come with Perl
list-of-lists(-of-lists(-of-lists))) into a tree
structure. Returns the root of the tree converted.
The conversion rules are that: 1) if the last (possi
bly the only) item in a given list is a scalar, then
that is used as the "name" attribute for the node
based on this list. 2) All other items in the list
represent daughter nodes of the current node -- recur
sively so, if they are list references; otherwise,
(non-terminal) scalars are considered to denote nodes
with that name. So ['Foo', 'Bar', 'N'] is an alter
nate way to represent [['Foo'], ['Bar'], 'N'].
An example will illustrate:

use Tree::DAG_Node;
$lol =
[
[
[ [ 'Det:The' ],
[ [ 'dog' ], 'N'], 'NP'],
[ '/with rabies´, 'PP'],
'NP'
],
[ 'died', 'VP'],
'S'
];
$tree = Tree::DAG_Node->lol_to_tree($lol);
$diagram = $tree->draw_ascii_tree;
print map "$_0, @$diagram;
...returns this tree:

<S>
/-----------------
<NP> <VP>
/--------------- <died>
<NP> <PP>
/------- </with rabies>
<Det:The> <N>

<dog>
By the way (and this rather follows from the above
rules), when denoting a LoL tree consisting of just
one node, this:

$tree = Tree::DAG_Node->lol_to_tree( 'Lonely' );
is okay, although it'd probably occur to you to denote
it only as:

$tree = Tree::DAG_Node->lol_to_tree( ['Lonely'] );
which is of course fine, too.
$node->tree_to_lol_notation({...options...})Dumps a tree (starting at $node) as the sort of LoLlike bracket notation you see in the above example
code. Returns just one big block of text. The only
option is "multiline" -- if true, it dumps the text as
the sort of indented structure as seen above; if false
(and it defaults to false), dumps it all on one line
(with no indenting, of course).
For example, starting with the tree from the above
example, this:

print $tree->tree_to_lol_notation, "0;
prints the following (which I've broken over two lines
for sake of printablitity of documentation):

[[[['Det:The'], [['dog'], 'N'], 'NP'], [["/with ra
biesc"],
'PP'], 'NP'], [['died'], 'VP'], 'S'],
Doing this:

print $tree->tree_to_lol_notation({ multiline => 1
});
prints the same content, just spread over many lines,
and prettily indented.
$node->tree_to_lol
Returns that tree (starting at $node) represented as a
LoL, like what $lol, above, holds. (This is as
opposed to "tree_to_lol_notation", which returns the
viewable code like what gets evaluated and stored in
$lol, above.)
Lord only knows what you use this for -- maybe for
feeding to Data::Dumper, in case "tree_to_lol_nota
tion" doesn't do just what you want?
the constructor CLASS->simple_lol_to_tree($simple_lol);
This is like lol_to_tree, except that rule 1 doesn't
apply -- i.e., all scalars (or really, anything not a
listref) in the LoL-structure end up as named terminal
nodes, and only terminal nodes get names (and, of
course, that name comes from that scalar value). This
method is useful for making things like expression
trees, or at least starting them off. Consider that
this:

$tree = Tree::DAG_Node->simple_lol_to_tree(
[ 'foo', ['bar', ['baz'], 'quux'], 'zaz', 'pati'
]
);
converts from something like a Lispish or Iconish
tree, if you pretend the brackets are parentheses.
Note that there is a (possibly surprising) degenerate
case of what I'm calling a "simple-LoL", and it's like
this:

$tree = Tree::DAG_Node->simple_lol_to_tree('Lone
ly');
This is the (only) way you can specify a tree consist
ing of only a single node, which here gets the name
'Lonely'.
$node->tree_to_simple_lol
Returns that tree (starting at $node) represented as a
simple-LoL -- i.e., one where non-terminal nodes are
represented as listrefs, and terminal nodes are gotten
from the contents of those nodes' "name' attributes.
Note that in the case of $node being terminal, what
you get back is the same as $node->name.
Compare to tree_to_simple_lol_notation.
$node->tree_to_simple_lol_notation({...options...})
A simple-LoL version of tree_to_lol_notation (which
see); takes the same options.
$list_r = $node->draw_ascii_tree({ ... options ... })
Draws a nice ASCII-art representation of the tree
structure at-and-under $node, with $node at the top.
Returns a reference to the list of lines (with no
"0s or anything at the end of them) that make up the
picture.
Example usage:

print map("$_0, @{$tree->draw_ascii_tree});
draw_ascii_tree takes parameters you set in the
options hashref:
* "no_name" -- if true, "draw_ascii_tree" doesn't
print the name of the node; simply prints a "*".
Defaults to 0 (i.e., print the node name.)
* "h_spacing" -- number 0 or greater. Sets the number
of spaces inserted horizontally between nodes (and
groups of nodes) in a tree. Defaults to 1.
* "h_compact" -- number 0 or 1. Sets the extent to
which "draw_ascii_tree" tries to save horizontal
space. Defaults to 1. If I think of a better
scrunching algorithm, there'll be a "2" setting for
this.
* "v_compact" -- number 0, 1, or 2. Sets the degree
to which "draw_ascii_tree" tries to save vertical
space. Defaults to 1.
This occasionally returns trees that are a bit cockeyed in parts; if anyone can suggest a better drawing
algorithm, I'd be appreciative.
$node->copy_tree or $node->copy_tree({...options...})
This returns the root of a copy of the tree that $node
is a member of. If you pass no options, copy_tree
pretends you've passed {}.
This method is currently implemented as just a call to
$this->root->copy_at_and_under({...options...}), but
magic may be added in the future.
Options you specify are passed down to calls to
$node->copy.
$node->copy_at_and_under or
$node->copy_at_and_under({...options...})
This returns a copy of the subtree consisting of $node
and everything under it.
If you pass no options, copy_at_and_under pretends
you've passed {}.
This works by recursively building up the new tree
from the leaves, duplicating nodes using
$orig_node->copy($options_ref) and then linking them
up into a new tree of the same shape.
Options you specify are passed down to calls to
$node->copy.
the constructor $node->copy or
$node->copy({...options...})
Returns a copy of $node, minus its daughter or mother
attributes (which are set back to default values).
If you pass no options, "copy" pretends you've passed
{}.
Magic happens with the 'attributes' attribute: if it's
a hashref (and it usually is), the new node doesn't
end up with the same hashref, but with ref to a hash
with the content duplicated from the original's
hashref. If 'attributes' is not a hashref, but
instead an object that belongs to a class that pro
vides a method called "copy", then that method is
called, and the result saved in the clone's
'attribute' attribute. Both of these kinds of magic
are disabled if the options you pass to "copy" (maybe
via "copy_tree", or "copy_at_and_under") includes
("no_attribute_copy" => 1).
The options hashref you pass to "copy" (derictly or
indirectly) gets changed slightly after you call
"copy" -- it gets an entry called "from_to" added to
it. Chances are you would never know nor care, but
this is reserved for possible future use. See the
source if you are wildly curious.
Note that if you are using $node->copy (whether
directly or via $node->copy_tree or
$node->copy_at_or_under), and it's not properly copy
ing object attributes containing references, you prob
ably shouldn't fight it or try to fix it -- simply
override copy_tree with:

sub copy_tree {
use Storable qw(dclone);
my $this = $_[0];
return dclone($this->root);
# d for "deep"
}
or

sub copy_tree {
use Data::Dumper;
my $this = $_[0];
$Data::Dumper::Purity = 1;
return eval(Dumper($this->root));
}
Both of these avoid you having to reinvent the wheel.
How to override copy_at_or_under with something that
uses Storable or Data::Dumper is left as an exercise
to the reader.
Consider that if in a derived class, you add
attributes with really bizarre contents (like a
unique-for-all-time-ID), you may need to override
"copy". Consider:

sub copy {
my($it, @etc) = @_;
$it->SUPER::copy(@etc);
$it->{'UID'} = &get_new_UID;
}
...or the like. See the source of
Tree::DAG_Node::copy for inspiration.
$node->delete_tree
Destroys the entire tree that $node is a member of
(starting at the root), by nulling out each
node-object's attributes (including, most importantly,
its linkage attributes -- hopefully this is more than
sufficient to eliminate all circularity in the data
structure), and then moving it into the class
DEADNODE.
Use this when you're finished with the tree in ques
tion, and want to free up its memory. (If you don't
do this, it'll get freed up anyway when your program
ends.)
If you try calling any methods on any of the node
objects in the tree you've destroyed, you'll get an
error like:

Can't locate object method "leaves_under"
via package "DEADNODE".
So if you see that, that's what you've done wrong.
(Actually, the class DEADNODE does provide one method:
a no-op method "delete_tree". So if you want to
delete a tree, but think you may have deleted it
already, it's safe to call $node->delete_tree on it
(again).)
The "delete_tree" method is needed because Perl's
garbage collector would never (as currently imple
mented) see that it was time to de-allocate the memory
the tree uses -- until either you call
$node->delete_tree, or until the program stops (at
"global destruction" time, when everything is unallo cated).
Incidentally, there are better ways to do garbage-col
lecting on a tree, ways which don't require the user
to explicitly call a method like "delete_tree" -- they
involve dummy classes, as explained at
"http://mox.perl.com/misc/circle-destroy.pod"
However, introducing a dummy class concept into
Tree::DAG_Node would be rather a distraction. If you
want to do this with your derived classes, via a
DESTROY in a dummy class (or in a tree-metainformation
class, maybe), then feel free to.
The only case where I can imagine "delete_tree" fail
ing to totally void the tree, is if you use the
hashref in the "attributes" attribute to store (pre
sumably among other things) references to other nodes'
"attributes" hashrefs -- which 1) is maybe a bit odd,
and 2) is your problem, because it's your hash struc
ture that's circular, not the tree's. Anyway, con
sider:

# null out all my "attributes" hashes
$anywhere->root->walk_down({
'callback' => sub {
$hr = $_[0]->attributes; %$hr = (); return
1;
}
});
# And then:
$anywhere->delete_tree;
(I suppose "delete_tree" is a "destructor", or as
close as you can meaningfully come for a circularityrich data structure in Perl.)
When and How to Destroy
It should be clear to you that if you've built a big parse
tree or something, and then you're finished with it, you
should call $some_node->delete_tree on it if you want the
memory back.
But consider this case: you've got this tree:

A
/ | B C D
| | E X Y
Let's say you decide you don't want D or any of its
descendants in the tree, so you call
D->unlink_from_mother. This does NOT automagically
destroy the tree D-X-Y. Instead it merely splits the tree
into two:

A D
/ / B C
X Y
E
To destroy D and its little tree, you have to explicitly
call delete_tree on it.
Note, however, that if you call C->unlink_from_mother, and
if you don't have a link to C anywhere, then it does magi
cally go away. This is because nothing links to C -whereas with the D-X-Y tree, D links to X and Y, and X and
Y each link back to D. Note that calling C->delete_tree is
harmless -- after all, a tree of only one node is still a
tree.
So, this is a surefire way of getting rid of all $node's
children and freeing up the memory associated with them
and their descendants:

foreach my $it ($node->clear_daughters) {
$it->delete_tree }
Just be sure not to do this:

foreach my $it ($node->daughters) { $it->delete_tree }
$node->clear_daughters;
That's bad; the first call to $_->delete_tree will climb
to the root of $node's tree, and nuke the whole tree, not
just the bits under $node. You might as well have just
called $node->delete_tree. (Moreavor, once $node is dead,
you can't call clear_daughters on it, so you'll get an
error there.)

BUG REPORTS

If you find a bug in this library, report it to me as soon
as possible, at the address listed in the AUTHOR section,
below. Please try to be as specific as possible about how
you got the bug to occur.

HELP!

If you develop a given routine for dealing with trees in
some way, and use it a lot, then if you think it'd be of
use to anyone else, do email me about it; it might be
helpful to others to include that routine, or something
based on it, in a later version of this module.

It's occurred to me that you might like to (and might
yourself develop routines to) draw trees in something
other than ASCII art. If you do so -- say, for PostScript
output, or for output interpretable by some external plot
ting program -- I'd be most interested in the results.

RAMBLINGS

This module uses "strict", but I never wrote it with -w
warnings in mind -- so if you use -w, do not be surprised
if you see complaints from the guts of DAG_Node. As long
as there is no way to turn off -w for a given module
(instead of having to do it in every single subroutine
with a "local $^W"), I'm not going to change this. How
ever, I do, at points, get bursts of ambition, and I try
to fix code in DAG_Node that generates warnings, as I come across them -- which is only occasionally. Feel free to email me any patches for any such fixes you come up with,
tho.

Currently I don't assume (or enforce) anything about the
class membership of nodes being manipulated, other than by
testing whether each one provides a method "is_node", a
la:
die "Not a node!!!" unless UNIVERSAL::can($node,
"is_node");
So, as far as I'm concerned, a given tree's nodes are free
to belong to different classes, just so long as they pro
vide/inherit "is_node", the few methods that this class
relies on to navigate the tree, and have the same internal
object structure, or a superset of it. Presumably this
would be the case for any object belonging to a class
derived from "Tree::DAG_Node", or belonging to
"Tree::DAG_Node" itself.
When routines in this class access a node's "mother"
attribute, or its "daughters" attribute, they (generally)
do so directly (via $node->{'mother'}, etc.), for sake of
efficiency. But classes derived from this class should
probably do this instead thru a method (via $node->mother,
etc.), for sake of portability, abstraction, and general
goodness.
However, no routines in this class (aside from, necessar
ily, "_init", "_init_name", and "name") access the "name"
attribute directly; routines (like the various tree
draw/dump methods) get the "name" value thru a call to
$obj->name(). So if you want the object's name to not be
a real attribute, but instead have it derived dynamically
from some feature of the object (say, based on some of its
other attributes, or based on its address), you can to
override the "name" method, without causing problems. (Be
sure to consider the case of $obj->name as a write method,
as it's used in "lol_to_tree" and "random_network".)

SEE ALSO

HTML::Element

Wirth, Niklaus. 1976. Algorithms + Data Structures = Programs Prentice-Hall, Englewood Cliffs, NJ.

Knuth, Donald Ervin. 1997. Art of Computer Programming, Volume 1, Third Edition: Fundamental Algorithms. Addi son-Wesley, Reading, MA.

Wirth's classic, currently and lamentably out of print,
has a good section on trees. I find it clearer than
Knuth's (if not quite as encyclopedic), probably because
Wirth's example code is in a block-structured high-level
language (basically Pascal), instead of in assembler
(MIX).

Until some kind publisher brings out a new printing of
Wirth's book, try poking around used bookstores (or
"www.abebooks.com") for a copy. I think it was also
republished in the 1980s under the title Algorithms and Data Structures, and in a German edition called Algorith_ men und Datenstrukturen. (That is, I'm sure books by Knuth were published under those titles, but I'm assuming that they're just later printings/editions of Algorithms + Data Structures = Programs.)

COPYRIGHT AND DISCLAIMER

Copyright 1998,1999,2000,2001 by Sean M. Burke
"sburke@cpan.org", all rights reserved. This program is
free software; you can redistribute it and/or modify it
under the same terms as Perl itself.

This program is distributed in the hope that it will be
useful, but without any warranty; without even the implied
warranty of merchantability or fitness for a particular
purpose.

AUTHOR

Sean M. Burke "sburke@cpan.org"
Copyright © 2010-2025 Platon Technologies, s.r.o.           Home | Man pages | tLDP | Documents | Utilities | About
Design by styleshout