/dev/null: Coding Practice: Depth-first Tree Traversal

A few weeks ago, I wrote about trees -- binary search trees, to be specific. Working with a tree, for example inserting or retrieving elements, often involve scanning through the elements of the tree, starting at the root. This is known as a traversal.

Unsurprisingly, there are many ways to perform a traversal. A major distinguishing factor is the order in which nodes are visited: if all children of a node are visited before any of the node's siblings, then the traversal is known as depth-first. If, on the other hand, the siblings are visited before the children, then the traversal is known as breadth-first. In this article, I'll focus on describing depth-first traversals.

There are three main ways of traversing a binary tree depth-first: pre-order, in-order and post-order. They are typically defined recursively, with each step of the recursion consisting of three sub-steps: do something to the current node (this is referred to as visiting), traverse the left subtree, traverse the right subtree. By convention, the left subtree is traversed before the right subtree. Pre-order, in-order and post-order traversals perform the visit sub-step before, in-between and after the two subtree traversals, respectively. Here's an example of performing each of the three traversal on a small tree.

The big three traversal algorithms can be implemented recursively or iteratively. Recursive implementations are easier to understand since they follow directly from definition, but can be less efficient than iterative implementations due to the function call overhead (for more details, see http://stackoverflow.com/questions/72209/recursion-or-iteration).

Picking the traversal algorithm to use depends on the application. For example, propagating changes from the leaf nodes to the root (e.g. calculating the sum of the tree) can only be accomplished with a post-order traversal, since the sum of each subtree needs to be known before the sum of the current node can be calculated. In contrast, propagating changes from the root to the leaves would be best done with a pre-order traversal.

The problems to solve this week were:

Implement a simple binary (non-search) tree node data structure in your favorite programming language and write the following methods: (1) print nodes pre-order, (2) print nodes in-order, (3) print nodes post-order.
Write a function that, given two nodes and a tree root, finds the two nodes' lowest common ancestor. That is, the function should find the ancestor that both nodes share that is furthest away from the root.

I went with a C++ implementation this time to take advantage of the STL's sets and maps. For finding the lowest common ancestor (LCA), I didn't use a parent pointer in the Node, and instead used an arbitrary traversal to calculate the parent of each node in the tree and store it in a map. This costs

$O(n)$ for both space and time. Once that's done, finding all ancestors of one node and searching for the LCA both cost

$O(log(n))$ given a data structure with a fast membership function (a set). The approach is thus

$O(n)$ , but can be reduced to

$O(log(n))$ if the results of the traversal are pre-calculated and stored somewhere. This pre-calculation would be practically identical to keeping parent pointers in each Node.

The code is below:

	#include <map>
	#include <set>
	#include <algorithm>
	#include <iostream>

	using namespace std;

	//
	// Lessons learned:
	//
	// - correct notation for function pointers
	// - callbacks often require additional data to be passed to them to be
	// useful
	// - functions need to be declared before they are called
	// - return value of calloc needs to be cast to the appropriate type
	// (in C++, C appears to forgive this). For more details, see:
	// http://stackoverflow.com/questions/3477741/why-does-c-require-a-cast-for-malloc-but-c-doesnt
	// - std::find() not vector.find()
	// - C++ does not require the explicit "struct" qualifier, leaving it off can
	// make things more readable
	// - typedef'ing a callback can make things significantly more readable
	// - using malloc in C++ appears to be a sin. For more details, see:
	// http://stackoverflow.com/questions/184537/in-what-cases-do-i-use-malloc-vs-new
	// - classes can be more useful than structs (take advantage of constructor)
	// - class members are private by default
	//
	// Compiles with:
	//
	// cl traversal.cpp (on Windows using MSVC 2010)
	//

	class Node
	{
	public:
	int value;
	Node *left;
	Node *right;
	Node(int value_)
	{
	value = value_;
	left = NULL;
	right = NULL;
	}
	};

	void
	store_parents(Node node, void cb_data);

	typedef void (VisitCallback)(Node node, void cb_data);

	void
	preorder(Node n, VisitCallback visit, void *cb_data)
	{
	visit(n, cb_data);
	if (n->left) preorder(n->left, visit, cb_data);
	if (n->right) preorder(n->right, visit, cb_data);
	}

	void
	inorder(Node n, VisitCallback visit, void *cb_data)
	{
	if (n->left) inorder(n->left, visit, cb_data);
	visit(n, cb_data);
	if (n->right) inorder(n->right, visit, cb_data);
	}

	void
	postorder(Node n, VisitCallback visit, void *cb_data)
	{
	if (n->left) postorder(n->left, visit, cb_data);
	if (n->right) postorder(n->right, visit, cb_data);
	visit(n, cb_data);
	}

	set<Node *>
	find_all_ancestors(Node node, map<Node , Node *> parents)
	{
	set<Node *> ancestors;
	for (Node *n = parents[node]; n; n = parents[n])
	ancestors.insert(n);
	return ancestors;
	}

	map<Node , Node >
	calc_parent_map(Node *root)
	{
	map<Node , Node > parents;
	inorder(root, store_parents, &parents);
	return parents;
	}

	void
	store_parents(Node node, void cb_data)
	{
	map<Node , Node > parents = (map<Node , Node > )cb_data;
	if (node->left) (*parents)[node->left] = node;
	if (node->right) (*parents)[node->right] = node;
	}

	void
	print_node(Node node, void cb_data)
	{
	(void *)cb_data;
	cout << node->value << ' ';
	}

	Node *
	lowest_common_ancestor(Node n1, Node n2, Node *root)
	{
	map<Node , Node > parent = calc_parent_map(root);
	set<Node *> ancestors = find_all_ancestors(n1, parent);
	Node *n;
	for (n = n2; n != root; n = parent[n])
	if (find(ancestors.begin(), ancestors.end(), n) != ancestors.end())
	break;
	return n;
	}

	int
	main(void)
	{
	Node *n1 = new Node(1);
	Node *n2 = new Node(2);
	Node *n3 = new Node(3);
	Node *n4 = new Node(4);
	Node *n5 = new Node(5);
	Node *n6 = new Node(6);
	Node *n7 = new Node(7);
	Node *n8 = new Node(8);
	Node *n9 = new Node(9);
	Node *n10 = new Node(10);
	Node *n11 = new Node(11);
	Node *n12 = new Node(12);
	Node *n13 = new Node(13);
	Node *n14 = new Node(14);
	Node *n15 = new Node(15);

	n1->left = n2;
	n2->left = n3;
	n3->left = n4;
	n3->right = n5;
	n2->right = n6;
	n6->left = n7;
	n6->right = n8;
	n1->right = n9;
	n9->left = n10;
	n10->left = n11;
	n10->right = n12;
	n9->right = n13;
	n13->left = n14;
	n13->right = n15;

	cout << "preorder" << ' ';
	preorder(n1, print_node, NULL);
	cout << endl;
	cout << "inorder" << ' ';
	inorder(n1, print_node, NULL);
	cout << endl;
	cout << "postorder" << ' ';
	postorder(n1, print_node, NULL);
	cout << endl;

	Node *lca = lowest_common_ancestor(n5, n7, n1);
	cout << "lowest common ancestor: " << lca->value << endl;

	return 0;
	}

view raw traversal.cpp hosted with ❤ by GitHub

/dev/null

Tuesday, February 5, 2013

Coding Practice: Depth-first Tree Traversal

No comments:

Post a Comment

About Me