Programming languages should have a tree traversal primitive
# Expects:
# Tuple of iterateOn where iterateOn can be None
def fancyIt(init, *options):
if init != None:
yield(init)
for f in options:
newVal = f(init)
yield from fancyIt(newVal, *options)
class Tree:
def __init__(self, val, left = None, right = None):
self.val = val
self.left = left
self.right = right
def left(self):
return self.left
def right(self):
return self.right
myTree = Tree(
1,
Tree(2,
Tree(3),
Tree(4, None, Tree(5))),
Tree(6, None, Tree(7)))
for node in fancyIt(myTree, Tree.left, Tree.right):
print(node.val)
which prints the numbers 1 through 7 in order.Breadth-first is slightly trickier, but only slightly trickier one time.
Well a range based for loop requires that your tree exist in memory AND that you have an iterator defined for your tree. With for_tree you could operate on an entirely imperative tree, without needing to define any iterators or generator functions. Here's an example where I'm checking every single string composed of "a", "b", and "c" of length 8 or less.
for_tree(string x = ""; x.size() <= 8; x : {x+"a", x+"b", x+"c"}){
print(x);
}
You could definitely find every string composed of "a", "b", and "c" of length 8 or less by defining a custom iterator but it would be a verbose and unpleasant way of writing it: class StringIterator {
public:
using iterator_category = std::forward_iterator_tag;
using value_type = std::string;
using difference_type = std::ptrdiff_t;
using pointer = const std::string*;
using reference = const std::string&;
StringIterator(bool begin = false) : is_end_(!begin) { if (begin) s_ = ""; }
const std::string& operator*() const {
if (is_end_) throw std::out_of_range("End iterator");
return s_;
}
StringIterator& operator++() {
if (is_end_) return *this;
if (s_.size() < 8) return s_.push_back('a'), *this;
while (!s_.empty() && s_.back() == 'c') s_.pop_back();
if (s_.empty()) is_end_ = true;
else s_.back() = s_.back() == 'a' ? 'b' : 'c';
return *this;
}
StringIterator operator++(int) { auto tmp = *this; ++(*this); return tmp; }
bool operator==(const StringIterator& other) const {
return is_end_ == other.is_end_ && (is_end_ || s_ == other.s_);
}
bool operator!=(const StringIterator& other) const { return !(*this == other); }
private:
std::string s_;
bool is_end_;
};
int main() {
StringIterator begin(true), end;
int count = 0;
for (auto it = begin; it != end; ++it) ++count;
std::cout << (count == 9841 ? "Pass" : "Fail") << std::endl;
return 0;
}
def itWithStop(init, stop, *options):
if init is not None and not stop(init):
yield(init)
for f in options:
newVal = f(init)
yield from itWithStop(newVal, stop, *options)
for s in itWithStop("",
lambda x: len(x) > 2,
lambda x: x + "a",
lambda x: x + "b",
lambda x: x + "c"):
print(s)
yields the combinations of 0 - 2 length strings with a, b, and c.Python has a number of ways to achieve this depending on exactly how you want to pass the arguments; multiple functions, optional arguments, etc. How nice the final call looks is more about your local language's closures look.
The main point here is that this will happily iterate on things that don't "exist".
module Tmp where
iter :: forall a. (a -> Bool) -> [a -> a] -> a -> [a]
iter p opts x = if p x then x:concatMap (iter p opts) (opts <*> [x]) else []
ghci> :l tmp.hs
[1 of 1] Compiling Tmp ( tmp.hs, interpreted )
Ok, one module loaded.
ghci> iter (\x -> length x < 3) [(++ "a"), (++ "b"), (++ "c")] ""
["","a","aa","ab","ac","b","ba","bb","bc",
"c","ca","cb","cc"]
(Since things are lazy in Haskell, functions that return lists effectively are iterators. There's probably something in the standard library somewhere for (opts <*> [x]) to avoid the wrapping x in an unnecessary list, but my Haskell is rusty.)And yes, Haskell is amazing at this sort of thing.
If the poster wants to particularize this to C++ because C++'s syntax can't support it in any reasonable manner, that's fine, but that's a C++ problem, not a "Programming languages..." problem. Which would be perfectly understandable and I'm not really complaining, more clarifying that most of the rest of the world can just rub together three or four existing constructs in a pretty reasonable manner to get this.
Ceterum censeo this would be a family of simple macros in LISP.
While easy, I think bisect would be a good addition to every stdlib too.
I don't think this is a large problem in practice because you shouldn't be using dozens of tree types in a given code base, so adding iterators to a tree is no big deal. In general there aren't enough types of iteration available to a given data structure that you need to describe how to iterate on it from the "outside". (Generally when you are doing that, it's too non-trivial to fit into this pattern anyhow; see the Visitor pattern in general.) This strikes me as maybe the sort of default tool you might slap in a library somewhere, but it should be a niche tool. If you're using it all the time you're probably doing something wrong. By default your data structures should be providing iteration packaged with them and it should generally be what you need. And your language should support aborting iteration, in whatever that looks like normally. I'm not sure I know a language that doesn't, it's a fairly basic element of iterator support when you get into implementation.
There are also many cases where a tree iterator will perform significantly better, including CPython. I don't have enough experience with PyPy to know if it could inline the Tree.left and Tree.right calls down to zero penalty at JIT time. Rust and C++ and the other static languages with sophisticated compilers might be able to get that down to fully inlined and zero-cost, but even if they can it's probably better not to push that on to the optimizer as the optimizers will eventually give up if this is composed with enough other stuff. Better to just have an efficient implementation in the first place.
(this is for iterating over nested JSON-like objects, which are just weird trees)
[1] https://doc.rust-lang.org/std/vec/struct.Vec.html#method.bin...
Yes, students should absolutely implement the classic algorithms to learn.
Yes, there are some occasions when you need to home grow one at $work.
BUT, in my opinion, most of the time, professional code should use a battle tested, vuln hardened library or builtin version. These things are VERY HARD to get exactly right. Jon Bently's Programming Pearls famously had a latent bug in its binary search for 20 years before someone caught it.
https://research.google/blog/extra-extra-read-all-about-it-n...
So yeah, it looks easy but don't do it. Stand on some giant's shoulders instead.
Anyone who copies and pastes it is welcome to both pieces when it breaks. Others have already alluded to possible improvements that could be made, and I already have my own analysis in a grandchild reply as to why I don't think this is a terribly pressing need or necessarily even a good idea.
The reason I provide code is that it gets past the "oh, you say it's just an iterator, but I still don't believe you, since you haven't spelled it out to the n'th degree". When code is provided, belief ceases to be an issue. It is clearly something an iterator can implement, in existing languages, with existing iterator support.
Unless you're going to claim it is somehow impossible to provide this functionality in a tested manner, you're completely changing the topic in an uninteresting direction, since it is always true that functionality generally needs testing and bits of code slammed into an HN conversation just to make a particular point probably shouldn't be copied wholesale into your production code.
IMO the thing that would be really nice is if control flow like `for` was actually the same as using an iterator. This would really help in Rust too where handling errors inside iterator callbacks is a right pain.
I've seen a few languages try this but it seems to not be very popular. I think it can get a bit confusing how control flow keywords like `return` and `break` work if you turn `if` into syntactic sugar for a function call taking a closure etc.
In PHP you loop through an iterator with the foreach control structure.
In JavaScript you use for of.
In rust it’s for in.
What am I missing?
Major tools that exist today for partial structure traversal and focused manipulation:
- Optics (Lenses, Prisms, Traversals)
Elegant, composable ways to zoom into, modify, and rebuild structures.
Examples: Haskell's `lens`, Scala's Monocle, Clojure's Specter.
Think of these as programmable accessors and updaters.
- Zippers Data structures with a "focused cursor" that allow local edits without manually traversing the whole structure.
Examples: Huet’s original Zipper (1997), Haskell’s `Data.Tree.Zipper`, Clojure’s built-in zippers.
- Query Languages (for semantic traversal and deep search) When paths aren't enough and you need semantic conditionals:
- SPARQL (semantic web graph querying)
- Datalog (logic programming and query over facts)
- Cypher (graph traversal in Neo4j)
- Prolog (pure logic exploration)
These approaches let you declaratively state what you want instead of manually specifying traversal steps.
The problem is that 'lens', 'monocle', etc. are famously abstract and difficult for people to apply to their actual problems. IMO, the solution would be for standard libraries to specify interfaces called 'BreadthFirstTraverse', 'DepthFirstTraverse', etc.
Optics are famously abstract in implementation, but I don't think people have trouble applying them - people seem to like JQuery/CSS selectors, and insist on `object.field` syntax; it's kind of wild that no mainstream language has a first-class way to pass around the description of a location in an arbitrary data structure.
[1] https://ghc-proposals.readthedocs.io/en/latest/proposals/002...
[2] https://ghc-proposals.readthedocs.io/en/latest/proposals/015...
Account acc = getAccount();
QVERIFY(acc.person.address.house == 20);
auto houseLens = personL() to addressL() to houseL();
std::function<int(int)> modifier = [](int old) { return old + 6; };
Account newAcc2 = over(houseLens, newAcc1, modifier);
These also use templating to get something that still feels maybe a little less ergonomic than it could be, though.[1] https://github.com/graninas/cpp_lenses [2] https://github.com/jonsterling/Lens.hpp
These are what I think the author is looking for. But it shouldn't be a "primitive" in terms of code automatically generated by the compiler
I think people are often too enamored by general purpose languages that can express such abstractions natively. I don't see an issue with a language that provides this as a primitive without being able to express it itself, constraints can be useful for other properties. Once you can traverse trees, most programming problems can be tackled even in such constrained languages, eg. SQL with CTE.
Think of these as programmable accessors and updaters.
How is iterating through something already not 'programmable' ?
If you read a bit more about them, I think you will be surprised to see the breadth of what these abstractions can be used for. To start, they've been used to build a new compositional foundation for game theory[1], and they can be used to model gradient-based learning in machine learning[2].
As for their simpler applications as getters and setters, they are programmable in the sense that you can think of lens/prism types as interfaces that can be implemented arbitrarily. So you can create your own lenses and combine them like legos to construct new, "bigger" getters and setters from the smaller components.
[1] https://arxiv.org/pdf/1603.04641 [2] https://arxiv.org/abs/2103.01931
EDIT: fixed typo
When does someone give up on the pagentry of programming and just get something done by looping through data instead of "constructing getters and setters to model gradient based machine learning".
It really seems like the straight forward way of doing things is to make simple iteration and worry about the game theory after that is all figured out.
I think your comment is valuable in spirit because it could bring a more thorough discussion of the creation or popularization of a syntactically clean and a shift of the idiomatic-ness of such a solution to traversing trees. Leaving the more abstract and general Optics discussion to be a side dish for commenters to discuss as well.
Also, just a nitpick, but traversing a tree is a broader topic than iteration. Iteration implies, both colloquially and formally, performing a computation on each element of a data structure, while traversal is a more general algorithm that could be the basis of iteration, but may also be more akin to a search through the data structure without expectation of performing computation until the searched for ‘member’ is returned from the traversal. So it’s not an apples-to-oranges comparison with your conflation of iteration and traversal, but does seem a little like Canis Familiaris-to-Mammal comparison.
For C++, you could define yourself a template that expands to the two functions you listed.
For any language, you could write yourself a pre-processor that adds for_tree notation and expands it, either with pre-processor semantics or working on the abstract syntax tree (which is more "proper" but also more work"). I would recommend the former to test notations, and then you can live with them for a while to experiment with them and see if and how often you really need the construct (recall that is how C++ evolved from C via "C with classes" - it was first a pre-processor). Once you and others are 100% convinced of the new syntax, go to your prefered language's working group/ISO committee and propose inclusion.
My own feeling is that this is not something for which I need more than recursion; calling inside traverse() traverse(left) and traverse(right) for binary trees or using a normal for loop to iterate over all this->children() from 0 to this->children.size() is something that occurs in graph libraries and parsers/compilers once in a while but not often enough to warrant its own notation. Rather, when I look at languages like C++, I'd have a language that is simpler, cleaner, more homogeneous and more orthogonal; C++ is complicated, convoluted, too large to implement it correctly by a single person in reasonable time (compared to beauties like C, Pascal, Scheme), so I stand on the side of minimalism.
Be that as it may, for C++, Eric Neibler's [Boost.Proto](https://www.boost.org/doc/libs/1_84_0/doc/html/proto.html) could likely help conveniently connect syntax to implementation to achieve something similar to what the author is taking about.
[1] https://www.cs.ox.ac.uk/jeremy.gibbons/publications/iterator...
https://wiki.haskell.org/index.php?title=Research_papers/Gen...
(defn walk [inner outer form]
(cond
(list? form) (outer (with-meta (apply list (map inner form)) (meta form)))
(instance? clojure.lang.IMapEntry form)
(outer (clojure.lang.MapEntry/create (inner (key form)) (inner (val form))))
(seq? form) (outer (with-meta (doall (map inner form)) (meta form)))
(instance? clojure.lang.IRecord form)
(outer (reduce (fn [r x] (conj r (inner x))) form form))
(coll? form) (outer (into (empty form) (map inner form)))
:else (outer form)))
(defn postwalk [f form]
(walk (partial postwalk f) f form))
(defn prewalk [f form]
(walk (partial prewalk f) identity (f form)))
Another reason why this perlisism holds: 9. It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.
"Let's move on." (defn tree-seq
"Returns a lazy sequence of the nodes in a tree, via a depth-first walk.
branch? must be a fn of one arg that returns true if passed a node
that can have children (but may not). children must be a fn of one
arg that returns a sequence of the children. Will only be called on
nodes for which branch? returns true. Root is the root node of the
tree."
{:added "1.0"
:static true}
[branch? children root]
(let [walk (fn walk [node]
(lazy-seq
(cons node
(when (branch? node)
(mapcat walk (children node))))))]
(walk root)))
If every ounce of performance matters, e.g. in a database, you want 10000 functions, 100 for each data structure.
Complex data structures absorb lot of the complexity of the problem and reduce the complexity of the rest of the code.
Here is how I one could write in C: https://godbolt.org/z/P3ErP4T4d or possibly with the setup code moved into the helper function for convenience: https://godbolt.org/z/soz7W5z1G
If this becomes a C++ feature, imagine how many data structures we would need to support?
Many other languages, specially the FP languages, allow to do that as a library. Even the languages that are only inspired by FP. Example, Ruby:
class BinTree
include Enumerable
def initialize v, l, r
@v, @l, @r = v, l, r
end
def each &block
@l.each(&block) unless @l.nil?
yield @v
@r.each(&block) unless @r.nil?
end
end
Using the Enumerable mixin includes many FP-based methods, such as map, filter and reduce by only defining each, which in this case is DFS.Then we can proceed to define a binary tree:
tree = BinTree.new(
1,
BinTree.new(
2,
BinTree.new(4, nil, nil),
BinTree.new(5, nil, nil)
),
BinTree.new(
3,
BinTree.new(6, nil, nil),
BinTree.new(7, nil, nil),
)
)
Iterate over all the elements: tree.each{|v| puts v}
Iterate over the even elements: tree.filter{|v| v.even?}.each{|v| puts v}
Stop iteration when finding a value: tree.each do |v|
break if v == 1
puts v
end
And so on. The same can be done in Python, Kotlin and many others.If this becomes a C++ feature, imagine how many data structures we would need to support?
C++ already solved that problem. Iterator are designed so that an algorithm can be written once and used with multiple data structures.
Sequences, trees and graphs are core abstractions to most programming, why try to flatten all trees and graphs to sequences?
Because "trees" and "graphs" aren't actually single concepts, they're families of concepts that can vary along many dimensions. Most programming languages offer one or a small handful of sequence types, because most sequence use patterns are the same. Most languages don't offer a tree or graph type, and as soon as you try to implement one you see why: there are multiple different kinds of trees and graphs, and multiple different representations that are appropriate for different use cases, and any one you might pick would be unsuitable for most programs people want to write.
For example if you literally don’t care about the order then your code can map over a default iterator that is efficient (DFS).
Then, for some cases, depth-first traversal is needed; for others, breadth-first.
Then, there's parallelism, and even the plain old for loops aren't parallel by default.
By the time you specify exactly what you need from a tree traversal, you've written code to do it.
And if you're fine with some default choice — you already can use the default iterator with the for_each loop.
I don't see what need there is for adding an extra for_tree syntax to do that.
Trees are often considered too diverse to include even in the standard library, let alone as a primitive. Even Python doesn't have trees in the standard library. I'm sure it's been proposed and rejected at some point
But to model a general tree, you'd need a dynamically allocated list of children for each node, adding a layer of indirection that can noticeably worsen performance.
I hate to be a broken record, but this again goes back to the assumption of a specific sequential computation model. Modelling relations as in SQL automatically supports for N-ary child relations. There are other computation models! eg. logic, relational, etc.
The same from another angle: there are a lot of trees in the indices of SQL databases (example[1]) but we don't zoom in to that level of detail very often when defining our tables.
To implement Brown's algorithm to optimize class-based language models I had to implement a complex forest (DAG, actually) in Python using lists of fixed length. That was not especially nice to work with.
- Every html document is a tree structure. And css documents have special syntax to address nodes within those trees
- If it's any good, you're routing framework probably uses some kind of tree to quickly match the request to it's handler
- The database you write to uses a tree to quickly find the location it needs to write to
(And to be clear, none of these involve sitting down and writing some generic Tree<T> structure, they're all cases of "Well, there's some tree-like datastructure that already exists or is a natural fit for this system and I'm going to be traversing it depth-first or breadth-first to do something to the elements of it or calculate something based on what nodes I pass through on a given traversal")
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Also I’m slightly confused by this example.
for_tree(string x = ""; x.size() <= 8; x : {x+"a", x+"b", x+"c"}){
print(x);
}So, our “next node” operation is to concatenate to x. Won’t we either have to have a method for modifying x to go “up” a node, or we’ll have to keep a record of what x was upon entering each node? Like in this example we’ll end up with x=“aaaaaaaa” and then go up a node, over to a “b” node, and get x=“aaaaaaaab”, right?
I guess we can delete the a node’s copy of x after all of a node’s child nodes are visited, at least.
Perhaps it can be optimized to be a little better than the recursive version, depending on how much overhead your language uses for a stack frame that it won't need for this special case.
Or for a possibly more useful comment, constant space tree traversal is genuinely quite difficult to implement. I don't know a general solution to it (other than walk from the start N times, quadratic style), would be interested to hear of one.
Constant space tree traversal is impossible...
Naively I would expect something like left-hand-rule maze traversal to be constant space. The tree may need additional structures to support this sort of travel.
Other thoughts: most state machines are constant space.... there may be something there.
class InOrderTreeIterator {
Stack stack;
TreeNode cursor;
InOrderTreeIterator(TreeNode root) {
cursor = root;
s = new Stack;
}
bool hasNext() {
return cursor != null || !stack.empty();
}
TreeNode next() {
if (cursor != null) {
while (cursor.left != null) {
stack.push(cursor);
cursor = cursor.left;
}
} else if (!stack.empty()) {
cursor = stack.pop();
} else {
throw new NoSuchElementException();
}
TreeNode ret = cursor;
cursor = cursor.right
return ret;
}
}
Iterable<TreeNode> inOrder(TreeNode node) sync* {
if (node.left != null) yield* inOrder(node.left!);
yield node;
if (node.right != null) yield* inOrder(node.right!);
}
But it's not obvious to me how you'd write a generic iterator that supports something like finding all free variables in an expression tree, or even just express a tree mapping function that constructs a new tree with the same structure (for example, "make all variables in the expression uppercase"). It's been a while since I worked with iterators so correct me if I'm wrong and this is in fact easy with iterators.
With something like Haskell's recursion-schemes library[0] these operations are all just a few lines long, guaranteed to terminate, and don't need to be updated if your data structure changes (for example you add new expression nodes). I'm not aware of any non-functional programming language that can do that. See for example the freeVars function at the bottom of the linked recursion-schemes doc.
[0] https://hackage.haskell.org/package/recursion-schemes-5.2.3
At the same time, the way it's implemented in Haskell's recursion-schemes library might be hard to wrap one's head around at first, kind of like how the list functions "foldr" and "foldl" are also often confusing to newbies, even though they're like the go-to, default way to make list functions in Haskell.
type Node = { value: Any, left: Node, right: Node }
type Direction = Left | Right
type TreePosition = { root: Node, currentNode: Node = root, position: Direction[] = [] }
# Implementation left as an exercise but should be obvious and run in O(1), I believe. Returns Nothing when we're out of nodes.
function nextPosition(position: TreePosition): Option<TreePosition>
# The tree you want to iterate through
const myTree: Node = ...
# The loop
for(let position: TreePosition? = TreePosition(root: myTree); position != Nothing; position = nextPosition(position) {
node = position!.currentNode
# Your loop code
}
I'd argue this doesn't belong as a language-level feature, but maybe an API/stdlib-level feature."Why not just use recursive functions"
One great reason not to use recursive functions for traversing trees is that you can allocate your own stack data structure rather than relying on the call stack itself. In most languages/runtimes, the call stack has a maximum depth which limits the depth of trees you can process, usually on the order of thousands of stack frames.
Managing your own stack usually produces weirder looking code (personally I find "naive" recursive approaches more readable) - but having it as a first-class language feature could solve that!
for recursive (Node t = tree.root; t != NULL;) {
puts(t.value);
if (t.value == target) break;
if (t.value == dontfollow) continue;
if (t.left) continue t.left;
if (t.right) continue t.right;
}
return t;
Regular 'break' is to really break out of the structure like a regular for, as regular 'continue' is to do the next iteration. But if continue has a value to recurse on, it reenters the for loop like a subroutine.As a bonus, I think this is tail-call-optimization friendly.
I use it in every project for data navigation and transformation, and it's more performant than standard Clojure data manipulation, while retaining types (instead of coercing back from seqs).
E.g. if you have a map and you want to increment every value in the map: (require '[com.rpl.specter :as S])
``` (->> {:a 5, :b 6, :c 7} (S/transform [S/MAP-VALS] inc)) => {:a 6, :b 7, :c 8} ```
^ try to write that in normal clojure.
Now let's say you have a map of vectors and want to increment all of those? (->> {:a 5, :b 6, :c 7} (S/transform [S/MAP-VALS S/ALL] inc)) ;; note the navigator juts got another level of nesting => {:a [2 3], :b [4 5], :c [6 7]}works for all clj data types, and of course it has navigators for recursive walking .
It took me a while to get good at Specter, but it was worth it. I hear Rama uses Specter navigators internally.
https://www.metosin.fi/blog/transforming-data-with-malli-and...
Here's the range based for loop counterexample from the blog post, as a python generator:
import itertools
def iterate_tree():
for length in range(1, 8):
for combo in itertools.product("abc", repeat=length):
yield ''.join(combo)
for s in iterate_tree():
print(s)
But tree traversal doesn't have this universal property. There are too many methods and purposes for traversing a tree, sufficient that IMHO no single primitive embodiment could materially improve a language. Also, modern compilers efficiently break down high-level traversal code so well that expressing the idea at a high level incurs no serious penalty compared to having a primitive for that purpose, or a series of them.
[0] https://www.hillelwayne.com/post/graph-types/ and https://news.ycombinator.com/item?id=39592444
Candidates: Racket, Scheme, Rust.
For example the most basic operations of a pointer are to advance and dereference.
std::map is actually implemented as a tree. To iterator its members you can do
for (cost auto &pair : map)
The only requirement for your custom data structure to work is to implement begin() and end() which return iterators - “pointer like” objects. for (const auto&node: await depth_first_tree(node))
And generators have the added advantage that walking trees is just a special case of the more general case of traversing a directed graph of elements, which is equally easy.Are generators not a thing in other languages?
We don't have syntax and semantics for recursion schemes in any programming language - it's always deferred to library support at best. As far as I'm concerned, this is an open problem in programing language design where we finally replace the vestigial recursion hack with a proper structured programming solution.
For exposition, lets seperate loop initialization from the rest, make `continue` mandatory to repeat the loop with another specified value (i.e. the loop implicitly ends without a continue) and lets say `descend` to recurse into the tree.
sum = 0;
loop (node = tree.root) {
sum += node.value;
if (node.left) descend node.left;
if (node.right) descend node.right;
}
Compare with a linear loop: sum = 0;
loop (i = 0) {
if (i >= array.length) break;
sum += array[i];
continue i + 1;
}
The `descend` keyword differs from `continue` in that its more of a continuation - control flow will continue from that point once it is done.You could make this more functional and less imperative (and I'd be surprised if such things don't exist as libraries in some functional languages). Yes ultimately we can transform this to recursive functions and compile it that way... but for imperative languages (especially those without closures) something like the above might be useful.
Arguably the above loops are _more_ imperative and less declaritive than the standard ones from the C family. You could add a functional twist by breaking with a value (which is the result of the overall `loop` expression).
Thank you... I'll see myself out... lol =3
func Search(Root : Tree; Desired_Value : Tree_Value)
-> optional Tree_Id is
var Identifier : optional Tree_Id := null;
for T => Root then Left(T) || Right(T)
while T not null concurrent loop
if Value(T) == Desired_Value then
// Found desired node,
// exit with its identifier
exit loop with Identifier => Key(T);
end if;
end loop;
return Identifier;
end func Search;
I actually don't use this feature much, because usually there is a key of some sort that determines which side of the tree is of interest.OTOH I read/heard that the beauty of array languages is that they have few data types, but they can be easily worked on. So maybe the answer is not easier tree traversal primitives, but better literature/training on how to transform it in a more manageable data type.
Sometimes the answer is to adapt our vision to our world, sometimes the answer is to adapt the world to our vision.
So something similar to `break` and `continue` should be the method of branching (descending into the tree).
I think Scala might be a great language to prototype implementation of such construct.
Const km = 1 Const velocity = 11
And do km + velocity = 12 is all kinds of silly.
I explored this idea with gstd, a standard library for graphs inspired by the JS Array interface: https://github.com/cdrini/gstd/
Both looping and tree traversal can be done with library functions.
In general, everything that can be done with library functions, should be done with library functions.
Doesn't for_tree(...) look a lot nicer and simpler and less error prone than needing to implement a recursive function for each operation you would want to do on a tree?
Eh, in Haskell you can derive many of these things automatically. And even Rust has some automatic derivation.
I think it's a better idea to write a function that applies a function to the elements of a tree. Ideally you'd make a function for each traversal order. This makes it obvious what is happening.
map_tree_pre(my_tree, my_func);
map_tree_in(my_tree, my_func);
map_tree_post(my_tree, my_func);
map_tree_bfs(my_tree, my_func);
Encoding the semantics of a tree traversal operator likewise is difficult in the general case. What exactly would the order be, what if I want to traverse in a non-standard ordering, what about skipping branches; all would be difficult to cleanly represent.
I have seen it done where you return actions with key ones being recurse, stop, replace, and replace & then carry out some function, but again, this is pretty simple to implement.
foreach (Node node in EnumerateNodes(root, x => x != null, x => [x.Left, x.Right]))
where EnumerateNodes uses `yield return` (i.e. is a generator) and calls itself recursively. Though it'd probably be easier / better performance to write an implementation specific to each node type.Web browsers have this. It’s the DOM and it scares the shit out of most JavaScript developers. The fun part is confronting people about their irrational fears watching them spin their wheels with all the imagined excuses.
I am a tremendous fan of tree systems. Tree systems should completely replace inheritance in programming to enforce the composition over inheritance principle.
var todo = new List<T>();
todo.append(root);
while (var item = todo.pop_front()) {
todo.append(item.left); // or .prepend for depth-first
todo.append(item.right); // or .prepend()
// do stuff...
}
It allows you to specify traversal in a structure-shy way. Personally, I think it goes a little too far, but it's definitely a good inspiration.
[1] https://www2.ccs.neu.edu/research/demeter/
Presentation: https://www2.ccs.neu.edu/research/demeter/talks/overview/qua...
This kind of imperative iteration seems better served by the traditional visitor design pattern: more verbose (more explicit, not more complex) and more general.
Not as ergonomic as a direct tree-iterator, but I can't see of an elegant way to introduce that in an imperative language while keeping the forking/recursion aspect clear
EnScript had a forall(NodeClass n in tree.GetRoot(){} construct that was very easy. It was essentially a depth-first iterator.
I think allowing for starting this kind of search from a given node would cover most (though not all) of what you'd want to expose the trees structure for directly.
[1] https://lean-lang.org/doc/reference/latest//The-Type-System/...
Bonus: Java's HotSpot magick will NOP (most?) methods of Null Objects, making this a zero cost abstraction.
I should probably write a concrete example for a blog or something.
TLDR: For every base class such as TreeNode, create a NullTreeNode that does nothing, then replace all uses of null with NullTreeNode. Voilá, no more null checks or NPEs.
A TXR Lisp macro for-tree for traversing a tree, given a variable, an expression for the root node, and expressions for how to to left and right relative to the variable.
A node structure that for-tree knows nothing about. No iterator abstraction or API, nothing.
(defstruct node ()
value
left
right)
Example tree (a balanced binary tree whose numeric values happen to be sorted): (defvar example-tree #S(node value 5
left #S(node value 3
left #S(node value 1)
right #S(node value 4))
right #S(node value 7
left #S(node value 6)
right #S(node value 9)))
Walk it: (for-tree nod example-tree nod.left nod.right
(prinl nod.value))
1
3
4
5
6
7
9
Code: (defmacro go-leftmost (var root left-expr stack)
^(for ((,var ,root)) (,var) ((set ,var ,left-expr))
(push ,var ,stack)))
(defmacro for-tree (var root left-expr right-expr . body)
(with-gensyms (stack node left right)
^(let (,stack)
(go-leftmost ,var ,root ,left-expr ,stack)
(for ((,var (pop ,stack)))
(,var)
((iflet ((,right ,right-expr))
(go-leftmost ,var ,right ,left-expr ,stack))
(set ,var (pop ,stack)))
,*body))))
The left-expr and right-expr must be expressions that involve the var variable; the construct evaluates these to find the left or right child. When the value is nil it means there is no child.This is almost exactly what the blog is asking for, transliterated to Lisp syntax.
Original concept in fantasy C syntax:
for_tree(Node* N = mytreeroot; N != NULL; N : {N->left, N->right}
{
print(N->value);
}
Lispified: (for-tree nod example-tree nod.left nod.right
(prinl nod.value))
As required, the construct specifies the node variable to be the loop dummy, the root expression giving its initial value and the two accessor expressions for left and right traversal.The termination test N != NULL is made implicit in the Lisp version, so early termination requires a break out of the loop. It could be arranged.
The fantasy C syntax specifies a variable name in the navigation declarator: N : { N->left, N->right }. Presumably, the variable here can be renamed to anything you want; it just arbitrarily happens to have the same name as the loop dummy.
I didn't replicate this feature exactly because it just adds verbosity for nothing. It's okay if the navigation declarator just refers to the loop's one and only dummy variable.
Anyway, programming languages should definitely not have a tree traversal primitive. Rather, languages should be Lisp.
The left, right and key fields of a node object are accessed with like-named functions. We must call tree-root on a tree object to get to the encapsulated root node:
1> (for-tree n (tree-root #T(() 1 2 3 4 5 6 7 8)) (left n) (right n)
(prinl (key n)))
1
2
3
4
5
6
7
8
How about a tree of integers. Let's say the root integer is 1, and to go left is to multiply by 2, and to go right is to multiply by 2 and mask in a 1. And let's say the depth caps out at the value 16: 2> (for-tree n 1 (if (< n 8) (* 2 n)) (if (< n 8) (+ 1 (* 2 n)))
(prinl n))
8
4
9
2
10
5
11
1
12
6
13
3
14
7
15
nil
Let's see that in binary: 3> (for-tree n 1 (if (< n 8) (* 2 n)) (if (< n 8) (+ 1 (* 2 n)))
(format t "~b\n" n))
1000
100
1001
10
1010
101
1011
1
1100
110
1101
11
1110
111
1111
nil
Let's right-align digits to see different patterns in it: 2> (for-tree n 1 (if (< n 8) (* 2 n)) (if (< n 8) (+ 1 (* 2 n)))
(format t "~6b\n" n))
1000
100
1001
10
1010
101
1011
1
1100
110
1101
11
1110
111
1111
nil
Nah, left-aligned wins. 3> for-tree n 1 (if (n < 8) (n << 1)) (if (n < 8) (n << 1 | 1))
(format t "~b\n" n)
1000
100
1001
10
[...]
// Comments for the non-Rust native reader, regarding this Function declaration:
// successors is a function that accepts an `Option` container for some Value of type T, called `first`
// and a Closure called `succ`, constrained below:
pub fn successors<T, F>(first: Option<T>, succ: F) -> Successors<T, F> ⓘ
where
// `succ` must receive the iterated state, and return the next iterated state
F: FnMut(&T) -> Option<T>,
// Each time the `next()` function is called on the returned Iterator (a Successors-flavored iterator),
// the state of `first` is yielded, and then
// `succ` is called to progress
// until a `None` type is reported by `succ`
I'm not sure where the concept came from, but it's not dissimilar to the author's implementation, but instead of the ControlFlow enum, it relies simply on the Option enum. I know though, that it was initially built in the Itertools crate as unfold and then upstreamed some time later.Essentially you use `first` to contain a Queue, Stack, or Level for the different traversals, and define traversal or activities from there.
It's fairly ergonomic in practice, ergonomic enough for Leetcode.
Here's a BFS: https://leetcode.com/problems/course-schedule-iv/solutions/6...
[0] https://doc.rust-lang.org/std/iter/fn.successors.html
[1] https://docs.rs/itertools/latest/itertools/fn.unfold.html
But I think grafting those two together is the right answer, not inventing a new loop construct.
Trees are easy to write iterators for. DAGs are a bit harder and full graphs are an advanced interview question.