Saumitra's blog

Search / Analytics / Distributed Systems / Machine Learning / DSLs

Antlr4 - Visitor vs Listener Pattern

In previous post, we saw how to create and parse DSL with antlr4. In this post we will compare the two tree walking mechanism provided by the library - Listener vs Visitor. Both approaches have their own advantages, and choice of preferred method depends on what you are using antlr for.

Lets take a look at generated methods in Listener and Visitor interfaces, for grammar from previous post

Listener

1
2
3
4
5
6
7
8
9
10
11
12
public class ArithmeticParserBaseListener implements ArithmeticParserListener {

  @Override public void enterExpr(ArithmeticParser.ExprContext ctx) { }
  @Override public void exitExpr(ArithmeticParser.ExprContext ctx) { }
  @Override public void enterOperation(ArithmeticParser.OperationContext ctx) { }
  @Override public void exitOperation(ArithmeticParser.OperationContext ctx) { }

  @Override public void enterEveryRule(ParserRuleContext ctx) { }
  @Override public void exitEveryRule(ParserRuleContext ctx) { }
  @Override public void visitTerminal(TerminalNode node) { }
  @Override public void visitErrorNode(ErrorNode node) { }
}

Visitor

1
2
3
4
public class ArithmeticParserBaseVisitor<T> extends AbstractParseTreeVisitor<T> implements ArithmeticParserVisitor<T> {
  @Override public T visitExpr(ArithmeticParser.ExprContext ctx) { return visitChildren(ctx); }
  @Override public T visitOperation(ArithmeticParser.OperationContext ctx) { return visitChildren(ctx); }
}

There are 3 primary differences:

  1. Listener methods are called automatically by the ANTLR provided walker object, whereas visitor methods must walk their children with explicit visit calls. Forgetting to invoke visit() on a node’s children means those subtrees don’t get visited
  2. Listener methods can’t return a value, whereas visitor methods can return any custom type. With listener, you will have to use mutable variables to store values, whereas with visitor there is no such need.
  3. Listener uses an explicit stack allocated on the heap, whereas visitor uses call stack to manage tree traversals. This might lead to StackOverFlow exceptions while using visitor on deeply nested ASTs

In previous post we used listener. Lets see a equivalent visitor implementation for same usecase:

First thing which we need to do is identify and define types which will be returned by each vistor method which we will override.

1
2
3
sealed trait Expr
case class Operation(name:String) extends Expr
case class ExprResult(res:Option[Double]) extends Expr

Now we can extend the visitor class with our custom type

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
class ArithmeticVisitorApp extends ArithmeticParserBaseVisitor[Expr] {

  override def visitExpr(ctx: ArithmeticParser.ExprContext): ExprResult = {
    val exprText = ctx.getText
    println(s"Expression after tokenization = $exprText")

    val operands = ctx.NUMBER().toList.map(_.getText)
    val operand1 = parseDouble(operands.lift(0).getOrElse("0.0")).getOrElse(0.0)
    val operand2 = parseDouble(operands.lift(1).getOrElse("0.0")).getOrElse(0.0)

    val operation = visitOperation(ctx.operation())

    ExprResult(calculate(operand1, operand2, operation.name))

  }


  override def visitOperation(ctx: ArithmeticParser.OperationContext): Operation = {
    val op = ctx.getText
    Operation(op)
  }

  def parseDouble(s: String): Option[Double] = Try(s.toDouble).toOption

  def calculate(op1:Double, op2:Double, operation:String):Option[Double] = {
    operation match {
      case "+" => Some(op1 + op2)
      case "-" => Some(op1 - op2)
      case "*" => Some(op1 * op2)
      case "/" => Try(op1 / op2).toOption

      case _ =>
        println(s"Unsupported operation")
        None
    }

  }

}

Notice that both rules visitExpr and visitOperation return some value which can be shared between rules. And for the caller method:

1
2
3
4
5
6
7
8
9
10
11
12
13
  def parse(input:String) = {
    println("\nEvaluating expression " + input)

    val charStream = new ANTLRInputStream(input)
    val lexer = new ArithmeticLexer(charStream)
    val tokens = new CommonTokenStream(lexer)
    val parser = new ArithmeticParser(tokens)

    val arithmeticVisitor = new ArithmeticVisitorApp()
    val res = arithmeticVisitor.visit(parser.expr())
    println(res)

  }

At line 10, We are explicitly calling visit on expr rule and it returns back the result of expression evaluation.

Both approaces have their advantages depending on the usecase. If you want to have full control over the traversal, dont want any mutability and your grammar rules won’t lead to deeply nested trees, then its preferred to go with visitor.

Here’s a figure showing call sequence for listener vs visitor

Listener

Visitor

Good thing about listener pattern is that its all automatic and even if you don’t write any parse tree walker, then also it will figure out and trigger the enter and exit method for each rule. This is a huge benefit for translation type of usecases.

I hope this post gave you some clarity in terms of which pattern to choose while creating your DSL.