Static Dynamics - Expression definable datatypes in Scala

In most programming languages, there are two fundamental types of code: statements and expressions. Statements modify the state of a program dependent on scope. Take for example the following code:

def printHello = println("hello!")

This line of Scala code is a statement. It modifies the state of a program by defining a method named printHello that is bound to a type (which type is dependent on where this line exists). This method is a fundamental part of the program and its changes what the name "printHello" refers to (if anything) in the program.

The expression equivalent of the above line is as follows:

() => println("hello!")

This function has the same behavior as the aforementioned method, but it does not modify program state. It does not belong to some type absent a binding via the "val" keyword (which would make it part of a statement).

The difference between statements and expressions is important to understand in Scala. A statement can be defined only in terms of expressions, and cannot include other statements.

//compile-time error
val x = def printHello = println("hello!")

An expression can contain statements, but the changes to the program wrought by those statements are not visible outside of the expression.

def x = 
  def printHello = println("hello!")
// compile-time error
printHello

Classes in Scala can be defined via statement or expression. However, the members of classes defined via expression (anonymous classes) cannot be accessed unless they are part of the parent class.

//statement definition of a class named A
class A: 
  val c = 5

val a = A()
a.c //works

//expression definition of a class 
val b = new Object { val c = 5 }
b.c //compile-time error; c is not a member of Object

So, can we define a class via an expression in Scala 3 while still allowing full access to the members defined via the expression? In a sense we can using a little-known feature that was introduced in Scala 2: Dynamic.

Dynamic

Dynamic is a special trait that was introduced in 2.10 in order to allow somewhat dynamic types in Scala. The Dynamic trait has 4 methods defined for it.

selectDynamic
applyDynamic
applyDynamicNamed
updateDynamic

All of these methods are magic in much the same way "apply" is in Scala. Let's see how they work:

class Foo extends Dynamic

foo.bar //translates to foo.selectDynamic("a")
foo.baz("qux") //translates to foo.applyDynamic("baz")("qux")
foo.baz(pem = "qux") //translates to foo.applyDynamicNamed("baz")("pem" -> "qux")
foo.bar = 10 //translates to foo.updateDynamic("bar")(10)
foo.arr(10) = 13 //translates to foo.selectDynamic("arr").update(10, 13)
foo.arr(10) // translates to foo.applyDynamic("arr")(10)

In essence, calls to members of classes that extend Dynamic are translated into calls to one of the four-aforementioned methods. The Foo class defined above would give errors for all of these calls since it has no definition for these methods, so lets look at how it might be implemented:

class Foo(var storage: Map[String, Any]) extends Dynamic: 
  def selectDynamic(field: String): Any = storage(field)
  def applyDynamic(method: String)(inputs: Any*) = storage(method).asInstanceOf[Seq[Any] => Any](inputs)
  def updateDynamic(field: String)(value: Any) = 
    storage = storage.updated(field, value)

A few things come to mind looking at this definition. It becomes obvious that the members of Foo are definable via an expression:

val foo = Foo(Map("hello" -> 4, "goodbye" -> 'a'))
foo.hello //compiles and returns 4 at runtime 
foo.goodbye //compiles and returns 6 at runtime

It's also follows from the definition that one can extend the definition of foo via expressions.

foo.size = 10
foo.size //returns 10
foo.printInput = (args: Seq[Any]) => println(args.head)
foo.printInput(3) //prints 3

You may also notice from the definition that we need not implement all 4 members of Dynamic. We chose not to define applyDynamicNamed for `Foo` despite being fully able to do so.

However, this code should make any self-respecting lover of static, strongly typed languages scream in revulsion. Foo has absolutely no type safety. Attempting to use a member that hasn't been pre-defined is a runtime error instead of a compile time one, and there's no guards against bad input.

Is the flexibility of Dynamic worth its cost in a language that prides itself on strong and static typing?

Static Dynamic types

The four methods of Dynamic are magic like apply in a lot of ways. One very important way is that these methods do not have a fixed signature. Rather, the implementation of one of these methods is solely identified based on the method name.

This means that we do not have to write a Dynamic implementing class the way we did above:

class Bar extends Dynamic:
  def selectDynamic[A <: String](str: A): Int = str.length

val bar = Bar()

bar.hello //returns 5 with the return type Int

As shown above, 'bar.hello' transforms into 'bar.selectDynamic("hello")' at compile-time. The only real restrictions on the signature of selectDynamic is that it takes one parameter, and that that one parameter is capable of accepting a String literal. That makes all the difference in the world...

Singleton types!

Singleton types are a feature that was added in Scala 2.13. They are types that are defined for every literal in Scala. If you don't know what a literal is, it's an expression that is pure data. "hello", 3, and 6d are examples of literals. Each of these literals have a corresponding type of the same name since Scala 2.13:

val a: 5 = 5
val b: 4 = 8 //doesn't compile, 4 is not 8
val c: "hello" = "hello"

You might think to yourself that this is a relatively useless feature. These singleton types are narrow, and they can only accept one single piece of data, so what is the use? Well, it turns out they are supremely useful for describing specific data in a type signature. Take for example the following type:

type Baz = (("hello",Int), ("goodbye", Char))

This type is a tuple of pairs. A tuple map if you will. And this specific one could be thought of as corresponding to an anonymous case class defined as such:

case class A(hello: Int, goodbye: Char)

Using a tuple like this, and a couple of helper type classes, we can reintroduce compile-time checking of the existence of members of a class that extends Dynamic.

type ExistsIn[T <: Tuple, S <: Singleton] = T match
  case (S, ?) *: ? => true
  case ? *: tail => ExistsIn[tail, S]
  case EmptyTuple => false 

class Baz[T <: Tuple] extends Dynamic:
  def selectDynamic[S <: Singleton](s: S)(using ExistsIn[T,S] =:= true): Any = ???

val baz = Baz[("hello", Int) *: ("goodbye", Char) *: EmptyTuple]()
//compile-time error: cannot prove that
// ExistsIn[("hello",Int) *: ("goodbye", Char) *: EmptyTuple, "qux"] =:= true 
baz.qux

//not a compile-time error, but a runtime error due to our current Baz definition
baz.hello

Using a few more match types, and switching our selectDynamic implementation to be inline, we can make it fully typesafe as well:

type ValuesOf[T <: Tuple] <: Tuple = T match 
  case (?, v) *: tail => v *: ValuesOf[tail]
  case EmptyTuple => EmptyTuple 

type IndexOf[T <: Tuple, S <: Singleton] <: Int = T match 
  case (S, ?) *: ? => 0
  case ? *: tail => 1 + IndexOf[tail, S]

class Baz[T <: Tuple](members: ValuesOf[T]) extends Dynamic:
  inline def selectDynamic[S <: Singleton](s: S)(using ExistsIn[T,S] =:= true): 
    Tuple.Elem[ValuesOf[T], IndexOf[T, S]]
  = 
    inline members match 
      case membersNonEmpty: NonEmptyTuple => 
        inline membersNonEmpty(constValue[IndexOf[T,S]]) match 
          case result: Tuple.Elem[ValuesOf[T], IndexOf[T,S]] => result

val baz = Baz[("hello",Int) *: ("goodbye", Char) *: EmptyTuple]((4,'c'))

val a: Int = baz.hello //type-checks
val b: String = baz.hello //compile-time error

val c: Char = baz.goodbye
val d: Float = baz.bar //compile-time error, field doesn't exist

As you can see from the test code, we have regained type safety and compile-time errors, and our Dynamic extending Baz is more flexible than typical class definitions.

However, our definition of selectDynamic is quite complex now. What do all the parts mean? The return type `Tuple.Elem[ValuesOf[T], IndexOf[T,S]]` indicates that the return type is the type of the nth element of the value types in tuple T, where n is the index where S appears as a key in tuple T. For the case where S is "hello", IndexOf[T,S] resolves to 0, and Tuple.Elem[(Int, Char), 0] resolves to Int, making the return type Int.

Inside the method body, we have a number of inline matches. These are compile-time data-checks that can verify the type of some data much better than at runtime, which turns out to be very necessary for us. The first inline match checks if the instance of T named members is a NonEmptyTuple or not. membersNonEmpty with the type NonEmptyTuple & T becomes available in one of the cases, allowing access to the apply method of NonEmptyTuple, which allows you to get data with an exact type back from a NonEmptyTuple. We do not define a case for T being `EmptyTuple` because the `using ExistsIn[T,S] =:= true` guard we have before already proves that T is a non-empty tuple and an additional case would be unreachable. The match here is merely a type-safe way of convincing the Scala compiler of that fact.

Next we have an inner match on the result of `membersNonEmpty(constValue[IndexOf[T,S]])`. `constValue[IndexOf[T,S]]` produces an Int value from the calculation of `IndexOf[T,S]`, turning the type result into a value result. The apply method invocation retrieves the element at the index along with its specific type from the value tuple ValuesOf[T]. However, due to a weakness of the Scala compiler, it is not enough to just return this value as the Scala compiler has difficulty proving that the return type `Tuple.Elem[ValuesOf[T], IndexOf[T,S]]` actually aligns with `Tuple.Elem[ValuesOf[T], IndexOf[T,S]]`. I'm not sure why this is, and it may be a compiler bug. In any case, an inline match with a case that matches on `Tuple.Elem[ValuesOf[T], IndexOf[T,S]]` is enough to convince the compiler that our code is correct.

Results and Conclusions

This approach can be extended far beyond what I've written here, to include method members, mutability, etc. Performance-wise, this technique of defining data types is slower than class definition, but not significantly:

Benchmark                              Mode  Cnt    Score    Error   Units
StaticDynamicBench.caseClassTest      thrpt    5  632.794 ± 27.261  ops/us
StaticDynamicBench.staticDynamicTest  thrpt    5  398.554 ±  2.114  ops/us

Why use this?

This technique doesn't seem super necessary at first. It's just another way to create data types that pretends to be a case class, and is less capable than a case class. However, if you think back to the start of this blog post, the answer becomes clearer; these data types are fully definable via expressions instead of statements, and that's very important for macros.

How statements and expressions work influences the implementation of macros in Scala 3. All macros are formed by a pair of definitions:

inline def myMacroHook: Any = ${
  myMacroHookImpl
}

def myMacroHookImpl(using Quotes): Expr[Any] = ???

The first definition is the hook used by code that wants to use the macro. It acts as the intermediary that passes data to the actual macro code, and splices the results of the actual macro into the code that called the hook. The second definition is the actual implementation of the macro. It takes the data given to it and returns an EXPRESSION.

The fact that a macro implementation in Scala 3 returns an expression means that a macro in Scala 3 cannot modify the state of the Scala program on its own. That also means that a macro cannot define data types and allow them to be useable outside of the macro code... except in this case. These static dynamics can be returned from macros and can be used for code generation that Scala 3's macros normally wouldn't allow.

Are there any other downsides aside from performance?

The one major downside of this approach aside from performance loss is the loss of hints from tools like metals, the scala compiler, or IntelliJ. The information needed to give hints exists in these types, but since they are not built in, no editor tools will recognize said information, and I doubt they ever will. That means that you will get compiler errors if you try to use a field that doesn't exist, but can't use autocomplete to discover what fields do exist.

Why not programmatic structural types?

Programmatic structural types seem like they could remedy the last issue, and at the same time provide the type-safety guarantees we want, and more easily to boot. However, it's impossible to construct a programmatic structural type from a match type (unlike with this approach), meaning a macro that generates one must be defined as transparent inline. Transparent inline introduces a number of issues and is best avoided when dealing with macros whenever possible. The resulting type from the macro would also be inherently unknowable to the user, and thus dealing with any kind of generic result from the macro would be a headache.

I hope you found this blog post informative, and I hope you see that `Dynamic` isn't as bad as you may have originally thought.

Happy Scala hacking!!