Containers Redux - Capturing Type Class information for myriad types


Previously, I wrote an article on a datatype I've been calling container.

Containers - Associating items by type class

As a quick review, this data type is used to capture evidence of a type class for a datapoint, for the purposes of storing data that have the same typeclasses in a sequence or list.

Previously I was asked what purpose this datastructure actually had, aside from academic curiosity. At the time I didn't have a purpose for it, but today I can say that I very certainly do... C library bindings using the new foreign interface in Java 17 and up.

Foreign API in Java 17

The problem


In Java 17's Foreign API, C function bindings are described by a FunctionDescriptor and a MethodType. The FunctionDescriptor is constructed from MemoryLayouts that correspond to the input argument types and the return type. The same applies for the MethodType, except it uses Class[?] types. A sample mapping of types to MemoryLayouts follows:


There are more complex constructions, but this gives a basic gist of the input mapping. In a library I'm working on, I have two type classes that are very important for generating these bindings: BindingTransition[A] and ElementContext[A]. BindingTransition[A] handles converting an input of type A into a format that the MethodHandle produced by the Foreign API can understand. It's also used to convert the Object return of the MethodHandle into type A. ElementContext[A] provides information about the type A, including its MemoryLayout and Class[?] mapping. My library currently uses these type classes and some other machinery to have the following work:

val abs: Int => Int = makeBinding[Int => Int]("abs")

Now, one thing to understand about the Foreign API is that it doesn't have any special facility for variadic C functions. The bare minimum binding for C's printf is represented by a simple C_POINTER as an input. However, printf is variadic, and can accept however many arguments you want. In theory then, we need to be able to write code like the following:

printf("hello %s", "world")

However, how do we write that type for the above makeBinding function? Well, Container is the answer, but an upgraded version of it.

Improving Container


As I said in my previous section, we need two type classes for each input in my C binding scheme to generate the appropriate binding, BindingTransition[A] and ElementContext[A]. With the previous iteration of Container, one could have a Seq[Container[ElementContext]] to represent the variadic inputs, if only we needed evidence for that singular type class. Lets expand our definition of Container to allow capturing multiple type classes then!

To start with, we need a data type to represent type classes without type arguments. For that, we will use the following:

sealed trait Capabilities
sealed trait *:::[A[_], B <: Capabilities] extends Capabilities
sealed trait End extends Capabilities


Next, we want a way to have Container be a class instead of a trait. Last time, the need to store the data of the Container without recording its type held us back, but this time we'll use a container for the data called... Data.

class Data[A](a: A):
  type B = A 
  val b: B = a

Now our container class looks like this:

class Container[A <: Capabilities](val data: Data[?])

The type classes captured by the container take on the following syntax:

Container[BindingTransaction *::: ElementContext *::: End]

Now we need storage for the type class evidence captured by our container. This modifies the look of Container like so:

class Container[A <: Capabilities](
  val data: Data[?],
  val evidences: Array[AnyRef]
)

This isn't particularly typesafe, but we can make usage of the Container typesafe by modifying the use method we had before:

class Use[A[_], B](b: B)(using ev: A[B]): 
  def apply[C](fn: A[B] ?=> B => C) = fn(b)
class Container[A <: Capabilities](
  val data: Data[?],
  val evidences: Array[AnyRef]
): 
  inline def fetchIdx[C[_], A <: Capabilities]: Int =
    inline erasedValue[A] match 
      case _: (C *::: ?) => 0
      case _: (? *::: rest) => fetchIdx[C, rest] + 1
      case _: End => error("ended")
    
  inline def fetchable[C[_], A <: Capabilities]: Boolean = 
    inline erasedValue[A] match 
      case _: (C *::: ?) => true
      case _: (? *::: rest) => fetchable[C,rest]
      case _: End => false 

  inline def use[C[_]]: Use[C, data.B] = 
    inline if fetchable[C,A] then 
      val ev: AnyRef = evidences(fetchIdx[C,A])
      Use(data.b)(using ev.asInstanceOf[C[data.B]])
    else error("The requisite capability doesn't exist")

Our Container class is complete, and usage is relatively easy:

val c: Container[Numeric *::: Show *::: End] = ???

c.use[Show](_.show)//would print data

However it's a nightmare to construct a Container from data. Let's implement some macros for apply to make it easy:

object Container:
  type ToTuple[A <: Capabilities, T] <: Tuple = A match
    case head *::: tail => head[T] *: ToTuple[tail, T]
    case End => EmptyTuple
  
  inline def apply[A <: Capabilities](data: Any): Container[A] = ${
    applyImpl[A]('data)
  }

  private def getEvidences[A <: Tuple](using
      Type[A],
      Quotes
  ): Expr[Vector[AnyRef]] =
    import quotes.reflect.*
    Type.of[A] match
      case '[head *: tail] =>
        val ev = Expr
          .summon[head]
          .getOrElse(
            report.errorAndAbort(s"Couldn't find ${Type.show[head]} in scope")
          )
          .asExprOf[AnyRef]
        var rest = getEvidences[tail]
        '{
          $ev +: $rest
        }
      case '[EmptyTuple] =>
        '{ Vector.empty }

  private def applyImpl[A <: Capabilities](
      data: Expr[Any]
  )(using Quotes, Type[A]) =
    import quotes.reflect.*

    data match
      case '{ $a: i } =>
        TypeRepr.of[i].widen.asType match
          case '[j] =>
            val expr = getEvidences[ToTuple[A, j]]
            '{
              new Container[A](Data($data), ${ expr }.toArray)
            }

The getEvidences method deconstructs the stuff needed to be summoned and builds up an array of the typeclass instances (should they exist). appylImpl grabs the true type of a, instead of having an Any, and widens it to make it useful to us.

The result is that we can write code like the following:

val c: Container[Numeric *::: Show *::: End](5)

println(c.use[Show](_.show)) //prints 5

Bringing it back to C


Now that we've extended Container, we can define the following type:

type Variadic = Container[BindingTransition *::: ElementContext *::: End]

And we can define a binding to printf like so:
val printf = makeBinding[(Ptr[Byte], Seq[Variadic]) => Unit]("printf")

//using printf works, but is a bit unwieldly
Scope(){
  printf(
    toNative("Hello %s"), Seq(Container(toNative("world")))
  )
}

//prints "Hello world"
So that works, but can we improve the situation a bit more? Yes we can!!

object Container:
...
  inline implicit def getContainer[A <: Capabilities](data: Any): Container[A] =
    ${
      applyImpl[A]('data)
    }

Adding this one short implicit to the Container companion object improves the usage of Container significantly:

val c: Container[Show *::: Numeric *::: End] = 5

//prints hello world2
Scope() {
  printf(toNative("hello %s %d"), Seq(toNative("world"), 2))
}

Addendum: Why not simple pattern matching


Above I showed a simple mapping from Scala types to MemoryLayout. It would be trivial to pattern match from type of data to MemoryLayout, except for two things:

1. Typically we want to generate the required MethodHandle before the call has even been made, and before we have any data.
2. Pattern matching for complex types like case classes isn't feasible in the end, but the type class system I've developed can destructure these types (and nested case classes and tuples), and return the appropriate MemoryLayout.

Anyway, I hope this has been an interesting and illuminating read! Happy Scala Hacking!!