The biggest semantic mess in Futhark

The Biggest Semantic Mess in Futhark Posted on September 26, 2025 --- Overview Futhark is a high-performance, purely functional data-parallel array programming language designed to keep parallel programming concepts simple—comparable in complexity to basic functional programming taught in universities. Most features of Futhark have straightforward, environment-based semantics, and even complex features like uniqueness types are primarily operationally complicated but semantically clear. However, one notable exception is the feature of size types. Despite seeming simple, size types have caused many edge cases, bugs, and semantic challenges in Futhark’s implementation and interpreter. --- Size Parameters Size types let functions constrain the sizes of their array parameters. Example: A dot product function specifying vectors must have the same length n: The size parameter n is implicitly instantiated at runtime from the argument arrays. Size parameters can appear as term-level variables: Here, the value of n is the size of the array dimension. Challenges with Nested Sizes For multidimensional arrays, like matrices: Problems occur when outer dimensions are zero (e.g., n = 0), since no elements exist to inspect. To solve this, Futhark arrays carry shape metadata separate from content, ensuring size info is available even for empty arrays like replicate 0 (replicate 3 0) (which evaluates to shape [0][3]). --- Constructing Arrays Without Values Consider the polymorphic map function: When mapping over an empty array (n=0), the function f is never applied, so we cannot infer the shape of elements of type b from values. The question arises: How to determine the full shape [0][m] of the resulting array without element samples? "Shape" Semantics Approach Barry Jay’s paper "A Semantics for Shape" suggests functions can be evaluated in two ways: Value evaluation: Applies to concrete values. Shape evaluation: Applies to array shapes producing corresponding result shapes. This works only if the function is shapely—the result shape depends solely on the input shape (excludes non-regular operations like filtering). This means map can infer the shape of the result array from the shapes of its inputs, even with empty inputs. Futhark’s Model Polymorphic functions in Futhark can query the concrete instantiated type of their parameters at runtime. This allows them to extract shape information to annotate constructed arrays with full shapes. The interpreter models this approach faithfully, though the compiler internally uses code transformations (monomorphisation). --- Troubles and Complexity Maintaining full shape information at all times is complex: Requires evaluating types along with expressions. Types may contain expressions depending on variables and environment bindings. Example: Here, C is a size-parameterized type depending on variable cnt. Difficulties The type M.C [n] cannot yield a concrete shape until both n and cnt are known. Instantiating M.C [k] requires: Knowing cnt in the environment of the module Evaluating n cnt in that environment Simple syntactic substitution to resolve this is inefficient and undesirable for an interpreter designed to use environment-based evaluation. Futhark’s Solution Type constructors are implemented as closures** that capture their environment at definition time. When instantiated, the environment