Doug's Compiler Corner

Originally posted on 2024-07-01 06:55:00 +0000

Last updated on 2024-09-28 17:17:06 +0000

Swift for C++ Practitioners, Part 10: Operator Overloading

One of the ways in which C++ lets you define library types that feel like built-in types is operator overloading: you can define you own type (say, an big integer or a matrix type), and along with it a reasonable set of operators like +, -, *, and /. Operator overloading also lets you treat classes like functions and introduce subscripts. Swift provides a rather more... extensive... operator overloading system, along with related features that help with building DSLs. Let's start with operator overloading, and along the way we'll also tackle Swift "key paths", which are conceptually similar to pointer-to-members in C++.

Operator overloading

C++ operator overloading can be (and has been) abused: the standard library's use of the left-shift operator (<<) for output streaming is a little bit sus, because "shift left" and "output to a stream" really have no business being the same operator, and the arithmetic << doesn't necessarily have the precedence you want for output streaming. Some libraries (Boost.Spirit is a favorite example) take operator overloading to the extreme, providing a completely different meaning for each of the standard operators. Sometimes it works, sometimes it creates confusion.

In Swift, we wanted operator overloading for its expressive power in libraries, but were concerned about falling into the same trap where the same set of operators have very different meanings in different libraries. Perhaps we could have created some restrictions to avoid having different meanings for the same operators, but instead we went the completely opposite direction in a delightful bit of over-engineering: in Swift, libraries can define their own sets of operators with their own precedence relationships. If you need some operators to make your library API great, and the meanings don't match with the standard set of operators, no problem: define your own operator so there's no confusion.

Operators & precedence groups

Swift has two kinds of declarations for producing the standard operators: operator and precedencegroup. The operator declaration spells out the name of an operator along with its kind (infix, prefix, or postfix). For example, let's say we deeply miss C+'s prefix * for pointer syntax. We can declare such an operator like this:

prefix operator *

And then we can go ahead and add a * implementation to UnsafePointer to read the value:

extension UnsafePointer {
  static prefix func *(_ pointer: Self) -> Pointee { pointee }
}

Infix (binary) operators have a precedence group associated with them. The precedence group is given a name, associativity (left or right), and relationship to other precedence groups. For example, let's say we want to create a <<< operator that streams out values but doesn't stomp on the bit-shift operator. It's an infix operator that could look like this:

infix operator <<<: OutputStreaming

Here, OutputStreaming is a precedence group. We could define it to be a standalone precedence group, unrelated to all others, like this:

precedencegroup OutputStreaming { }

Now, we can use <<< as an infix operator. Here's a little OutputStream class to show how we can define such an operator for use:

class OutputStream {
    static func <<<(lhs: OutputStream, rhs: Int) -> OutputStream {
        // stream it
        return lhs
    }
}

Now if we have value os of type OutputStream, we can write the expression os <<< 17.

But... what happens if we write the expression os <<< 17 <<< 42? We get an error:

error: adjacent operators are in non-associative precedence group 'OutputStreaming'

The problem here is that we haven't specified whether the <<< operator should be read as the left-associative (os <<< 17) << 42 or the right-associative os <<< (17 <<< 42). Since our design is mimicking C++ output streaming by returning the left-hand operand, we meant to make it left-associative, so let's specify that:

precedencegroup OutputStreaming { 
    associativity: left
}

Now, our expression parses. The next thing to consider is how <<< works along with other operators. For example, how should the expression os <<< 17 + 25 be handled? It could be treated as (os << 17) + 25 or os << (17 + 25), or could even be considered an error that requires the user to write parentheses. The default behavior is an error, like this:

error: adjacent operators are in unordered precedence groups 'OutputStreaming' and 'AdditionPrecedence'

What's this AdditionPrecedence thing, you say? Well, it comes... from the Swift standard library.

Standard operators

As I've mentioned before, the "standard" types in Swift are expressed using the same tools that are available to all Swift libraries. We saw it with types like Int and Array being defined in the standard library (not the language), and more recently with extensibility of literals. The same principle applies to the "standard" operators that we think of as being part of the language: the full set of arithmetic and Boolean operators are defined in the standard library with precedence relations. Here is the full set of standard precedence groups:

precedencegroup AssignmentPrecedence {
  associativity: right
  assignment: true
}
precedencegroup FunctionArrowPrecedence {
  associativity: right
  higherThan: AssignmentPrecedence
}
precedencegroup TernaryPrecedence {
  associativity: right
  higherThan: FunctionArrowPrecedence
}
precedencegroup DefaultPrecedence {
  higherThan: TernaryPrecedence
}
precedencegroup LogicalDisjunctionPrecedence {
  associativity: left
  higherThan: TernaryPrecedence
}
precedencegroup LogicalConjunctionPrecedence {
  associativity: left
  higherThan: LogicalDisjunctionPrecedence
}
precedencegroup ComparisonPrecedence {
  higherThan: LogicalConjunctionPrecedence
}
precedencegroup NilCoalescingPrecedence {
  associativity: right
  higherThan: ComparisonPrecedence
}
precedencegroup CastingPrecedence {
  higherThan: NilCoalescingPrecedence
}
precedencegroup RangeFormationPrecedence {
  higherThan: CastingPrecedence
}
precedencegroup AdditionPrecedence {
  associativity: left
  higherThan: RangeFormationPrecedence
}
precedencegroup MultiplicationPrecedence {
  associativity: left
  higherThan: AdditionPrecedence
}
precedencegroup BitwiseShiftPrecedence {
  higherThan: MultiplicationPrecedence
}

The various operators available in the Swift standard library use these precedence groups:

infix operator << : BitwiseShiftPrecedence
infix operator &<< : BitwiseShiftPrecedence
infix operator >> : BitwiseShiftPrecedence
infix operator &>> : BitwiseShiftPrecedence

infix operator * : MultiplicationPrecedence
infix operator &* : MultiplicationPrecedence
infix operator / : MultiplicationPrecedence
infix operator % : MultiplicationPrecedence
infix operator & : MultiplicationPrecedence

infix operator + : AdditionPrecedence
infix operator &+ : AdditionPrecedence
infix operator - : AdditionPrecedence
infix operator &- : AdditionPrecedence
infix operator | : AdditionPrecedence
infix operator ^ : AdditionPrecedence

infix operator ... : RangeFormationPrecedence
infix operator ..< : RangeFormationPrecedence

infix operator ?? : NilCoalescingPrecedence

infix operator < : ComparisonPrecedence
infix operator <= : ComparisonPrecedence
infix operator > : ComparisonPrecedence
infix operator >= : ComparisonPrecedence
infix operator == : ComparisonPrecedence
infix operator != : ComparisonPrecedence

infix operator === : ComparisonPrecedence
infix operator !== : ComparisonPrecedence
infix operator ~= : ComparisonPrecedence

infix operator && : LogicalConjunctionPrecedence

infix operator || : LogicalDisjunctionPrecedence

infix operator *= : AssignmentPrecedence
infix operator &*= : AssignmentPrecedence
infix operator /= : AssignmentPrecedence
infix operator %= : AssignmentPrecedence
infix operator += : AssignmentPrecedence
infix operator &+= : AssignmentPrecedence
infix operator -= : AssignmentPrecedence
infix operator &-= : AssignmentPrecedence
infix operator <<= : AssignmentPrecedence
infix operator &<<= : AssignmentPrecedence
infix operator >>= : AssignmentPrecedence
infix operator &>>= : AssignmentPrecedence
infix operator &= : AssignmentPrecedence
infix operator ^= : AssignmentPrecedence
infix operator |= : AssignmentPrecedence

There are some Swift-specific ones in there (..., ..<, and ??, for example) that line up with specific precedence groups (NilCoalescingPrecedence, CastingPrecedence, etc.), but for the most part these operators reflect those of C(++) and where they do, the precedence and associativity is the same.

Most Swift programs just use this standard set of operators, and that's good! It's why they are standard. But if your library calls for its own operators (say, to express another domain), make sure to think about both their associativity and their relationship to the standard operators. For our output streaming operator, we expect it to have a relatively low precedence, so we can use other operators to compute the values we stream out. Making OutputStreaming have a lower precedence than the logical disjunction (||) works well:

precedencegroup OutputStreaming { 
    associativity: left
    lowerThan: LogicalDisjunctionPrecedence
}

Aside: "unused result" warnings

If you've been coding along with the examples above, you've probably noticed that we get an "unused result" warning for expressions like os <<< 17. By default, Swift emits these warnings for calls that return a non-Void result. To suppress the warning at the use site, assign the expression to the placeholder value _. For our <<< operator, that also means making its precedence higherThan: AssignmentPrecedence.

However, for this particular function, we don't actually care whether the user discards the result, because the important part is the output streaming effect. Therefore, we can put the @discardableResult result attribute on the function to squash the warning for all uses:

@discardableResult
static func <<<(lhs: OutputStream, rhs: Int) -> OutputStream { ... }

Think of this as flipping the default for the C++ nodiscard attribute: by default, Swift emits a warning for unused function results, and @discardableResult disables that warning for a given function.

Calling a value as a function

C++ lets you create classes that behave like functions by implementing an operator() method. Indeed, this is the basis of C++ lambdas and all other C++ "function objects".

Swift has a similar capability, but it's far less commonly-used and offers a somewhat disjoint feature set. The basic idea is the same: in Swift, you can introduce a method named callAsFunction that will be used when calling an instance of the value as a function. For example, here's a Swift "function object" that binds the first value of a function, like the old school C++ bind1st, albeit using variadic generics:

struct BindFirst<Result, First, each Rest> {
    var fn: (First, repeat each Rest) -> Result
    let first: First
    
    func callAsFunction(_ rest: repeat each Rest) -> Result {
        fn(first, repeat each rest)
    }
}

Here's how it works:

func f(x: Int, y: Double) -> String { "\(x) -> \(y)" }

let bf = BindFirst(fn: f, first: 17)
bf(3.14159)

The BindFirst initializer takes a function and a first argument for that function. In our example, the function f has a first argument of type Int. It then acts as a function that takes the remaining arguments (in our example, one parameter of type Double). When invoked, it passes along the stored first argument along with the remaining arguments (repeat each rest) and returns the result, so it acts as a forwarding function.

While this illustrates how callAsFunction works, it doesn't really illustrate how it's used: in Swift, one would probably just use a closure instead of BindFirst (indeed, bind1st was removed from C++ after lambdas came along for basically the same reason). More importantly, since there's no "Callable"-style protocol in Swift, you can't really abstract over function objects in the same way that you do in C++. Rather, one tends to use values of function type and closures in Swift, and callAsFunction is fairly rare. It's certainly less commonly used in Swift than operator() is in C++.

With callAsFunction, the argument labels at the call site need to match up with the ones declared for the parameters of callAsFunction, just like with any other Swift function. There's actually a way to be more dynamic and accept any keywords arguments at the call site: we'll get back to that after we talk about pointer-to-members.

Pointer-to-member, Swift style

C++ pointer-to-members are a mechanism for referring to a non-static member (data or function) without specifying the actual instance for this. At a later point, one can supply a this using the .* or ->* operators. Here's a quick refresher:

class Point {
public:
  int x, y;
  Point flippedOverXAxis() { ... }
  Point flippedOverYAxis() { ... }
};

// Form pointers to specific members.
Point (Point::*memberFunction)() = &Point::flipOverXAxis;
int (Point::*member) = &Point::x;

// Supply an instance to refer to the referenced members.
Point p{1, 2};
p.*member = 3;         // updates x
p = (p.*memberFunction)(); // flipped over X axis

Swift permits the same use cases, but... differently.

Curried instance methods

In Swift, it's possible to refer to an instance method of a type as a member of the type itself. Imagine a Point struct similar to the C++ class above:

struct Point {
  var x: Int
  var y: Int
  
  func flippedOverXAxis() -> Point { ... }
  func flippedOverYAxis() -> Point { ... }
}

If I refer to Point.flippedOverXAxis, I will get back a value of type (Point) -> () -> Point, i.e., a function that accepts a Point (i.e., the self instance) and then returns a function. The returned function takes no arguments and returns a Point for its result.

Aside: Note that -> is right-associative, so this function type is read as (Point) -> (() -> Point). This is documented by the FunctionArrowPrecedence precedence group earlier in this post, which is also important for handling the -> operator when it shows up in expressions.

Because referring to an instance method on the type produces a function, we don't need any special syntax like .* to deal with "pointer-to-member-function" in Swift: it's just function calls.

let memberFunction = Point.flippedOverXAxis

var p = Point(x: 1, y: 2)
p = memberFunction(p)()

Key paths

Swift's equivalent to pointer-to-data-members is a feature called key paths. A key path abstracts of a chain of nested member accesses starting from a given instance. In its simplest form, a key path is like pointer to data member, and is formed with the syntax \Type.propertyName. For example, let's reference the x property of Point:

let member = \Point.x

The type of member is WritableKeyPath<Point, Int>, i.e., a key path that starts at an instance of Point and refers to a value of type Int, much like the pointer-to-data-member we'd get from the &Point::x we wrote in C++. WritableKeyPath is used for key paths that can both read and write the value (i.e., because it's a var). If x were instead a let, we would get an instance of KeyPath, which doesn't allow mutation.

Now that we have the key path member, we can use it to access the member with the keyPath subscript on a particular instance. For example:

var p = Point(x: 1, y: 2)
p[keyPath: member] = 3        // update p.x to the value 3

This subscript, which is available on every Swift type, is equivalent to the C++ .* operator: it takes a key path whose root type (in our case, Point) is the same as the type being subscripted, and evaluates to a value of the referenced property (an Int, in our case).

But Swift key paths can do a lot more than C++ pointer-to-data-members, because they can represent an arbitrary path. For one, they aren't limited to stored properties. For example, let's imagine there's an absoluteValue property on Int, like this:

extension Int {
  var absoluteValue: Int {
    self < 0 ? -self : self
  }
}

A key path \Point.absoluteValue will have type KeyPath<Point, Int>, since there's no setter and it's therefore not writable. When passed into the [keyPath:] subscript, the result will be evaluated by calling the getter. As with the rest of Swift, the difference between a stored and computed property is not usually observable to the user, but there's an exception: one can retrieve the physical offset of a keypath that refers to a stored property with the MemoryLayout.offset(of:) operation:

if let offset = MemoryLayout<Point>.offset(of: member) {
  print("Offset of stored property referenced by the member is \(offset)")
}

Key paths can also subscript into their instances and perform chains of accesses. For example, let's imagine that we have an array of points like this:

var points: [Point] = ...

We can form a key path that extracts the absolute value of the y component of a particular element in an array of points, like this:

let index = 2
let deepMember = \[Point].[index].y.absoluteValue // has type KeyPath<[Point], Int>

Now, if we evaluate points[keyPath: deepMember], we subscript the array (by index), extract the y value, then compute the absolute value of that. Now, much of the time, you won't be writing out these long key path chains as literals like this. Key paths are composable, so you can take a keypath and append another key path to it to form a longer key path. For example, we could create an key path that references a specific point from the array:

let indexedMember = \[Point].[index]   // has type WritableKeyPath<[Point], Point>

Now, we can extend this key path to access whichever member of Point we referenced by member originally:

let pointMember = indexedMember.appending(path: member)   // produces a WritableKeyPath<[Point], Int>

The appending(path:) operation requires that the root of the second key path (member in our example) be the same as the value of the first key path (indexedMember in our example), because we're effectively gluing the two chains of accesses together.

Key paths generalize the idea of "pointer to data member" to include computed properties, subscripts, and chains of nested accesses. They're useful for describing how to extract parts of data in an abstract manner. They also play a central role in the ability to proxy properties in Swift.

Proxied members

Sometimes, you want to have a type that acts as a proxy for some other type. For example, perhaps you have a Box<T> type that heap-allocates a value of type T so you can share it. In Swift, that could look like this:

class Box<T> {
  var stored: T
  
  init(stored: T) {
    self.stored = stored
  }
}

However, this can be a bit annoying to use: if I have a Box<Point> named boxedPoint, I can't refer to boxedPoint.x directly: I have to go through boxedPoint.stored.x. If this were C++, we'd likely overload operator-> to return the underlying type. For example, our C++ Box type might look like this:

template<typename T>
class Box {
  T *stored;
  
public:
  T &operator->() { return *stored; }
};

In C++, this would let you write boxedPoint->x to access the x field of the stored point. Swift doesn't have a specific equivalent to operator->. However, Swift does have the ability to intercept normal . access to provide similar behavior for accessing properties using @dynamicMemberLookup. Let's update our Box type to use this feature:

@dynamicMemberLookup
class Box<T> {
  var stored: T
  
  init(stored: T) {
    self.stored = stored
  }
    
  subscript<U>(dynamicMember keyPath: KeyPath<T, U>) -> U {
    stored[keyPath: keyPath]
  }

  subscript<U>(dynamicMember keyPath: WritableKeyPath<T, U>) -> U {
    get { stored[keyPath: keyPath] }
    set { stored[keyPath: keyPath] = newValue }
  }
}

The @dynamicMemberLookup feature enables the behavior we want, but the subscript [dynamicMember:] is where the real magic happens. This subscript accepts a KeyPath (or WritableKeyPath) starting at a T (the type of the boxed value) and producing a U. The implementation of the subscript applies the key path to the stored value to get (or set) a value of type U.

Now, when we write something like

boxedPoint.x

the compiler notes that Box is marked @dynamicMemberLookup and that there is no member named x in it. So, it looks to see whether there is a member x in the root type of the key path for the [dynamicMember:] subscript. There is, so the code above is desugared to

boxedPoint[dynamicMember: \Point.x]

This is a rather different approach: we've effectively overloaded the "dot" operator by projecting the properties of T onto Box<T>. In our simple case here, it's just syntactic sugar to prevent .stored everywhere. However, we can do more interesting things by manipulating the key paths themselves. For example, we could keep a log of the updates made to the storage:

@dynamicMemberLookup
class LoggingBox<T> {
  var stored: T
  var log: [(PartialKeyPath<T>, oldValue: Any, newValue: Any)] = []
  
  init(stored: T) {
    self.stored = stored
  }
      
  subscript<U>(dynamicMember keyPath: WritableKeyPath<T, U>) -> U {
    get { stored[keyPath: keyPath] }
    set {
      log.append((keyPath, oldValue: stored[keyPath: keyPath], newValue: newValue))
      stored[keyPath: keyPath] = newValue
    }
  }
}

It's also possible for the subscript [dynamicMember:] to return something other than the result of accessing the key path. For example, an ORM library might want a way to describe an arbitrary database query that produces a value of type T. Drilling down into specific members of the query shouldn't eagerly evaluate it: rather, they should produce another query with the type of the member. It would look something like this:

@dynamicMemberLookup
struct Query<T> {
  subscript<U>(dynamicMember keyPath: KeyPath<T, U>) -> Query<U> { ... }
  
  func evaluate() -> T { ... }
}

Therefore, if we have a let query: Query<Point>, then query.x would produce a Query<Int> that extracts just the x component from the queried point.

Going dynamic

Both callAsFunction (for operator()) and the @dynamicMemberLookup (sort of like an overloadable operator.) are strongly-typed mechanisms that are suitable when you're working with well-typed information. However, sometimes you want something a bit more... dynamic. For example, you might be interoperating with a more dynamic language like Python or JavaScript where we don't have static information about the various properties and methods. There are existing Swift libraries for both cases (PythonKit and JavaScriptKit, respectively), but I'll illustrate with Python.

In Python, everything is an object, so a Swift library to interoperate with Python will have some wrapper around an arbitrary Python object:

struct PythonObject {
  // hold reference to Python object
}

Now, a Python object can hold a value of any type at runtime. We'd like to be able to access the properties and methods of any Python object, including calling methods, but we don't know the names of any of them in our Swift code. Let's say we want to use a Python class defined like this:

class Dog:
    def __init__(self, name):
        self.name = name
        self.tricks = []    

    def add_trick(self, trick):
        self.tricks.append(trick)

Let's assume we can get an instance of the Python type Dog into a PythonObject somehow:

let pyObj: PythonObject = /*create the Dog instance somehow */

How would we go about accessing its tricks property or adding a new trick? The names tricks and add_trick aren't known in Swift, so we're probably going to have to use string literals for everything, e.g.

pyObj["add_trick"](newTrick)   // Python equivalent: pyObj.add_trick(newTrick)
print(pyObj["tricks"])         // Python equivalent: pyObj.tricks

That can work, but it's fairly ugly. And if we want to support Python features like keyword arguments, it's going to get uglier:

pyObj["add_trick"](("trick", newTrick))    // Python equivalent: pyObj.add_trick(trick=newTrick)

Fortunately, the strongly type-safe Swift features we've talked about for calling a value as a function and accessing a projected property have dynamic versions that work on strings. This lets us get to Swift code looks a whole lot more like the original Python, making it easier to interoperate. The features we are are @dynamicCallable and the string form of @dynamicMemberLookup. Let's see both together:

@dynamicCallable
@dynamicMemberLookup
struct PythonObject {
  func dynamicallyCall(withKeywordArguments: KeyValuePairs<String, PythonObject>) -> PythonObject { ... }
  
  subscript(dynamicMember name: String) -> PythonObject {
    get { ... }
    set { ... }
  }
}

The dynamicallyCall(withKeywordArguments:) method enables calls to an instance of PythonObject with any arguments, including argument labels (where present), so long as all of the argument values are all PythonObjects. This means that if we have a Python object that's a method or has a __call__ method (Python's version of operator()), we can invoke it with something like this:

pyObj(a, limit: b)

And the compiler will translate this into a call to dynamicallyCall with a dictionary literal that stringifies the argument names:

pyObj.dynamicallyCall(withKeywordArguments: ["": a, "limit": b])

We use KeyValuePairs from the standard library because it maintains ordering and allows duplicate keys, unlike Dictionary, but still supports dictionary literal syntax. The actual implementation of dynamicallyCall(withKeywordArguments:) can then extract the Python objects and form the underlying call.

To access members on a Python object, we again define the subscript [dynamicMember:], but this time it takes the name of the member as a String. This trades away the type safety of the key-path solution to provide more flexibility, because we can refer to any name and it'll be resolved dynamically. With this in place, if we write

pyObj.tricks

the Swift compiler will translate that into

pyObj[dynamicMember: "tricks"]

to access the property. With these two customizations in hand, we can now write Swift code that feels a lot like the equivalent Python to access Python properties and methods:

pyObj.add_trick(trick: newTrick)
print(pyObj.tricks)

Of course, one shouldn't be too eager to throw away type information without cause, so don't reach for these string-based features first. However, when you're interacting with something very dynamic---whether it's completely unstructured data or an untyped language---these string-based operator overloading features can provide clearer code than the string-literal-laden alternatives.

Wrap-up and what's next?

Swift's support for operator overloading is fairly extensive, allowing libraries to provide rich APIs. But as in C++, great power comes with great responsibility: it's absolutely possible to go wild with these features and create something that's utterly indecipherable to users of your library.

Next we're going to look at one last set of features for embedding domain-specific languages in Swift.

Tagged with: