Declarative Text Kit: Inserting Strings and Lines With a Result Builder

null

There’s been progress on the Declarative Text Kit API idea, and today I want to share an insight into building a DSL with Swift Result Builders.

A detail that I didn’t grok until I built it.

Table of Contents
  1. Declaratively Inserting Text
  2. NSString Text Buffer Abstraction
  3. Inserting and Concatenating Strings
  4. Tip: Avoid Opaque Types in Result Builder at First for Maximum Compiler Help
  5. Inserting and Concatenating Whole Lines
  6. An Abstract Look at Our Transformations
  7. Inserting Newline Characters Between Lines As Needed
  8. Mixing Insertable Pieces of Text (This is Where the Big Creative Moment Happened)
  9. Minimum Effective Dose
  10. Takeaways

Declaratively Inserting Text

Inserting text into a string, text view, or other kind of text buffer is simple enough: you need a location and then use the appropriate API to put a string into the target there.

My initial sketch contained a detail I was very excited to build: a way to guarantee a string would always be inserted on its own line. Not just concatenating newlines before and after, no matter what, but be clever about it: ensure that there’ll be sufficient newline characters in the resulting text buffer, depending on the existing content, re-using existing newline characters, so to speak.

The checks to insert newlines as needed is nothing magical:

  • If inserted on an empty line (aka at a location between "\n\n" in a string), just insert the text.
  • If inserted at the end of a line (before "\n" but after any non-newline character), break the line before inserting the text.
  • If inserted at the start of a line (after `“\n” but before any non-newline character), insert the text and then break after it.
  • If inserted in the middle of a piece of text (no newlines around the location), break the line first, insert the text, then break again before continuing with the original text.

In isolation, that’s a simple procedure with simple checks, as we’ll see further down.

It becomes more complex when this becomes part of a declarative API that doesn’t know at the time of creating instructions where newline characters are.

The example for that looked like this:

Modifying(selectedRange) { rangeToWrap in
    Insert(rangeToWrap.location) { Line("```") }
    Insert(rangeToWrap.endLocation) { Line("```") }
}

In a selectedRange, insert triple backticks at the start and end, but ensure they are on separate lines.

Here’s a textual demonstration with some Markdown; the selection is represented with a highlight:


Reformat this code example:

let foo = 12, so that
it's in a code block.
Before: A bit of text highlighted at the start of a paragraph.

Reformat this code example:

```⮐
let foo = 12⮐
```⮐
, so that
it's in a code block.
After: The selection became a code block. New line breaks are depicted as .

Note how there is no new line break added before the opening text block because the selection started right after an existing line break?

So the Line wrapper can’t just concat "\n" left and right; it needs to check whether there are line breaks already.

Here’s how I did that and how I escalated the complexity of the approach in steps that each isn’t overwhelming.

NSString Text Buffer Abstraction

I’m making two assumptions for this example:

  • use integer-based locations, not Swift.String.Index, because text views (my target component) work with NSStrings under the hood;
  • characters in a NSString are also NSString substrings.

To simplify the code, imagine we have a NSMutableString or text view that conform to this protocol and just do the right thing:

protocol Buffer {
    func character(at location: Int) -> NSString
    func insert(_ content: NSString, at location: Int)
}

We’ll work on this Buffer type for the remainder of the time and ignore the details of NSTextView, UITextView, and NSMutableString.

Inserting and Concatenating Strings

The most simple change in text can be a string literal. To exercise the Result Builder API, let’s start with two strings that should be concatenated:

Insert(1337) { 
    "Hello, "
    "World!"
}

No Line wrapper, yet, just a couple of string literals to insert at a position.

First, we’re declaring a protocol to get to the text to be inserted. It will work on Line, too, and we’re calling it Insertable to describe the trait:

protocol Insertable { 
    var insertableContent: NSString { get }
}

extension String: Insertable { 
    var insertableContent: NSString { self as NSString }
}

That won’t remain as-is, but it’s a good enough start to get the code to compile.

The Insert instruction needs the location and content to be complete:

struct Insert {
    let insertable: Insertable
    let location: Int

    init(_ insertable: Insertable, at location: Int) {
        self.insertable = insertable
        self.location = location
    }
}

This is missing the block-based Result Builder API. But it has an initializer to create the value from scratch (which can be useful for testing). The user-facing API with the Result Builder can be added in an extension.

Applying wishful programming, imagine we have the Result Builder already:

extension Insert {
    init(
        _ location: UTF16Offset,
        @InsertableBuilder _ body: () -> Insertable
    ) {
        self.init(at: location, insertable: body())
    }
}

That would make the code example compile once we add the builder:

@resultBuilder
struct InsertableBuilder { }

To implement a Result Builder in Swift, you need to either provide a single buildBlock function, which won’t work here because we have 2 string literals, or two buildPartialBlock functions that behave similar to collection reducers. Our 2 string literals are “partial”, each being a part of the to-be-inserted text at a given location, so that’s a good fit.

extension InsertableBuilder {
    static func buildPartialBlock(
        first: String
    ) -> String {
        return first
    }
    
    static func buildPartialBlock(
        accumulated: String,
        next: String
    ) -> String {
        return accumulated + next
    }
}

And with that, we can concatenate two strings into one inside the block. If Insertable would conform to Equatable, this equation would now hold in tests:

XCTAssertEqual(
    Insert(1337) { 
        "Hello, "
        "World!"
    },
    Insert("Hello, World!", at: 1337)
)

Tip: Avoid Opaque Types in Result Builder at First for Maximum Compiler Help

Note that the parameters of both buildPartialBlock functions are opaque types. I discovered how helpful this would be much later in the process.

This is how it looked first:

static func buildPartialBlock(
    first: some Insertable
) -> some Insertable {
    return first
}

That looks innocent enough. But what do you do with two some Insertables? You group them in a Tuple<A, B> of sorts. For strings, I went with Concat<Left, Right>:

static func buildPartialBlock(
    accumulated: some Insertable,
    next: some Insertable
) -> some Insertable { 
    return Concat(accumulated, next)
}

struct Concat<Left, Right>: Insertable
where Left: Insertable, Right: Insertable {
    let left: Left
    let right: Right
    var insertableContent: String { 
        return left.insertableContent + right.insertableContent 
    }
}

While this does compile just fine, it’s tough as nails to later figure out which overload you’re missing, and which will be used (even with @_disfavoredOverload) when you want to add specializations.

This was a dead end to start with.

This could work fine as an API in the end, but for the initial phase of experimentation and discovery, I found concreate types to work much better, like String above.

On the flip side, every Insertable-conforming type that goes into the InsertableBuilder block needs its own build functions.

This sounds like boilerplate, code duplication, overhead and whatever – and it absolutely is, but this is the point to learn and get clarity: to have the compiler tell you that, hey, you didn’t think of this or that combination.

If everything is a special case at first, you can discover shared behavior later.

Inserting and Concatenating Whole Lines

We almost have all the basics in place.

To represent a Line that guarantees newline characters left and right, we need the struct itself:

struct Line {
    let wrapped: String

    init(_ wrapped: String)
        self.wrapped = wrapped
    }
}

The simplest next step is to make it Insertable, even though this implementation is not (yet) to spec and flawed.

We can recover from that flawed implementation in a minute, and I’ll spare you all the detours I went through. The idea here again is to get something to work, and then make it better.

extension Line: Insertable {
    var insertableContent: NSString { 
        // FIXME: Eagerly appends newline, that's actually wrong 
        return (wrapped + "\n") as NSString
    }
}

With this, we can start with an insertion example to inch forward a bit, slowly, towards a working implementation:

Insert(1337) {
    Line("First")
    Line("Second")
}

This won’t compile. Since we ditched opaque types (some Insertable) in the InsertableBuilder, we need to declare new functions that work on Lines:

extension InsertableBuilder {
    static func buildPartialBlock(
        first: Line
    ) -> String {
        return first
    }
    
    static func buildPartialBlock(
        accumulated: Line,
        next: Line
    ) -> String {
        return Line(accumulated.wrapped + "\n" + next.wrapped)
    }
}

I hid an insight that cost me a day to get to in this implementation (because I was still trying to make Concat<Left, Right> work).

To recap:

  • The Line type expresses a requirement: that the wrapped text should end up surrounded by newline characters.
  • The Line type can’t just concatenate newline characters; it’s a lazily evaluated requirement. Only during insertion can we know whether a newline character is needed at a certain point.

So we can think of the Line value as “maybe a newline” plus “some text” plus “maybe a newline”. The “maybe” is the key here.

Concatenating two Line values means we can eagerly make this decision and change the “maybe” to a “definitely”:

Line(accumulated.wrapped + "\n" + next.wrapped)

Concatenating two lines means we can make the decision now and don’t need to wait. This is expressed here with the eager newline in the middle.

Again, this took a while for me to realize: I don’t need to carry a Concat<Line, Line> around through the Result Bulder and indefinitely defer all decisions, because this essentially still represents single Line value, only the wrapped string is on two lines.

From this insight will follow all the magic that’s going to unfold.

An Abstract Look at Our Transformations

It will pay off to look at the transformations (concatenations) we support inside our Result Builder from a higher level point of view to reason about the type combinations we really need.

To do that, we can look at the 4 functions of InsertableBuilder as function signatures:

(String) -> String
(String, String) -> String
(Line) -> Line
(Line, Line) -> Line

The first and third one can now actually be combined, as they are merely same-type passthroughs:

extensin InsertableBuilder {
    static func buildPartialBlock<I>(
        first: I
    ) -> I 
    where I: Insertable {
        return first
    }
}

Note it’s not an opaque some Insertable, because the input type is strongly required to be the output type. No same-protocol transformations allowed. It’s a stronger guarantee this way, and will again help the compiler to help you.

The remaining transformations are these three, then:

(I) -> I where I: Insertable
(String, String) -> String
(Line, Line) -> Line

From this point on, we will ignore the first function, the “block starter”, buildPartialBlock<I>(first:), because it will remain unchanged.

The two interesting transformations then are same-type concatenations, from two strings reduced into a single string, and from two lines reduces into a single line:

(String, String) -> String
(Line, Line) -> Line

This will come in handy, soon.

Inserting Newline Characters Between Lines As Needed

We now have two ways through the DSL to concatenate same values. But the one one Lines still has a bug I annotated as FIXME:

extension Line: Insertable {
    var insertableContent: NSString { 
        // FIXME: Eagerly appends newline, that's actually wrong
        return (wrapped + "\n") as NSString
    }
}

This eagerly appends a newline character. That’s broken when we concatenate two lines, and it’ll also add superfluous newline characters when we insert text into empty lines. We want empty lines to be reused.

The bad news is we can’t just produce a computed insertableContent. We need context to decide if “maybe” adding a newline reduces into “actually do insert a line break”:

  • Is there a leading line break? If not, insert a newline character before the wrapped string.
  • Is there a trailing line break? If not, insert a newline character after the wrapped string.

The Buffer protocol I introduced allows inspection via character(at:), so we can ask the text buffer for context. Insertable needs to be replaced, though, to support asking for context:

protocol Insertable {
    func insert(into buffer: Buffer, at location: Int)
}

extension String: Insertable {
    func insert(into buffer: Buffer, at location: Int) {
        buffer.insert(self as NSString, at: location)
    }
}

A Swift string is simple: just put it at the location.

With that protocol change, finally, Line values can express their condition lazily:

extension Buffer {
    fileprivate func newline(at location: Int) -> Bool {
        return character(at: location) == "\n"
    }
}

extension Line: Insertable {
    public func insert(
        in buffer: Buffer,
        at location: UTF16Offset
    ) -> ChangeInLength {
        // FIXME: Buffer boundaries not checked for simplicity.
        let hasNewlineBefore = buffer.newline(at: location - 1)
        let hasNewlineAfter = buffer.newline(at: location)

        if !hasNewlineAfter {
            buffer.insert("\n", at: location)
        }

        buffer.insert(self.wrapped, at: location)

        if !hasNewlineBefore {
            buffer.insert("\n", at: location)
        }
    }
}

First, check whether a newline before and after the insertion point is required at all.

Then insert the trailing newline, the actual content, and the leading newline from back-to-front.

By starting at the back, we can insert at the same location multiple times and don’t need to compute offsets. Imagine that each insertion shoves the one that came before further to the right, like prepending elements to a list.

With this function, we can write unit tests for Line.insert(into:at:) directly to verify that the lookup works.

Mixing Insertable Pieces of Text (This is Where the Big Creative Moment Happened)

With the fixed lazy insertion of newline characters before/after Line values if needed, we can make a more complex use case compile:

Insert(1337) { 
    Line("```")
    "Wait a minute, "
    "this is a string!"
    Line("```")
}

The expected output according to our rules would be:

```
Wait a minute, this is a string!
```

Concatenate the regular strings, make sure to surround the backticks in newline characters as needed.

The transformations we have at this moment don’t support this, and the compiler will complain;

(String, String) -> String   // Concat two inline strings
(Line, Line) -> Line         // Concat two lazy lines with a newline eagerly

To model the sequence of Line, String, String, Line like above was where my approach with Concat<Left, Right> finally came to an end.

Let’s break down the sequence into how the Result Builder sees it:

  1. (Line) -> Line: Start the partial block building sequence.
  2. (Line, String) -> ???: Try to combine Line and String. That’s not defined, yet.
  3. (???, String) -> ???: Try to combine whatever the previous step did with a String. Not defined, either.
  4. (???, Line) -> ???: Try to combine whatever the previous step did with a Line. This is also not defined.

We have learned that concatenating two Line values is simple: we could eagerly make the decision to insert a newline character at point of fusion. A similar approach to make the decision eagerly during concatenation results in this:

  • To concat a Line with an inline String implies that a newline character needs to go between the two. But the result is not itself a Line, because we don’t want to insert a newline character after the inline string.
  • Similarly, a String concatenated with a Line means that we can decide a newline character at the start of the Line will definitely be needed, and that we only need to check for a trailing newline later. Again, the result is itself not a Line, because we don’t want to enforce a newline before the inline string.

Why not fold String + Line into a new Line?

Because we might want to append text to an already existing paragraph and only afterward add a line break.

For example, given this command:

Insert(123) { 
    "append this inline."
    Line("Start anew.")
}

The expected transformation in a piece of text would be the following:


How can I continue?
Before: End of a paragraph highlighted.

How can I append this inline.⮐
Start anew.⮐

After: Inserting an inline string and then enforcing a new line.

To ensure we don’t add too many line breaks and keep the sequence of inline strings and strings on separate lines, we need to break up the Line itself.

Remember a Line is the representation of two boundary checks, two rules around the insertion of the wrapped text:

  1. “maybe add a newline character before”, and
  2. “maybe add a newline character after”.

It turns out that we can represent both these rules as separate value types, and essentially treat Line as the combination of both:

  • StartsWithNewlineIfNeeded
  • EndsWithNewlineIfNeeded

This is the key insight to make everything else work. If this idea didn’t immediately come to you, don’t worry. It took me a while to realize: I noodled around for about two days with sketches and during conversations before having this idea.

Sure, “letting my subconscious do the work” is not great advice to get reproducible results.

But creative insight is never reproducible like experiments in a lab.

Creative insight comes to us. When it comes, we embrace it, and new paths open up.

Even though this idea-having moment is not reproducible, you know what is? Learning from this particular idea I try to present to you in a way that makes it look like a logical conclusion, so you can go look for similar patterns in your code. That’s all we have.

The missing transformation rules we need to combine inline strings and Line actualizes the potential of a newline character on either side:

(String, Line) -> EndsWithNewlineIfNeeded
(Line, String) -> StartsWithNewlineIfNeeded

That’s great, but now we don’t have two types to mix, we have 4!

To continue the sequence of half-actualized lazy newlines, we need at least a couple more.

Concatenating inline Strings to the side where no newline is requested doesn’t affect the type; it’s a simple string concatenation:

(String, EndsWithNewlineIfNeeded) -> EndsWithNewlineIfNeeded
(StartsWithNewlineIfNeeded, String) -> StartsWithNewlineIfNeeded

Similarly, concatenating an inline string to the opposite end actualizes the lazy rule there, and the result is an inline string with newline characters in the middle:

(String, StartsWithNewlineIfNeeded) -> String
(EndsWithNewlineIfNeeded, String) -> String

Consequently, combining with a Line (two potential newlines) with the new types (one potential newline) behaves like this when the two potential newlines ‘touch’:

(EndsWithNewlineIfNeeded, Line) -> EndsWithNewlineIfNeeded
(Line, StartsWithNewlineIfNeeded) -> StartsWithNewlineIfNeeded

Concatenated at opposite ends, we end up expanding one potential newline to two potential newlines, aka Line again:

(Line, EndsWithNewlineIfNeeded) -> Line
(StartsWithNewlineIfNeeded, Line) -> Line

That’d be 12 concrete overloads of buidPartialBlock in total, one for each possible transformation.

With these, we can flexibly combine strings and lines and represent each combination as a concrete type.

This vocabulary composes quite beautifully.

Minimum Effective Dose

In practice, we only need 8 of these 12 overloads because some are impossible to form in a Result Builder block.

For example, to get to EndsWithNewlineIfNeeded from the two primitives String and Line, you need to use this exact combination of types in the reduction transformation:

(String, Line) -> EndsWithNewlineIfNeeded

As a consequence, this combination is impossible to reach:

(Line, EndsWithNewlineIfNeeded) -> Line

We can treat this like a simplified term, and de-simplify EndsWithNewlineIfNeeded to our initial, more primitive types:

(Line, (String, Line)) -> Line

That’s an input sequence of the tuples (Line, (String, Line)). We can’t get to that result with the builder, at all, though; these two tuples of three Insertables would have to be formed by this sequence of expressions:

Line("...")
"..."
Line("...")

But since buildPartialBlock reduces pair-wise and processes the sequence from top-to-bottom, the pairing would actually be of the form ((Line, String), Line), which simplifies to (StartsWithNewlineIfNeeded, Line).

The remaining 8 reducer transformations are:

// Self-Concatenation
(String, String) -> String
(Line, Line) -> Line
// Actualizing potential newlines eagerly
(String, Line) -> EndsWithNewlineIfNeeded
(Line, String) -> StartsWithNewlineIfNeeded
// Combining half-actualized newlines
(EndsWithNewlineIfNeeded, String) -> String
(EndsWithNewlineIfNeeded, Line) -> EndsWithNewlineIfNeeded
(StartsWithNewlineIfNeeded, String) -> StartsWithNewlineIfNeeded
(StartsWithNewlineIfNeeded, Line) -> Line

That’s the whole vocabulary we need for the InsertableBuilder.

Takeaways

Swift Result Builders are cool, but getting started and knowing where to go isn’t apparent if you never used Result Builders before.

Here, I showed one approach to express a somewhat rich Insertable vocabulary in a Result Builder. But I have also skipped a couple of detours, including that I started with buildBlock first, which took a variadic argument of String or Line at first, before figuring out that buildPartialBlock with Concat<Left, Right> would express mixed-type expression lists better, before then discovering that staying with more concrete types trumps even that.

My personal gems from this process are:

  • Do not start with opaque types (some FooProtocol), at least not if you want to express mixing rules. Start Result Builders with concreate types. Then the compiler’s messages will help you find holes in your implementation. You can simplify later – but the more lines of code you have during the exploration phase, the more pieces you can touch to tweak individually. Even at the cost of initial duplication.
  • The eventual vocabulary of 8 pair-wise transformations was not planned like this. It was a discovery. Breaking up the idea of the Line type into StartsWithNewlineIfNeeded and EndsWithNewlineIfNeeded solved all problems and all combinations so far. The pattern underlying this insight is maybe: “Decompose into its parts, then recombine.” Once I was actually able to thinking of a Line value as an atomic, literally indivisible unit, but as the expression of two separate rules, the rest became simple.