Declarative Text Kit: Reduction in Ugly Code Size

Jun 2nd, 2024

I started working on DeclarativeTextKit mostly out of desperation, because the naive approach wouldn’t do the trick any longer.

I complained about this on Mastodon in the beginning, but until now, I haven’t actually shown the code that made me want to change things.

The Bad and the Ugly

Please take a minute to check out the code that made me write a Declarative Text Kit API. It’s explorative code, procedural, to check out which steps are necessary to surround a selection in triple backticks for fenced code blocks:

You don’t need to read it to see what I’m getting at. You just need to see the structure: the nested conditions, the procedural transformations. And yet, it couldn’t even scale to cover all interesting cases properly.

func toggleCodeBlockProcedural(textSystem: TextSystem) {
    let textStorage = textSystem.textStorage
    let textView = textSystem.textView

    let insertionPointLocation = min(
        textView.insertionPointLocation,
        max(0, textView.nsString.length - 1))
    guard let tokenLookup = textStorage.token(atPoint: insertionPointLocation) else { return }

    if let codeBlockToken = tokenLookup.lastToken(ofType: .blockCodeFenced) {
        let codeBlockRange = codeBlockToken.range(in: textStorage)
        let nsString = textView.nsString

        textView.setSelectedRange(codeBlockRange)

        let mutableCodeBlockString = NSMutableString(string: nsString.substring(with: codeBlockRange))
        let fences = codeBlockToken.childTokens(ofType: TokenType.codeFenceLine)
        let relativeFenceRanges = fences
            .map { $0.range(in: textStorage) }
            .map { NSRange(
                location: $0.location - codeBlockRange.location,
                length: $0.length)
            }
        for range in relativeFenceRanges.reversed() {
            mutableCodeBlockString.replaceCharacters(in: range, with: "")
        }
        let replacement = mutableCodeBlockString as String

        guard textView.shouldChangeText(in: codeBlockRange, replacementString: replacement) else { return }
        textView.replaceCharacters(in: codeBlockRange, with: replacement)
        textView.didChangeText()

        textView.setSelectedRange(NSRange(
            location: codeBlockRange.location,
            length: mutableCodeBlockString.length))
    } else if let blockToken = tokenLookup.lastBlockToken() {

        // TODO: Consider escalating to the largest, not the smallest, block possible.
        if blockToken.lastTokenType() != .blockEmpty {

            let blockTokenRange = blockToken.range(in: textStorage)
            let nsString = textView.nsString
            let affectedRange = nsString.lineRange(for: blockTokenRange)

            textView.setSelectedRange(affectedRange)

            let markers = "```"
            let selectedText = nsString.substring(with: affectedRange)

            let codeBlock = markers + "\n"
                + selectedText + ((selectedText.last == "\n") ? "" : "\n")  // line range contains trailing newline *except* in the last fragmented line
                + markers

            guard textView.shouldChangeText(in: affectedRange, replacementString: codeBlock) else { return }
            textView.replaceCharacters(in: affectedRange, with: codeBlock)
            textView.didChangeText()

            textView.setSelectedRange(NSRange(
                location: affectedRange.location,
                length: codeBlock.nsLength))
        } else {
            let affectedRange = NSRange(location: insertionPointLocation, length: 0)

            let markers = "```"
            let codeBlock = markers + "\n" + markers

            guard textView.shouldChangeText(in: affectedRange, replacementString: codeBlock) else { return }
            textView.replaceCharacters(in: affectedRange, with: codeBlock)
            textView.didChangeText()

            textView.setSelectedRange(affectedRange.offset(markers.nsLength))
        }
    }
}

Having to do everything in reverse is already a cumbersome start, but you can get used to it. (With multiple locations to mutate, you need to insert and delete characters in a string from back to front, otherwise the first operation will invalidate all that come after it.)

But having to apply ever change in reverse on top of detecting whether the fenced code block markers need to be removed or added, how long they are (3–5 characters are legal), whether there is a language specifier right afterwards, and how many newlines are required before and after the inserted text to make it a proper block – the combination is just too annoying to deal with in the long run.

I know: I’ve dealt with this in the past 6 years for bold and italics and things like that. It works, but there’s hardly any potential for code re-use or to make subtle changes.

Now that you’ve seen the abomination that made me realize things needed to change, what’s the current code like?

The Good

With DeclarativeTextKit release version v0.3 or newer, the same functionality (plus more robust edge case handling) looks like this now:

func toggleCodeBlock(textSystem: TastefulTextSystem) throws {
    let textStorage = textSystem.textStorage
    let buffer = textSystem.buffer

    let insertionPointLocation = min(
        buffer.insertionLocation,
        // If the cursor is at the EOF position, inspect the location right before that because tokens don't extend up until the EOF position
        max(0, buffer.range.length - 1))
    guard let tokenLookup = textStorage.token(atPoint: insertionPointLocation) else { return }

    if let codeBlockToken = tokenLookup.lastToken(ofType: .blockCodeFenced) {
        // TODO: Add FencedCodeBlock type to not do this child token lookup inline
        let codeBlockRange = codeBlockToken.range(in: textStorage)
        let codeFenceLines = codeBlockToken.childTokens(ofType: TokenType.codeFenceLine)
        let codeFenceLineRanges = codeFenceLines
            .map { $0.range(in: textStorage) }

        try buffer.evaluate {
            Select(LineRange(codeBlockRange)) { selectedRange in
                Modifying(selectedRange) { blockRange in
                    for codeFenceLineRange in codeFenceLineRanges {
                        Delete(codeFenceLineRange)
                    }
                }
            }
        }
    } else if let blockToken = tokenLookup.lastBlockToken() {
        // TODO: Consider escalating to the largest, not the smallest, block possible?
        let blockTokenRange = blockToken.range(in: textStorage)
        try buffer.evaluate {
            Select(LineRange(blockTokenRange)) { selectedRange in
                Modifying(selectedRange) { blockRange in
                    Insert(blockRange.location) { Line("```") }
                    Insert(blockRange.endLocation) { Line("```") }
                }

                Select(selectedRange.location + length(of: "```"))
            }
        }
    }
}

The rough structure is similar. And I haven’t even added abstrations using the token tree properly. It’s all just “fetch the token at this point”, then “pick the nearest block (paragraph) token”, then wrap the character range accordingly.

For example, I currently still need to grab a token and then translate it to UTF-16 ranges to interface with the DeclarativeTextKit API. I want to add an adapter layer to not work with UTF-16 ranges at all, if I can help it, and instead use tokens.

But Good Lord is the actual code to wrap this already pleasant to work with!

And it handles so many weird cases automagically.

This isn’t a release announcement, it’s just a shout-out that I’m happy with the current state of the API.

You can play with it on GitHub. I’ve also added a DocC-compiled documentation, generously hosted by the Swift Package Index!