Declarative Text Kit: Word Ranges

I figured out a way to consistently change a NSRange, e.g. of a selection of text or the insertion point in a text view, to select surrounding words in DeclarativeTextKit.

But first off:

Dear heavens! This wasn’t easy.

I’m still not happy with the solution. While there are three to four hundred test cases (the vast majority is generated) that helped me narrow things down, I can tell you up front that I’m not proud of the resulting 160 lines of code.

What’s a Word?

Sketch late in the process to figure out how to deal with insertion point locations in whitespace.

The rules sounded simple at first.

  • Expand to sequences of consecutive “letters” based on the input range.
  • Skip punctuation marks and whitespace and symbols.
  • If there’s nothing else around, select punctuation marks or symbols, though.
  • If what comes next is a punctuation mark, don’t just select that, but any letters that come after it.
  • But if letters come first, don’t include symbols!
  • Do this for various input languages.1

The challenge there is that the base implementation can’t use NSTextView or NSTextStorage. Sadly, NSAttributedString (or NSTextStorage, a mutable subclass) have nextWord(from:forward:) (thanks Daniel Jalkut for the pointer!), but NSString doesn’t. So if you work with a string, you’re on your own.

While Swift.String consists of Characters, which would be a blast to iterate over, at the moment, we need NSString for NSRange compatibility.

That means NSString.enumerateSubstrings(in:options:using:) with NSString.EnumerationOptions.byComposedCharacterSequences is the best API offered natively. That didn’t look bad at first but became worse as the edge cases piled up.

I think I’ll be trying out the manual while loop again with a simple state machine to make the code less weird. But not today.

Achievements

This unlocks one foundational feature: Buffer.wordRange(for:) trims the selection of whitespace and expands to cover as many characters as required to select what the user may consider to be a “word”.

With that, we can now express two things: word selection, and word-boundary upholding insertions.

This will select a word at NSString location 123:

try buffer.evaluate {
    Select(WordRange(location: 123))
}

This will change some text and expand the selection to word boundaries:

try buffer.evaluate {
    // In an arbitrary part of the text ...
    Select(location: 10, length: 20) { selectedRange in
        // ... wrap it in parentheses and end with a triple bang.
        Modifying(selectedRange) { editedRange in 
            Insert(editedRange.location)    { " (" }
            Insert(editedRange.endLocation) { "!!!) " }
        }
        // Finally, expand to full word ranges.
        Select(WordRange(selectedRange))
}

Because inserting spaces left and right can be brittle, producing unwanted double-spacing, I also added an insertion helper:

Insert(someLocation) { Word("excellent") }

This will ensure that the phrase "excellent" is inserted into the buffer at a location, surrounded by one space on each side, if needed. It splices in a word at that location and ensures there’s one space, max, left and right.

No space needs to be added when there is existing whitespace, for example.

Here’s an example to make the selected text bold in Markdown:

Modifying(buffer.selectedRange) { wrappedRange in
   Insert(wrappedRange.location) { Word.Prepending("**") }
   Insert(wrappedRange.endLocation) { Word.Appending("**") }
}

These are “half-open wrappers”: Prepending is for the leading part, adding space before the location if needed, while Appending is for the trailing part, adding space after the end location if needed.

  1. I tried to add strings with Arabic writing, but Xcode refused to deal with this properly, so I gave up eventually. But Pinyin works.