Logical Ranges and Character Ranges

Jul 3rd, 2024

In my recent post titled “The Rake and Its Prongs” I introduced a function towards the end, called NSRange.isValidInsertionPointLocation(at:). It’s used to consider the after-end location as part of the range.

I have a hard time with that name – and just today renamed it to hasValidInsertionPointLocation, because I found that call sites were getting the subject and object of the ‘sentence’ all wrong.

Read it loud:

Range, is valid insertion point location at ⎵?

That’s a bit awkward. Since it’s a predicate of sorts, returning a boolean value, the “has” form reads more like an assertion:

Range has valid insertion point at location ⎵.

That works better within an if, I’d like to argue.

But that’s not the real point of me writing this post: it’s the insight that this is all wrong because it’s too simplistic! What we have here is like Primitive Obsession; sticking to a ‘primitive’ value instead of expressing a domain concept.

The concept here is not that all ranges should actually do their contains function calls this way and that Apples implementation is wrong. Instead some call sites need a containment check with different rules. Because the range, then, is not just the character range, but something else. A logical range, or semantic range. Implementing this check on NSRange itself is a mistake; it’s the wrong factoring, and a missed opportunity.

Consider a simple next step in pseudo-Swift for a logical word range:

struct WordRange {
    // Replicating NSRange for the moment:
    typealias Offset = Int
    typealias Length = Int
    
    let location: Offset
    let length: Length

    func contains(insertionLocation: Offset) -> Bool {
        // same test as before
    }
}

An NSRange should contain whatever is between lower and upper bound, excluding the upper bound. That’s alright.

A logical word range instead represents a range of characters in a text that are considered to be a semantic unit, a word. In this very specific context, the position at the end is very much part of a check that can be called contains(insertionLocation:), since “insertion location” of a buffer and the concept of a word range as a potentially expandable, changeable range of characters is a better fit than the generic NSRange could be.

Let me try something a bit crazy: I’m spending some conscious effort to push this idea forward, even though the current iteration could be enough to be “more right” than the previous iteration. What could you do to express the limitation of “I’m talking about an editing context, not a text description”? Maybe this:

protocol BufferRange {}

struct WordRange: BufferRange {
    // Replicating NSRange for the moment:
    typealias Offset = Int
    typealias Length = Int
    
    let location: Offset
    let length: Length
}

/// View into a BufferRange for the purpose of performing edits.
struct Editing<R: BufferRange> {
    let range: R
    
    func contains(insertionLocation: Offset) -> Bool {
        // same test as before
    }
}

extension BufferRange {
    var editing: Editing<Self> { Editing(range: self) }
}

buffer.wordRange(...).editing.contains(insertionLocation: ...)

While this may not be it, I like the idea to express a better separation of text description (what is inside where?) and text editing (where can I type?) into the domain.

I’m always amazed about perspective shifts like this and the avalanche of new and related ideas these bring. I could’ve stuck with awkward utility functions like isValidInsertionPointLocation and the boring NSRange for months. Instead, I’m not in a creative and explorative mood, looking for concepts that waited to be expressed and which I could make real, and which suggest follow-ups almost by themselves.