A Short Note on Voice Assistants and Very Old People Who Don't Hear Well

In reaction to my post about the hyperlegible font, from earlier this week, people on HackerNews kindly pointed out that voice-assistants might be a fix. Thanks for all the ideas and comments, folks.

I didn’t discuss everything about the granny-situation in that post to keep it short, so y’all don’t know everything, of course. I’ll follow up on the topic some later day, because there’s still so much to be said, but here’s the advance summary of voice assist technology: it only works when you either know what you have to do, or when your perception is still good enough to train to use these reliably.

Chances are, if you’re very old, more than one things will give out with time. My grandmother can’t see well at all, but she can’t hear well for even longer. It’s really bad. It was way worse than my grandfather could hear when he was 90 and she was about 85. Bad luck, bad genes – it’s what it is, and you probably don’t know in advance how hard life will hit you. On the plus side, she’s still pretty sharp; on the negative, she’s bored to death because she can’t do anything at all very well to keep herself entertained. So she’s not like a 20-year-old competetive athlete who happens to have bad eyesight but otherwise perfectly fine body functions. Everything’s giving out on her at the same time.

For us young ones, it’s easy to joke about Siri misunderstanding what we said. (It appears to be worse in German, but there’s plenty of memes in English, too.) But if you depend on voice input for names, and audio feedback to notice that the assistant dialed the wrong person, but can’t hear the assistant very well, you’re screwed. My grandmother doesn’t say “Ah bugger off Siri” and tries again, or gets a laugh out of funny interpretations like the 7-year-olds in our family do. She’s utterly confused because she doesn’t know what happened, why the phone doesn’t seem to dial, or if it does in the first place because she can’t hear the doot … doot sound of the phone sometimes when it’s waiting for the other party to pick up. Imagine putting a phone to your ear, and wondering why nobody picks up, while to bystanders it’s obvious that, d’oh, you need to hit the green button to dial the number, first. Which she did try to press, but maybe her thumb hurt that day and she didn’t press hard enough, and the press didn’t register, or whatever.

To her, voice assistants sounds like someone’s talking with a potato in their mouth, while being stuffed in a bag and then hidden in the drawer. It’s garbled, hollow, and muffled at the same time.

I’m lucky to be able to talk with her, and to have a sufficiently deep voice that she can pick up, if I’m being loud enough, and pronounce words carefully, and choose words that sound distinct from each other. That’s a lot of ifs, a lot of things that can and do go wrong and I have to rephrase something so she can get what I want to bring across.

I believe her brain is picking up parts and recognizes pieces of familiar sounds. It then fills in the gaps. Like speed-reading, but without speed, and a higher error rate. I cannot, I just can not by sound alone tell her the name of the street I live in: “Elpke”. Never heard that, so she mis-understands it 100% of the time, even though the syllables don’t seem to be that weird. The p/k sound combo is odd, so I’d totally get if she thought I was living in “Elke”. But she ends up at “Telgte”, with the characteristic “ch” instead of “g”. That word she knows. Her pattern recignition machine still works. The inputs are just too bad.

Anyway, that’s also why my grandmother and her daughter can’t talk about much at all, because my aunt’s voice is too thin and high-pitched. On the upside, my grandmother knows me since forever, and my voice didn’t change much in the past 15 years or so I’d guess, so she had plenty of time to build up expectations of what talking-with-Christian sounds like. Even then, me being the Wizard of Oz-style voice assistant fails.

A robot voice just doesn’t cut it. A pleasant, soft-spoken voice won’t be picked up. You can’t easily make it repeat itself. You can’t make it patient, or come up with ideas to approach the situation from a different angle. Computer says no, and that’s the end of it. It resets its state at some point. My grandmother can’t. She’s changed by the interaction: from a state of wanting to do something to a state of not wanting to do anything at all, because she’s failing, and all the tools are failing her, and the world just got a bit more depressing because of all that.

That’s why voice assistants are not a solution, but huge-ass printouts are. Even illegible print-outs are a puzzle she can solve through magnification. The analogue world is one she still understands and can manipulate, to some extent, with her cyborg tools of electronic magniying glasses and hearing aids.