11 or so Android Ebook Reader Apps for Academic Writing Workflows: Annotations are Hard

Here’s my personal comparison of Android ebook readers for my Boox eink tablet.

I would love to add drawings as annotations. Ratta Supernote devices do this splendidly by storing the pencil input directly, without handwriting recognition. (Example here.) This is the gold standard. Everything else requires multiple apps (to draw a diagram, for example) and import/export (of notes or EPUB book locations), which is also acceptable, but not ideal.

There’s a one month-old overview on Reddit that summarizes the situation nicely: you apparently have to use a proprietary cloud/sync subscription, or you’re out of luck, because all annotation exports suck in some way.

Table of Contents
  1. Android Apps
    1. Onyx Boox Neo Reader (built-in)
    2. Moon+ Reader
    3. KOReader
    4. eLibrary Manager
    5. ReadEra
    6. Foliate
    7. EPUB reader (by Bum bum apps)
    8. Lithium
    9. Study Comfort
  2. Subscription/cloud services I didn’t even try
  3. Emacs
  4. Conclusion

Android Apps

Some links to reader recommendations

Onyx Boox Neo Reader (built-in)

Annotations are from the Neo Reader app are exported like this, with placeholders for the actual content:

Reading Notes || <<BOOK FILE NAME>> AUTHOR
TIMESTAMP | Page No.: PAGENUMBER
MULTI-LINE QUOTE OF HIGHLIGHT
 【Note】 ANNOTATION TEXT
-------------------
...

The Onyx Book device is a Chinese product, so of course there’s these “thick” square brackets, called “lenticular brackets”.

Here’s some actual reading notes:

Reading Notes | <<Out of the Software Crisis - Baldur Bjarnason_2714c74f-8f8d-4b96-af08-d25255acc9f6>>Baldur Bjarnason
2023-03-29 15:48  |  Page No.: 22
Everything becomes harder and harder until the system is effectively thrown out and replaced. It happens in a number of different ways
-------------------
2023-03-29 15:48  |  Page No.: 23
Our systems die because we keep killing them
【Note】Lists of reasons
-------------------
2023-03-29 15:49  |  Page No.: 26
managers think their job is to extract work from the team. They think variations in performance are because of variations in the work they can extract from each employee
【Note】interessante Behauptung für die vielleicht noch Belege o. ä. im Buch folgen?
-------------------
2023-03-29 15:52  |  Page No.: 28
If something keeps going wrong, that means there isn’t a feedback loop telling the team that it’s going wrong. Shouting at them doesn’t count
【Note】change system pieces for productivity

Parsing this would be simple enough. The first line describes the book, and 19 (!) dashes separate annotations. tail +2 starts at the 2nd line of the text, and splitting a string with csplit into multiple files would work.

Example using gcsplit, which has a couple more options:

$ cat export.txt |\
    tail +2 | \
    gcsplit --prefix="note" - '/-------------------/1' "{*}"

I end up with 80 note*.txt files that way; it works.

Screenshot of exported scribbles

The Boox’s killer feature is the amazing pen input. Visual annotations are exported as PDFs with screenshots of the pages.

In the picture above, you see a page with 2 regular highlights (the inverted texts), plus a scribble at the bottom.

Because the export doesn’t contain information about the page the annotation is on, I just tried a handful of these.

This limitation, again, makes the export useless to me. I can use the Neo Reader app to jump to the page with the annotation, and that works fine.

All in all, the annotation features are good, but as I wrote in another post, the export is meh, and I’m better off processing notes from the device on the device. In other words, like with paper-based books: open the book on the reader and then process all annotations one by one.

For this workflow, the most annoying thing is that handwriting/scribbles and textual annotations are on different tabs.

There’s no unified view, so I can’t go through all my annotations. If a scribble is on a one page without any highlights, I could process all my highlights one by one and miss the scribble. In the screenshots, you’ll see that I have annotations on pages 30 and 32, and a scribble on page 31. I’d miss the scribble if I jump from annotation to annotation. In other words, I need to keep track of the page numbers in both tabs to process them in sequence. Not a fan.

Moon+ Reader

I also tried the apparently very popular 3rd party app Moon+ Reader. Its export is of the following format, with placeholders in all-caps:

BOOK TITLE (Highlight: N, Note: M)
----------
◆ CHAPTER TITLE

◼︎ HIGHLIGHTED TEXT .... (ANNOTATION)

Instead of positions or page numbers, you get the current chapter title above the exported annotation. That can be helpful to find a highlighted section in the ePub file, if the chapters are short enough. Good luck with huge fantasy novels.

The TXT export also contains these (odd) unicode characters at the beginning of lines to separate things. Moon+ Reader also is a Chinese product as far as I know.

KOReader

KOReader and Moon+ are mentioned the most inside the Boox community, I’d say. I can see why KOReader is a fan favorite.

The UI is minimal and works very well with eink devices, that’s a big plus. No animations, clear shapes and lines, large touch targets. They nailed that.

Custom KOReader keyboard. It's high-contrast, but not the native one.

But KOReader decided to use a custom keyboard, so you can’t use the Boox keyboard switcher which supports pen input and handwriting recognition. Why?

List of all bookmarks, notes, highlights in one place

Textual annotations are okay. You can check them out in one place (via “Bookmarks”, where you can filter for highlights, notes, and page bookmarks) and editing notes works well enough.

KOReader shines when it comes to exporting highlights. There’s support for Joplin and Readwise, but I’m mostly interested in Markdown, TXT, HTML, and JSON.

Screenshot of KOReader’s HTML output

Markdown is serviceable:

# Out of the Software Crisis
##### Baldur Bjarnason

## It was great until it wasn’t
### Page 12 @ 25 April 2023 08:00 AM
*Churn is devastating for software quality as it destroys institutional memory and sabotages many of the fundamental mechanisms of programming, which require stability and consistency. Churn in manufacturing or physical product design isn’t nearly as disruptive as in software*

---
keep the programmers

TXT looks like it’d be easy to parse:

 Out of the Software Crisis

 It was great until it wasn’t

  -- Page: 12, added on Tue Apr 25 08:00:56 2023
Churn is devastating for software quality as it destroys institutional memory and sabotages many of the fundamental mechanisms of programming, which require stability and consistency. Churn in manufacturing or physical product design isn’t nearly as disruptive as in software
---
keep the programmers
-=-=-=-=-=-

But, of course, JSON is the easiest to process programmatically:

{
    "file": "/storage/emulated/0/Books/Out of the Software Crisis - Baldur Bjarnason_2714c74f-8f8d-4b96-af08-d25255acc9f6.epub",
    "created_on": 1684747456,
    "entries": [
        {
            "chapter": "It was great until it wasn’t",
            "page": 12,
            "time": 1682402456,
            "sort": "highlight",
            "text": "Churn is devastating for software quality as it destroys institutional memory and sabotages many of the fundamental mechanisms of programming, which require stability and consistency. Churn in manufacturing or physical product design isn’t nearly as disruptive as in software",
            "note": "keep the programmers",
            "drawer": "lighten"
        }
    ],
    "title": "Out of the Software Crisis",
    "number_of_pages": 250,
    "author": "Baldur Bjarnason",
    "md5sum": "d3f1220d162570c6b2256e12de868873",
    "version": "json/1.0.0"
}

Using jq syntax here to denote fields, '.entries[0].text' is the highlighted text, '.entries[0].note' is the note, and '.entries[0].drawer' the “style” of the annotation (there’s highlighting as “lighten”, underline, strike-through, and inverting for stark contrast).

But again – page numbers!

At least the JSON output contains '.number_of_pages' so you can compute the percent offset yourself. Page 12/250 is 4.8%; Calibre’s web reader insists that the same book has 200 “pages”, so the location to go to would be 4.8% × 200 = 9.6, but that’s not the location I’m looking for. Searching for the phrase (“Churn is devastating…”), 6.0 is the location. So all in all, the page numbers are just as useless unless you find an ebook reader for your real computer that can make sense of the page numbering of KOReader. (Spoiler: Apple’s Books app doesn’t, and Emacs ereader-mode doesn’t either.)

The more time I spend on this, the more I wonder if I should’ve used PDFs (and a device with a larger screen) instead.

KOReader’s issue tracker lists 2 issues when you search for “cfi” (EPUB Canonical Fraction Identifiers), but it’s not going to happen at the moment.

For what it’s worth, KOReader also comes with

  • a terminal emulator
  • a text editor

eLibrary Manager

eLibrary Manager Basic is free but doesn’t offer annotations; the eLibrary Manager (Pro) version costs EUR 1.59 at the moment (2023-05-22) so I tried that. We’re still in “cost of a cup of coffee” territory here.

Its ePub reader tells me that Bjarnason’s book has 123 pages in total. That’s almost half of what the others say!

By default, page turns are animated. You can turn this off (good). Page refresh then still looks, weird, though, as if the screen clears and then renders in vertical lines from left to right. I wondered if disabling animations just removes half of the transition, and this is a bug, but the 5.0.1 release notes revealed:

  • Due to reader rendering and animation issues that occur on different WebView versions, the following updates have been made:
  • Disable software layer rendering for WebView versions greater than or equal to 110.
  • Allow toggling of software layer WebView rendering through Advanced ePub Reader Setting, in case the default behaviour is not satisfactory.

The “Software Layer” setting was set to “Default”. (It does not indicate whether this means ‘on’ or ‘off’, confusingly.) I tried to disable it; no change. I tried to enable it, now the animations, or rather the page turn glitches, are gone for good.

Reading is nice. It renders the book well, period.

Opposed to the Neo Reader application,

  • it correctly displays list bullets vertically centered next to the first line of a list item (which I shouldn’t even need to mention, but the Neo Reader app is that wonky), and
  • it also correctly interprets a bit more advanced CSS, e.g. to display attributes from heading tags as chapter numbers. (Which is likely not useful that much, but Bjarnason’s book does use these.)

So as a good ebook reader, the money is not wasted.

(But KOReader does everything better in my opinion.)

List of bookmarks, highlights, and annotations; no export from there, though. You need to go to another menu for that

Selecting text is clumsy: long-press to open a context menu, select “Highlight”, then select text. That’s not going to age well. It’s the same process for “Bookmark/Note”: the text is highlighted and a note is added. You need to long-press and select “Bookmark/Note” again to edit the highlight, though. The primary interaction doesn’t seem to be on the textual level for some reason. It’s like I only have right-clicked menus to do anything, and that’s super weird for an ebook reader because what else do you interact with? Links, probably, but distinguishing tap from long-press would do the trick there.

You do notice how eink tablets aren’t the primary use case of this app.

Export options: Does not look like exporting annotations was the primary use case

Export of book information to Calibre requires the app author’s “Calibre Documents Provider” app, which costs the same as the reader. It bridges Calibre’s library to Android’s file system, more or less? I didn’t try that, because I already do sync my Calibre books with, well, Calibre Sync (on iOS and Android). If annotations were exported, I might be interested in switching. But updating book metadata isn’t useful (to me). So I passed.

The export is a JSON file with these contents:

[
   {
      "title": "Out of the Software Crisis",
      "creator": [
         "Baldur Bjarnason"
      ],
      "bookmarks": [
         {
            "page": 12,
            "excerpt": "build software development",
            "note": "bookmark note test"
         }
      ],
      "highlights": [
         {
            "page": 11,
            "excerpt": "software ship begins to si",
            "text": "software ship begins to sink. ",
            "colour": 0
         }
      ]
   }
]

On the plus side, the JSON is very minimal. I don’t care about the root-level separation of “bookmarks”/”highlights”, though. And again, it has useless page numbers.

Please do note that the root object of this file is a JSON array. The file is called export.1684755174801.json. So it will contain all book annotations, I guess? Oof.

ReadEra

ReadEra highlights in one list; the dark-by-default UI doesn't work well on eink

This app does install, it does find ebooks, and it does do highlights. But the format is:

BOOK TITLE
AUTHOR

HIGHLIGHTED TEXT 1
--
ANNOTATION 1

*****

HIGHLIGHTED TEXT 2

*****

HIGHLIGHTED TEXT 3
--
ANNOTATION 3

No page or location indicators at all. The line-based output and the separators (two dashes -- between highlighted text and the annotation, and five asterisks ***** between highlights/quotes) would make it simple to split the output into pieces. That’s somewhat useful. But without any location indicators, this is mosly good to export highlighted quotes. Otherwise, you’d have to add the location into your annotation manually.

Foliate

The open source GTK app Foliate was mentioned on HackerNews. I don’t know why it was mentioned in that thread, because it’s not available on mobile/Android. But it’s looking like a great option for Linux. Annotations are stored as JSON with EPUB Canonical Fraction Identifiers instead of page numbers. Great idea. I’d love to have that.

EPUB reader (by Bum bum apps)

The name, “Foliate”, brought me to the FolioReader GitHub Team: they offer Kotlin/Android and Swift/iOS SKS. Not a lot of releases in recent years, but the absence of PDF support and focus on EPUB and highlighting got me interested. The FolioReader-Android repository does collect Play Store links to apps using the library, fortunately.

EPUB reader (by Bum bum apps) is a more recent addition to that list, uses the FolioReader library, and is not a reader app for comics.

Reading is limited to scrolling; there are no pagination features. Scrolling can work on the Boox Nova Air2 if you change the refresh mode a bit. But it’s not optimal.

You can highlight text, but grayscale eink isn’t supported well, and export is per-highlight: so you can only share quotes, more or less. From the book overview, highlights behave are like bookmarks.

There are no textual annotations.

Lithium

This app was recommended here and there, and there’s a very promising HTML annotation export (yes, the export is a structured format!), plus open-source tools to extract info from the export – but the app is not compatible with the Nova Air2, it seems. I can’t install it from the Play Store.

Installing the .apk from elsewhere works, and I can open the app, but I can’t read any book. It won’t display content and remains stuck in the “loading” phase.

Study Comfort

From the app anouncement on Reddit and the screenshots, I was looking forward to testing this, but apparently this never left the first public beta stage. Annotations are very bare-bones, and inserting a textual note is error-prone (e.g. “invalid note position” error). So this didn’t work.

Subscription/cloud services I didn’t even try

  • Google Play Books. Because Google.
  • BookFusion: can’t miss this since the founder posts links all over HN and Reddit. (Nothing wrong with that.) CSV and Markdown export sound good. Subscription pricing doesn’t.
  • Readwise’s Reader, because it’s also a SaaS.
  • Zotero, a platform I don’t dislike, sync your library and annotations if you find an app that speaks ‘Zotero’. I already have my ebooks synced on my NAS and made available via Calibre, so I don’t really want another library cloud storage. Also, Zotero’s annotation seems to be limited to PDF, so no ePub and thus no “reflow” of content.
  • A lot of apps do way more than just book reading and highlighting; so I wasn’t interested.
  • iOS apps are aplenty;
    • MarginNote 3, for example, would be available on iOS and macOS (also via Setapp). That looks like a well-made application, but doesn’t help in this situation.
    • Polar doesn’t do EPUB, yet, but has Anki flash-card export. Not available on Android.

Emacs

… yeah, sorry.

The Boox tablet does run Termux, and you can install Emacs 28 there and use the terminal version.

There’s also the real native Emacs port, offering version 30 (currently being worked on) in a GUI.

Emacs is useless without a keyboard, of course. You can use the GUI version to open .epubs with Emacs from the toolbar, but you can hardly switch the mode to e.g. ereader-mode. If you could, you could annotate your ebook using org-mode notes, maybe.

This is a very nerdy idea, and I’d love to try that once, sometime, but it’s not solving any of my problems. (Not like Emacs ever does, you might say …)

Conclusion

Ryan West was/is on a similar journey. Zotero seems to be superb for PDF annotations. But for ePub, there’s no annotation standard, and each app is doing things differently.

I checked Open Annotation in EPUB, Draft Specification (23 July 2015): it looked like it never went anywhere, but folks on the W3C EPUB 3 Community Group responded kindly on the mailing list. It’s being discussed on the public GitHub repo a lot, and Hypothes.is is mentioned as a notable implementation. But no wide-spread adoption, that’s certain.

On large (or rather: huge) Boox devices, you could use split screen to combine highlights in your EPUB/PDF files on one side with handwritten notes on the other. This would be similar to reading a real book with a real note-pad. But you need absolute page numbers as references to make both media work together. EPUB readers don’t gel with this, so that’s not the workflow I’m looking for. You could use a reader that produces the Calibre-compatible location identifiers instead of page numbers, but it’s still a chore. Page numbers in books work better for this.

So what’s the verdict, then?

I have no happy answer.

KOReader is a very good eink-enabled EPUB reader. It’s open source, so maybe one can tinker with that. So it’s either that, with the custom keyboard I hate, or the Boox’s native Neo Reader, with the weird vendor lock-in and proprietary format.

The Supernote is unmatched, software-wise, thus far. But the Boox hardware and color pencil input and Android base is promising.