Transforming JSON to Tables: the Impedance Problem

This is a note from 2022 I found by accident today. Teaser: It will become relevant again, soon, so I want to share the rough idea with you today.


JSON data containing nested object lists within object lists, namely, arrays of objects within other arrays, cannot be mapped directly to a 2D table representation.

You need to cut off the depth at a certain level and make compromises.

The following tablular data, represented with Markdown table syntax:

| title | prices  |
| ----- | ------- |
| Foo   | [1,2,3] |
| Bar   | [4,5]   |

… could be mapped to JSON like so:

[{"title": "Foo",
  "prices": [1,2,3]},
 {"title": "Bar",
  "prices": [4,5]}]

Using this technique, you can also incorporate one extra object level into a comparable table structure, but this is the extent of its capacity.

Here’s an array with an object that basically includes JSON similar to the one above:

[{"shop": "Foo",
  "articles": [{"title": "Bar",
                "price": 100},
               {"title": "Baz",
                "price": 200}]
 }]

You can flatten the array of objects associated with "articles" by collecting both values for the "title" and "price" keys:

| shop | articles.title | articles.price |
| ---- | -------------- | -------------- |
| Foo  | ["Bar","Baz"]  | [100,200]      |

But it quickly grows beyond a scale that is human-readable as a table. You can imagine adding more properties to objects nested in "articles", e.g. a "date" key, that becomes the column "articles.date" in the table.

Adding extra columns is feasible – however, expanding the nested object array to encompass more than a few items renders the table format unwieldy. As shown above, it’s clear that "Bar" and 100, and "Baz" and 200 correlate. But with 10 items? Identifying the 5th or 7th in the sequence becomes arduous, particularly if the titles lengthen.

Although it’s technically feasible to depict a 2-level deep nested JSON structure in such a manner, the method is inadequate.

This is the impedance problem all over again. An object–relational impedance mismatch. How do you map complex objects into (relational) table structures?

JSON itself is already a flexible object store doesn’t have (or need?) a table schema. It quickly grows out of a 2D projection, and using 2D projections limits the real world use of this attempt to very few .json files.