The recently released new Google Arts & Culture experiment, Gael Hughes’ An Ocean of Books, is a cute but telling example of the challenges of large heterogeneous datasets. Ostensibly a ‘discovery tool’ it uses basic book metadata from Google Books (itself built on standard library MARC records and classification practices), to group authors into islands in an ‘ocean of books’. But just as a ‘map is not the territory’, these placement of these islands end up revealing the deep problems with the original classification data.
Here’s some example pics. [You might need to tell your mail client to load them!]