Ask the Indexers: can’t a computer do that?

Photograph of library shelves full of books with ceiling lights to illuminate them

This is the first in a series of blog posts that ask a panel of experienced indexers questions about indexing practice and life as an indexer. These will be appearing regularly over the next few months as our indexers share their expertise and experiences.

For this post, we tackle the question indexers field most often in their work. How do our panel of indexers answer it?

Christopher Phipps, Advanced Professional Member

A computer can index a book in much the same way as a word processor can write one.

Christine Boylan, Advanced Professional Member

Yes, a computer can produce an ‘index’ of sorts that can superficially look professional and suitable for the reader but it may well consist of a list of words with multiple locators without any consideration as to the relevance or importance of these words. It will also lack the critical and analytical approach to entries and cross-references that make an index useful and easy to access.

Marian Aird, Advanced Professional Member

A computer can word spot, compile lists and put things into alphabetical order. But it can’t make decisions on what a reader might be looking for, or assess the significance of a mention of particular topics within a book. Human indexers make decisions based on knowledge of how people use books and indexes and what the expected level of knowledge of the reader might be. We can also assess the relevance of topics in a book, which can alter in different contexts, or depending on the extent of treatment. Of course, this might change with the advent of AI, but it’s unlikely that human indexers will be redundant very soon.

Michelle Brumby, Advanced Professional Member

It depends on who’s asking! If I sense a genuine interest I might launch into an explanation of the difference between an index and a concordance, how while a computer can produce a list of names and terms, for the time being a human brain does a better job of analysing the content of a text for concepts and aboutness in order to inform decisions on the appropriateness of cross-references, subheadings and double entries or how to word entries so that the most important element of the term appears first, all the while bearing in mind the purpose of the book, subject area and its likely readership and assessing the relative importance of information so that passing mentions may be omitted. Or I may just shake my head in pity.

Rohan Bolton, Fellow

No, a computer cannot compile an efficient comprehensive and easy to use index.

For example in a book about current Irish politics a computer would not be able to pick up that all these terms refer to the same two people: Prime Minister, Taoiseach, Simon Harris, Leo Varadkar, nor would it be able to identify which of the men the first two terms are referring to.

A computer compiled index for this book would contain long lists of undifferentiated page references for the two politicians leaving the reader to trawl through all of these to identify a particular piece of information. A human indexer would be able to analyse the text and create helpful subheadings to differentiate between the different aspects (e.g. ‘achievements’, ‘family life’, ‘general elections’, ‘leadership elections’, ‘public service reforms’, ‘fiscal policy’).

A computer is not capable of linking related terms. Professional indexers use see also cross-references to alert users to related terms (e.g. ‘Northern Ireland see also Irish border’) and see references to gather together material between synonymous terms (e.g. ‘taxation see fiscal policy’).

Sue Penny, Advanced Professional Member

If you’re asking that question, you’ve never used an ‘index’ created by a computer! Computers are really good at spotting words, and they’ll make something that looks like an index. But it will be a frustrating one to use. Computers don’t understand synonyms, near synonyms, antonyms, irony, subtlety… all the things that humans are really good at using.

Indexers think like humans; computer don’t think, they follow instructions.

Jan Worrall, Fellow

Up to now computers have only been able to select occurrances of words or possibly phrases, and arrange them in an alphabetical sequence with page numbers. There is no ‘intelligent’ indexing or understanding of user needs. Computer programs cannot provide an insightful analysis of the text, identify implicit concept, or information in images, recognise synonyms or alternative word forms that need to be linked together, Large entries with many references cannot be broken down intelligently into useful subcategories of topic aspects.

However AI is beginning to be able to do some of these things, so there may come a time when ‘computers’ can do ‘some’ of this, which may make them a useful tool for indexers, who would still provide an enhanced superior index to meet users needs.

Helen Bilton, Advanced Professional Member

It can. But not very well. Yet.

Computers are great at finding specific things in large amounts of text. They are not great at deciding which of those things are important/useful/pertinent and they are not good at sifting through the various possible meanings of a word, phrase or idiom in the context of the whole text. There is a lot of work do be done in computing being able to more accurately pattern match chunks of text to meaning, in context, and this will only happen when whole books and their indexes are loaded into AI engines which isn’t happening large scale at the moment for lots of reasons including copyright. However, once those pattern to meaning-in-context relationships are better mapped search tools will become even more powerful; my prediction is that many books will be produced in e-versions only with no index because the search function (currently quite poor as can only really search on a word level with a bit of morphological stemming functionality) will be powerful enough to achieve most goals.

Nic Nicholas, Fellow

Computers cannot index concepts – for example in a biography if the main character talks about their children as ‘daughter’ or ‘son’ – ‘my youngest accompanied me to the Royal Academy for the Hammershoi exhibition …’ a computer cannot work out the name of this child.

Cross references where readers may look up a different phrase meaning the same thing is something a computer cannot do. For example a book may discuss the ‘First Amendment’ but not ‘bill of rights’. A cross reference is useful – ‘bill of rights see First Amendment’

A computer would list #BlackLivesMatter under # – whereas most indexers would put it under ‘B’.

Rob Gibson, Advanced Professional Member

A computer might tell you there is a reference to something or someone on page 2. It will also tell you if there are further mentions on pages 10, 15 and 20. But it can’t judge whether such mentions contain useful information, or whether there is straight duplication of points that have been made before. And it won’t take account of indirect references to a subject, or mentions that use alternative names or expressions. And if, of the literal mentions that it does record, there are a large number of them, all it can do is present a long list of undifferentiated page numbers, leaving the reader to trawl through each in the hope of finding the information sought. In contrast, a human indexer would break the entry down into appropriately-worded subheadings so that the reader would know exactly where to look. A human indexer would also include alternative entries for subjects that could be approached from more than one direction, and cross references from one related entry to another. In essence, a human indexer will look to put themselves in the mind of the reader, and will seek to present relevant information in the form that the reader would expect to find it. Content is therefore unlocked in the way that best suits the book and its readers.

This post is part of our Ask the Indexers series. The next post asks What aspects of indexing as a career turned out to be very different from your expectations?