Data mining sacred texts

Social Media timelines are awash with the results of a textual analysis of the Old Testament, New Testament and Qur’an, which in a very cursory way seems to suggest that the Qur’an is a more peaceful text than the Bible. Unfortunately it is one of those feel-good stories, easily shared, which falls apart on closer inspection.

Firstly because the Bible and the Qur’an are very different texts. What would happen if we were to compare biblical oral histories with those of Muslim tradition? Or the Acts of the Apostles to the accounts of early Muslim communities? The New Testament is made up of accounts of the life of Jesus, pseudo histories and letters of encouragement: though of course it informs the life of the Christian believer, it is of a completely different genre to the Qur’an. The Old Testament is an even more diverse body of literature, containing histories, poetry, canticles, mythology and law, spanning two thousand years.

More pertinently, however, the analysis was undertaken not on original sources in their native languages, but on English translations / interpretations. For the Bible, the New International Version was selected. For the Qur’an, Muhammad Ali’s Ahmadiyya rendering was used. Clearly data-mining any interpretation or translation of a text other than the original is going to severely skew the results.

It’s true that mining the original texts in Arabic, Hebrew or Aramaic would present its own set of problems. Even in their Hebrew, Aramaic and Greek forms, biblical texts have long histories spanning centuries of oral transmission, the written record and subsequent editing and refinement.

It doesn’t stop there. The nature of language itself is an issue for all traditions. The meanings of words are not independent of religious authority, which itself is not independent of the political establishment; naturally the definitions of words are very often politicised. Even so, a word-for-word analysis of earlier texts would at least avoid some of the layers of interpretational, doctrinal and linguistic bias introduced by the translator.

Textual analysis of this kind no doubt has its place, but it is too limited to be used on its own, other than to generate the kinds of headlines helpful to a small technology company seeking to stand out from the crowd.

A real analysis of sacred texts demands years of very patient work — much more than most of us are willing to pledge — taking in the meanings of surrounding words, grammar, ellipsis, philosophy, practice, historical context, later political developments and so on. On the road to understanding there are no shortcuts: it is a lifetime’s work.

One thought on “Data mining sacred texts”

  1. Assalamu alykum ,
    I have read about this, this morning and in few places.
    He wrote an algorithm based on words content in a text.
    He used it on translated texts and lo and behold …
    You can analyse the Quran in its original language Arabic.
    However, as far as I know, we cannot do that for the Bible.
    As far as I can see he had no choice but to use translations.
    Maybe this is one example of a usefulness of an algorithm but with inherent limitations.
    As far as I could read in the Quran, the Prophets were non violent people.
    Wassalam

Leave a Reply

Your email address will not be published. Required fields are marked *

4 × two =