Facebook Used the Bible to Train its AI Reader

Meta, the tech behemoth behind Facebook, is constructing a sophisticated text-to-speech AI language model to push the boundaries of typical language translation. The internet is churning with stories about AI concocting its own unique language, which may have led to a shutdown. Speculations range from the AI model losing efficacy to it devising a secret language to orchestrate a global AI takeover. 

Will Shanklin, a contributing author of Engadget, wrote a simple article introducing the Meta goal and topic. 1440 shared a single link with the digest line reading, "Meta (Facebook) releases open-source AI platform capable of recognizing more than 4,000 languages and producing speech-to-text and text-to-speech in around 1,100 of them." Given all the buzz with AI today, Shanklin's article and 1440's digest line are forgettable. If not for the work the church I pastor is trying to do to bridge a ministry toward English and Ukrainian speakers, it would have gone by with no interest on my part. But something caught my attention when my eye hit the third paragraph.  

It turns out, Meta used audio recordings of religious texts like the Bible to help train their text-to-speech language model. It shouldn't be surprising that Christians are concerned about translating the Bible into as many languages as possible. We desire that the Word of God be preached in every language to every tribe, nation, and people. Christians have dedicated their lives to learning languages for this purpose, teaching reading and literacy. Many have even given their lives at the hands of those who resist the spread of the Gospel of Jesus Christ. With so many available translations of the same material, it makes sense that this would help people and AI models learn to translate more effectively. It's like to Rosetta Stone of our day.   

But then, the narrative takes an unexpected turn.  

Shanklin continues, "If you're like me, that approach may raise your eyebrows." Raise my eyebrows? It turns out, we should be concerned about a Christian bias toward a biblical worldview in our AI models. There's also a concern that because so many of the audio recordings are done by males, there should be a concern for male bias being trained into the AI models. He assures his readers that we need not be concerned because the constraints of this system compared to other systems kept the model safe from this kind of bias.  

This raised more than just my eyebrows; it raised two primary concerns.

First, the article speaks highly of Meta's desire to translate material into local, native, heart languages. Anyone with a goal like that is standing on the tall shoulders of those who came before, and no group of people has devoted more time, money, and effort to translation than Christians. Christians have also been highly involved in literacy and education in other parts of the world, maybe more than any other group. The material wouldn't be available to train AI models without the Christian's profound desire for translation with solid roots as far back as the Sixteenth century. What should raise my eyebrows is the lack of any appreciation or respect for the Christian researchers and innovators who have come before.  

Second, the spector of bias is inescapable. Are we not aware of a bias for things we're biased against? If the author thinks my eyebrows should be raised because of discrimination against particular data sets in multiple languages across multiple groups across numerous centuries, we may see his bias. Furthermore, no data fed into these models is neutral. There is no unbiased data. If it's audio read by a human, it will be read by a woman or a man or a man claiming to be a woman or a woman claiming to be a man, or a person who is confused about such things, a male child or a female child, or another computer with no gender. The read content will be intended to entertain, educate, or persuade. In one way or another, all data will contain bias.    

The only way we can really deal with bias when working with this much data and AI machines is to be aware of bias. Statements like "raised eyebrows" shows bias, and that's fine. The author doesn't want any Christian bias, likely because the author is against Christian beliefs and Christian thinking. If we flipped the context, would he still have included a statement about raised eyebrows? Adolf Hitler's Main KampfThe Communist Manifesto by Karl Marx and Friedrich Engles, Timothy McVeigh's manifesto, including his "Essay on Hypocrisy," the script for the 1984 Terminator movie, and "Baby Shark" lyrics have all been uploaded to ChatGPT. Should that raise eyebrows?  Might that be bias? (And also, might that give AI more training we may not want AI to have? Yes, I recognize my bias.)  

In the final analysis, I am pleased to know that all the translation work that has gone into translating the Bible is not only providing God's Word in other languages but also helping train machines to learn new languages. If Meta’s AI model imbibe some values and morals from the Bible, perhaps the much-feared AI apocalypse may just be averted. In any case, it’s good to see Facebook finding some value from the Bible.