ARABIZI: another hurdle for counterterrorism efforts


The wave of terrorist attacks in Europe, North America and Australia in the last two years brought to the open many intelligence and operational shortcomings of the local and international response to it. Among them one may point out the absence of coherent mechanisms of intelligence sharing, an absence of a consolidated suspects index built on an agreed principles and spelling, inter-agency rivalries, severe manpower shortages and legal problems in the human rights and privacy domains.

Despite the sincere and intensified efforts to improve this state of affairs by all the actors involved, there remain several obstacles with no prospect of a quick fix in the foreseeable future. This paper aims to shed a light on just one of them-the use of Latin alphabet to write Arabic which is popular among young Arabs and which is nick-named ARABIZI.

Arab youngsters both in the Arab world and abroad are using their smartphones extensively; they’re too lazy to switch frequently between Arabic and Latin keyboards prefer to type their Arabic correspondence in Latin alphabet. In terms of counterterrorism it means that special tools have to be developed to tackle the huge volume of traffic in the social media and the blogosphere, not to mention other clandestine means, to detect and flag troublesome indicators. The complexity of the Arabic transliteration i.e. the variety of possibilities to spell an Arabic word in Latin alphabet makes those efforts very tricky and exhausting.

Many people both in the academia and the tech industry are working hard to develop solutions based, among others, on machine learning and artificial intelligence to produce an effective and user friendly tool to do that. The sheer volume of the traffic coupled with the chronic shortage of linguists with security clearance leaves no choice but to run the texts through automatic translation programs based on similar principles as “Google Translate” and the like. The major problem is to define key words or sequences of words in the text to be translated either to English, French, German or other languages. But even then the automatic translator may miss meaningful parts.

The alternative is to widen the scope of the key words or chain of words taking the risk of being over flooded with amounts of material beyond the capacity of analysts. Most of the solutions produced so far may look not bad in laboratory tests but don’t quite deliver in reality. Painstaking efforts are continuously made to fine tune the results and to teach the machines to reach further precision.

Another problem is that most of Arabic given names and surnames have linguistic meaning and the automatic translation machines failing to recognize them as such translate them literarily as nouns and adjectives. Moreover, some of these Arabizi texts are written in colloquial Arabic which differs from the modern standard Arabic both in grammar and vocabulary and is frequently dotted with abbreviations, icons and exaggerated use of letters and vowels such as (in English) “you are greaaaat”. A person of North African background where French is common will write the same word in Arabizi in a different spelling than a person who hails from a country where English is more common like Saudi Arabia, Libya or Iraq.

All these problems result in a considerable amount of material not being properly dealt with or even lost. To avoid this there is no alternative but to employ people who are familiar with the different dialects and who will dig out the vital data. The problem in Europe, North America and to an extent also in Australia is that they face chronic shortage of them. There are of course quite a number of individuals who possess the skills to deal with the various forms and dialects of Arabizi but it’s questionable whether, given the present state of affairs, they will be granted a security  clearance and access to deal with those materials.

One shouldn’t envy the contemporary leaders and politicians who are reaping long years of negligence in developing proper linguistic tools and capabilities. What we hear instead are complaints about the Internet or encryption. The public and private sector need to get much better at tackling the many challenges to counterterrorism efforts, and they need to do it fast. With such huge money flowing into intelligence, policing and security apparatuses around the world, we really should be doing a better job.