Write-ups

Censorship of social media in Qatar

Note: I live in Qatar since 2012, working for a local research institution as a computer scientist specialized in social media. As everything in this blog, my personal opinions do not reflect the position of the institutions I'm part of.

Despite widespread criticism, Qatar authorities have promulgated a new "Cybercrime Prevention Law". The law basically addresses three very distinct topics. The first topic (Chapter 1) is related to unauthorized access to computer systems, stealing or deletion of data, electronic fraud, etc. which together conform what is usually considered "cybercrime," i.e. crimes that involve a computer or network.

There is, however, a second topic (Chapter 2) that is not cybercrime but what the law refers to as "Content Crimes". Content crimes include helping terrorist organizations or disseminating child porn, both punished with up to 3 years in prison and a fine of up to 140K USD (500KQAR). It also includes electronic forgery and blackmail.

Prison for "false news" or violating "social values" online

Between the articles about terrorism and the ones about child pornography there is a vague provision regarding "false news" that basically extinguish freedom of press in Qatar, which is guaranteed in article 48 of its Constitution:

Article 6.- A sentence of not more than three years and a fine of not more than QR500,000 (~140K USD), or either of these penalties, shall be imposed on any person who through an information network or an information technology technique sets up or runs a website to publish false news to threaten the safety and security of the State or its public order or domestic and foreign security. A sentence of not more than a year in a prison and a fine of not more than QR250,000, or either of these penalties, shall be imposed on any person who promotes, disseminates or publishes in any way such false news for the same purpose.

Next, between the article about child pornography and the one about blackmail, there is an article that ends freedom of expression in Qatar, which is guaranteed in article 47 of its Constitution (emphasis added):

Article 8.- A sentence of not more than three years in prison and a fine of not more than QR100,000 (~27K USD), or either of these penalties, shall be imposed on any person who, through an information network or information technology technique, violates social values or principles, publishes news, photos or video or audio recordings related to the sanctity of people’s private or family life, even if the same is true, or insults or slanders others.

Additionally a third major topic (Chapter 5) establishes a maximalist view of intellectual property, in which copyright infringement is punished with up to 3 years in prison and a fine of up to QR500,000 (~140K USD). This is approximately the fine that the US law provides, which is one of the largest in the world (up to 150K USD per infraction), with the addition of jail time. Copyright law has been repeatedly used in the past in several countries to censor expression; for instance reproducing a past speech of someone without his/her authorization has been construed as a copyright violation.

What does it mean?

Personally, I find this extremely disheartening and a tremendous setback for a country that in many fronts is progressing.

The opinions of anybody are likely to challenge, in some way or another, the values or principles of somebody else.

As an atheist who believes in the separation of church and state, a vegan who abhors animal sacrifices including religious ones, a pro-LGBT right that considers inhumane the laws that punish homosexuality, a person who is pro-legalization of drugs for adults, that defends freedom of expressions and a sharing economy of knowledge, etc. I feel that most of my opinions (and those of anyone except drones!) challenge in some way the values or principles or other people. To me, challenging other people's views is part of cosmopolitanism; the opposite (ignoring each other's positions completely) has nothing to do with living together.

As a scientist who has researched extensively in social media credibility, I have to say that false news and rumors are inevitable in social media (and of media in general), particularly in times of crises. At the same time, there are mechanisms that correct false rumors in the sense that in a typical crisis misinformation is actually hard to find! Most people broadcast information that ends up being erroneous moved by a desire to help. Discouraging people from posting information in social media unless it is verified is dangerous: it creates a blind spot in the awareness that we can get from it during a crisis situation.

Finally, and here I echo what Amnesty International has said on the matter, a key issue is vagueness. The law defines "user", "provider", "network", etc. but does not define false news or what are the social values that people are not supposed to challenge through social media. In that sense, this law has an incredible potential for abuse and will have a chilling effect on the development of information technologies in Qatar.


See: unofficial translation to English [PDF] of the law promulgated on September 15th, 2014. Twitter bird and scissor: Carlos Latuff.

How does automatic classification of documents using machine learning works?

A friend asked me to explain how does an automatic system for classifying documents, such as AIDR, works.

We are going to do this in three steps, first a preliminary with an example on the risk of having a heart attack, then a little generalities, then the real thing.

Preliminary: predicting heart attack risk

Imagine a doctor with several patients that she has been following for several years. She has a clinical file for each patient in which she has noted the following: whether the patient smokes or not (which she writes as "smokes=y, smokes=n". whether the patient has high blood pressure or not (which she writes as "hypertensive=y, hypertensive=n", and whether the patient practices sports or not (which she writes as "sports=y, sports=n").

Finally, the doctor also notes if the patient has had a heart attack, written as "STROKE=y, STROKE=n":

  • Patient 1: smokes=y, hypertensive=y, sports=n, STROKE=y
  • Patient 2: smokes=y, hypertensive=n, sports=n, STROKE=y
  • Patient 3: smokes=y, hypertensive=n, sports=y, STROKE=n
  • Patient 4: smokes=n, hypertensive=y, sports=y, STROKE=n
  • Patient 5: smokes=n, hypertensive=y, sports=n, STROKE=y

Now, one can extract certain statistics from this data. For instance, patients 3 and 4 practice sports and didn't have a stroke, while patients 1, 2, and 5, don't practice sports and did have a stroke. From this data alone, one could conclude that practicing sports may help prevent a stroke (where the "may help" part doesn't come from this data but just from the recognition that 5 patients is not a lot).

We can also learn that 66% of the patients who smoke had heart strokes in this sample.

Now, if we look for combinations of factors, we can extract more information. For instance, by looking with care at the data, one can realize that, disregarding the practice of sports, everybody in this sample who either smokes or is hypertensive has had a heart attack. Obviously, with more data we can be more certain about how good are the combinations of factors that we learn, in terms of how closely they are related to a certain outcome.

Statistical machine learning

There are so many combinations of factors that even in the small dataset above, with five patients, exploring all the combinations and outcomes is very time consuming. Fortunately, there is where a well established research field, statistical machine learning, that studies precisely this problem.

This research field has studied for years different methods to automatically and quickly find relationships between elements in large-scale data. This process is known as learning, and there are many, many, techniques to do it.

In general, what these methods need in order to be able to learn effectively is: (i) a large amount of data, and (ii) the "right" data. In the example above, the medical doctor who interviewed the patients asked the "right" questions. If she had written instead their eye color or other irrelevant factor, learning something about heart stroke risk would have been much more difficult.

Classifying text

Text classification is not much different. Instead of 3 factors (smokes, hypertensive, and sports), we will have hundreds of thousands of factors, one for each word in the dictionary. The factors will take the form "word=y, word=n" where the "word" can be any word, and we write "y" when the document contains the word and "n" when it doesn't.

The outcomes will be different types of documents. Suppose our documents are tweets and we want to separate those that contain information about damage to infrastructures (DAMAGE=y) from those who don't (DAMAGE=n). Again, you can have the following table, in which for each tweet you have one factor for each word in the tweet, and the outcome has been written by an expert who has looked at the tweet and decided if it contains infrastructure damage or not:

Tweet1: ... building=y, ..., collapsed=y, ..., DAMAGE=y
Tweet2: ... bridge=y, ..., collapsed=y, ..., DAMAGE=y
Tweet3: ... bridge=y, ..., playing=y, ..., DAMAGE=n
...
Tweet1000: ... bridge=y, ..., hearts=y, ..., DAMAGE=n

Again, we can apply any of the statistical machine learning method to learn what are the combinations of words that indicate the presence of infrastructure damage reports.

That's all. Once we learn those combinations, we can use them automatically to evaluate new tweets. In this case, the learning method will also output a confidence, which you can understand roughly as the percentage of tweets having those factors that were found to have that predicted outcome in the data used to learn (it is more complex than that, but that is a good approximation).

When the data is large, in general it is very difficult for a human to be able to spot a pattern better than what a computer algorithm can do. This is why crafting rules by hand (containing "bridge" implies "DAMAGE" unless the tweet also contains "playing" or "play" or "ace" or "heart" or ...) is not only time-consuming but also tends to yield lower accuracy than automatic methods, and is in general a bad idea.

In the case of text, we also use other factors (we call them "features" or "attributes") in addition to words. For instance, we can take all sequences of two words or three words (which we call "word bi-grams" and "word tri-grams"). We can also look at the position of some words in the phrase, as to whether the word was capitalized or not, how many times it appeared, etc. For the learning method, this is all the same, simply more factors that can be exploited to learn about the data.


Further readings: lots of them, but you can start with the Wikipedia page on decision trees, which is a popular and easy to understand method for statistical machine learning.

News and Social Media (SNOW 2013 Keynote)

Slides from keynote at the Social News on the Web Workshop. Rio de Janeiro, Brazil, May 2013.

Doha II - June 2012

Shortly before it caught fire, I visited the Villagio mall, one of the three largest in Doha. As far as I understand, it is a copy of a place in Las Vegas that intends to give the visitor the impression that you are in an "Italian village", including Gucci stores, a Venetian channel with gondoliers, etc.

I also had the opportunity to meet the (admirable) crazy cat ladies from "Cats in Qatar". Doha is full of abandoned cats. One of them is in this picture; we found her in the -3 parking of Tornado Tower, the maintenance people from the building took care of her for a few days and for now I am taking care of her.

Fortunately my immigration paperwork is done. This is a record. In Italy my resident permit took 7 months. In Spain it took 3 months, without considering the time when they expelled me from the country. Here it was only 5 weeks.

It is evident that the Qatar Foundation as an employer has a lot of influence. They put us in front of the queues at every step, and that saves entire days of paperwork. Queues are never well arranged, and often, you can not trust they will be respected.. In my medical checkup my queue was reversible. You could be at the beginning or at the end, depending on the decision of the security guard.

Through all the city you can hear the speakers of the mosques calling to prayer. SpeakerS, mosqueS, plural. From here I can hear at least three. To me it sounds like a cacophony of Gregorian singing..

Being without Fabiola is weird, I have had some critical days in which I don't even want to eat, specially on weekends. My work mates are practically all foreigners, many from Egypt and India. We hang out together a lot. I have gone a couple of times to the movies ... in sex scenes, they blank the screen and you can hear only the audio.

A little problem derived from the local customs is that in shopping malls and restaurants there are areas for "families". The meaning of this is that men alone can not enter these areas. This discriminates obviously against the poorer immigrants, because to bring your family here you have to have a well-paid job.

* * *

I met my neighbors, a British couple absolutely lovely, who believing I was not at home took my garden furniture. But they gave it back ;-) They are very nice, we went for a brunch to the Ritz-Carlton, where there is a free buffet and a free bar of sparkling white wine if you want -- meaning, the perfect place for getting drunk on a Friday noon (!) Here people work from Sunday to Thursday and Friday, specially Friday morning, is the most relaxed time of the week.

* * *

To get my driver license I have to do a mini-course of 12 sessions. I did not like the idea much at the beginning, I had hoped to just exchange my Chilean license for a Qatari one and maybe if I had insisted enough I would have done that. But the course has not been bad. The first two classes are in a simulator, which is fun because in the simulator everybody is very imprudent, they don't respect traffic lights, etc. I killed a guy in the simulator, who practically waited for me to get closer to throw himself to the street. But I saved a camel in the "country" scenario.

Then there are the practical lessons where you get ready for the exam (traffic signs and practical exam with L-parking, parking in a tight spot, and on-the-road test). I think I will get a PhD in L-parking. I am also practicing a lot defensive driving.

It is worth doing it. People are very aggressive when driving and they do stuff I haven't seen elsewhere. For instance, back from work a colleague who was giving me a ride cut in front of a Land Cruiser, unwillingly. Later the Land Cruiser came in front of his car and stopped suddenly, to make us crash against him. Fortunately, the other people in the car warned my colleague that this would happen, because Qataris know that in a trial between them and a foreigner, the foreigner always looses.

The instructors in the driving school work 10 hours a day and I think mine is always about to throw himself out of the car window out of boredom. When instructors are not giving lessons, they wait in a room with air conditioning and watch wrestling matches.

* * *

Last week one of my colleagues was speaking with the son of someone from the office, who was complaining about bullies in school. My colleague told the kid that he had to do like Napoleon, who studied a lot in school to be better than the bullies. His answer:

-- I am sorry sir, that is a bad example. I don't want to be like Napoleon.
-- Why?
-- Because I am Egyptian.

Doha I - May 2012

Less than a week ago I moved for work to Doha, Qatar. I was rather worried, I have to say. The reaction from my colleagues was mostly negative: affectionate, but negative. The reaction from my friends was mixed: some found it excellent, other congratulate me, others said they'll miss us (and me to them!)

My first impression of Doha, is that I do not have a first impression ;-) So far I've only picked small clues about how things will be. Probably many of them are wrong. I write them here to laugh about them later.

The first things is that it is ridiculously hot, 40 Celsius during most of the day, and everybody says this is just a small preview of what is about to come in the next months, where it will go to 50 Celsius and "you won't be able to stand in the sun for 5 minutes".

The cats hated to travel. They spent 12 hours in the cages, with 6 of them flying. When they were "delivered" to me, I found them alone next to the luggage belt. The good thing: they still had water in their dishes, meaning they were not moved so much. The cats were weird after the trip, they did not want to eat and were searching for Fabiola, but now they are easier, eating and wanting to go to the backyard. That, I will let them do it later on when they are more used to this place. For now, Panterita and Trufa are a big chunk of my social life.

I was assigned a house in a gated community where there are mostly people related to education/universities, many families with kids. The community has its own pool and gym. The house looks huge compared with what I've experienced in the last years: Rome (a studio of 16 sq m) and Barcelona (flats of 48 and 64 sq m).

* * *

The other part of my social life are conversations with taxi drivers. Kind of groundhog day: soccer, where is Chile, where is Sri Lanka or Ethiopia, how hot it is in summer, etc. Language is a big barrier. At work everybody speaks English, many of them better than I. Outside work people speak little English or the minimum for basic stuff.

And my activity so far has been ... shopping ;-) To get the home ready, to buy food, a cellphone, etc. There is a mini-mall with a medium-size supermarket attached to the community and I've also gone to the City Center which is the largest shopping mall. Prices are similar to the convenience stores in Barcelona. Here is a sample with approximate prices:

  • 500g of pasta = 1 eur.
  • Colgate toothpaste = 1 eur.
  • Two liters of watermelon juice = 2 eur.
  • Shampoo HnS = 3 eur.
  • Small box of champignon = 1.5 eur
  • One cucumber = 2 eur.
  • Box of tea bags = 1.5 eur.
  • One lettuce = 0.5 eur.
  • Taxi from home to airport, 30 min = 10 eur.
  • Taxi from home to city center, 20 min = 8 eur.

The bills are quite decorated, and as I mix lila and blue I tend to confuse the bills of 100 qar (20 eur) and 1 qar (0.2 eur).

* * *

Most local women wear the Abaya (it covers the head and the body, but not the face) and about half wears a Niqab (a veil that allows you to see only the eyes). I have not seen any woman in Burqa (the one with the net in front of the eyes). Non-qatari women wear whatever they like: jeans, skirts (below the knee), t-shirts, etc. The only thing I saw was at the entrance of the Islamic Museum a friend of mine was asked to cover her shoulders.

On that count, I have not experienced a "cultural shock". I hope to postpone that as much as possible. Well, I was about to experience it: in the toilet at the shopping mall there was a place that looked like an urinal but it was to wash your feet before praying ... fortunately I was suspicious and did not use it for what I thought it was used ;-)

Besides, in many aspects it is a developing country and in that sense, to a first-class human (I am second class), Qatar may look more strange. For instance: my experience to pass some accompanied cargo through customs was strange: there are written rules, unwritten rules, people asking you for money around, it is not clear immediately what for, and in general something systematically dysfunctional. But it can be understood. The logic is that you elbow your way to the counter and shove your papers in front of an officer, smile, and wait. Finally you have to pay to the guys that asked you for money initially because they are legally part of this business, kind of para-officers of this place.

The good side: rules are flexible. For instance the shuttle bus in the airport stops anywhere.

* * *

In the city center there are very pretty buildings. I work at the Tornado Tower which is a twisted tower. Looking at all the funky buildings I thought they could have built a Sagrada Familia, just replacing animals and people by text and abstract motives.

From my office (now that I have an office) you can see buildings, cranes, and a piece of the bay. The work looks interesting, I am just starting to decide what exactly I am going to do, but there is a lot of energy, students, engineers, etc. I still don't have a good sense of what to expect of those around me. I don't care if it is much or little, fast or slow. I just want to understand the work rhythm and who are the reliable people.

* * *

What else can I say? I miss Fabiola so much. I have no idea how these two months without her are going to be. And for someone risk-averse as me, this situation is frightening at times. But I also have a lot of curiosity. I expect to satisfy that curiosity in the next months ;-)

Hugs for everyone. I don't tell you "everybody come to Doha!" because it will be a while before I am certain that it is a good idea ;-)

Pages

Subscribe to RSS - Write-ups