Profession and academia

My christmas message

My Christmas message is this: if you are analyzing data don't discretize a continuous variable *just* because you're way more familiar with tests and methods for categorical variables than for continuous variables. I'm not saying don't do it, but don't do it just for that reason. It's a bad habit of computing people, I find myself doing it all the time -- but others such as economists don't do it as much. Learn methods that can handle continuous variables and use them. Ho ho ho.

Christmas message

My Christmas message is this: if you measured some quantity with a certain precision (say 2 decimal places) and obtained some trailing zeroes, please don't remove them. Don't mix in a table "0.5" and "0.48". Write "0.50" and "0.48" because that is what you measured. Ho ho ho.

New tools for fair ranking available

With the support of a Data Transparency Lab grant, working with Meike Zehlike and Tom Sühr from TU Berlin, and Ivan Kitanovski from Ss. Cyril and Methodius University of Skopje, we have produced new tools for creating fair rankings.

Reference for both tools:
Meike Zehlike, Tom Sühr, Carlos Castillo, Ivan Kitanovski: "FairSearch: A Tool For Fairness in Ranked Search Results". arXiv:1905.13134 (2019). Homepage: https://github.com/fair-search

FA*IR: fair ranking by post-processing

The first set of tools correspond to the FA*IR paper in CIKM 2017, which describes a method for ranking post-processing based on a statistical test called the ranking group fairness condition:

Reference for the FA*IR algorithm:
Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, Ricardo Baeza-Yates: "FA*IR: A Fair Top-k Ranking Algorithm". Proc. of the 2017 ACM on Conference on Information and Knowledge Management (CIKM).

DELTR: fair ranking in-processing by learning-to-rank

The second set of tools correspond to an unpublished work on Learning To Rank (LTR) while reducing disparate impact, an in-processing algorithm named DELTR:

Reference for the DELTR algorithm:
Meike Zehlike, Gina-Theresa Diehn, Carlos Castillo. "Reducing Disparate Exposure in Ranking: A Learning to Rank Approach" arXiv:1805.08716 (2018).

AI-analyzed tweets could help Europe track floods

The European Commission's Joint Research Center is working on a tool that could use tweets and artificial intelligence to collect real-time data on floods. In a paper released on Arvix.org, EU scientists explain how their Social Media for Flood Risk (SMFR) prototype could help emergency responders better understand what's happening on the ground in flooded areas and determine what trouble spots might need immediate attention.

The tool works in collaboration with Europe's Flood Awareness System (EFAS). When EFAS identifies areas with heightened flood risks, it triggers SMFR to begin collecting flood-related tweets from users in those areas. Gathering reliable information from Twitter is no easy task, especially considering that EFAS covers an area with more than 27 languages. That's where the team put AI to work. To start, the researchers trained SMFR to spot flood-related keywords in English, German, Spanish and French. In a test during floods in Calabria, Italy, last fall, the tool successfully gathered 14,347 tweets over three days, sorted them by relevance and provided geo-location data.

Continue reading in Engadget »

Valerio Lorini, Carlos Castillo, Francesco Dottori, Milan Kalas, Domenico Nappo, Pater Salamon: Integrating Social Media into a Pan-European Flood Awareness System: A Multilingual Approach. To appear in ISCRAM. Valencia, Spain. [arxiv]

Pages

Subscribe to RSS - Profession and academia