The Data Science for Society workshop tackles societal challenges

This month, on ACM-W Europe social media, we are discussing topics in the domains of data science, AI, gender and ethics. The explosive growth in data science and machine learning roles hides a problematic dynamic: women occupy only a minority of these new positions. We firmly believe that it is time to address the connection between the lack of diversity in the AI community and bias in the technical products that are produced by the community.

The organisers of The Data Science for Society (DS4S) workshop, Dr Letizia Milli, Michela Natilli, Laura Pollacci, and Dr Francesca Pratesi (University of Pisa & CNR Pisa ISTI KDDLab , Italy) wrote for us the key takeaways from their workshop.

The Data Science for Society (DS4S) workshop

During the 6th ACM Celebration of Women in Computing, womENcourage 2019, women engaged in STEAM areas came together to enjoy a multidisciplinary program rich in both educational and networking activities. Thanks to this opportunity, the Knowledge Discovery and Data Mining Laboratory (KDD Lab) has organized the Data Science for Society (DS4S) workshop, which was attended by a diverse audience of about 80 women.

The DS4S workshop explored four different themes of growing interest: migration, online debates, city of citizens, and ethics. The common theme of the works presented in the DS4S workshop is the combined use of Big Data with Social Data Mining techniques. Big data arises from the digital breadcrumbs of human activities and allows researchers to observe the ground truth of individual and collective behaviors. Then, social data mining provides the appropriate means to extract knowledge from big data offering a chance to understand the complexity of our contemporary, globally-interconnected society.

This workshop was possible thanks to the efforts of women involved in the European project, SoBigData, which focuses on examples of social mining and big data research to answer challenging questions in various domains. SoBigData is a multidisciplinary European infrastructure for Big Data and Social Data Mining, providing an integrated ecosystem for ethically sensitive scientific discoveries. SoBigData is sensitive to the issues of integration and removal of barriers and promotes gender equality encouraging the participation of women to its events, e.g., providing scholarships in data-science summer schools.

Migration

In the migration theme, the first presentation was given by Cristina Ioana Muntean (ISTI-CNR) on “The Perception of Social Phenomena through the Multidimensional Analysis of Online Social Networks”. Cristina presented a discussion about the refugee crisis and the United Kingdom referendum on its exit from the European Union membership. Her work highlights the possibility of analyzing these complex topics by monitoring online social networks like Twitter. Similarly using Twitter, Alina Sîrbu of (University of Pisa) presented “Measuring the “Salad Bowl” – Superdiversity on Twitter”. She highlighted difficulties in studying human migration, migration flows, and migrants stocks in countries due to the high cost, the lack of data, and the delays in how data gets updated. Sîrbu proposes a new index based on a sentiment called Superdiversity in socio-cultural theory. The study takes into account different factors such as geography, language, and emotion. Considering different cultures assign different emotional value to different words enables a richer understanding of migration issues that affect the whole population.

Online debates

The first speaker of this theme was Diana Maynard (University of Sheffield), who presented “The language of political tweets: analysing social debates”. She described the possibility of using Twitter to monitor posts from politicians and analyze the language of political tweets as a means of interpreting social debates. Her work shows an increase in the abuse directed at women over time, and especially towards women not in the current governing party. In 2019, online abuse has been on the rise each month, and the situation is particularly worse for particular communities, e.g., females, ethnic minorities, and LGBT. The second speaker, Alina Sîrbu, discussed “Algorithmic bias and opinion fragmentation and polarisation”: the problems related to algorithmic bias that is believed to enhance fragmentation and polarization of the societal debate. Social media platforms typically show their users information and opinions that are close to their own to make the user experience on the platform as pleasant as possible. Sîrbu highlights the adverse effects caused by this algorithmic bias, e.g., fragmentation – the increased number of different groups; polarization – the increased pairwise distance among opinions; instability – a slow down of convergence with a large number of views that coexist for long periods. Sîrbu says the least we can do is to make the users aware of the algorithmic bias, next to exploring neutral platforms.

City of Citizens

Chiara Boldrini (IIT-CNR) kicked off this session with “Shared Mobility and Urban Data Science” and discussed possibilities and drawbacks introduced by car-sharing. Car-sharing leads to more efficient use of cars, better land use (e.g., fewer parking lots needed), lower pollution (car-sharing with electric vehicles is even more environmental-friendly). Besides, car-sharing improves the family economy as households switching to car-sharing can save 150-400SS monthly. One of the car-sharing challenges is to identify both the sociodemographic and urban activity indicators that drive variations in the car-sharing requests. Therefore, Boldrini highlights that it is vital to continually study car-sharing, characterizing areas through its usage patterns and develop forecasting approaches, e.g., to predict pickup and drop-off events.

Ethics

Fosca Giannotti (ISTI-CNR) talked about “Data Science and Ethics”, and discussed the open challenge of meaningfully explaining the opaque artificial intelligence and machine learning systems. It is now past the time humans took responsibility for the resulting software quality and correctness. With AI and machine learning, machines automatically learn algorithms from sufficiently many examples. Therefore, it is increasingly important to focus on the explicability phenomenon and the phenomena that it implies such as intelligibility (i.e., how does it work?) and accountability (i.e., who is responsible for it?). Explainable-AI explores and investigates how AI algorithms and the results they produce become more accessible and interpretable by humans. Despite many open research challenges, explainable AI is crucial in empowering individuals against undesired effects of automated decision making, in implementing the right of explanation, in improving industry standards for developing AI-powered products, and in preserving (and expand) human autonomy.

You can find all the presentations, which have been kindly provided by the speakers, on the workshop site: Data Science for Society (DS4S).