Data science for the common good
The Data Science group at Radboud University develops theory and methods for machine learning and information retrieval with a strong focus on social responsibility.
With the surge in the production of digital data and the explosion of machine learning applications over the past decade, it is no wonder that the Data Science group at Radboud University has grown significantly to some forty researchers. One of the group’s key characteristics is its strong focus on social responsibility in general and a strong connection with applications in the health domain in particular, the latter via close cooperation with the Radboudumc hospital.
‘Despite the growth of the Data Science group in recent years, we have decided to stick to our three core themes’, says group leader Tom Heskes. ‘We focus on causal reasoning for machine learning, biomedical applications of machine learning, and information retrieval and recommender systems. With the growth of the group, we considered whether we should split into more subgroups, but we decided not to do so precisely because there is a strong social cohesion running through the subgroups. Even though people are doing different things in terms of content, we do a lot of social activities together.’
One of the weaknesses of most machine learning applications developed and applied in recent years is that they are poor at reasoning about cause and effect, so-called causal reasoning. A machine learning application may conclude from data on smoking and lung cancer that the two are correlated but does not automatically understand that smoking can cause lung cancer. ‘Machine learning techniques are based on making associations’, says Heskes, ‘but because they are bad at causal reasoning, they don’t know what happens if you do an intervention, like banning smoking in public spaces.’ In the field of causal reasoning for machine learning, the Data Science group is one of the largest research groups in the Netherlands.
Heskes and his colleagues try to extract more information from the data to reason about cause and effect. Heskes: ‘One of the basic ideas that we use is that a model that goes from cause to effect is likely less complex than a model that goes from effect to cause. This essentially goes back to the philosophical thesis of Ockham’s razor, which states that the simplest explanations are usually the best ones. For example, we have applied this idea to data about attention deficit hyperactivity disorder, ADHD. The data show that the attention deficit causes the hyperactivity and not the other way around. When machines get better at causal reasoning, it makes them much smarter and more robust in many applications.’
Artificial immune system
PhD candidate Franka Buytenhuijs has worked since 2020 in the second big theme of the Data Science group: biomedical applications of machine learning. She is part of the computational immunology subgroup. ‘I work on a project called Artificial immune systems’, she states. ‘Just like the brain, the immune system is also a learning system. The brain was the inspiration for the development of neural networks. We are now looking for a system to describe the immune system’s behaviour. How does the immune system learn? How does it remember? How does it forget? For example, we want to use the insights to study how the immune system determines which cells are harmful and which ones are harmless. My research focuses on a specific type of immune cell called T cells. From experimental data from Canadian colleagues, I am trying to find features that determine how strong T cells bind to viruses and the body’s own cells.’
Before starting her PhD research in the Data Science group, Buytenhuijs had completed her Master’s project in the same group. ‘I have a background in AI, but I like to apply this knowledge in the medical domain’, she says. ‘Furthermore, I enjoy the broad diversity of topics in the Data Science group, the ease with which everybody can be approached and the minimum of hierarchy. And despite the diversity of topics, most group members use some form of AI technique, so we can still learn from each other. We get a chance to share our results in our bi-weekly seminar.’
Learning from clicks
The bi-weekly seminar is co-organised by assistant professor Harrie Oosterhuis whose research focuses on optimising ranking systems for search engines and recommender systems. His work is part of the third theme of the Data Science group: information retrieval and recommender systems. ‘We develop statistical methods that learn from the click behaviour of users’, says Oosterhuis. ‘One of the applications is that search or recommender results that are displayed lower in the results list, but that people click on frequently, get an extra push to the top.’ In recent years, the work of Oosterhuis has won three best paper awards at the top conferences in the information field.
People sometimes ask Oosterhuis if improving search and recommendation systems is not already solved by companies like Google or Microsoft. ‘Of course, they are working on that as well’, replies Oosterhuis, ‘but companies like to solve it best for themselves and don’t like to share their results. In our group, we think it is important that what we develop is freely available and open for investigation and improvement.’ For example, group members Arjen de Vries and Djoerd Hiemstra are working on a European search engine that does not depend on large American tech companies. ‘We do not focus solely on publishing papers but also aim to contribute to such socially responsible initiatives’, says Oosterhuis.
Group passport – Data Science at Radboud University
Research fields: Causal reasoning for machine learning, Biomedical applications of machine learning, Information retrieval and recommender systems.
Institution: The Data Science group is a section of the Institute for Computing and Information Sciences at Radboud University.
Labs:
ICAI Lab ‘Radboud AI for Health’
AI for Precision Health, Nutrition & Behavior
AI for Energy Grids
AI for Parkinson
Websites:
Data Science group
Radboud AI (campus-wide initiative connecting all activities on Artificial Intelligence and Data Science within Radboud University and Radboudumc)
By Bennie Mols
Images Ivar Pel