Accepted to SIGIR ‘22 and GLB ‘22
Misinformation is becoming increasingly prevalent on social media and in news articles. It has become so widespread that we require algorithmic assistance utilising machine learning to detect such content. Training these machine learning models require datasets of sufficient scale, diversity and quality. However, datasets in the field of automatic misinformation detection are predominantly monolingual, include a limited amount of modalities and are not of sufficient scale and quality. Addressing this, we develop a data collection and linking system (MuMiN-trawl), to build a public misinformation graph dataset (MuMiN), containing rich social media data (tweets, replies, users, images, articles, hashtags) spanning 21 million tweets belonging to 26 thousand Twitter threads, each of which have been semantically linked to 13 thousand fact-checked claims across dozens of topics, events and domains, in 41 different languages, spanning more than a decade. The dataset is made available as a heterogeneous graph via a Python package (mumin). We provide baseline results for two node classification tasks related to the veracity of a claim involving social media, and demonstrate that these are challenging tasks, with the highest macro-average F1-score being 62.55% and 61.45% for the two tasks, respectively. The MuMiN ecosystem is available at this https URL, including the data, documentation, tutorials and leaderboards.
Submitted to ICML ‘22
Monitoring machine learning models once they are deployed is challenging. It is even more challenging to decide when to retrain models in real-case scenarios when labeled data is beyond reach, and monitoring performance metrics becomes unfeasible. In this work, we use non-parametric bootstrapped uncertainty estimates and SHAP values to provide explainable uncertainty estimation as a technique that aims to monitor the deterioration of machine learning models in deployment environments, as well as determine the source of model deterioration when target labels are not available. Classical methods are purely aimed at detecting distribution shift, which can lead to false positives in the sense that the model has not deteriorated despite a shift in the data distribution. To estimate model uncertainty we construct prediction intervals using a novel bootstrap method, which improves upon the work of Kumar & Srivastava (2012). We show that both our model deterioration detection system as well as our uncertainty estimation method achieve better performance than the current state-of-the-art. Finally, we use explainable AI techniques to gain an understanding of the drivers of model deterioration. We release an open source Python package, doubt, which implements our proposed methods, as well as the code used to reproduce our experiments.
Submitted to ACL ARR February ‘22
This paper introduces a Scandinavian benchmarking platform, ScandEval, which
can benchmark any pretrained or finetuned model on 29 datasets in Danish,
Norwegian, Swedish, Icelandic and Faroese, two of which are new. We develop and
release a Python package and Command-Line Interface (CLI),
can benchmark any model that has been uploaded to the HuggingFace Hub, with
reproducible results. Using this package, we benchmark over 60 Scandinavian or
multilingual models and present the results of these in an interactive online
leaderboard. The benchmarking results shows that
the investment in language technology in Norway, Sweden and Iceland has led to
language models that outperform massively multilingual models such as
XLM-RoBERTa and LaBSE. We release the source code for both the
Can we automate the truth? Mapping the contingencies of automated misinformation detection
Submitted to FAccT ‘22
The stark rise of online misinformation in recent years has sparked a growing interest in the development of automatic detection of misinformation using machine learning algorithms. In the wake of COVID-19, the issue became even more rampant and harmful, leading major social media companies like Facebook, YouTube and Twitter to rely more on automated and less on human moderation of online content. The use of machine learning supervised models is a promising approach to tackle the sheer volume of misinformation, but it also brings about challenges related to the reproduction of biases in the data, undue censorship, and potentially backfiring effects. Drawing on an interdisciplinary collaboration between academics from the fields of science and technology studies and data science, we critically unpack the technical and epistemic practices involved in the construction of misinformation classification models. We outline a series of contingencies throughout the stages of problematization, formalization, curation of ground truth datasets and model evaluation. We then suggest three concrete responses and future research paths. This paper contributes to the ongoing scholarly debate on fairness in algorithmic systems which has not yet systematically looked at the distinctive issues linked to the use of ML algorithms in combatting misinformation.
Submitted to Fundamenta Mathematicae
We continue the study of the virtual large cardinal hierarchy, initiated in Gitman and Schindler (2018), by analysing virtual versions of superstrong, Woodin, Vopěnka, and Berkeley cardinals. Gitman and Schindler showed that virtualizations of strong and supercompact cardinals yield the same large cardinal notion (Gitman and Schindler, 2018). We show the same result for a (weak) virtualization of Woodin and a virtualization of Vopěnka cardinals. We also show that there is a virtually Berkeley cardinal if and only if the virtual Vopěnka principle holds, but On is not Mahlo.
Published in the Journal of Symbolic Logic
We generalise the $\alpha$-Ramsey cardinals introduced in Holy and Schlicht (2018) for cardinals $\alpha$ to arbitrary ordinals $\alpha$, and answer several questions posed in that paper. In particular, we show that $\alpha$-Ramseys are downwards absolute to the core model $K$ for all $\alpha$ of uncountable cofinality, that strategic ω-Ramsey cardinals are equiconsistent with remarkable cardinals and that strategic $\alpha$-Ramsey cardinals are equiconsistent with measurable cardinals for all $\alpha>\omega$. We also show that the n-Ramseys satisfy indescribability properties and use them to provide a game-theoretic characterisation of completely ineffable cardinals, as well as establishing further connections between the $\alpha$-Ramsey cardinals and the Ramsey-like cardinals introduced in Gitman (2011), Feng (1990), and Sharpe and Welch (2011).