Local Time (CEST)

Event

Speaker

0900 - 0910

Introduction and opening remarks
Organizers

0910 - 0940

Inequality and Fairness in social networks and algorithms

While algorithms promise many benefits including efficiency, objectivity, and accuracy, they may also introduce or amplify biases. In this talk, I show how biases in our social networks are fed into and amplified by ranking and recommender systems. Drawing from social theories and fairness literature, we argue that biases in social connections need to be taken into consideration when designing people recommender systems.

Bio: Fariba Karimi is a data scientist who develops mathematical and computational models to study inequalities in socio-technical networks and algorithms. She is currently full professor of Data Science at the faculty of Computer Science and Biomedical Engineers at the Graz University of Technology. Fariba Karimi received her doctorate from the University of Umea in 2015. She then spent four years researching at the computational social science department at Leibniz Institute for the Social Sciences in Cologne, Germany. Since March 2021, she has been the group lead of the "Network Inequality" group at Complexity Science Hub Institute in Vienna. Before joining TU Graz, she also served as a tenure track professor at the Department of Computer Science at Vienna University of Technology. In 2023, she received the prestigious Young Scientist Award from the German Physical Society for her contribution in modeling minorities and inequalities in networks.

0940 - 1010

Sociotechnical Safety Evaluation of AI systems

Generative AI enables new use cases and modes of human-AI-interaction. These create ethical, social and safety risks which must be assessed in order to be managed or mitigated. However, current approaches to AI safety evaluation may miss relevant hazards due to not taking into account all relevant context, such as who uses the system and to what end. In this talk, I introduce a sociotechnical framework to AI safety evaluation that aims to capture relevant complexity, providing a holistic approach to AI safety evaluation. I canvass the current state of AI safety evaluation and point out key gaps. To close these gaps, I discuss possibilities for the field to expand beyond current evaluation methods and point out open challenges such as accuracy/ cost trade-offs, and representativeness and consent in the context of user studies and simulations. I close by highlighting ways toward implementing a sociotechnical approach to safety evaluation.

Bio: Laura Weidinger is a Staff Research Scientist at Google DeepMind, where she leads research on novel approaches to ethics and safety evaluation. Laura’s work focuses on taxonomising, evaluating, and mitigating risks from generative AI systems. Previously, Laura worked in cognitive science research and as policy advisor at UK and EU levels. She holds degrees from Humboldt Universität Berlin and University of Cambridge.

1010 - 1030

Update from UK Gov's AI Safety Institute - Evals & Advancing AI Governance

As AI becomes more capable, it's crucial that governments have the capabilities to empirically understand and respond to risks. We've been building a research startup within the government to achieve that. Initially, our team of ML researchers focused on building LLM evaluations for societal impacts, dangerous capabilities, the effectiveness of safeguards, and agentic capabilities. We're now broadening our work, e.g. to study how we could predict specific capabilities and by launching a systemic safety grants program. In this talk, we'll provide an update on our technical and governance work.
Cozmin Ududec

1030 - 1130

Poster Session (with coffee break)

1130 - 1200

Causal Inference from Competing Treatments

Many applications of RCTs involve the presence of multiple treatment administrators—from field experiments to online advertising—that compete for the subjects’ attention. In the face of competition, estimating a causal effect becomes difficult, as the position at which a subject sees a treatment influences their response, and thus the treatment effect. In this talk I will present a recent paper in which we build a game-theoretic model of agents who wish to estimate causal effects in the presence of competition, through a bidding system and a utility function that minimizes estimation error. The main technical result establishes an approximation with a tractable objective that maximizes the sample value obtained through strategically allocating budget on subjects. Conceptually, this work successfully combines elements from causal inference and game theory to shed light on the equilibrium behavior of experimentation under competition. We'll discuss societal implications of experimentation derived from our results, from policy evaluation to fairness in marketing campaigns. This work is joint with Vivian Y. Nastl and Moritz Hardt and will be presented at ICML'24.

Bio: Ana-Andreea Stoica is a Research Group Leader in the Social Foundations of Computation group at the Max Planck Institute for Intelligent Systems, Tuebingen. Her work focuses on social foundations in networked systems, such as algorithm design, fairness and inequality evaluation methods, and causal inference under interference. From recommendation algorithms to the way information spreads in networks, Ana is particularly interested in studying the effect of algorithms on people's sense of community and access to information and opportunities. Ana holds a Ph.D. from Columbia University and a B.A. in Mathematics from Princeton University. Since 2019, she has been co-organizing the Mechanism Design for Social Good initiative and is a co-founder of ACM conference series on Equity and Access in Algorithms, Mechanisms, and Optimization.

1200 - 1215

Contributed Talk: Generative AI Misuse: A Taxonomy of Tactics and Insights from Media Data

Generative, multimodal artificial intelligence(GenAI) offers transformative potential across industries, but its misuse poses significant risks. While prior research has shed light on the potential of advanced AI systems to be exploited for malicious purposes, we still lack a concrete under-standing of how GenAI models are specifically exploited or abused in practice, including the tac-tics employed to inflict harm. In this paper, we present a taxonomy of GenAI misuse tactics, in-formed by existing academic literature and a qualitative analysis of approximately 200 observed incidents of misuse reported between January 2023and March 2024. Through this analysis, we illuminate key and novel patterns in misuse during this time period, including potential motivations, strategies, and how attackers leverage and abuse system capabilities across modalities (e.g. image, text, audio, video) in the wild. Notably, we find that manipulation of human likeness (i.e., impersonation and sockpuppeting) and falsification of evidence underlie the most common tactics used in real-world cases of misuse. We further show that the majority of reported misuse cases leverage easily accessible GenAI capabilities that require minimal technical expertise, rather than relying on complex attacks or advanced system manipulation.
Rasmi Elasmar

1215 - 1230

Contributed Talk: Models That Prove Their Own Correctness

How can we trust the correctness of a learned model on a particular input of interest? Model accuracy is typically measured *on average* over a distribution of inputs, giving no guarantee for any fixed input. This paper proposes a theoretically-founded solution to this problem: to train *Self-Proving models* that prove the correctness of their output to a verification algorithm $V$ via an Interactive Proof. We devise a generic method for learning Self-Proving models, and we prove convergence bounds under certain assumptions. As an empirical exploration, our learning method is used to train a Self-Proving transformer that computes the Greatest Common Divisor (GCD) *and* proves the correctness of its answer.

1230 - 1400

Lunch (provided by the venue)

1400 - 1430

User Dynamics in Machine Learning Systems

When machine learning models are deployed, for example in recommender systems, they can affect the distribution on which they operate. Such endogenous distribution shifts arise due to the impact of decisions on individuals, and these effects can cause issues like polarization and bias amplification. In this talk, I will discuss models of impact at a variety of levels: users consuming content, producers creating it, and learning-based services who serve it. I will draw on recent work on preference dynamics in personalized recommendation, producer competition under algorithmic curation, and multi-learner participation dynamics. Time permitting, I will introduce a perspective based on the unifying framework of dynamical systems, and outline open problems.

Bio: Sarah is an Assistant Professor in the Computer Science Department at Cornell. She is interested in the interplay between optimization, machine learning, and dynamics, and her research focuses on understanding the fundamentals of data-driven control and decision-making. This work is grounded in and inspired by applications ranging from robotics to recommendation systems. Sarah has a PhD in EECS from UC Berkeley and did a postdoc at the University of Washington.

1430 - 1500

Optimizing Interventions for Social Impact: A Multi-Armed Bandit Approach to Public Health

1500 - 1530

Human expertise in algorithmic prediction

In many contexts, algorithmic predictions perform comparably to human expert judgement. However, there are plenty of good reasons to want humans to remain involved in decision-making. Here, we explore one such reason: humans can access information that algorithms cannot. For example, in medical settings, algorithms may be used to assess pathologies based on fixed data, but doctors may directly examine patients. We build a framework to incorporate expert judgements to distinguish between instances that are algorithmically indistinguishable, with the goal of producing predictions that outperform both humans and algorithms in isolation. We evaluate our methods on clinical risk prediction contexts, finding that while algorithms outperform humans on average, humans add valuable information in identifiable cases.

Bio: Manish Raghavan is the Drew Houston (2005) Career Development Professor at the MIT Sloan School of Management and Department of Electrical Engineering and Computer Science. Before that, he was a postdoctoral fellow at the Harvard Center for Research on Computation and Society (CRCS). His research centers on the societal impacts of algorithmic decision making.

1530 - 1700

Poster Session (with coffee break)