Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
The most advanced future AI systems will first be deployed inside the frontier AI companies developing them. According to these companies and independent experts, AI systems may reach or even surpass human intelligence and capabilities by 2030. Internal deployment is, therefore, a key source of benefits and risks from frontier AI systems. Despite this, the governance of the internal deployment of highly advanced frontier AI systems appears absent. This report aims to address this absence by priming a conversation around the governance of internal deployment. It presents a conceptualization of internal deployment, learnings from other sectors, reviews of existing legal frameworks and their applicability, and illustrative examples of the type of scenarios we are most concerned about. Specifically, it discusses the risks correlated to the loss of control via the internal application of a misaligned AI system to the AI research and development pipeline, and unconstrained and undetected power concentration behind closed doors. The report culminates with a small number of targeted recommendations that provide a first blueprint for the governance of internal deployment.
AI-powered predictive systems have high margins of error. However, data visualisations of algorithmic systems in education and other social fields tend to visualise certainty, thus invisibilising the underlying approximations and uncertainties of the algorithmic systems and the social settings in which these systems operate. This paper draws on a critical speculative approach to first analyse data visualisations from predictive analytics platforms for education. It demonstrates that visualisations of uncertainty in education are rare. Second, the paper explores uncertainty visualisations in other fields (defence, climate change and healthcare). The paper concludes by reflecting on the role of data visualisations and un/certainty in shaping educational futures. It also identifies practical implications for the design of data visualisations in education.
Today's world is witnessing an unparalleled rate of technological transformation. The emergence of non-fungible tokens (NFTs) has transformed how we handle digital assets and value. Despite their initial popularity, NFTs face declining adoption influenced not only by cryptocurrency volatility but also by trust dynamics within communities. From a social computing perspective, understanding these trust dynamics offers valuable insights for the development of both the NFT ecosystem and the broader digital economy. China presents a compelling context for examining these dynamics, offering a unique intersection of technological innovation and traditional cultural values. Through a content analysis of eight Chinese NFT-focused WeChat groups and 21 semi-structured interviews, we examine how socio-cultural factors influence trust formation and development. We found that trust in Chinese NFT communities is significantly molded by local cultural values. To be precise, Confucian virtues, such as benevolence, propriety, and integrity, play a crucial role in shaping these trust relationships. Our research identifies three critical trust dimensions in China's NFT market: (1) technological, (2) institutional, and (3) social. We examined the challenges in cultivating each dimension. Based on these insights, we developed tailored trust-building guidelines for Chinese NFT stakeholders. These guidelines address trust issues that factor into NFT's declining popularity and could offer valuable strategies for CSCW researchers, developers, and designers aiming to enhance trust in global NFT communities. Our research urges CSCW scholars to take into account the unique socio-cultural contexts when developing trust-enhancing strategies for digital innovations and online interactions.
In today's digital world, computing education offers critical opportunities, yet systemic inequities exclude under-represented communities, especially in rural, under-resourced regions. Early engagement is vital for building interest in computing careers and achieving equitable participation. Recent work has shown that the use of sensor-enabled tools and block-based programming can improve engagement and self-efficacy for students from under-represented groups, but these findings lack replication in diverse, resource-constrained settings. This study addresses this gap by implementing sensor-based programming workshops with rural students in Sri Lanka. Replicating methods from the literature, we conduct a between-group study (sensor vs. non-sensor) using Scratch and real-time environmental sensors. We found that students in both groups reported significantly higher confidence in programming in Scratch after the workshop. In addition, average changes in both self-efficacy and outcome expectancy were higher in the experimental (sensor) group than in the control (non-sensor) group, mirroring trends observed in the original study being replicated. We also found that using the sensors helped to enhance creativity and inspired some students to express an interest in information and communications technology (ICT) careers, supporting the value of such hands-on activities in building programming confidence among under-represented groups.
Fairness has emerged as a critical consideration in the landscape of machine learning algorithms, particularly as AI continues to transform decision-making across societal domains. To ensure that these algorithms are free from bias and do not discriminate against individuals based on sensitive attributes such as gender and race, the field of algorithmic bias has introduced various fairness concepts, along with methodologies to achieve these notions in different contexts. Despite the rapid advancement, not all sectors have embraced these fairness principles to the same extent. One specific sector that merits attention in this regard is insurance. Within the realm of insurance pricing, fairness is defined through a distinct and specialized framework. Consequently, achieving fairness according to established notions does not automatically ensure fair pricing in insurance. In particular, regulators are increasingly emphasizing transparency in pricing algorithms and imposing constraints on insurance companies on the collection and utilization of sensitive consumer attributes. These factors present additional challenges in the implementation of fairness in pricing algorithms. To address these complexities and comply with regulatory demands, we propose an efficient method for constructing fair models that are tailored to the insurance domain, using only privatized sensitive attributes. Notably, our approach ensures statistical guarantees, does not require direct access to sensitive attributes, and adapts to varying transparency requirements, addressing regulatory demands while ensuring fairness in insurance pricing.
Introductory programming courses often rely on small code-writing exercises that have clearly specified problem statements. This limits opportunities for students to practice how to clarify ambiguous requirements -- a critical skill in real-world programming. In addition, the emerging capabilities of large language models (LLMs) to produce code from well-defined specifications may harm student engagement with traditional programming exercises. This study explores the use of ``Probeable Problems'', automatically gradable tasks that have deliberately vague or incomplete specifications. Such problems require students to submit test inputs, or `probes', to clarify requirements before implementation. Through analysis of over 40,000 probes in an introductory course, we identify patterns linking probing behaviors to task success. Systematic strategies, such as thoroughly exploring expected behavior before coding, resulted in fewer incorrect code submissions and correlated with course success. Feedback from nearly 1,000 participants highlighted the challenges and real-world relevance of these tasks, as well as benefits to critical thinking and metacognitive skills. Probeable Problems are easy to set up and deploy at scale, and help students recognize and resolve uncertainties in programming problems.
Existing estimates of human migration are limited in their scope, reliability, and timeliness, prompting the United Nations and the Global Compact on Migration to call for improved data collection. Using privacy protected records from three billion Facebook users, we estimate country-to-country migration flows at monthly granularity for 181 countries, accounting for selection into Facebook usage. Our estimates closely match high-quality measures of migration where available but can be produced nearly worldwide and with less delay than alternative methods. We estimate that 39.1 million people migrated internationally in 2022 (0.63% of the population of the countries in our sample). Migration flows significantly changed during the COVID-19 pandemic, decreasing by 64% before rebounding in 2022 to a pace 24% above the pre-crisis rate. We also find that migration from Ukraine increased tenfold in the wake of the Russian invasion. To support research and policy interventions, we will release these estimates publicly through the Humanitarian Data Exchange.
Large language models (LLMs) increasingly serve as human-like decision-making agents in social science and applied settings. These LLM-agents are typically assigned human-like characters and placed in real-life contexts. However, how these characters and contexts shape an LLM's behavior remains underexplored. This study proposes and tests methods for probing, quantifying, and modifying an LLM's internal representations in a Dictator Game -- a classic behavioral experiment on fairness and prosocial behavior. We extract ``vectors of variable variations'' (e.g., ``male'' to ``female'') from the LLM's internal state. Manipulating these vectors during the model's inference can substantially alter how those variables relate to the model's decision-making. This approach offers a principled way to study and regulate how social concepts can be encoded and engineered within transformer-based models, with implications for alignment, debiasing, and designing AI agents for social simulations in both academic and commercial applications.
As artificial intelligence (AI) systems rapidly gain autonomy, the need for robust responsible AI frameworks becomes paramount. This paper investigates how organizations perceive and adapt such frameworks amidst the emerging landscape of increasingly sophisticated agentic AI. Employing an interpretive qualitative approach, the study explores the lived experiences of AI professionals. Findings highlight that the inherent complexity of agentic AI systems and their responsible implementation, rooted in the intricate interconnectedness of responsible AI dimensions and the thematic framework (an analytical structure developed from the data), combined with the novelty of agentic AI, contribute to significant challenges in organizational adaptation, characterized by knowledge gaps, a limited emphasis on stakeholder engagement, and a strong focus on control. These factors, by hindering effective adaptation and implementation, ultimately compromise the potential for responsible AI and the realization of ROI.
There is growing interest in hypothesis generation with large language models (LLMs). However, fundamental questions remain: what makes a good hypothesis, and how can we systematically evaluate methods for hypothesis generation? To address this, we introduce HypoBench, a novel benchmark designed to evaluate LLMs and hypothesis generation methods across multiple aspects, including practical utility, generalizability, and hypothesis discovery rate. HypoBench includes 7 real-world tasks and 5 synthetic tasks with 194 distinct datasets. We evaluate four state-of-the-art LLMs combined with six existing hypothesis-generation methods. Overall, our results suggest that existing methods are capable of discovering valid and novel patterns in the data. However, the results from synthetic datasets indicate that there is still significant room for improvement, as current hypothesis generation methods do not fully uncover all relevant or meaningful patterns. Specifically, in synthetic settings, as task difficulty increases, performance significantly drops, with best models and methods only recovering 38.8% of the ground-truth hypotheses. These findings highlight challenges in hypothesis generation and demonstrate that HypoBench serves as a valuable resource for improving AI systems designed to assist scientific discovery.
Masculine defaults are widely recognized as a significant type of gender bias, but they are often unseen as they are under-researched. Masculine defaults involve three key parts: (i) the cultural context, (ii) the masculine characteristics or behaviors, and (iii) the reward for, or simply acceptance of, those masculine characteristics or behaviors. In this work, we study discourse-based masculine defaults, and propose a twofold framework for (i) the large-scale discovery and analysis of gendered discourse words in spoken content via our Gendered Discourse Correlation Framework (GDCF); and (ii) the measurement of the gender bias associated with these gendered discourse words in LLMs via our Discourse Word-Embedding Association Test (D-WEAT). We focus our study on podcasts, a popular and growing form of social media, analyzing 15,117 podcast episodes. We analyze correlations between gender and discourse words -- discovered via LDA and BERTopic -- to automatically form gendered discourse word lists. We then study the prevalence of these gendered discourse words in domain-specific contexts, and find that gendered discourse-based masculine defaults exist in the domains of business, technology/politics, and video games. Next, we study the representation of these gendered discourse words from a state-of-the-art LLM embedding model from OpenAI, and find that the masculine discourse words have a more stable and robust representation than the feminine discourse words, which may result in better system performance on downstream tasks for men. Hence, men are rewarded for their discourse patterns with better system performance by one of the state-of-the-art language models -- and this embedding disparity is a representational harm and a masculine default.
Cancer patients are increasingly turning to large language models (LLMs) as a new form of internet search for medical information, making it critical to assess how well these models handle complex, personalized questions. However, current medical benchmarks focus on medical exams or consumer-searched questions and do not evaluate LLMs on real patient questions with detailed clinical contexts. In this paper, we first evaluate LLMs on cancer-related questions drawn from real patients, reviewed by three hematology oncology physicians. While responses are generally accurate, with GPT-4-Turbo scoring 4.13 out of 5, the models frequently fail to recognize or address false presuppositions in the questions-posing risks to safe medical decision-making. To study this limitation systematically, we introduce Cancer-Myth, an expert-verified adversarial dataset of 585 cancer-related questions with false presuppositions. On this benchmark, no frontier LLM -- including GPT-4o, Gemini-1.Pro, and Claude-3.5-Sonnet -- corrects these false presuppositions more than 30% of the time. Even advanced medical agentic methods do not prevent LLMs from ignoring false presuppositions. These findings expose a critical gap in the clinical reliability of LLMs and underscore the need for more robust safeguards in medical AI systems.
Open Large Language Models (OLLMs) are increasingly leveraged in generative AI applications, posing new challenges for detecting their outputs. We propose OpenTuringBench, a new benchmark based on OLLMs, designed to train and evaluate machine-generated text detectors on the Turing Test and Authorship Attribution problems. OpenTuringBench focuses on a representative set of OLLMs, and features a number of challenging evaluation tasks, including human/machine-manipulated texts, out-of-domain texts, and texts from previously unseen models. We also provide OTBDetector, a contrastive learning framework to detect and attribute OLLM-based machine-generated texts. Results highlight the relevance and varying degrees of difficulty of the OpenTuringBench tasks, with our detector achieving remarkable capabilities across the various tasks and outperforming most existing detectors. Resources are available on the OpenTuringBench Hugging Face repository at https://huggingface.co/datasets/MLNTeam-Unical/OpenTuringBench
Decentralised Autonomous Organisations (DAOs) automate governance and resource allocation through smart contracts, aiming to shift decision-making to distributed token holders. However, many DAOs face sustainability challenges linked to limited user participation, concentrated voting power, and technical design constraints. This paper addresses these issues by identifying research gaps in DAO evaluation and introducing a framework of Key Performance Indicators (KPIs) that capture governance efficiency, financial robustness, decentralisation, and community engagement. We apply the framework to a custom-built dataset of real-world DAOs constructed from on-chain data and analysed using non-parametric methods. The results reveal recurring governance patterns, including low participation rates and high proposer concentration, which may undermine long-term viability. The proposed KPIs offer a replicable, data-driven method for assessing DAO governance structures and identifying potential areas for improvement. These findings support a multidimensional approach to evaluating decentralised systems and provide practical tools for researchers and practitioners working to improve the resilience and effectiveness of DAO-based governance models.
The National Covid Memorial Wall in London, featuring over 240,000 hand-painted red hearts, faces significant conservation challenges due to the rapid fading of the paint. This study evaluates the transition to a better-quality paint and its implications for the wall's long-term preservation. The rapid fading of the initial materials required an unsustainable repainting rate, burdening volunteers. Lifetime simulations based on a collections demography framework suggest that repainting efforts must continue at a rate of some hundreds of hearts per week to maintain a stable percentage of hearts in good condition. This finding highlights the need for a sustainable management strategy that includes regular maintenance or further reduction of the fading rate. Methodologically, this study demonstrates the feasibility of using a collections demography approach, supported by citizen science and social media data, to inform heritage management decisions. An agent-based simulation is used to propagate the multiple uncertainties measured. The methodology provides a robust basis for modeling and decision-making, even in a case like this, where reliance on publicly available images and volunteer-collected data introduces variability. Future studies could improve data within a citizen science framework by inviting public submissions, using on-site calibration charts, and increasing volunteer involvement for longitudinal data collection. This research illustrates the flexibility of the collections demography framework, firstly by showing its applicability to an outdoor monument, which is very different from the published case studies, and secondly by demonstrating how it can work even with low-quality data.
AI-powered chatbots and digital teaching assistants (AI TAs) are gaining popularity in programming education, offering students timely and personalized feedback. Despite their potential benefits, concerns about student over-reliance and academic misconduct have prompted the introduction of "guardrails" into AI TAs - features that provide scaffolded support rather than direct solutions. However, overly restrictive guardrails may lead students to bypass these tools and use unconstrained AI models, where interactions are not observable, thus limiting our understanding of students' help-seeking behaviors. To investigate this, we designed and deployed a novel AI TA tool with optional guardrails in one lab of a large introductory programming course. As students completed three code writing and debugging tasks, they had the option to receive guardrailed help or use a "See Solution" feature which disabled the guardrails and generated a verbatim response from the underlying model. We investigate students' motivations and use of this feature and examine the association between usage and their course performance. We found that 50% of the 885 students used the "See Solution" feature for at least one problem and 14% used it for all three problems. Additionally, low-performing students were more likely to use this feature and use it close to the deadline as they started assignments later. The predominant factors that motivated students to disable the guardrails were assistance in solving problems, time pressure, lack of self-regulation, and curiosity. Our work provides insights into students' solution-seeking motivations and behaviors, which has implications for the design of AI TAs that balance pedagogical goals with student preferences.
This study examines how Large Language Models (LLMs) can reduce biases in text-to-image generation systems by modifying user prompts. We define bias as a model's unfair deviation from population statistics given neutral prompts. Our experiments with Stable Diffusion XL, 3.5 and Flux demonstrate that LLM-modified prompts significantly increase image diversity and reduce bias without the need to change the image generators themselves. While occasionally producing results that diverge from original user intent for elaborate prompts, this approach generally provides more varied interpretations of underspecified requests rather than superficial variations. The method works particularly well for less advanced image generators, though limitations persist for certain contexts like disability representation. All prompts and generated images are available at https://iisys-hof.github.io/llm-prompt-img-gen/
In various networks and mobile applications, users are highly susceptible to attribute inference attacks, with particularly prevalent occurrences in recommender systems. Attackers exploit partially exposed user profiles in recommendation models, such as user embeddings, to infer private attributes of target users, such as gender and political views. The goal of defenders is to mitigate the effectiveness of these attacks while maintaining recommendation performance. Most existing defense methods, such as differential privacy and attribute unlearning, focus on post-training settings, which limits their capability of utilizing training data to preserve recommendation performance. Although adversarial training extends defenses to in-training settings, it often struggles with convergence due to unstable training processes. In this paper, we propose RAID, an in-training defense method against attribute inference attacks in recommender systems. In addition to the recommendation objective, we define a defensive objective to ensure that the distribution of protected attributes becomes independent of class labels, making users indistinguishable from attribute inference attacks. Specifically, this defensive objective aims to solve a constrained Wasserstein barycenter problem to identify the centroid distribution that makes the attribute indistinguishable while complying with recommendation performance constraints. To optimize our proposed objective, we use optimal transport to align users with the centroid distribution. We conduct extensive experiments on four real-world datasets to evaluate RAID. The experimental results validate the effectiveness of RAID and demonstrate its significant superiority over existing methods in multiple aspects.