Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
High-resolution origin-destination (OD) tables are essential for a wide spectrum of transportation applications, from modeling traffic and signal timing optimization to congestion pricing and vehicle routing. However, outside a handful of data rich cities, such data is rarely available. We introduce MOVEOD, an open-source pipeline that synthesizes public data into commuter OD flows with fine-grained spatial and temporal departure times for any county in the United States. MOVEOD combines five open data sources: American Community Survey (ACS) departure time and travel time distributions, Longitudinal Employer-Household Dynamics (LODES) residence-to-workplace flows, county geometries, road network information from OpenStreetMap (OSM), and building footprints from OSM and Microsoft, into a single OD dataset. We use a constrained sampling and integer-programming method to reconcile the OD dataset with data from ACS and LODES. Our approach involves: (1) matching commuter totals per origin zone, (2) aligning workplace destinations with employment distributions, and (3) calibrating travel durations to ACS-reported commute times. This ensures the OD data accurately reflects commuting patterns. We demonstrate the framework on Hamilton County, Tennessee, where we generate roughly 150,000 synthetic trips in minutes, which we feed into a benchmark suite of classical and learning-based vehicle-routing algorithms. The MOVEOD pipeline is an end-to-end automated system, enabling users to easily apply it across the United States by giving only a county and a year; and it can be adapted to other countries with comparable census datasets. The source code and a lightweight browser interface are publicly available.
Generative AI (GenAI) models have broad implications for education in general, impacting the foundations of what we teach and how we assess. This is especially true in computing, where LLMs tuned for coding have demonstrated shockingly good performance on the types of assignments historically used in introductory CS (CS1) courses. As a result, CS1 courses will need to change what skills are taught and how they are assessed. Computing education researchers have begun to study student use of LLMs, but there remains much to be understood about the ways that these tools affect student outcomes. In this paper, we present the design and evaluation of a new CS1 course at a large research-intensive university that integrates the use of LLMs as a learning tool for students. We describe the design principles used to create our new CS1-LLM course, our new course objectives, and evaluation of student outcomes and perceptions throughout the course as measured by assessment scores and surveys. Our findings suggest that 1) student exam performance outcomes, including differences among demographic groups, are largely similar to historical outcomes for courses without integration of LLM tools, 2) large, open-ended projects may be particularly valuable in an LLM context, and 3) students predominantly found the LLM tools helpful, although some had concerns regarding over-reliance on the tools.
Artificial intelligence has been introduced as a way to improve access to mental health support. However, most AI mental health chatbots rely on a limited range of disciplinary input, and fail to integrate expertise across the chatbot's lifecycle. This paper examines the cost-benefit trade-off of interdisciplinary collaboration in AI mental health chatbots. We argue that involving experts from technology, healthcare, ethics, and law across key lifecycle phases is essential to ensure value-alignment and compliance with the high-risk requirements of the AI Act. We also highlight practical recommendations and existing frameworks to help balance the challenges and benefits of interdisciplinarity in mental health chatbots.
As AI systems enter high-stakes domains, evaluation must extend beyond predictive accuracy to include explainability, fairness, robustness, and sustainability. We introduce RAISE (Responsible AI Scoring and Evaluation), a unified framework that quantifies model performance across these four dimensions and aggregates them into a single, holistic Responsibility Score. We evaluated three deep learning models: a Multilayer Perceptron (MLP), a Tabular ResNet, and a Feature Tokenizer Transformer, on structured datasets from finance, healthcare, and socioeconomics. Our findings reveal critical trade-offs: the MLP demonstrated strong sustainability and robustness, the Transformer excelled in explainability and fairness at a very high environmental cost, and the Tabular ResNet offered a balanced profile. These results underscore that no single model dominates across all responsibility criteria, highlighting the necessity of multi-dimensional evaluation for responsible model selection. Our implementation is available at: https://github.com/raise-framework/raise.
Food insecurity remains a persistent public health emergency in the United States, tightly interwoven with chronic disease, mental illness, and opioid misuse. Yet despite the existence of thousands of food banks and pantries, access remains fragmented: 1) current retrieval systems depend on static directories or generic search engines, which provide incomplete and geographically irrelevant results; 2) LLM-based chatbots offer only vague nutritional suggestions and fail to adapt to real-world constraints such as time, mobility, and transportation; and 3) existing food recommendation systems optimize for culinary diversity but overlook survival-critical needs of food-insecure populations, including immediate proximity, verified availability, and contextual barriers. These limitations risk leaving the most vulnerable individuals, those experiencing homelessness, addiction, or digital illiteracy, unable to access urgently needed resources. To address this, we introduce Food4All, the first multi-agent framework explicitly designed for real-time, context-aware free food retrieval. Food4All unifies three innovations: 1) heterogeneous data aggregation across official databases, community platforms, and social media to provide a continuously updated pool of food resources; 2) a lightweight reinforcement learning algorithm trained on curated cases to optimize for both geographic accessibility and nutritional correctness; and 3) an online feedback loop that dynamically adapts retrieval policies to evolving user needs. By bridging information acquisition, semantic analysis, and decision support, Food4All delivers nutritionally annotated and guidance at the point of need. This framework establishes an urgent step toward scalable, equitable, and intelligent systems that directly support populations facing food insecurity and its compounding health risks.
There has been increasing research interest in AI/ML for social impact, and correspondingly more publication venues have refined review criteria for practice-driven AI/ML research. However, these review guidelines tend to most concretely recognize projects that simultaneously achieve deployment and novel ML methodological innovation. We argue that this introduces incentives for researchers that undermine the sustainability of a broader research ecosystem of social impact, which benefits from projects that make contributions on single front (applied or methodological) that may better meet project partner needs. Our position is that researchers and reviewers in machine learning for social impact must simultaneously adopt: 1) a more expansive conception of social impacts beyond deployment and 2) more rigorous evaluations of the impact of deployed systems.
Recent work has highlighted the importance of monitoring chain-of-thought reasoning for AI safety; however, current approaches that analyze textual reasoning steps can miss subtle harmful patterns and may be circumvented by models that hide unsafe reasoning. We present a sentence-level labeled dataset that enables activation-based monitoring of safety behaviors during LLM reasoning. Our dataset contains reasoning sequences with sentence-level annotations of safety behaviors such as expression of safety concerns or speculation on user intent, which we use to extract steering vectors for detecting and influencing these behaviors within model activations. The dataset fills a key gap in safety research: while existing datasets label reasoning holistically, effective application of steering vectors for safety monitoring could be improved by identifying precisely when specific behaviors occur within reasoning chains. We demonstrate the dataset's utility by extracting representations that both detect and steer safety behaviors in model activations, showcasing the potential of activation-level techniques for improving safety oversight on reasoning. Content Warning: This paper discusses AI safety in the context of harmful prompts and may contain references to potentially harmful content.
This paper introduces Prompt-to-Primal (P2P) Teaching, an AI-integrated instructional approach that links prompt-driven exploration with first-principles reasoning, guided and moderated by the instructor within the classroom setting. In P2P teaching, student-generated AI prompts serve as entry points for inquiry and initial discussions in class, while the instructor guides learners to validate, challenge, and reconstruct AI responses through fundamental physical and mathematical laws. The approach encourages self-reflective development, critical evaluation of AI outputs, and conceptual foundational knowledge of the core engineering principles. A large language model (LLM) can be a highly effective tool for those who already possess foundational knowledge of a subject; however, it may also mislead students who lack sufficient background in the subject matter. Results from two student cohorts across different semesters suggest the pedagogical effectiveness of the P2P teaching framework in enhancing both AI literacy and engineering reasoning.
Education in the era of generative AI faces a pivotal transformation. As AI systems reshape professional practices-from software development to creative design-educators must reconsider how to prepare students for a future where humans and machines co-construct knowledge. While tools like ChatGPT and Claude automate tasks and personalize learning, their educational potential depends on how meaningfully they are integrated into learning environments. This paper argues that Learning Management Systems (LMSs), as the core of educational practice, must evolve from static content repositories into dynamic ecosystems that cultivate higher-order thinking and meaningful human-AI interaction. We propose two guiding principles for integrating generative AI into LMSs. First, From Content Delivery to Fostering Higher-Order Thinking, emphasizing AI's role in supporting inquiry, collaboration, and reflective knowledge building. Second, Toward Meaningful Interaction with AI, highlighting the design of learning environments that nurture critical, intentional, and socially mediated engagement with AI. Drawing on a case study of CheckIT Learning, we illustrate how these principles can translate into practice. We conclude with the need for Edtech partnerships in an AI-powered world, underscoring that responsible AI integration in education requires sustained collaboration among researchers, educators, and technologists to ensure ethical, pedagogically grounded, and cognitively informed innovation.
The convergence of LLM-powered research assistants and AI-based peer review systems creates a critical vulnerability: fully automated publication loops where AI-generated research is evaluated by AI reviewers without human oversight. We investigate this through \textbf{BadScientist}, a framework that evaluates whether fabrication-oriented paper generation agents can deceive multi-model LLM review systems. Our generator employs presentation-manipulation strategies requiring no real experiments. We develop a rigorous evaluation framework with formal error guarantees (concentration bounds and calibration analysis), calibrated on real data. Our results reveal systematic vulnerabilities: fabricated papers achieve acceptance rates up to . Critically, we identify \textit{concern-acceptance conflict} -- reviewers frequently flag integrity issues yet assign acceptance-level scores. Our mitigation strategies show only marginal improvements, with detection accuracy barely exceeding random chance. Despite provably sound aggregation mathematics, integrity checking systematically fails, exposing fundamental limitations in current AI-driven review systems and underscoring the urgent need for defense-in-depth safeguards in scientific publishing.
We advance the understanding of robotic intervention in high-risk scenarios by examining their potential to distract and impede a school shooter. To evaluate this concept, we conducted a virtual reality study with 150 university participants role-playing as a school shooter. Within the simulation, an autonomous robot predicted the shooter's movements and positioned itself strategically to interfere and distract. The strategy the robot used to approach the shooter was manipulated -- either moving directly in front of the shooter (aggressive) or maintaining distance (passive) -- and the distraction method, ranging from no additional cues (low), to siren and lights (medium), to siren, lights, and smoke to impair visibility (high). An aggressive, high-distraction robot reduced the number of victims by 46.6% relative to a no-robot control. This outcome underscores both the potential of robotic intervention to enhance safety and the pressing ethical questions surrounding their use in school environments.
As stories of human-AI interactions continue to be highlighted in the news and research platforms, the challenges are becoming more pronounced, including potential risks of overreliance, cognitive offloading, social and emotional manipulation, and the nuanced degradation of human agency and judgment. This paper surveys recent research on these issues through the lens of the psychological triad: cognition, behavior, and emotion. Observations seem to suggest that while AI can substantially enhance memory, creativity, and engagement, it also introduces risks such as diminished critical thinking, skill erosion, and increased anxiety. Emotional outcomes are similarly mixed, with AI systems showing promise for support and stress reduction, but raising concerns about dependency, inappropriate attachments, and ethical oversight. This paper aims to underscore the need for responsible and context-aware AI design, highlighting gaps for longitudinal research and grounded evaluation frameworks to balance benefits with emerging human-centric risks.
Large-scale pre-trained machine learning models have reshaped our understanding of artificial intelligence across numerous domains, including our own field of geography. As with any new technology, trust has taken on an important role in this discussion. In this chapter, we examine the multifaceted concept of trust in foundation models, particularly within a geographic context. As reliance on these models increases and they become relied upon for critical decision-making, trust, while essential, has become a fractured concept. Here we categorize trust into three types: epistemic trust in the training data, operational trust in the model's functionality, and interpersonal trust in the model developers. Each type of trust brings with it unique implications for geographic applications. Topics such as cultural context, data heterogeneity, and spatial relationships are fundamental to the spatial sciences and play an important role in developing trust. The chapter continues with a discussion of the challenges posed by different forms of biases, the importance of transparency and explainability, and ethical responsibilities in model development. Finally, the novel perspective of geographic information scientists is emphasized with a call for further transparency, bias mitigation, and regionally-informed policies. Simply put, this chapter aims to provide a conceptual starting point for researchers, practitioners, and policy-makers to better understand trust in (generative) GeoAI.
Online political microtargeting involves monitoring people's online behaviour, and using the collected data, sometimes enriched with other data, to show people-targeted political advertisements. Online political microtargeting is widely used in the US; Europe may not be far behind. This paper maps microtargeting's promises and threats to democracy. For example, microtargeting promises to optimise the match between the electorate's concerns and political campaigns, and to boost campaign engagement and political participation. But online microtargeting could also threaten democracy. For instance, a political party could, misleadingly, present itself as a different one-issue party to different individuals. And data collection for microtargeting raises privacy concerns. We sketch possibilities for policymakers if they seek to regulate online political microtargeting. We discuss which measures would be possible, while complying with the right to freedom of expression under the European Convention on Human Rights.
Artificial intelligence (AI) has a huge impact on our personal lives and also on our democratic society as a whole. While AI offers vast opportunities for the benefit of people, its potential to embed and perpetuate bias and discrimination remains one of the most pressing challenges deriving from its increasing use. This new study, which was prepared by Prof. Frederik Zuiderveen Borgesius for the Anti-discrimination Department of the Council of Europe, elaborates on the risks of discrimination caused by algorithmic decision-making and other types of artificial intelligence (AI).
Information about millions of people is collected for behavioural targeting, a type of marketing that involves tracking people's online behaviour for targeted advertising. It is hotly debated whether data protection law applies to behavioural targeting. Many behavioural targeting companies say that, as long as they do not tie names to data they hold about individuals, they do not process any personal data, and that, therefore, data protection law does not apply to them. European Data Protection Authorities, however, take the view that a company processes personal data if it uses data to single out a person, even if it cannot tie a name to these data. This paper argues that data protection law should indeed apply to behavioural targeting. Companies can often tie a name to nameless data about individuals. Furthermore, behavioural targeting relies on collecting information about individuals, singling out individuals, and targeting ads to individuals. Many privacy risks remain, regardless of whether companies tie a name to the information they hold about a person. A name is merely one of the identifiers that can be tied to data about a person, and it is not even the most practical identifier for behavioural targeting. Seeing data used to single out a person as personal data fits the rationale for data protection law: protecting fairness and privacy.
AI is transforming medical practice and redefining the competencies that future healthcare professionals need to master. Despite international recommendations, the integration of AI into Medicine curricula in Spain had not been systematically evaluated until now. A cross-sectional study (July-September 2025) including Spanish universities offering the official degree in Medicine, according to the 'Register of Universities, Centers and Degrees (Registro de Universidades, Centros y T\'itulos RUCT)'. Curricula and publicly available institutional documentation were reviewed to identify courses and competencies related to AI in the 2025-2026 academic year. The analysis was performed using descriptive statistics. Of the 52 universities analyzed, ten (19.2%) offer specific AI courses, whereas 36 (69.2%) include no related content. Most of the identified courses are elective, with a credit load ranging from three to six ECTS, representing on average 1.17% of the total 360 credits of the degree. The University of Ja\'en is the only institution offering a compulsory course with AI content. The territorial analysis reveals marked disparities: Andalusia leads with 55.5% of its universities incorporating AI training, while several communities lack any initiative in this area. The integration of AI into the medical degree in Spain is incipient, fragmented, and uneven, with a low weight in ECTS. The limited training load and predominance of elective courses restrict the preparation of future physicians to practice in a healthcare environment increasingly mediated by AI. The findings support the establishment of minimum standards and national monitoring of indicators.
Unlike other military technologies driven by national security needs and developed with federal funding, AI is predominantly funded and advanced by commercial industry for civilian applications. However, there is a lack of understanding of the reasons commercial AI firms decide to work with the DoD or choose to abstain from the defence market. This thesis argues that the contract law and procurement framework are among the most significant obstacles. This research indicates that the commercial AI industry actually views the DoD as an attractive customer. However, this attraction is despite the obstacles presented by traditional contract law and procurement practices used to solicit and award contracts. Drawing on social exchange theory, this thesis introduces a theoretical framework, optimal buyer theory, to understand the factors that influence a commercial decision to engage with the DoD. Interviews from a sample of the participants explain why the AI industry holds such perceptions, opinions, and preferences about contracts generally and the DoD, specifically, in its role as a customer. This thesis concludes that commercial AI firms are attracted to contracts that are consistent with their business and technology considerations. Additionally, it develops best practices for leveraging existing contract law, primarily other transaction authority, to align contracting practices with commercial preferences and the machine learning development and deployment lifecycle.