Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
This paper makes the opaque data market in the AI economy empirically legible for the first time, constructing a computational testbed to address a core epistemic failure: regulators governing a market defined by structural opacity, fragile price discovery, and brittle technical safeguards that have paralyzed traditional empirics and fragmented policy. The pipeline begins with multi-year fieldwork to extract the market's hidden logic, and then embeds these grounded behaviors into a high-fidelity ABM, parameterized via a novel LLM-based discrete-choice experiment that captures the preferences of unsurveyable populations. The pipeline is validated against reality, reproducing observed trade patterns. This policy laboratory delivers clear, counter-intuitive results. First, property-style relief is a false promise: ''anonymous-data'' carve-outs expand trade but ignore risk, causing aggregate welfare to collapse once external harms are priced in. Second, social welfare peaks when the downstream buyer internalizes the full substantive risk. This least-cost avoider approach induces efficient safeguards, simultaneously raising welfare and sustaining trade, and provides a robust empirical foundation for the legal drift toward two-sided reachability. The contribution is a reproducible pipeline designed to end the reliance on intuition. It converts qualitative insight into testable, comparative policy experiments, obsoleting armchair conjecture by replacing it with controlled evidence on how legal rules actually shift risk and surplus. This is the forward-looking engine that moves the field from competing intuitions to direct, computational analysis.
Urban food delivery services have become an integral part of daily life, yet their mobility and environmental externalities remain poorly addressed by planners. Most studies neglect whether consumers pay enough to internalize the broader social costs of these services. This study quantifies the value of access to and use of food delivery services in Beijing, China, through two discrete choice experiments. The first measures willingness to accept compensation for giving up access, with a median value of CNY588 (approximately USD80). The second captures willingness to pay for reduced waiting time and improved reliability, showing valuations far exceeding typical delivery fees (e.g., CNY96.6/hour and CNY4.83/min at work). These results suggest a substantial consumer surplus and a clear underpricing problem. These findings highlight the need for urban planning to integrate digital service economies into pricing and mobility frameworks. We propose a quantity-based pricing model that targets delivery speed rather than order volume, addressing the primary source of externalities while maintaining net welfare gains. This approach offers a pragmatic, equity-conscious strategy to curb delivery-related congestion, emissions, and safety risks, especially in dense urban cores.
The sustainability of cooperation is crucial for understanding the progress of societies. We study a repeated game in which individuals decide the share of their income to transfer to other group members. A central feature of our model is that individuals may, with some probability, switch incomes across periods, our measure of income mobility, while the overall income distribution remains constant over time. We analyze how income mobility and income inequality affect the sustainability of contribution norms, informal agreements about how much each member should transfer to the group. We find that greater income mobility facilitates cooperation. In contrast, the effect of inequality is ambiguous and depends on the progressivity of the contribution norm and the degree of mobility. We apply our framework to an optimal taxation problem to examine the interaction between public and private redistribution.
This study explored how advanced budgeting techniques and economic indicators influence funding levels and strategic alignment in California Community Colleges (CCCs). Despite widespread implementation of budgeting reforms, many CCCs continue to face challenges aligning financial planning with institutional missions, particularly in supporting diversity, equity, and inclusion (DEI) initiatives. The study used a quantitative correlational design, analyzing 30 years of publicly available economic data, including unemployment rates, GDP growth, and CPI, in relation to CCC funding trends. Results revealed a strong positive correlation between GDP growth and CCC funding levels, as well as between CPI and funding levels, underscoring the predictive value of macroeconomic indicators in budget planning. These findings emphasize the need for educational leaders to integrate economic forecasting into budget planning processes to safeguard institutional effectiveness and sustain programs serving underrepresented student populations.
The evolution of global income distribution from 1988 to 2018 is analyzed using purchasing power parity exchange rates and well-established statistical distributions. This research proposes the use of two separate distributions to more accurately represent the overall data, rather than relying on a single distribution. The global income distribution was fitted to log-normal and gamma functions, which are standard tools in econophysics. Despite limitations in data completeness during the early years, the available information covered the vast majority of the world's population. Probability density function (PDF) curves enabled the identification of key peaks in the distribution, while complementary cumulative distribution function (CCDF) curves highlighted general trends in inequality. Initially, the global income distribution exhibited a bimodal pattern; however, the growth of middle classes in highly populated countries such as China and India has driven the transition to a unimodal distribution in recent years. While single-function fits with gamma or log-normal distributions provided reasonable accuracy, the bimodal approach constructed as a sum of log-normal distributions yielded near-perfect fits.
This paper reexamines the effects of the Latin Monetary Union (LMU) - a 19th century agreement among several European countries to standardize their currencies through a bimetallic system based on fixed gold and silver content - on trade. Unlike previous studies, this paper adopts the latest advances in gravity modeling and a more rigorous approach to defining the control group by accounting for the diversity of currency regimes during the early years of the LMU. My findings suggest that the LMU had a positive effect on trade between its members until the early 1870s, when bimetallism was still considered a viable monetary system. These effects then faded, converging to zero. Results are robust to the inclusion of additional potential confounders, the use of various samples spanning different countries and trade data sources, and alternative methodological choices.
Despite data's central role in AI production, it remains the least understood input. As AI labs exhaust public data and turn to proprietary sources, with deals reaching hundreds of millions of dollars, research across computer science, economics, law, and policy has fragmented. We establish data economics as a coherent field through three contributions. First, we characterize data's distinctive properties -- nonrivalry, context dependence, and emergent rivalry through contamination -- and trace historical precedents for market formation in commodities such as oil and grain. Second, we present systematic documentation of AI training data deals from 2020 to 2025, revealing persistent market fragmentation, five distinct pricing mechanisms (from per-unit licensing to commissioning), and that most deals exclude original creators from compensation. Third, we propose a formal hierarchy of exchangeable data units (token, record, dataset, corpus, stream) and argue for data's explicit representation in production functions. Building on these foundations, we outline four open research problems foundational to data economics: measuring context-dependent value, balancing governance with privacy, estimating data's contribution to production, and designing mechanisms for heterogeneous, compositional goods.
An increasingly large number of experiments study the labor productivity effects of automation technologies such as generative algorithms. A popular question in these experiments relates to inequality: does the technology increase output more for high- or low-skill workers? The answer is often used to anticipate the distributional effects of the technology as it continues to improve. In this paper, we formalize the theoretical content of this empirical test, focusing on automation experiments as commonly designed. Worker-level output depends on a task-level production function, and workers are heterogeneous in their task-level skills. Workers perform a task themselves, or they delegate it to the automation technology. The inequality effect of improved automation depends on the interaction of two factors: ($i$) the correlation in task-level skills across workers, and ($ii$) workers' skills relative to the technology's capability. Importantly, the sign of the inequality effect is often non-monotonic -- as technologies improve, inequality may decrease then increase, or vice versa. Finally, we use data and theory to highlight cases when skills are likely to be positively or negatively correlated. The model generally suggests that the diversity of automation technologies will play an important role in the evolution of inequality.
We develop a method to estimate producers' productivity beliefs when output quantities and input prices are unobservable, and we use it to evaluate the market for science. Our model of researchers' labor supply shows how their willingness to pay for inputs reveals their productivity beliefs. We estimate the model's parameters using data from a nationally representative survey of researchers and find the distribution of productivity to be very skewed. Our counterfactuals indicate that a more efficient allocation of the current budget could be worth billions of dollars. There are substantial gains from developing new ways of identifying talented scientists.
This study applies an optimized XGBoost regression model to estimate district-level expenditures on high-dosage tutoring from incomplete administrative data. The COVID-19 pandemic caused unprecedented learning loss, with K-12 students losing up to half a grade level in certain subjects. To address this, the federal government allocated \$190 billion in relief. We know from previous research that small-group tutoring, summer and after school programs, and increased support staff were all common expenditures for districts. We don't know how much was spent in each category. Using a custom scraped dataset of over 7,000 ESSER (Elementary and Secondary School Emergency Relief) plans, we model tutoring allocations as a function of district characteristics such as enrollment, total ESSER funding, urbanicity, and school count. Extending the trained model to districts that mention tutoring but omit cost information yields an estimated aggregate allocation of approximately \$2.2 billion. The model achieved an out-of-sample $R^2$=0.358, demonstrating moderate predictive accuracy given substantial reporting heterogeneity. Methodologically, this work illustrates how gradient-boosted decision trees can reconstruct large-scale fiscal patterns where structured data are sparse or missing. The framework generalizes to other domains where policy evaluation depends on recovering latent financial or behavioral variables from semi-structured text and sparse administrative sources.
We study optimal monetary policy when a central bank maximizes a quantile utility objective rather than expected utility. In our framework, the central bank's risk attitude is indexed by the quantile index level, providing a transparent mapping between hawkish/dovish stances and attention to adverse macroeconomic realizations. We formulate the infinite-horizon problem using a Bellman equation with the quantile operator. Implementing an Euler-equation approach, we derive Taylor-rule-type reaction functions. Using an indirect inference approach, we derive a central bank risk aversion implicit quantile index. An empirical implementation for the US is outlined based on reduced-form laws of motion with conditional heteroskedasticity, enabling estimation of the new monetary policy rule and its dependence on the Fed risk attitudes. The results reveal that the Fed has mostly a dovish-type behavior but with some periods of hawkish attitudes.
This article investigates the fundamental factors influencing the rate and manner of Electoral participation with an economic model-based approach. In this study, the structural parameters affecting people's decision making are divided into two categories. The first category includes general topics such as economic and livelihood status, cultural factors and, also, psychological variables. In this section, given that voters are analyzed within the context of consumer behavior theory, inflation and unemployment are considered as the most important economic factors. The second group of factors focuses more on the type of voting, with emphasis on government performance. Since the incumbent government and its supportive voters are in a game with two Nash equilibrium, and also because the voters in most cases are retrospect, the government seeks to keep its position by a deliberate change in economic factors, especially inflation and unemployment rates. Finally, to better understand the issue, a hypothetical example is presented and analyzed in a developing country in the form of a state-owned populist employment plan.
Most studies on the labor market effects of immigration use repeated cross-sectional data to estimate the effects of immigration on regions. This paper shows that such regional effects are composites of effects that address fundamental questions in the immigration debate but remain unidentified with repeated cross-sectional data. We provide a unifying empirical framework that decomposes the regional effects of immigration into their underlying components and show how these are identifiable from data that track workers over time. Our empirical application illustrates that such analysis yields a far more informative picture of immigration's effects on wages, employment, and occupational upgrading.
With escalating macroeconomic uncertainty, the risk interlinkages between energy and food markets have become increasingly complex, posing serious challenges to global energy and food security. This paper proposes an integrated framework combining the GJRSK model, the time-frequency connectedness analysis, and the random forest method to systematically investigate the moment connectedness within the energy-food nexus and explore the key drivers of various spillover effects. The results reveal significant multidimensional risk spillovers with pronounced time variation, heterogeneity, and crisis sensitivity. Return and skewness connectedness are primarily driven by short-term spillovers, kurtosis connectedness is more prominent over the medium term, while volatility connectedness is dominated by long-term dynamics. Notably, crude oil consistently serves as a central transmitter in diverse connectedness networks. Furthermore, the spillover effects are influenced by multiple factors, including macro-financial conditions, oil supply-demand fundamentals, policy uncertainties, and climate-related shocks, with the core drivers of connectedness varying considerably across different moments and timescales. These findings provide valuable insights for the coordinated governance of energy and food markets, the improvement of multilayered risk early-warning systems, and the optimization of investment strategies.
I evaluate San Juan, Puerto Rico's late-night alcohol sales ordinance using a multi-outcome synthetic control that pools economic and public-safety series. I show that a common-weight estimator clarifies mechanisms under low-rank outcome structure. I find economically meaningful reallocations in targeted sectors -- restaurants and bars, gasoline and convenience, and hospitality employment -- while late-night public disorder arrests and violent crime show no clear departures from pre-policy trends. The short post-policy window and small donor pool limit statistical power; joint conformal and permutation tests do not reject the null at conventional thresholds. I therefore emphasize effect magnitudes, temporal persistence, and pre-trend fit over formal significance. Code and diagnostics are available for replication.
Global climate warming and air pollution pose severe threats to economic development and public safety, presenting significant challenges to sustainable development worldwide. Corporations, as key players in resource utilization and emissions, have drawn increasing attention from policymakers, researchers, and the public regarding their environmental strategies and practices. This study employs a two-way fixed effects panel model to examine the impact of environmental information disclosure on corporate environmental performance, its regional heterogeneity, and the underlying mechanisms. The results demonstrate that environmental information disclosure significantly improves corporate environmental performance, with the effect being more pronounced in areas of high population density and limited green space. These findings provide empirical evidence supporting the role of environmental information disclosure as a critical tool for improving corporate environmental practices. The study highlights the importance of targeted, region-specific policies to maximize the effectiveness of disclosure, offering valuable insights for promoting sustainable development through enhanced corporate transparency.
The rapid ascent of Foundation Models (FMs), enabled by the Transformer architecture, drives the current AI ecosystem. Characterized by large-scale training and downstream adaptability, FMs (as GPT family) have achieved massive public adoption, fueling a turbulent market shaped by platform economics and intense investment. Assessing the vulnerability of this fast-evolving industry is critical yet challenging due to data limitations. This paper proposes a synthetic AI Vulnerability Index (AIVI) focusing on the upstream value chain for FM production, prioritizing publicly available data. We model FM output as a function of five inputs: Compute, Data, Talent, Capital, and Energy, hypothesizing that supply vulnerability in any input threatens the industry. Key vulnerabilities include compute concentration, data scarcity and legal risks, talent bottlenecks, capital intensity and strategic dependencies, as well as escalating energy demands. Acknowledging imperfect input substitutability, we propose a weighted geometrical average of aggregate subindexes, normalized using theoretical or empirical benchmarks. Despite limitations and room for improvement, this preliminary index aims to quantify systemic risks in AI's core production engine, and implicitly shed a light on the risks for downstream value chain.
This study uses the Synthetic Control Method (SCM) to estimate the causal impact of a January 2025 wildfire on housing prices in Altadena, California. We construct a 'synthetic' Altadena from a weighted average of peer cities to serve as a counterfactual; this approach assumes no spillover effects on the donor pool. The results reveal a substantial negative price effect that intensifies over time. Over the six months following the event, we estimate an average monthly loss of $32,125. The statistical evidence for this effect is nuanced. Based on the robust post-to-pre-treatment RMSPE ratio, the result is statistically significant at the 10% level (p = 0.0508). In contrast, the effect is not statistically significant when measured by the average post-treatment gap (p = 0.3220). This analysis highlights the significant financial risks faced by communities in fire-prone regions and demonstrates SCM's effectiveness in evaluating disaster-related economic damages.