Convergence of Agnostic Federated Averaging
Abstract
Federated learning (FL) enables decentralized model training without centralizing raw data. However, practical FL deployments often face a key realistic challenge: Clients participate intermittently in server aggregation and with unknown, possibly biased participation probabilities. Most existing convergence results either assume full-device participation, or rely on knowledge of (in fact uniform) client availability distributions -- assumptions that rarely hold in practice. In this work, we characterize the optimization problem that consistently adheres to the stochastic dynamics of the well-known \emph{agnostic Federated Averaging (FedAvg)} algorithm under random (and variably-sized) client availability, and rigorously establish its convergence for convex, possibly nonsmooth losses, achieving a standard rate of order $\mathcal{O}(1/\sqrt{T})$, where $T$ denotes the aggregation horizon. Our analysis provides the first convergence guarantees for agnostic FedAvg under general, non-uniform, stochastic client participation, without knowledge of the participation distribution. We also empirically demonstrate that agnostic FedAvg in fact outperforms common (and suboptimal) weighted aggregation FedAvg variants, even with server-side knowledge of participation weights.