FlowUnits: Extending Dataflow for the Edge-to-Cloud Computing Continuum
Abstract
This paper introduces FlowUnits, a novel programming and deployment model that extends the traditional dataflow paradigm to address the unique challenges of edge-to-cloud computing environments. While conventional dataflow systems offer significant advantages for large-scale data processing in homogeneous cloud settings, they fall short when deployed across distributed, heterogeneous infrastructures. FlowUnits addresses three critical limitations of current approaches: lack of locality awareness, insufficient resource adaptation, and absence of dynamic update mechanisms. FlowUnits organize processing operators into cohesive, independently manageable components that can be transparently replicated across different regions, efficiently allocated on nodes with appropriate hardware capabilities, and dynamically updated without disrupting ongoing computations. We implement and evaluate the FlowUnits model within Renoir, an existing dataflow system, demonstrating significant improvements in deployment flexibility and resource utilization across the computing continuum. Our approach maintains the simplicity of dataflow while enabling seamless integration of edge and cloud resources into unified data processing pipelines.