Transmission With Machine Language Tokens: A Paradigm for Task-Oriented Agent Communication
Abstract
The rapid advancement in large foundation models is propelling the paradigm shifts across various industries. One significant change is that agents, instead of traditional machines or humans, will be the primary participants in the future production process, which consequently requires a novel AI-native communication system tailored for agent communications. Integrating the ability of large language models (LLMs) with task-oriented semantic communication is a potential approach. However, the output of existing LLM is human language, which is highly constrained and sub-optimal for agent-type communication. In this paper, we innovatively propose a task-oriented agent communication system. Specifically, we leverage the original LLM to learn a specialized machine language represented by token embeddings. Simultaneously, a multi-modal LLM is trained to comprehend the application task and to extract essential implicit information from multi-modal inputs, subsequently expressing it using machine language tokens. This representation is significantly more efficient for transmission over the air interface. Furthermore, to reduce transmission overhead, we introduce a joint token and channel coding (JTCC) scheme that compresses the token sequence by exploiting its sparsity while enhancing robustness against channel noise. Extensive experiments demonstrate that our approach reduces transmission overhead for downstream tasks while enhancing accuracy relative to the SOTA methods.