Optimizing Knowledge Transfer Graph for Deep Collaborative Learning
- Author
- Soma Minami, Naoki Okamoto, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi
- Publication
- IEEE Access, 2025
Download: PDF (English)
Knowledge transfer among multiple networks, using predicted probabilities or intermediate-layer activations, has evolved significantly through extensive manual design, ranging from simple teacher—student approaches (for example, knowledge distillation) to bidirectional cohort methods (for example, deep mutual learning). However, key factors such as network size, the number of networks, transfer direction, and loss function design interact in complex ways and limit conventional methods to exploring only a narrow range of possible combinations. This study proposes a novel training method that enables more flexible and diverse combinations of knowledge transfer strategies. Specifically, a graph-based representation called the knowledge transfer graph is introduced, providing a unified framework for representing a wide variety of knowledge transfer strategies. Furthermore, specialized loss functions are proposed that incorporate five distinct gate functions to dynamically control gradient propagation and thereby enable a richer set of knowledge transfer strategies. By systematically searching the structure of the knowledge transfer graph, the proposed method automatically discovers more effective knowledge transfer strategies than those designed manually. Experimental results on eleven datasets demonstrate that the proposed approach consistently yields significant performance improvements over conventional methods employing basic combinations of the key factors and reliably identifies effective graph structures.