Quick answer
AI Summary: Instead of averaging weights, this method uses preference-based distillation to align diverse model architectures across different clients.
AI Summary: Instead of averaging weights, this method uses preference-based distillation to align diverse model architectures across different clients.
We argue that while replacing data with model parameters characterizes the present of Federated Learning (FL), replacing parameters with preferences represents a more scalable and privacy-preserving future. Preferences capture high-level user intent and align with downstream goals, making them ideal for FL where clients share reward signals instead of raw model parameters. We introduce Mixture-of-Rewards (MoR), which trains a lightweight routing network to integrate preference signals of different styles and resolve conflicts while maintaining privacy. Experiments validate that MoR consistently outperforms existing parameter-averaging approaches, especially under client heterogeneity and diverse model architectures.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for Replacing Parameters with Preferences: Federated Alignment of Heterogeneous Models.