Consistent Injective Mappings in Pseudo‑RL Encoder–Decoder Training
Develop training procedures that consistently achieve injective mappings between a finite color set and a finite name set in the pseudo‑reinforcement learning setup where one language model encodes colors as names and another decodes names back to colors using in‑context rewards followed by supervised fine‑tuning.
References
It remains to be determined how to consistently achieve injective mappings.
— Secret Collusion among Generative AI Agents: Multi-Agent Deception via Steganography
(2402.07510 - Motwani et al., 2024) in Appendix, Additional Case Studies — Pseudo‑RL Optimisation