Three storylines defined this week's recommendation systems research. First, Semantic ID-based generative recommendation moved from paradigm validation into hard engineering. The specific problems: cold-start signal balancing, ad monetization, out-of-distribution robustness, and reasoning over item tokens. Alibaba's OneSearch-V2 delivered CTR +3.98% and conversion rate +3.05% in production. Second, LLM Agents in recommendation and search shifted from "end-to-end replacement" toward "layered collaboration" — reasoning stays with the LLM, execution goes to deterministic modules, and reinforcement learning aligns intermediate steps with final objectives. Third, industrial search ranking hit an efficiency wall — Taobao's KARMA uses semantic regularization to prevent LLM fine-tuning from destroying knowledge, UniScale argues that data and model scaling must be co-designed, and DIET compresses training data to 1–2% while preserving performance trends.
This week's recommendation systems research runs along three technical threads. First, Semantic ID-driven generative retrieval keeps gaining momentum. Spotify released two papers simultaneously — one deploys a SID system in production with A/B test results (new show discovery rate +14.3%), the other treats SID as a standalone modality unifying search, recommendation, and reasoning. Industrial SID systems have moved past "can this work?" into "how do we make it work better." Second, multimodal retrieval and representation compression: Apple delivered a production-grade unified retrieval architecture for text, images, and video; Aalto University distilled a 2B-parameter VLM into a 69M text encoder (50x latency reduction); POSTECH identified and fixed a modality collapse problem in VLM embedders for recommendation.
The central narrative this week: generative recommendation is moving from single-scenario proof-of-concept to full-pipeline production deployment. Papers from Meituan, Snapchat, and Meta no longer debate whether Semantic IDs work — they tackle the real operational pain points: multi-business expansion, codebook fairness, incremental training, and reranking integration. MBGR (2604.02684) delivers CTR +1.24% online across Meituan's multi-business food delivery platform, the top-rated paper this week.
Industrial recommendation ranking shifts to systematic scaling engineering. Alibaba's SORT achieves orders +6.35%, Kuaishou's FlashEvaluator and SOLAR optimize evaluator and attention efficiency, ByteDance's HAP enables adaptive compute budget allocation. Generative recommendation enters objective alignment phase. 36 papers analyzed.