AI周报 2026-W08

本周 AI 领域最突出的特征是一种"同步加速":资本、模型、基础设施和研究同时进入新的量级。OpenAI 宣布了史上最大规模的 1100 亿美元融资,NVIDIA 以 300 亿美元直接入股,Anthropic 刚刚完成 300 亿美元 G 轮——三天内流入 AI 头部公司的资本超过 1400 亿美元。与此同时,Qwen3.5-397B、Claude Sonnet 4.6、Gemini 3.1 Pro 三款旗舰模型在同一周内发布,形成了一场罕见的三方对决。 但真正值得关注的变化发生在水面之下。微软、Cloudflare、GitHub、HuggingFace 在同一周内集中发布 Agent 基础设施框架,标志着行业重心正从"更强的模型"转向"更可靠的 Agent 系统"。与此形成尖锐对照的是,五篇安全研究论文从几何、结构、模态三个维度共同揭示了当前 LLM 安全对齐的根本性脆弱。在 Agent 即将大规模部署的节点上,这一矛盾格外刺眼。

RecSys Weekly 2026-W16

Across 17 recommendation-system papers this week, industry teams used live deployments as the argument. Three technical storylines stand out.

RecSys Weekly 2026-W15

The central narrative this week: generative recommendation is moving from single-scenario proof-of-concept to full-pipeline production deployment. Papers from Meituan, Snapchat, and Meta no longer debate whether Semantic IDs work — they tackle the real operational pain points: multi-business expansion, codebook fairness, incremental training, and reranking integration. MBGR (2604.02684) delivers CTR +1.24% online across Meituan's multi-business food delivery platform, the top-rated paper this week.

AI Weekly 2026-W15

2026-W15 (April 5-11) marked a cognitive shift in AI engineering: the orchestration infrastructure built around models — what the industry now calls the "harness" — moved from backstage to center stage. OpenAI disclosed a million-line zero-human-code experiment. Meta built a code pre-computation engine with 50+ agents. A Claude Code source leak exposed the sophistication of this architecture. All three point to the same conclusion: the 2026 AI engineering race is no longer about models — it is about everything around them.

AI Weekly 2026-W14

If one word captures this week in AI, it's "engineering." Coding agents had a collective awakening. Internal architectures got laid bare, engineering methodology got codified, toolchains proliferated, and model-layer catch-up intensified. Coding agents have officially entered the era of systematic engineering discipline. Meanwhile, agent memory discourse — sparked by Karpathy's personal Wiki experiment — rippled through academia and the open-source community, making "how should agents persist knowledge" the week's most debated question.

RecSys Weekly 2026-W14

This week's recommendation systems research centers on three technical threads: engineering generative recommendation for production, agent-driven system self-evolution, and efficient scaling of ranking models.

AI Weekly 2026-W13

Week 13 of 2026 (March 22–28) surfaced three parallel but interconnected narratives in AI. The first is a concentrated burst of multi-agent orchestration tooling. Cline Kanban, Scion, DeerFlow 2.0, and several others all shipped in the same week, marking an industry-wide pivot from "single-agent capability" to "engineering multi-agent collaboration."

RecSys Weekly 2026-W11

Two technical threads dominate Week 11 of 2026 (March 8–14) in recommendation system research. First, generative recommendation (GR) is undergoing full-stack optimization — transitioning from "making it work" to "making it work well, fast, and fairly" — Netflix/Meta's exponential reward-weighted SFT addresses post-training alignment, LinkedIn's causal attention reformulation halves sequence length, Kuaishou's FP8 quantization reduces OneRec-V2 inference latency by 49%, and Alibaba's differentiable geometric indexing eliminates long-tail bias at its root. Five papers advance GR's industrial maturity across five dimensions. Second, LLM-based recommendation is shifting from "single-pass inference" toward an agentic paradigm — Meta's VRec inserts verification steps into reasoning chains, Meituan's RecPilot replaces traditional recommendation lists with a multi-agent framework, USTC's TriRec introduces tri-party coordination for the first time, and RUC/JD's RecThinker enables autonomous tool invocation.

Recsys Weekly 2026-W10

Industrial recommendation ranking shifts to systematic scaling engineering. Alibaba's SORT achieves orders +6.35%, Kuaishou's FlashEvaluator and SOLAR optimize evaluator and attention efficiency, ByteDance's HAP enables adaptive compute budget allocation. Generative recommendation enters objective alignment phase. 36 papers analyzed.

推荐算法日报 - 2026-03-06

多模态融合走向实用化:工业界开始系统性地将视觉信息深度整合到推荐核心链路(如召回),超越传统的文本主导模式,通过领域微调、多阶段对齐等具体技术提升融合效果,以应对电商等富媒体场景的需求。; 系统工程的科学化与可预测性:学术界开始将“缩放定律”等系统性分析方法引入推荐系统,旨在为模型规模、数据量与性能之间的关系建立可预测的模型,为重排等关键阶段的资源投入提供科学决策依据,降低试错成本。; 🔧 偏差治理的精细化与动态化:针对序列推荐中的曝光与选择偏差问题,研究从静态的因果纠偏方法向动态、时序感知的

推荐算法日报 - 2026-03-05

工业级Transformer排序系统优化:今日多篇工业界论文聚焦于将Transformer架构深度适配并优化至推荐系统的排序阶段。核心挑战在于解决工业场景特有的高特征稀疏性、低标签密度和严苛的延迟要求。阿里巴巴的SORT和字节跳动的HAP分别从精排和粗排角度,通过请求中心样本组织、局部注意力、自适应计算预算分配等系统化设计,实现了业务指标显著提升与推理效率的同步优化,标志着Transformer在工业推荐中从“可用”迈向“高效可用”的新阶段。; 多阶段推荐中的精细化样本与计算管理:推荐系统多阶段