Horizon Daily

Horizon Summary: 2026-04-15 (EN)

2026-04-14T16:00:00+00:00

From 122 items, 46 important content pieces were selected

头条速递

OpenAI Launches GPT-5.4-Cyber and Expands Trusted Access Program ⭐️ 9.0/10

OpenAI has officially released GPT-5.4-Cyber, a specialized variant of its flagship model fine-tuned specifically for defensive cybersecurity tasks. Concurrently, the company expanded its “Trusted Access for Cyber” program, allowing users to verify their identity via government ID photos processed by Persona to gain reduced-friction access to these tools. This move comes just one week after rival Anthropic announced its own powerful cybersecurity model, Claude Mythos. This release signifies a major escalation in the AI cybersecurity arms race, directly responding to Anthropic’s recent advancements with a dedicated defensive tool. By implementing identity verification through Persona, OpenAI aims to democratize access to high-capability security tools while maintaining safety controls against malicious use. The shift suggests that future access to frontier AI models for sensitive domains will increasingly depend on verified real-world identities rather than simple account credentials. This could fundamentally change how security researchers and enterprises interact with large language models for critical infrastructure protection. Access to the full suite of OpenAI’s best security tools still requires an additional Google Form application process, distinguishing it from the self-service verification flow available for general cyber-permissive access. The identity verification component relies on Persona, a third-party service that processes government-issued ID photos to confirm user authenticity. While GPT-5.4-Cyber is designed to be “cyber-permissive” for defense, the underlying GPT-5.4 model family previously demonstrated an 88% success rate in atomic Network Attack Simulation challenges.

rss · Simon Willison · Apr 14, 21:23

Background: Large Language Models (LLMs) like GPT-5.4 have dual-use capabilities, meaning they can be used for both beneficial defensive coding and harmful offensive cyberattacks. Recently, Anthropic highlighted this risk with its “Project Glasswing” and the unreleased “Claude Mythos” model, which was deemed too dangerous for public release due to its potent exploitation skills. In response, AI companies are developing “cyber-permissive” variants that retain helpful security knowledge while attempting to refuse requests related to creating malware or exploiting vulnerabilities. Identity verification services like Persona are becoming critical infrastructure in this landscape to ensure that powerful tools are only accessible to accountable individuals.

References

Horizon Summary: 2026-04-15 (ZH)

2026-04-14T16:00:00+00:00

From 122 items, 46 important content pieces were selected

头条速递

OpenAI 推出 GPT-5.4-Cyber 并扩展可信访问计划 ⭐️ 9.0/10
英国 Mythos AI 首个完成多步网络渗透挑战 ⭐️ 9.0/10
ClawBench 揭示 AI 代理在真实网络任务中表现挣扎 ⭐️ 9.0/10
Anthropic 推出 Claude Code Routines 以实现自动化开发工作流 ⭐️ 8.0/10
作者尝试退出 Flock Safety 监控网络并质疑其数据所有权主张 ⭐️ 8.0/10
AI 网络安全演变为经济层面的工作量证明军备竞赛 ⭐️ 8.0/10
HALO-Loss 使神经网络能够对不确定的预测选择弃权 ⭐️ 8.0/10
独立开发者将纯脉冲神经网络扩展至 10.88 亿参数 ⭐️ 8.0/10
研究者发布含引用图谱的 2000 万 + 印度法律文档数据集 ⭐️ 8.0/10
主流媒体因担忧 AI 训练屏蔽互联网档案馆 ⭐️ 8.0/10
ShinyHunters 借 Anodot 入侵 Snowflake 后向 Rockstar 勒索赎金 ⭐️ 8.0/10
中国五部门联合印发人工智能加教育行动计划 ⭐️ 7.0/10
千问 Agent 实现通过对话直接生成和编辑 Excel ⭐️ 7.0/10
Nervecode：利用层级“惊讶”信号提升分布外检测 ⭐️ 7.0/10
MiniMax 因禁止开源模型 2.7 商用引发争议 ⭐️ 7.0/10

关注动态

MemSearch Updates: 6 updates — bump memsearch 0.3.0 and claude-code plugin 0.3.5 (#348), add Jina and Mistral embedding providers (#346), expand feature matrix with embedding providers and optional rer… ⭐️ ?/10
chore(README): update the preview pic ⭐️ ?/10
Superpowers Updates: 10 updates — Merge pull request #1165 from obra/mirror-codex-plugin-tooling, anchor EXCLUDES patterns to source root, exclude assets/, add –bootstrap flag ⭐️ ?/10
openai/codex: 2 releases — rust-v0.121.0-alpha.9, rust-v0.121.0-alpha.8 ⭐️ ?/10
anthropics/claude-code: 2 releases — v2.1.108, v2.1.107 ⭐️ ?/10
upstash/context7 released ctx7@0.3.13 ⭐️ ?/10

GitHub 热榜

Karpathy 的 llm.c：用于教育的纯 C/CUDA LLM 训练实现 ⭐️ 10.0/10
Instant-NGP：通过 CUDA 实现闪电般快速的神经图形 ⭐️ 10.0/10
SageAttention：Transformer 的量化加速方案 ⭐️ 10.0/10
VoxCPM2：无分词器的多语言语音合成与克隆模型 ⭐️ 9.0/10
Axolotl 简化生产级大语言模型微调流程 ⭐️ 9.0/10
微软 Agent Lightning 简化 AI 智能体训练流程 ⭐️ 9.0/10
Flowise：基于 LangChain 的可视化低代码 AI 智能体构建器 ⭐️ 9.0/10
DeepEP：面向 MoE 训练的高效通信库 ⭐️ 9.0/10
Mirage 将大语言模型编译为持久化 CUDA 巨核 ⭐️ 9.0/10
Dao-AILab 发布优化的因果一维卷积 CUDA 内核 ⭐️ 9.0/10
Kronos：首个面向金融 K 线的开源基础模型 ⭐️ 8.0/10
Claude-Mem 插件实现 AI 代理会话记忆自动化 ⭐️ 8.0/10
Multica：用于管理 AI 编码代理的开源平台 ⭐️ 8.0/10
Archon：面向 AI 编程的确定性工作流引擎 ⭐️ 8.0/10
Voicebox：本地优先的开源语音克隆工作室 ⭐️ 8.0/10
BlenderMCP 通过 MCP 协议实现大语言模型驱动的 3D 建模 ⭐️ 8.0/10
基于单张图像的实时视频换脸工具 ⭐️ 8.0/10
yt-dlp：AI 数据流水线必备的多媒体下载工具 ⭐️ 8.0/10
Pixelle-Video：全自动 AI 短视频生成引擎 ⭐️ 8.0/10
OmniRoute：支持智能路由和 MCP 协议的统一 AI 网关 ⭐️ 8.0/10
NVIDIA cuOpt：用于车辆路径规划的 GPU 加速求解器 ⭐️ 8.0/10
Ralph：基于 Git 持久化记忆的自主 AI 代理循环 ⭐️ 7.0/10
GSD：防止 AI 上下文退化的元提示系统 ⭐️ 7.0/10
专为令牌高效 AI 代理优化的 Playwright CLI ⭐️ 7.0/10
GPUMD：基于 CUDA GPU 的高性能分子动力学模拟引擎 ⭐️ 7.0/10

头条速递

OpenAI 推出 GPT-5.4-Cyber 并扩展可信访问计划 ⭐️ 9.0/10

OpenAI 正式发布了 GPT-5.4-Cyber，这是其旗舰模型的一个专门变体，经过微调以专门用于防御性网络安全任务。与此同时，该公司扩展了“网络安全可信访问”计划，允许用户通过 Persona 处理的政府身份证件照片进行身份验证，从而获得更便捷的工具体验。此举紧随竞争对手 Anthropic 在一周前宣布其强大的网络安全模型 Claude Mythos 之后。此次发布标志着人工智能网络安全军备竞赛的重大升级，直接回应了 Anthropic 最近的进展并提供了专用的防御工具。通过实施基于 Persona 的身份验证，OpenAI 旨在在保持对恶意使用的安全控制的同时，使高能力安全工具的使用更加普及。这一转变表明，未来在敏感领域使用前沿人工智能模型将越来越依赖于经过验证的真实世界身份，而不仅仅是简单的账户凭证。这可能会从根本上改变安全研究人员和企业如何利用大型语言模型来保护关键基础设施。要访问 OpenAI 全套最佳安全工具，仍需额外的 Google 表单申请流程，这与适用于一般网络许可访问的自助验证流程有所不同。身份验证组件依赖于第三方服务 Persona，该服务通过处理政府颁发的身份证件照片来确认用户真实性。虽然 GPT-5.4-Cyber 旨在为防御目的提供“网络许可”，但基础的 GPT-5.4 模型家族此前在原子网络攻击模拟挑战中曾展现出 88% 的成功率。

rss · Simon Willison · Apr 14, 21:23

背景: 像 GPT-5.4 这样的大型语言模型（LLM）具有双重用途能力，意味着它们既可用于有益的防御性编码，也可用于有害的进攻性网络攻击。最近，Anthropic 通过其“Glasswing 项目”和未发布的“Claude Mythos”模型强调了这一风险，后者因其强大的漏洞利用技能而被认为过于危险，不适合公开发布。作为回应，人工智能公司正在开发“网络许可”变体，这些变体保留了有用的安全知识，同时试图拒绝与创建恶意软件或利用漏洞相关的请求。在这种环境下，像 Persona 这样的身份验证服务正成为关键基础设施，以确保只有可问责的个人才能访问这些强大的工具。

参考链接

Horizon Summary: 2026-04-14 (EN)

2026-04-13T16:00:00+00:00

From 110 items, 47 important content pieces were selected

头条速递

Critical Kernel Vulnerabilities Found in Kingsoft and 360 Antivirus Drivers ⭐️ 9.0/10
Malicious Actor Buys 30 WordPress Plugins to Inject Backdoors ⭐️ 8.0/10
Simon Willison demos local audio transcription with Gemma 4 and MLX ⭐️ 8.0/10
Anthropic’s Mythos Model Sparks Controversy Over Alleged ByteDance Seed Tech Usage ⭐️ 8.0/10
TurboOCR Achieves 1,200 Images/Second via TensorRT and CUDA Optimization ⭐️ 8.0/10
Depth-Recurrent Transformers Improve Generalization Without Intermediate Supervision ⭐️ 8.0/10
Third-Party Benchmarks Show Claude Opus 4.6 Hallucination Surge and Ranking Drop ⭐️ 8.0/10
EU Plans to Classify ChatGPT as Very Large Online Search Engine ⭐️ 8.0/10
Cloudflare Data Shows AI Giants Disrupting Web Balance, Anthropic Accused of Worst Offense ⭐️ 8.0/10
US BIS Staff Shortages Stall Nvidia AI Chip Exports ⭐️ 8.0/10
Cloudflare Engineers Detail Architecture for Unified CLI ⭐️ 7.0/10
Steve Yegge Claims Google’s AI Adoption Mirrors John Deere ⭐️ 7.0/10
Bryan Cantrill Argues LLMs Lack Beneficial Human Laziness ⭐️ 7.0/10
Google Integrates Rust into Pixel 10 Modem for Enhanced Safety ⭐️ 7.0/10
Max Welling to Host AMA on AI4Science, GNNs, and CuspAI ⭐️ 7.0/10
Apple Developing Display-Less Smart Glasses with Advanced Camera to Rival Meta ⭐️ 7.0/10
Ramp Report Predicts Anthropic to Surpass OpenAI in Enterprise Market Within Two Months ⭐️ 7.0/10
Meta Developing AI Clone of CEO Mark Zuckerberg for Internal Use ⭐️ 7.0/10

关注动态

MemSearch Updates: 2 updates — extend git-root collection fix to codex/opencode skills; async s…, derive memory-recall collection from git root (#324) (#330) ⭐️ ?/10
openai/codex: 2 releases — rust-v0.121.0-alpha.6, rust-v0.121.0-alpha.4 ⭐️ ?/10
anthropics/claude-code: 2 releases — v2.1.105, v2.1.104 ⭐️ ?/10
upstash/context7: 2 releases — @upstash/context7-mcp@2.1.8, ctx7@0.3.12 ⭐️ ?/10

GitHub 热榜

Karpathy Releases Minimal LLM Training in Raw C and CUDA ⭐️ 10.0/10
SageAttention Delivers 2-5x Speedup Over FlashAttention via 8-bit Quantization ⭐️ 10.0/10
VoxCPM2: Tokenizer-Free Multilingual TTS with Voice Cloning ⭐️ 9.0/10
Firecrawl: Web Data API Optimized for AI Agents ⭐️ 9.0/10
Chrome DevTools MCP Bridges AI Agents and Browser Debugging ⭐️ 9.0/10
DeepEP Optimizes Expert Parallelism for Large MoE Models ⭐️ 9.0/10
Mirage Compiles LLMs into Persistent CUDA Mega-Kernels ⭐️ 9.0/10
Nous Research Launches Self-Improving Hermes Agent Framework ⭐️ 8.0/10
Kronos: First Open-Source Foundation Model for Financial K-Lines ⭐️ 8.0/10
Microsoft MarkItDown: LLM-Ready Document Conversion ⭐️ 8.0/10
Multica Orchestrates Autonomous Coding Agents as Teammates ⭐️ 8.0/10
Archon: Deterministic Workflow Engine for AI Coding ⭐️ 8.0/10
Claude-Mem: Automated Context Memory for Claude Code Agents ⭐️ 8.0/10
RustFS: High-Performance S3-Compatible Storage in Rust ⭐️ 8.0/10
Ralph: Autonomous AI Agent Loop for PRD Execution ⭐️ 8.0/10
yt-dlp: Essential CLI Tool for AI Data Collection ⭐️ 8.0/10
Reverse-Engineering Google’s SynthID Watermark via Spectral Analysis ⭐️ 8.0/10
Voicebox: Local-First Desktop Studio for Voice Cloning ⭐️ 8.0/10
OpenMetadata: Unified Platform for Data Governance and Lineage ⭐️ 8.0/10
Letta Code: Persistent Memory for AI Coding Agents ⭐️ 8.0/10
NVIDIA NCCL Tests: Essential Multi-GPU Benchmarking Suite ⭐️ 8.0/10
ThunderKittens Simplifies High-Performance CUDA Kernel Development ⭐️ 8.0/10
DeepTutor: Agent-Native Personalized AI Tutoring System ⭐️ 7.0/10
InsForge Launches Backend Platform for AI Agent Development ⭐️ 7.0/10
GPUMD: High-Performance GPU Molecular Dynamics Engine ⭐️ 7.0/10

头条速递

Critical Kernel Vulnerabilities Found in Kingsoft and 360 Antivirus Drivers ⭐️ 9.0/10

Security researcher Patrick Saif disclosed severe kernel driver vulnerabilities in Kingsoft Antivirus and 360 Security Guard that allow unauthenticated privilege escalation. The Kingsoft firewall driver suffers from an IOCTL size calculation error causing a kernel heap overflow, while the 360 anti-Rootkit driver can bypass signature checks via process hollowing and uses hardcoded AES keys for arbitrary kernel read/write access. Both drivers possess valid digital signatures, making them prime candidates for Bring Your Own Vulnerable Driver (BYOVD) attacks. These vulnerabilities are critical because they enable attackers to escalate from standard user privileges to SYSTEM level access without needing to install malicious software on the target machine. Since the drivers are signed by trusted authorities (EV or WHQL), they can bypass modern security controls like HVCI and are not currently blocked by default lists. This poses a direct threat to system integrity and AI infrastructure, as attackers can hide malicious activities by modifying kernel callback tables or terminating processes protected by Protected Process Light (PPL). The vulnerabilities have been submitted to the LOLDrivers database but currently lack CVE identifiers and are not on the HVCI blocklist. Exploitation allows attackers to bypass KASLR, steal kernel credentials, and execute arbitrary code via signed drivers that are already present or easily loadable. Enterprises are advised to add the specific driver hashes to their EDR detection rules immediately to mitigate risks before vendors release patches.

telegram · zaihuapd · Apr 13, 13:56

Background: BYOVD (Bring Your Own Vulnerable Driver) attacks involve loading legitimate but vulnerable signed drivers to bypass security solutions and gain kernel-level control. Kernel drivers operate at the highest privilege level in an operating system, meaning a flaw in them can compromise the entire system’s security model. Protected Process Light (PPL) is a Windows security feature designed to protect critical processes from being tampered with, even by administrators, unless a specific kernel vulnerability is exploited.

References

Horizon Summary: 2026-04-14 (ZH)

2026-04-13T16:00:00+00:00

From 110 items, 47 important content pieces were selected

头条速递

金山与 360 杀毒软件内核驱动曝出高危漏洞 ⭐️ 9.0/10
恶意攻击者收购 30 个 WordPress 插件并植入后门 ⭐️ 8.0/10
Simon Willison 演示使用 Gemma 4 和 MLX 进行本地音频转录 ⭐️ 8.0/10
Anthropic 未发布模型 Mythos 被疑使用字节 Seed 技术引发争议 ⭐️ 8.0/10
TurboOCR 通过 TensorRT 和 CUDA 优化实现每秒 1200 张图像处理 ⭐️ 8.0/10
深度循环 Transformer 无需中间监督即可提升泛化能力 ⭐️ 8.0/10
第三方评测显示 Claude Opus 4.6 幻觉率激增且排名大幅下滑 ⭐️ 8.0/10
欧盟拟将 ChatGPT 列为超大型在线搜索引擎 ⭐️ 8.0/10
Cloudflare 数据显示 AI 巨头打破网络平衡，Anthropic 被指违规最严重 ⭐️ 8.0/10
美国 BIS 人员短缺导致英伟达 AI 芯片出口停滞 ⭐️ 8.0/10
Cloudflare 工程师详解统一 CLI 的架构设计 ⭐️ 7.0/10
Steve Yegge 称谷歌的 AI 采用率与约翰迪尔公司相似 ⭐️ 7.0/10
Bryan Cantrill 认为 LLM 缺乏有益的人类懒惰特质 ⭐️ 7.0/10
Google 将 Rust 集成到 Pixel 10 调制解调器以提升安全性 ⭐️ 7.0/10
Max Welling 将举办关于 AI4Science、GNN 和 CuspAI 的 AMA ⭐️ 7.0/10
苹果开发无显示屏智能眼镜，凭借先进相机设计与 Meta 竞争 ⭐️ 7.0/10
Ramp 报告预测 Anthropic 将在两个月内于企业市场超越 OpenAI ⭐️ 7.0/10
Meta 正为 CEO 扎克伯格开发用于内部的 AI 分身 ⭐️ 7.0/10

关注动态

MemSearch Updates: 2 updates — extend git-root collection fix to codex/opencode skills; async s…, derive memory-recall collection from git root (#324) (#330) ⭐️ ?/10
openai/codex: 2 releases — rust-v0.121.0-alpha.6, rust-v0.121.0-alpha.4 ⭐️ ?/10
anthropics/claude-code: 2 releases — v2.1.105, v2.1.104 ⭐️ ?/10
upstash/context7: 2 releases — @upstash/context7-mcp@2.1.8, ctx7@0.3.12 ⭐️ ?/10

GitHub 热榜

Karpathy 发布基于纯 C 和 CUDA 的极简 LLM 训练项目 ⭐️ 10.0/10
SageAttention 通过 8 比特量化实现比 FlashAttention 快 2 至 5 倍的加速 ⭐️ 10.0/10
VoxCPM2：无分词器的多语言语音合成与声音克隆模型 ⭐️ 9.0/10
Firecrawl：专为 AI 代理优化的网页数据 API ⭐️ 9.0/10
Chrome DevTools MCP 连接 AI 代理与浏览器调试 ⭐️ 9.0/10
DeepEP 优化大型混合专家模型的专家并行通信 ⭐️ 9.0/10
Mirage 将大语言模型编译为持久化 CUDA 超核 ⭐️ 9.0/10
Nous Research 推出自我进化的 Hermes Agent 框架 ⭐️ 8.0/10
Kronos：首个面向金融 K 线图的开源基础模型 ⭐️ 8.0/10
微软 MarkItDown：面向大模型的文档转换工具 ⭐️ 8.0/10
Multica 将自主编码代理编排为协作者 ⭐️ 8.0/10
Archon：面向 AI 编码的确定性工作流引擎 ⭐️ 8.0/10
Claude-Mem：为 Claude Code 代理提供自动化上下文记忆 ⭐️ 8.0/10
RustFS：基于 Rust 的高性能 S3 兼容存储系统 ⭐️ 8.0/10
Ralph：用于执行产品需求文档的自主 AI 代理循环 ⭐️ 8.0/10
yt-dlp：AI 数据采集必备的命令行工具 ⭐️ 8.0/10
通过频谱分析逆向工程谷歌 SynthID 水印 ⭐️ 8.0/10
Voicebox：本地优先的语音克隆桌面工作室 ⭐️ 8.0/10
OpenMetadata：统一的数据治理与血缘平台 ⭐️ 8.0/10
Letta Code：为 AI 编程代理提供持久化记忆 ⭐️ 8.0/10
NVIDIA NCCL Tests：必备的多 GPU 基准测试套件 ⭐️ 8.0/10
ThunderKittens 简化高性能 CUDA 内核开发 ⭐️ 8.0/10
DeepTutor：基于智能体架构的个性化 AI 辅导系统 ⭐️ 7.0/10
InsForge 推出专为 AI 智能体开发设计的后端平台 ⭐️ 7.0/10
GPUMD：高性能 GPU 分子动力学模拟引擎 ⭐️ 7.0/10

头条速递

金山与 360 杀毒软件内核驱动曝出高危漏洞 ⭐️ 9.0/10

安全研究员 Patrick Saif 披露了金山毒霸和 360 安全卫士内核驱动中的严重漏洞，允许未经认证的权限提升。金山防火墙驱动因 IOCTL 尺寸计算错误导致内核堆溢出，而 360 反 Rootkit 驱动可通过进程空洞绕过签名校验，并利用硬编码的 AES 密钥执行任意内核读写操作。由于这两个驱动均拥有合法的数字签名，它们极易被用于“自带易受攻击驱动”（BYOVD）攻击。这些漏洞极为关键，因为它们使攻击者无需在目标机器上安装恶意软件即可从普通用户权限提升至 SYSTEM 级别。由于这些驱动由受信任的机构（EV 或 WHQL）签名，它们可以绕过如 HVCI 等现代安全控制，且目前未被默认屏蔽列表拦截。这对系统完整性和 AI 基础设施构成了直接威胁，因为攻击者可以通过修改内核回调表或终止受保护进程光（PPL）保护的进程来隐藏恶意行为。这些漏洞已提交至 LOLDrivers 数据库，但目前尚未获得 CVE 编号，也不在 HVCI 屏蔽名单中。利用这些漏洞，攻击者可以绕过 KASLR，窃取内核凭据，并通过已存在或易于加载的签名驱动执行任意代码。建议企业在厂商发布补丁前，立即将相关驱动的哈希值添加到 EDR 检测规则中以防范风险。

telegram · zaihuapd · Apr 13, 13:56

背景: BYOVD（自带易受攻击驱动）攻击涉及加载合法但存在漏洞的签名驱动，以绕过安全解决方案并获得内核级控制权。内核驱动在操作系统中运行于最高特权级别，这意味着其中的缺陷可能破坏整个系统的安全模型。受保护进程光（PPL）是 Windows 的一项安全功能，旨在保护关键进程免受篡改，即使是管理员也无法操作，除非利用了特定的内核漏洞。

参考链接

Horizon Summary: 2026-04-13 (EN)

2026-04-12T16:00:00+00:00

From 94 items, 45 important content pieces were selected

头条速递

KIV Enables 1M Token Context on RTX 4070 via Tiered KV Cache ⭐️ 9.0/10
MiniMax Releases M2.7 Model with Open Weights on Hugging Face ⭐️ 9.0/10
Anthropic Launches Beta for Fully Managed Claude Agents ⭐️ 9.0/10
Chinese Team Releases First Large-Scale Ultrasound Dataset with 364k Image-Text Pairs ⭐️ 8.0/10
Analysis Claims LLMs Learn Backwards and Scaling Laws Are Bounded ⭐️ 8.0/10
New PyTorch Repo Teaches Distributed Training from Scratch ⭐️ 8.0/10
llama.cpp Adds Native Audio Support for Gemma-4 Models ⭐️ 8.0/10
Gemma 4 31B Inference Speed Boosted 50% on Code via Speculative Decoding ⭐️ 8.0/10
GLM-5.1 Matches Frontier Models in Social Reasoning at Lower Cost ⭐️ 8.0/10
Quantized MiniMax m2.7 Reaches 95% MMLU on High-Memory Macs ⭐️ 8.0/10
Unsloth Releases Full GGUF Quantizations for MiniMax M2.7 ⭐️ 8.0/10
LazyMoE Enables 120B LLMs on 8GB RAM Without GPU ⭐️ 8.0/10
MOSS-TTS-Nano: A 0.1B Open-Source Multilingual TTS Model for CPU Realtime Inference ⭐️ 8.0/10
China’s First BCI Unicorn Develops Superhuman Bionic Hands for Robots ⭐️ 7.0/10
Gary Marcus Critiques Leaked Claude Code as Symbolic AI ⭐️ 7.0/10
Data Analysis Reveals Sharp Drop in ICLR 2026 Reviewer Agreement ⭐️ 7.0/10
MiniMax M2.7 Released with Restrictive Non-Commercial License ⭐️ 7.0/10
Repaired Qwen 3.5 35B Model Released with Native Apple MLX Support ⭐️ 7.0/10
Top AI Talent Accelerates Return from Silicon Valley to China ⭐️ 7.0/10
Durov Claims 95% of WhatsApp Backups Are Stored Unencrypted ⭐️ 7.0/10

GitHub 热榜

Karpathy Releases Minimal LLM Training in Pure C and CUDA ⭐️ 10.0/10
SageAttention Accelerates Inference via Quantization ⭐️ 10.0/10
Instant-NGP: Lightning-Fast Neural Graphics Training ⭐️ 10.0/10
Nous Research Launches Self-Improving Hermes Agent Framework ⭐️ 9.0/10
VoxCPM2: Tokenizer-Free Multilingual TTS with Voice Design ⭐️ 9.0/10
Google Releases Efficient Smaller BERT Models for Resource-Constrained Environments ⭐️ 9.0/10
DeepGEMM Delivers Optimized FP8 Kernels for NVIDIA GPUs ⭐️ 9.0/10
Optimized CUDA Library for Causal Conv1d in Mamba ⭐️ 9.0/10
Microsoft Releases MarkItDown for LLM Data Ingestion ⭐️ 8.0/10
Archon: Deterministic Harness for AI Coding Workflows ⭐️ 8.0/10
Multica Orchestrates Autonomous Coding Agents as Collaborative Teammates ⭐️ 8.0/10
Kronos: First Open-Source Foundation Model for Financial K-Lines ⭐️ 8.0/10
Reverse-Engineering Google’s SynthID Watermark via Spectral Analysis ⭐️ 8.0/10
Standardized Scientific Skills Library for AI Agents ⭐️ 8.0/10
AgentScope: Visual Debugging for Trustworthy Multi-Agent Systems ⭐️ 8.0/10
Claude-Mem Adds Persistent Memory to AI Coding Sessions ⭐️ 8.0/10
Qwen Code: Terminal-Based AI Agent for Developers ⭐️ 8.0/10
AutoBE Generates Guaranteed Compilable TypeScript Backends ⭐️ 8.0/10
NVIDIA cuopt Accelerates Large-Scale Routing Optimization ⭐️ 8.0/10
OpenDataLoader PDF: High-Accuracy Multi-Language Parser for RAG ⭐️ 7.0/10
DeepTutor Launches Agent-Native Personalized Learning System ⭐️ 7.0/10
Superpowers Framework Enforces Structured Agentic Workflows ⭐️ 7.0/10
Ralph: Autonomous AI Agent Loop for PRD Execution ⭐️ 7.0/10
Rowboat: Open-Source AI Coworker with Local Memory ⭐️ 7.0/10
GPUMD: High-Performance GPU Molecular Dynamics Engine ⭐️ 7.0/10

头条速递

KIV Enables 1M Token Context on RTX 4070 via Tiered KV Cache ⭐️ 9.0/10

A new middleware called KIV (K-Indexed V Materialization) allows consumer GPUs like the RTX 4070 to handle 1 million token context windows by replacing standard KV caches with a tiered retrieval system. This approach keeps recent keys and values in VRAM while offloading older data to system RAM, using K vectors as an index to retrieve only the most relevant V entries during decoding. The solution requires no model retraining and works as a drop-in replacement for any HuggingFace model utilizing DynamicCache. This breakthrough significantly lowers the hardware barrier for running large-context LLMs locally, enabling complex tasks like analyzing entire codebases or books on affordable consumer hardware. By decoupling context length from VRAM capacity, KIV challenges the current industry reliance on expensive enterprise GPUs for long-context inference. If optimized further, this technique could democratize access to advanced AI capabilities for developers and researchers who cannot afford high-end data center equipment. It represents a shift from brute-force memory expansion to intelligent memory management in local AI deployment. On an RTX 4070 with 12GB VRAM running Gemma 4 E2B (4-bit), KIV achieves 1M token context with only ~6.5GB total GPU usage and a decode speed of 4.1 tokens per second. While prefilling 1M tokens takes approximately 4.3 minutes, the decode speed remains near-constant regardless of context length, though it is currently bottlenecked by CPU-to-GPU transfer rates. The system consumes about 5.8GB of system RAM for 1M tokens and has shown limitations in two-hop reasoning and dense similar-looking data scenarios due to collision disambiguation issues.

rss · r/MachineLearning · Apr 12, 17:23

Background: In transformer models, the KV cache stores Key and Value matrices from previous tokens to avoid recomputing them during generation, which speeds up inference but consumes significant VRAM as context grows. Traditionally, the size of this cache limits the maximum context length a GPU can handle, often requiring massive memory for million-token windows. HuggingFace’s DynamicCache interface allows developers to customize how these caches are stored and managed, enabling innovations like KIV to intercept and optimize memory usage without altering model weights. KIV leverages the observation that K vectors are structured enough to serve as search indices, while V vectors are too chaotic to compress effectively.

References

Horizon Summary: 2026-04-13 (ZH)

2026-04-12T16:00:00+00:00

From 94 items, 45 important content pieces were selected

头条速递

KIV 通过分层 KV 缓存在 RTX 4070 上实现 100 万 token 上下文 ⭐️ 9.0/10
MiniMax 在 Hugging Face 发布开源权重的 M2.7 模型 ⭐️ 9.0/10
Anthropic 推出全托管 Claude 代理 Beta 版 ⭐️ 9.0/10
中国团队发布首个含 36.4 万图文对的大规模超声专属数据集 ⭐️ 8.0/10
分析称大语言模型逆向学习且缩放定律存在上限 ⭐️ 8.0/10
新 PyTorch 仓库从零开始教授分布式训练 ⭐️ 8.0/10
llama.cpp 为 Gemma-4 模型添加原生音频支持 ⭐️ 8.0/10
Gemma 4 31B 通过投机解码在代码生成上提速 50% ⭐️ 8.0/10
GLM-5.1 在社交推理任务中媲美前沿模型且成本更低 ⭐️ 8.0/10
量化版 MiniMax m2.7 在高内存 Mac 上实现 95% MMLU 准确率 ⭐️ 8.0/10
Unsloth 发布 MiniMax M2.7 全套 GGUF 量化版本 ⭐️ 8.0/10
LazyMoE 实现无显卡 8GB 内存运行 120B 大模型 ⭐️ 8.0/10
MOSS-TTS-Nano：支持 CPU 实时推理的 0.1B 开源多语言 TTS 模型 ⭐️ 8.0/10
中国首家脑机接口独角兽为机器人研发超越人手的仿生手 ⭐️ 7.0/10
Gary Marcus 批评泄露的 Claude 代码为符号人工智能 ⭐️ 7.0/10
数据分析显示 ICLR 2026 审稿人一致性急剧下降 ⭐️ 7.0/10
MiniMax M2.7 发布但附带限制性非商业许可协议 ⭐️ 7.0/10
修复版 Qwen 3.5 35B 模型发布，原生支持 Apple MLX ⭐️ 7.0/10
硅谷顶尖 AI 人才加速回流中国 ⭐️ 7.0/10
杜罗夫称九成以上 WhatsApp 备份以未加密形式存储 ⭐️ 7.0/10

GitHub 热榜

Karpathy 发布纯 C 和 CUDA 编写的极简 LLM 训练项目 ⭐️ 10.0/10
SageAttention 通过量化加速模型推理 ⭐️ 10.0/10
Instant-NGP：闪电般快速的神经图形训练框架 ⭐️ 10.0/10
Nous Research 推出自我进化的 Hermes 智能体框架 ⭐️ 9.0/10
VoxCPM2：无分词器的多语言语音合成与声音设计模型 ⭐️ 9.0/10
谷歌发布面向资源受限环境的高效小型 BERT 模型 ⭐️ 9.0/10
DeepGEMM 为 NVIDIA GPU 提供优化的 FP8 算子 ⭐️ 9.0/10
用于 Mamba 架构的因果卷积一维 CUDA 优化库 ⭐️ 9.0/10
微软发布 MarkItDown 助力大模型数据摄入 ⭐️ 8.0/10
Archon：打造确定性 AI 编码工作流的开源框架 ⭐️ 8.0/10
Multica 将自主编码智能体编排为协作队友 ⭐️ 8.0/10
Kronos：首个面向金融 K 线图的开源基础模型 ⭐️ 8.0/10
通过频谱分析逆向工程谷歌 SynthID 水印 ⭐️ 8.0/10
面向 AI 代理的标准化科学技能库 ⭐️ 8.0/10
AgentScope：面向可信多智能体系统的可视化调试框架 ⭐️ 8.0/10
Claude-Mem 为 AI 编程会话添加持久化记忆功能 ⭐️ 8.0/10
Qwen Code：面向开发者的终端 AI 智能体 ⭐️ 8.0/10
AutoBE 生成保证可编译的 TypeScript 后端代码 ⭐️ 8.0/10
NVIDIA cuopt 加速大规模路径优化求解 ⭐️ 8.0/10
OpenDataLoader PDF：面向 RAG 的高精度多语言解析器 ⭐️ 7.0/10
DeepTutor 推出原生智能体个性化学习系统 ⭐️ 7.0/10
Superpowers 框架强制执行结构化代理工作流 ⭐️ 7.0/10
Ralph：用于执行产品需求文档的自主 AI 代理循环 ⭐️ 7.0/10
Rowboat：具备本地记忆功能的开源 AI 同事平台 ⭐️ 7.0/10
GPUMD：高性能 GPU 分子动力学模拟引擎 ⭐️ 7.0/10

头条速递

KIV 通过分层 KV 缓存在 RTX 4070 上实现 100 万 token 上下文 ⭐️ 9.0/10

一种名为 KIV（K-Indexed V Materialization）的新中间件通过用分层检索系统替换标准 KV 缓存，使 RTX 4070 等消费级 GPU 能够处理 100 万 token 的上下文窗口。该方法将最近的键值对保留在显存中，同时将旧数据卸载到系统内存，并利用 K 向量作为索引在解码过程中仅检索最相关的 V 条目。该方案无需重新训练模型，可作为任何使用 DynamicCache 的 HuggingFace 模型的即插即用替代品。这一突破显著降低了在本地运行大上下文大语言模型的硬件门槛，使得在负担得起的消费级硬件上分析整个代码库或书籍等复杂任务成为可能。通过将上下文长度与显存容量解耦，KIV 挑战了当前行业依赖昂贵的企业级 GPU 进行长上下文推理的现状。如果进一步优化，这项技术可以为无法承担高端数据中心设备的开发者和研究人员普及高级 AI 能力。它标志着本地 AI 部署从粗暴的内存扩展转向智能内存管理的转变。在配备 12GB 显存的 RTX 4070 上运行 4 位量化的 Gemma 4 E2B 时，KIV 实现了 100 万 token 上下文，总显存占用仅约 6.5GB，解码速度为每秒 4.1 个 token。虽然填充 100 万 token 需要约 4.3 分钟，但解码速度几乎不随上下文长度变化，目前主要瓶颈在于 CPU 到 GPU 的数据传输速率。该系统在 100 万 token 下消耗约 5.8GB 系统内存，并且由于碰撞消歧问题，在两跳推理和密集相似数据场景中表现出一定的局限性。

rss · r/MachineLearning · Apr 12, 17:23

背景: 在 Transformer 模型中，KV 缓存存储来自先前 token 的键（Key）和值（Value）矩阵，以避免在生成过程中重新计算它们，这加速了推理但随着上下文增长会消耗大量显存。传统上，这种缓存的大小限制了 GPU 能处理的最大上下文长度，通常需要巨大的内存才能支持百万 token 的窗口。HuggingFace 的 DynamicCache 接口允许开发者自定义这些缓存的存储和管理方式，使得像 KIV 这样的创新能够在不改变模型权重的情况下拦截并优化内存使用。KIV 利用了 K 向量具有足够结构可用作搜索索引，而 V 向量过于混乱无法有效压缩的观察结果。

参考链接

Horizon Summary: 2026-04-12 (EN)

2026-04-11T16:00:00+00:00

From 102 items, 43 important content pieces were selected

头条速递

Chen Danqi and Liu Zhuang Release Open-Source Visual Reasoning RL Framework Achieving SOTA Without Thinking Data ⭐️ 9.0/10
Small Open-Weight Models Match Mythos in Isolated Vulnerability Detection ⭐️ 8.0/10
Chinese Startup Lingchu Releases Massive 100,000-Hour Human Demonstration Dataset for Embodied AI ⭐️ 8.0/10
Educational PyTorch Implementations Released for FlashAttention FA1–FA4 ⭐️ 8.0/10
DFlash Speculative Decoding Achieves 3.3x Speedup on Apple Silicon MLX ⭐️ 8.0/10
Alibaba Shifts AI Strategy from Open-Source to Revenue Focus ⭐️ 8.0/10
Running Qwen3.5-397B MoE Locally with vLLM and 8x AMD GPUs ⭐️ 8.0/10
Experimental LLM Replaces MLP Decoders with K-Splanifolds Geometry ⭐️ 8.0/10
OpenAI Acquires Cirrus Labs, Shutting Down Cirrus CI Service ⭐️ 7.0/10
Google Launches DBSC in Chrome to Cryptographically Bind Sessions to Hardware ⭐️ 7.0/10
Putin Mandates Domestic AI Foundation Models for Russian National Security ⭐️ 7.0/10

关注动态

openai/codex: 5 releases — rust-v0.121.0-alpha.2, rust-v0.121.0-alpha.1, rust-v0.120.0 ⭐️ ?/10

GitHub 热榜

Karpathy Releases Minimal LLM Training in Pure C and CUDA ⭐️ 10.0/10
Instant-NGP: Lightning-Fast Neural Graphics Training ⭐️ 10.0/10
Nous Research Launches Self-Improving Hermes Agent Framework ⭐️ 9.0/10
VoxCPM2: Tokenizer-Free Multilingual TTS and Voice Cloning ⭐️ 9.0/10
Unsloth Studio: Unified Local UI for LLM Training and Inference ⭐️ 9.0/10
Feast: Production-Grade Open Source Feature Store for MLOps ⭐️ 9.0/10
Continue: Open-Source AI Assistant with Source-Controlled Checks ⭐️ 9.0/10
Chrome DevTools MCP Bridges AI Agents and Browsers ⭐️ 9.0/10
DeepGEMM Delivers Optimized FP8 Matrix Multiplication for CUDA ⭐️ 9.0/10
Mirage Optimizes LLM Inference with Persistent CUDA Mega-Kernels ⭐️ 9.0/10
SageAttention Accelerates Transformers via Quantization ⭐️ 9.0/10
Optimized CUDA Kernel for Causal Depthwise Conv1D ⭐️ 9.0/10
Microsoft MarkItDown: Optimizing Document Ingestion for AI Agents ⭐️ 8.0/10
Archon: Deterministic Harness Builder for AI Coding ⭐️ 8.0/10
Multica: Open-Source Platform for Managing AI Coding Agents ⭐️ 8.0/10
Kronos: First Open-Source Foundation Model for Financial K-Lines ⭐️ 8.0/10
jq: Essential CLI Tool for JSON Data Processing ⭐️ 8.0/10
Prefect: Modern Python Workflow Orchestration for Resilient Pipelines ⭐️ 8.0/10
Train a 64M GPT from Scratch in Two Hours ⭐️ 8.0/10
Claudian Embeds AI Coding Agents Directly into Obsidian ⭐️ 8.0/10
n8n: Fair-Code Automation with Native AI Agents ⭐️ 8.0/10
NVIDIA Releases cuopt for GPU-Accelerated Optimization ⭐️ 8.0/10
Rowboat: Local-First AI Coworker with Persistent Memory ⭐️ 7.0/10
DeepTutor Launches Agent-Native Personalized Learning System ⭐️ 7.0/10
OpenDataLoader PDF: High-Accuracy Parser for RAG Pipelines ⭐️ 7.0/10
Superpowers Framework Enforces Structured Agentic Workflows ⭐️ 7.0/10
Open-Source MCP Server Bridges Claude Desktop with Real-Time Trading Data ⭐️ 7.0/10
JetBrains Plugin Brings Claude Code and Codex GUI to IDE ⭐️ 7.0/10
Playwright CLI Optimizes Browser Automation for AI Agents ⭐️ 7.0/10
ChatLab: Local-First AI Agent for Private Chat Analysis ⭐️ 7.0/10
GPUMD: High-Performance GPU Molecular Dynamics Engine ⭐️ 7.0/10

头条速递

Chen Danqi and Liu Zhuang Release Open-Source Visual Reasoning RL Framework Achieving SOTA Without Thinking Data ⭐️ 9.0/10

Prominent researchers Chen Danqi and Liu Zhuang have released a new open-source framework for general visual reasoning using reinforcement learning (RL). This framework achieves state-of-the-art (SOTA) performance by leveraging extensive data scaling rather than requiring explicit ‘thinking data’ or chain-of-thought annotations. The approach demonstrates that broad data coverage is the primary driver for scaling visual reasoning capabilities in RL agents. This breakthrough is significant because it challenges the prevailing assumption that high-quality, explicitly annotated reasoning traces are essential for training advanced visual AI models. By eliminating the need for costly ‘thinking data,’ this method could drastically reduce the resources required to train powerful vision-language models, making high-performance AI more accessible. It suggests a paradigm shift where data diversity and volume outweigh the complexity of supervision signals in reinforcement learning contexts. Consequently, this could accelerate research in autonomous agents that must perceive and reason about complex visual environments without human-guided reasoning examples. The framework specifically targets general visual reasoning tasks and operates effectively without the inclusion of specialized thinking data often used in prior works like VisualRFT or Seg-Zero. Technical analysis indicates that the scaling of diverse perception data serves as the core mechanism for enhancing reasoning capabilities, rather than architectural changes alone. The release is fully open-source, allowing the community to replicate results and build upon this data-centric approach immediately.

rss · 量子位 · Apr 11, 01:23

Background: Visual reasoning in AI typically involves Vision-Language Models (VLMs) that must first accurately perceive visual inputs before performing logical deduction. Traditionally, improving these models has relied on ‘thinking data,’ which consists of step-by-step reasoning traces or chain-of-thought annotations generated by humans or other models to guide the learning process. Reinforcement Learning (RL) has recently been integrated into VLMs to enhance their ability to solve complex tasks through trial and error, but most approaches still depend heavily on these supervised reasoning signals. Recent studies have explored two-stage frameworks to separate perception enhancement from reasoning optimization, yet the dependency on high-quality reasoning data remains a bottleneck.

References

Horizon Summary: 2026-04-12 (ZH)

2026-04-11T16:00:00+00:00

From 102 items, 43 important content pieces were selected

头条速递

陈丹琦与刘壮发布开源通用视觉推理 RL 框架，无需思考数据即刷新 SOTA ⭐️ 9.0/10
小型开源模型在隔离代码检测中媲美 Mythos ⭐️ 8.0/10
中国初创灵初智能发布十万小时人类演示数据集助力具身 AI ⭐️ 8.0/10
FlashAttention FA1–FA4 的教育性 PyTorch 实现已发布 ⭐️ 8.0/10
DFlash 推测解码在 Apple Silicon MLX 上实现 3.3 倍加速 ⭐️ 8.0/10
阿里巴巴将 AI 战略从开源转向注重营收 ⭐️ 8.0/10
利用 vLLM 和 8 张 AMD 显卡本地运行 Qwen3.5-397B MoE 模型 ⭐️ 8.0/10
实验性 LLM 使用 K-Splanifolds 几何取代传统 MLP 解码器 ⭐️ 8.0/10
OpenAI 收购 Cirrus Labs 并计划关闭 Cirrus CI 服务 ⭐️ 7.0/10
谷歌在 Chrome 中推出 DBSC 技术以将会话加密绑定至硬件 ⭐️ 7.0/10
普京命令研发国产人工智能基础模型以保障国家安全 ⭐️ 7.0/10

关注动态

openai/codex: 5 releases — rust-v0.121.0-alpha.2, rust-v0.121.0-alpha.1, rust-v0.120.0 ⭐️ ?/10

GitHub 热榜

Karpathy 发布纯 C 和 CUDA 编写的极简 LLM 训练项目 ⭐️ 10.0/10
Instant-NGP：闪电般的神经图形训练框架 ⭐️ 10.0/10
Nous Research 推出自我进化的 Hermes 智能体框架 ⭐️ 9.0/10
VoxCPM2：无分词器的多语言语音合成与克隆模型 ⭐️ 9.0/10
Unsloth Studio：统一的本地大模型训练与推理界面 ⭐️ 9.0/10
Feast：面向 MLOps 的生产级开源特征存储平台 ⭐️ 9.0/10
Continue：支持源码控制检查的开源 AI 编程助手 ⭐️ 9.0/10
Chrome DevTools MCP 连接 AI 代理与浏览器 ⭐️ 9.0/10
DeepGEMM 推出专为 CUDA 优化的 FP8 矩阵乘法库 ⭐️ 9.0/10
Mirage 通过持久化 CUDA 巨型内核优化大模型推理 ⭐️ 9.0/10
SageAttention 通过量化加速 Transformer 推理 ⭐️ 9.0/10
用于因果深度卷积的高效 CUDA 内核 ⭐️ 9.0/10
微软 MarkItDown：优化 AI 代理的文档摄入流程 ⭐️ 8.0/10
Archon：面向 AI 编码的确定性构建框架 ⭐️ 8.0/10
Multica：管理 AI 编程代理的开源平台 ⭐️ 8.0/10
Kronos：首个面向金融 K 线图的开源基础模型 ⭐️ 8.0/10
jq：不可或缺的 JSON 数据处理命令行工具 ⭐️ 8.0/10
Prefect：构建弹性数据管道的现代 Python 工作流编排框架 ⭐️ 8.0/10
两小时从零训练 64M 参数的 GPT 模型 ⭐️ 8.0/10
Claudian 将 AI 编程助手直接嵌入 Obsidian 笔记库 ⭐️ 8.0/10
n8n：具备原生 AI 代理功能的公平代码自动化平台 ⭐️ 8.0/10
英伟达发布用于 GPU 加速优化的 cuopt 库 ⭐️ 8.0/10
Rowboat：具备持久记忆的本地优先 AI 同事框架 ⭐️ 7.0/10
DeepTutor 推出原生代理个性化学习系统 ⭐️ 7.0/10
OpenDataLoader PDF：专为 RAG 流水线打造的高精度解析器 ⭐️ 7.0/10
Superpowers 框架强制执行结构化智能体工作流 ⭐️ 7.0/10
开源 MCP 服务器将 Claude 桌面与实时交易数据连接起来 ⭐️ 7.0/10
JetBrains 插件为 IDE 引入 Claude Code 和 Codex 图形界面 ⭐️ 7.0/10
Playwright CLI 为 AI 代理优化浏览器自动化 ⭐️ 7.0/10
ChatLab：本地优先的私密聊天记录 AI 分析工具 ⭐️ 7.0/10
GPUMD：高性能 GPU 分子动力学引擎 ⭐️ 7.0/10

头条速递

陈丹琦与刘壮发布开源通用视觉推理 RL 框架，无需思考数据即刷新 SOTA ⭐️ 9.0/10

著名研究人员陈丹琦和刘壮发布了一个新的开源通用视觉推理强化学习（RL）框架。该框架通过利用广泛的数据扩展而非依赖显式的“思考数据”或思维链标注，实现了最先进（SOTA）的性能。该方法证明了广泛的数据覆盖是扩展 RL 智能体视觉推理能力的主要驱动力。这一突破意义重大，因为它挑战了当前的普遍假设，即高质量、显式标注的推理轨迹对于训练先进的视觉 AI 模型至关重要。通过消除对昂贵的“思考数据”的需求，这种方法可以大幅降低训练强大视觉语言模型所需的资源，使高性能 AI 更易于获取。这表明了一种范式转变，即在强化学习环境中，数据的多样性和数量比监督信号的复杂性更重要。因此，这可能会加速自主智能体的研究，使其能够在没有人类引导的推理示例的情况下感知并推理复杂的视觉环境。该框架专门针对通用视觉推理任务，并且在不包含先前工作（如 VisualRFT 或 Seg-Zero）中常用的专用思考数据的情况下也能有效运行。技术分析表明，多样化感知数据的扩展是增强推理能力的核心机制，而不仅仅是架构上的改变。该发布完全开源，允许社区立即复现结果并在此以数据为中心的方法基础上进行构建。

rss · 量子位 · Apr 11, 01:23

背景: AI 中的视觉推理通常涉及视觉语言模型（VLM），这些模型必须首先准确感知视觉输入，然后才能执行逻辑演绎。传统上，改进这些模型依赖于“思考数据”，即由人类或其他模型生成的逐步推理轨迹或思维链标注，以指导学习过程。强化学习（RL）最近被集成到 VLM 中，通过试错增强其解决复杂任务的能力，但大多数方法仍然严重依赖这些监督推理信号。最近的研究探索了两阶段框架，将感知增强与推理优化分开，但对高质量推理数据的依赖仍然是一个瓶颈。

参考链接

Horizon Summary: 2026-04-11 (EN)

2026-04-10T16:00:00+00:00

From 132 items, 66 important content pieces were selected

头条速递

CPUID Website Hijacked to Distribute Malware via CPU-Z and HWMonitor ⭐️ 9.0/10
NUS Presents DMax: A New Paradigm for Fast Parallel Diffusion Language Models ⭐️ 9.0/10
Stanford Introduces Meta-Harness for Self-Improving LLM Agents ⭐️ 9.0/10
DeepSeek V4 to Launch with Trillion Parameters and Native Huawei Ascend Support ⭐️ 9.0/10
Solayer Founder Reveals 20% of Free LLM Routers Inject Malicious Code ⭐️ 9.0/10
Alibaba’s Wan2.7 Tops DesignArena Leaderboard with 1334 Elo Rating ⭐️ 8.0/10
Star Action Era Wins Three Global Titles at Embodied AI Olympics ⭐️ 8.0/10
Chinese Open-Source AI Models Dominate Silicon Valley with 10x Cost Efficiency ⭐️ 8.0/10
Developer Reports 60% Performance Bug in cuBLAS on RTX 5090 ⭐️ 8.0/10
GLM-5.1 Open Model Tops Code Arena Rankings ⭐️ 8.0/10
GLM-5.1 Matches Opus in Agentic Benchmarks at One-Third the Cost ⭐️ 8.0/10
Developer Releases 9B LoRA Model Achieving 89% Autonomous Data Analysis ⭐️ 8.0/10
Community Effort to Reverse Engineer Gemma 4 MTP Capabilities ⭐️ 8.0/10
TurboQuant and TriAttention Combine for 6.8x KV Cache Reduction in llama.cpp on AMD HIP ⭐️ 8.0/10
France Commits to Replacing Windows with Linux for 2.5 Million Civil Servants ⭐️ 8.0/10
Claude Models Show Identity Confusion Risk Near Context Limits ⭐️ 8.0/10
CPU-Z Official Website Hacked, Malicious Code Injected into Downloads ⭐️ 8.0/10
WireGuard Releases New Windows Version After Microsoft Signing Resolution ⭐️ 7.0/10
ChatGPT Voice Mode Runs on Older, Weaker Model ⭐️ 7.0/10
Shengshu Technology Raises $280M Series B for General World Model ⭐️ 7.0/10
Trump Administration Summons Reddit to Grand Jury to Unmask ICE Critic ⭐️ 7.0/10
ibu-boost: A GBDT Library Using Absolute Split Rejection ⭐️ 7.0/10
Gemma 4 Fixes: Reasoning Budgets and Tool Calling Templates Updated ⭐️ 7.0/10
New Open-Source Suite Simplifies High-Quality GGUF Quantization ⭐️ 7.0/10
Local Qwen3.5 and MCP Tools Replace Cloud LLMs for Web Research ⭐️ 7.0/10
Community Highlights Chaos in Reasoning Token Formats Across LLMs ⭐️ 7.0/10
FCC to Vote on Banning Chinese Labs from US Device Testing ⭐️ 7.0/10
MiniMax Launches Music 2.6 with Enhanced Agent Skills and Free Trial ⭐️ 7.0/10
Anthropic Temporarily Bans Then Reinstates OpenClaw Developer Account ⭐️ 7.0/10

关注动态

MemSearch Updates: 3 updates — update OpenClaw capture architecture from llm_output debounce t…, bump memsearch to 0.2.4 and OpenClaw plugin to 0.2.0 (#322), OpenClaw plugin — remove child_process, simplify capture, f… ⭐️ ?/10
openai/codex: 3 releases — rust-v0.119.0-alpha.33, rust-v0.119.0-alpha.32, rust-v0.119.0-alpha.29 ⭐️ ?/10
anthropics/claude-code: 2 releases — v2.1.101, v2.1.100 ⭐️ ?/10

GitHub 热榜

Microsoft Releases BitNet for Efficient 1-Bit LLM Inference ⭐️ 10.0/10
Karpathy Releases Minimal LLM Training in Pure C and CUDA ⭐️ 10.0/10
Instant-NGP Revolutionizes NeRF Training Speed with CUDA ⭐️ 10.0/10
SageAttention Delivers 2-5x Speedup via Quantization ⭐️ 10.0/10
Nous Research Launches Self-Improving Hermes Agent Framework ⭐️ 9.0/10
VoxCPM2: Tokenizer-Free Multilingual TTS and Voice Cloning ⭐️ 9.0/10
DFlash Enables Efficient Parallel Drafting for LLM Speculative Decoding ⭐️ 9.0/10
Open WebUI: Self-Hosted Interface for Local and Cloud LLMs ⭐️ 9.0/10
Apache Airflow: Industry-Standard Workflow Orchestration ⭐️ 9.0/10
Daytona: Secure Infrastructure for AI Code Execution ⭐️ 9.0/10
Executor Unifies AI Agent Tool Integration ⭐️ 9.0/10
Superset Orchestrates Multiple AI Coding Agents Locally ⭐️ 9.0/10
DeepGEMM Delivers Optimized FP8 Matrix Multiplication for CUDA ⭐️ 9.0/10
Optimized CUDA Kernels for Mamba Sequence Modeling ⭐️ 9.0/10
NVIDIA cuVS: GPU-Accelerated Vector Search Library ⭐️ 9.0/10
Archon: Deterministic Harness for AI Coding Workflows ⭐️ 8.0/10
Kronos: First Open-Source Foundation Model for Financial K-Lines ⭐️ 8.0/10
Claudian Integrates AI Coding Agents into Obsidian Vaults ⭐️ 8.0/10
Hugging Face Skills Standardizes AI Agent Workflows ⭐️ 8.0/10
QMD: Local Hybrid Search Engine for AI Agents ⭐️ 8.0/10
Multica Orchestrates AI Coding Agents as Virtual Teammates ⭐️ 8.0/10
VoltAgent: TypeScript Framework for AI Agent Engineering ⭐️ 8.0/10
LlamaIndex Releases LiteParse for Fast Local PDF Parsing ⭐️ 8.0/10
Qwen Code: Open-Source Terminal AI Agent for Developers ⭐️ 8.0/10
OpenCode: Open-Source AI Coding Agent for Developers ⭐️ 8.0/10
NVIDIA cuopt: GPU-Accelerated Solver for Large-Scale Routing ⭐️ 8.0/10
ThunderKittens Accelerates CUDA Kernel Development ⭐️ 8.0/10
DeepTutor v1.0 Launches as Agent-Native Tutoring System ⭐️ 7.0/10
OpenDataLoader PDF: High-Accuracy Parser for AI RAG Pipelines ⭐️ 7.0/10
Superpowers Framework Enforces Structured Agentic Workflows ⭐️ 7.0/10
Open-Source MCP Server for Real-Time AI Trading Analysis ⭐️ 7.0/10
Rowboat: Open-Source AI Coworker with Persistent Memory ⭐️ 7.0/10
GitNexus: Client-Side Graph RAG for Code Intelligence ⭐️ 7.0/10
GPUMD: High-Performance GPU Molecular Dynamics Engine ⭐️ 7.0/10

头条速递

CPUID Website Hijacked to Distribute Malware via CPU-Z and HWMonitor ⭐️ 9.0/10

The official CPUID website was compromised in a supply-chain attack where download links for popular utilities CPU-Z and HWMonitor were redirected to malicious Cloudflare R2 storage buckets. Attackers replaced legitimate installers with malware-laced versions, triggering immediate detections by Windows Defender for some users. The incident was confirmed through community reports and initial checks by a project maintainer who noted the server files appeared intact while the site links were altered. This incident is critical because CPU-Z and HWMonitor are industry-standard tools used by developers, system administrators, and hardware enthusiasts for validating system specifications and monitoring health. A compromise of this magnitude exposes a vast user base to potential data theft, ransomware, or unauthorized remote access under the guise of trusted software. It highlights the fragility of software distribution channels and the severe risks associated with supply-chain attacks that bypass traditional perimeter defenses. Furthermore, it may erode trust in official vendor sites, forcing users to rely on third-party mirrors which carry their own risks. The attack vector involved hijacking the website’s HTML to redirect download buttons to external Cloudflare R2 object storage hosting malicious executables rather than compromising the actual files on the CPUID servers. Early reports indicate that Windows Defender successfully flagged the downloaded malicious installers, though false positive fatigue remains a concern for security professionals. Maintainers have stated they are investigating the breach while confirming that the original files stored on their backend infrastructure remain uncompromised.

hackernews · pashadee · Apr 10, 13:29

Background: A supply-chain attack occurs when cybercriminals target less secure elements in a software or hardware distribution network to inject malicious code into legitimate products before they reach the end user. CPU-Z and HWMonitor are widely respected freeware utilities developed by CPUID for displaying detailed technical information about a computer’s processor, motherboard, and sensors. Cloudflare R2 is a distributed object storage solution compatible with Amazon S3 APIs, often used by attackers for its low cost and lack of egress fees to host large payloads. Such attacks are particularly dangerous because users inherently trust software downloaded directly from an official vendor’s domain.

References

Horizon Summary: 2026-04-11 (ZH)

2026-04-10T16:00:00+00:00

From 132 items, 66 important content pieces were selected

头条速递

CPUID 官网遭劫持，通过 CPU-Z 和 HWMonitor 分发恶意软件 ⭐️ 9.0/10
新加坡国立大学推出 DMax：一种实现快速并行解码的扩散语言模型新范式 ⭐️ 9.0/10
斯坦福推出用于自改进 LLM 代理的 Meta-Harness ⭐️ 9.0/10
DeepSeek V4 拟发布：万亿参数规模并原生适配华为昇腾芯片 ⭐️ 9.0/10
Solayer 创始人揭示超 20% 免费 LLM 路由器注入恶意代码 ⭐️ 9.0/10
阿里视频生成大模型 Wan2.7 以 1334 Elo 评分登顶 DesignArena 榜单 ⭐️ 8.0/10
星动纪元在具身奥林匹克中斩获三项全球冠军 ⭐️ 8.0/10
国产开源模型以十倍性价比占领硅谷市场 ⭐️ 8.0/10
开发者报告 RTX 5090 上 cuBLAS 存在 60% 性能缺陷 ⭐️ 8.0/10
开源模型 GLM-5.1 登顶代码竞技场排行榜 ⭐️ 8.0/10
GLM-5.1 在代理基准测试中媲美 Opus，成本仅为三分之一 ⭐️ 8.0/10
开发者发布 9B LoRA 模型，实现 89% 自主数据分析成功率 ⭐️ 8.0/10
社区发起逆向工程以解锁 Gemma 4 的 MTP 功能 ⭐️ 8.0/10
TurboQuant 与 TriAttention 结合在 AMD HIP 版 llama.cpp 中实现 6.8 倍 KV 缓存缩减 ⭐️ 8.0/10
法国承诺为 250 万公务员将 Windows 替换为 Linux ⭐️ 8.0/10
Claude 模型在上下文极限附近出现身份混淆风险 ⭐️ 8.0/10
CPU-Z 官网遭黑客入侵，部分下载包被植入恶意代码 ⭐️ 8.0/10
WireGuard 在解决微软签名问题后发布新版 Windows 客户端 ⭐️ 7.0/10
ChatGPT 语音模式运行在较旧且较弱的模型上 ⭐️ 7.0/10
生数科技完成近 20 亿元 B 轮融资，发力通用世界模型 ⭐️ 7.0/10
特朗普政府传唤 Reddit 出席大陪审团以揭露批评 ICE 的用户 ⭐️ 7.0/10
ibu-boost：采用绝对分裂拒绝机制的 GBDT 库 ⭐️ 7.0/10
Gemma 4 修复更新：推理预算与工具调用模板已发布 ⭐️ 7.0/10
全新开源套件简化高质量 GGUF 量化流程 ⭐️ 7.0/10
本地 Qwen3.5 结合 MCP 工具取代云端大模型进行网络研究 ⭐️ 7.0/10
社区指出大模型推理令牌格式存在混乱局面 ⭐️ 7.0/10
FCC 拟投票禁止中国实验室检测美国电子设备 ⭐️ 7.0/10
MiniMax 发布新一代音乐大模型 Music 2.6 并开启免费内测 ⭐️ 7.0/10
Anthropic 临时封禁后恢复 OpenClaw 开发者账号 ⭐️ 7.0/10

关注动态

MemSearch Updates: 3 updates — update OpenClaw capture architecture from llm_output debounce t…, bump memsearch to 0.2.4 and OpenClaw plugin to 0.2.0 (#322), OpenClaw plugin — remove child_process, simplify capture, f… ⭐️ ?/10
openai/codex: 3 releases — rust-v0.119.0-alpha.33, rust-v0.119.0-alpha.32, rust-v0.119.0-alpha.29 ⭐️ ?/10
anthropics/claude-code: 2 releases — v2.1.101, v2.1.100 ⭐️ ?/10

GitHub 热榜

微软发布 BitNet 以实现高效 1 比特大模型推理 ⭐️ 10.0/10
Karpathy 发布纯 C 和 CUDA 编写的极简 LLM 训练项目 ⭐️ 10.0/10
Instant-NGP 利用 CUDA 彻底革新 NeRF 训练速度 ⭐️ 10.0/10
SageAttention 通过量化实现 2-5 倍推理加速 ⭐️ 10.0/10
Nous Research 推出自我进化的 Hermes 智能体框架 ⭐️ 9.0/10
VoxCPM2：无分词器的多语言语音合成与克隆模型 ⭐️ 9.0/10
DFlash 实现大模型投机解码的高效并行草稿生成 ⭐️ 9.0/10
Open WebUI：支持本地与云端大模型的自托管界面 ⭐️ 9.0/10
Apache Airflow：行业标准的工作流编排平台 ⭐️ 9.0/10
Daytona：用于 AI 代码执行的安全基础设施 ⭐️ 9.0/10
Executor 统一 AI 智能体工具集成 ⭐️ 9.0/10
Superset 在本地协调多个 AI 编程智能体 ⭐️ 9.0/10
DeepGEMM 推出专为 CUDA 优化的 FP8 矩阵乘法库 ⭐️ 9.0/10
面向 Mamba 序列建模的优化 CUDA 内核 ⭐️ 9.0/10
NVIDIA cuVS：GPU 加速向量搜索库 ⭐️ 9.0/10
Archon：打造确定性 AI 编码工作流的开源框架 ⭐️ 8.0/10
Kronos：首个面向金融 K 线的开源基础模型 ⭐️ 8.0/10
Claudian 将 AI 编程助手集成到 Obsidian 知识库中 ⭐️ 8.0/10
Hugging Face Skills 标准化 AI 智能体工作流 ⭐️ 8.0/10
QMD：面向 AI 代理的本地混合搜索引擎 ⭐️ 8.0/10
Multica 将 AI 编码代理编排为虚拟团队成员 ⭐️ 8.0/10
VoltAgent：面向 AI 代理工程的 TypeScript 框架 ⭐️ 8.0/10
LlamaIndex 发布 LiteParse 以实现快速本地 PDF 解析 ⭐️ 8.0/10
Qwen Code：面向开发者的开源终端 AI 代理 ⭐️ 8.0/10
OpenCode：面向开发者的开源 AI 编程助手 ⭐️ 8.0/10
NVIDIA cuopt：用于大规模路由的 GPU 加速求解器 ⭐️ 8.0/10
ThunderKittens 加速 CUDA 内核开发进程 ⭐️ 8.0/10
DeepTutor v1.0 发布：原生智能体个性化辅导系统 ⭐️ 7.0/10
OpenDataLoader PDF：面向 AI RAG 管道的高精度解析器 ⭐️ 7.0/10
Superpowers 框架强制执行结构化代理工作流 ⭐️ 7.0/10
用于实时 AI 交易分析的开源 MCP 服务器 ⭐️ 7.0/10
Rowboat：具备持久记忆功能的开源 AI 同事 ⭐️ 7.0/10
GitNexus：用于代码智能的客户端图 RAG 工具 ⭐️ 7.0/10
GPUMD：高性能 GPU 分子动力学引擎 ⭐️ 7.0/10

头条速递

CPUID 官网遭劫持，通过 CPU-Z 和 HWMonitor 分发恶意软件 ⭐️ 9.0/10

CPUID 官方网站遭遇供应链攻击，其热门工具 CPU-Z 和 HWMonitor 的下载链接被重定向至恶意的 Cloudflare R2 存储桶。攻击者用嵌入了恶意软件的版本替换了合法安装程序，导致部分用户的 Windows Defender 立即发出病毒警报。项目维护者初步确认服务器上的文件完好无损，但网站上的下载链接已被篡改。此次事件至关重要，因为 CPU-Z 和 HWMonitor 是开发人员、系统管理员和硬件爱好者用于验证系统规格和监控健康状况的行业标准工具。如此大规模的泄露使大量用户在信任软件的伪装下面临数据窃取、勒索软件或未授权远程访问的风险。它凸显了软件分发渠道的脆弱性，以及绕过传统边界防御的供应链攻击所带来的严重风险。此外，这可能会侵蚀用户对官方供应商网站的信任，迫使他们依赖带有自身风险的第三方镜像站点。攻击途径涉及劫持网站的 HTML 代码，将下载按钮重定向到托管恶意可执行文件的外部 Cloudflare R2 对象存储，而非直接破坏 CPUID 服务器上的实际文件。早期报告显示 Windows Defender 成功标记了下载的恶意安装程序，但误报疲劳仍是安全专业人员关注的问题。维护人员表示正在调查此次泄露，同时确认其后端基础设施上存储的原始文件未受损害。

hackernews · pashadee · Apr 10, 13:29

背景: 供应链攻击是指网络罪犯针对软件或硬件分发网络中安全性较弱的环节，在合法产品到达最终用户之前注入恶意代码的行为。CPU-Z 和 HWMonitor 是由 CPUID 开发的广受推崇的免费工具，用于显示计算机处理器、主板和传感器的详细技术信息。Cloudflare R2 是一种兼容 Amazon S3 API 的分布式对象存储解决方案，攻击者常因其低成本和无出口费用的特点而利用其托管大型负载。此类攻击尤为危险，因为用户天生信任直接从官方供应商域名下载的软件。

参考链接

Horizon Daily

Horizon Summary: 2026-04-15 (EN)

头条速递

关注动态

GitHub 热榜

GPUMD: High-Performance Molecular Dynamics on CUDA GPUs ⭐️ 7.0/10

头条速递

OpenAI Launches GPT-5.4-Cyber and Expands Trusted Access Program ⭐️ 9.0/10

Horizon Summary: 2026-04-15 (ZH)

头条速递

关注动态

GitHub 热榜

GPUMD：基于 CUDA GPU 的高性能分子动力学模拟引擎 ⭐️ 7.0/10

头条速递

OpenAI 推出 GPT-5.4-Cyber 并扩展可信访问计划 ⭐️ 9.0/10

Horizon Summary: 2026-04-14 (EN)

头条速递

关注动态

GitHub 热榜

GPUMD: High-Performance GPU Molecular Dynamics Engine ⭐️ 7.0/10

头条速递

Critical Kernel Vulnerabilities Found in Kingsoft and 360 Antivirus Drivers ⭐️ 9.0/10

Horizon Summary: 2026-04-14 (ZH)

头条速递

关注动态

GitHub 热榜

GPUMD：高性能 GPU 分子动力学模拟引擎 ⭐️ 7.0/10

头条速递

金山与 360 杀毒软件内核驱动曝出高危漏洞 ⭐️ 9.0/10

Horizon Summary: 2026-04-13 (EN)

头条速递

GitHub 热榜

GPUMD: High-Performance GPU Molecular Dynamics Engine ⭐️ 7.0/10

头条速递

KIV Enables 1M Token Context on RTX 4070 via Tiered KV Cache ⭐️ 9.0/10

Horizon Summary: 2026-04-13 (ZH)

头条速递

GitHub 热榜

GPUMD：高性能 GPU 分子动力学模拟引擎 ⭐️ 7.0/10

头条速递

KIV 通过分层 KV 缓存在 RTX 4070 上实现 100 万 token 上下文 ⭐️ 9.0/10

Horizon Summary: 2026-04-12 (EN)

头条速递

关注动态

GitHub 热榜

GPUMD: High-Performance GPU Molecular Dynamics Engine ⭐️ 7.0/10

头条速递

Chen Danqi and Liu Zhuang Release Open-Source Visual Reasoning RL Framework Achieving SOTA Without Thinking Data ⭐️ 9.0/10

Horizon Summary: 2026-04-12 (ZH)

头条速递

关注动态

GitHub 热榜

GPUMD：高性能 GPU 分子动力学引擎 ⭐️ 7.0/10

头条速递

陈丹琦与刘壮发布开源通用视觉推理 RL 框架，无需思考数据即刷新 SOTA ⭐️ 9.0/10

Horizon Summary: 2026-04-11 (EN)

头条速递

关注动态

GitHub 热榜

GPUMD: High-Performance GPU Molecular Dynamics Engine ⭐️ 7.0/10

头条速递

CPUID Website Hijacked to Distribute Malware via CPU-Z and HWMonitor ⭐️ 9.0/10

Horizon Summary: 2026-04-11 (ZH)

头条速递

关注动态

GitHub 热榜

GPUMD：高性能 GPU 分子动力学引擎 ⭐️ 7.0/10

头条速递

CPUID 官网遭劫持，通过 CPU-Z 和 HWMonitor 分发恶意软件 ⭐️ 9.0/10