<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>뉴스 on Ted Factory</title><link>https://tedfactory.com/en/tags/%EB%89%B4%EC%8A%A4/</link><description>Recent content in 뉴스 on Ted Factory</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sat, 13 Jun 2026 11:31:46 +0900</lastBuildDate><atom:link href="https://tedfactory.com/en/tags/%EB%89%B4%EC%8A%A4/index.xml" rel="self" type="application/rss+xml"/><item><title>AI News</title><link>https://tedfactory.com/en/news/ai-news/</link><pubDate>Wed, 29 Apr 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/</guid><description>&lt;h1 id="ai-news"&gt;AI News&lt;a class="anchor" href="#ai-news"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;&lt;img src="https://tedfactory.com/images/news/ai-news.png" alt="AI News" /&gt;&lt;/p&gt;
&lt;p&gt;This group collects AI technology, product, developer tool, infrastructure, and policy updates that seem worth checking from the author&amp;rsquo;s perspective.&lt;/p&gt;
&lt;p&gt;This page acts as the index for individual AI News briefs. Brief pages are not shown directly in the left sidebar; instead, they are managed in the list below in reverse chronological order.&lt;/p&gt;
&lt;h2 id="what-this-covers"&gt;What This Covers&lt;a class="anchor" href="#what-this-covers"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;AI models, agents, inference, multimodal systems, and on-device AI&lt;/li&gt;
&lt;li&gt;Major announcements from OpenAI, Anthropic, Google DeepMind, Meta AI, Microsoft, NVIDIA, and Hugging Face&lt;/li&gt;
&lt;li&gt;Developer tools such as Cursor, Claude Code, GitHub Copilot, MCP, evaluation tools, and deployment tools&lt;/li&gt;
&lt;li&gt;AI product launches, pricing changes, API updates, and changes that affect real usage&lt;/li&gt;
&lt;li&gt;AI infrastructure trends such as GPUs, inference cost, cloud services, and data centers&lt;/li&gt;
&lt;li&gt;Copyright, regulation, safety, and data usage policy&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="how-to-read"&gt;How To Read&lt;a class="anchor" href="#how-to-read"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Each brief is written to be skimmed in about five minutes.&lt;/li&gt;
&lt;li&gt;When more context is needed, follow the original article or video link inside each item.&lt;/li&gt;
&lt;li&gt;When interpretation matters more than the headline, each brief includes a short note on why it is worth tracking.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="latest-news"&gt;Latest News&lt;a class="anchor" href="#latest-news"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;div class="tf-news-card-list"&gt;
 &lt;a class="tf-news-card" href="https://tedfactory.com/en/news/ai-news/20260613-ai-brief/"&gt;
 &lt;div class="tf-news-card__body"&gt;
 &lt;h3 class="tf-news-card__title"&gt;2026-06-13 AI News Brief&lt;/h3&gt;
 &lt;p class="tf-news-card__summary"&gt;Anthropic's Claude Fable 5 / Mythos 5 launch and the US government directive to suspend access, OpenAI's Ona acquisition and Oracle Cloud partnership, Google DeepMind's multi-agent safety research fund, the AI subscription / token price war, and Xiaomi's open-source MiMo Code agent.&lt;/p&gt;</description></item><item><title>News</title><link>https://tedfactory.com/en/news/</link><pubDate>Wed, 29 Apr 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/</guid><description>&lt;h1 id="news"&gt;News&lt;a class="anchor" href="#news"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;&lt;img src="https://tedfactory.com/images/news/news-hero.png" alt="News" /&gt;&lt;/p&gt;
&lt;p&gt;This section collects technology updates that seem worth checking directly.&lt;/p&gt;
&lt;p&gt;It is not meant to cover every story like a general news site. Instead, each topic group curates a small number of important updates, summarizes them briefly, and links to the original article or video for readers who want more detail.&lt;/p&gt;
&lt;h2 id="news-groups"&gt;News Groups&lt;a class="anchor" href="#news-groups"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;div class="tf-news-group-list"&gt;
 &lt;a class="tf-news-group-card" href="https://tedfactory.com/en/news/ai-news/"&gt;
 &lt;div class="tf-news-group-card__media"&gt;
 &lt;img src="https://tedfactory.com/images/news/ai-news.png" alt="AI News" loading="lazy" /&gt;
 &lt;/div&gt;
 &lt;div class="tf-news-group-card__body"&gt;
 &lt;h3 class="tf-news-group-card__title"&gt;AI News&lt;/h3&gt;
 &lt;p class="tf-news-group-card__summary"&gt;Short briefings on AI models, agents, developer tools, product updates, infrastructure, and policy changes worth checking.&lt;/p&gt;</description></item><item><title>2026-04-30 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260430-ai-brief/</link><pubDate>Thu, 30 Apr 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260430-ai-brief/</guid><description>&lt;h1 id="2026-04-30-ai-news-brief"&gt;2026-04-30 AI News Brief&lt;a class="anchor" href="#2026-04-30-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Here is a short summary of AI technology news and videos worth checking today. Since there was no previous brief, this edition uses the last seven days as the default review window.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Cursor released a TypeScript SDK for the same agent runtime used across its desktop app, CLI, and web app.&lt;/li&gt;
&lt;li&gt;OpenAI models, Codex, and Managed Agents are coming to Amazon Bedrock, widening the enterprise deployment path.&lt;/li&gt;
&lt;li&gt;OpenAI published Symphony, a spec for orchestrating Codex runs around issue trackers and isolated workspaces.&lt;/li&gt;
&lt;li&gt;NVIDIA introduced Nemotron 3 Nano Omni, an open multimodal model for vision, audio, image, and text reasoning.&lt;/li&gt;
&lt;li&gt;YouTube is testing Ask YouTube, a conversational search experience that blends text answers and video results.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-stories"&gt;Top Stories&lt;a class="anchor" href="#top-stories"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="cursor-releases-its-sdk"&gt;Cursor Releases Its SDK&lt;a class="anchor" href="#cursor-releases-its-sdk"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Cursor released a TypeScript SDK that exposes the agent runtime and models behind its desktop app, CLI, and web app. Developers can install &lt;code&gt;@cursor/sdk&lt;/code&gt;, run agents locally or on Cursor cloud VMs, and stream events into their own workflows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Cursor is moving beyond an IDE product toward an agent execution platform. For developer tool builders, this is another signal that the runtime layer for launching, observing, and controlling agents is becoming a product category of its own.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; For Ted Factory-style personal projects, the SDK approach may make it easier to attach task-level agents to repeatable workflows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://cursor.com/changelog/sdk-release" target="_blank"&gt;Read the Cursor SDK announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openai-models-codex-and-managed-agents-come-to-aws"&gt;OpenAI Models, Codex, and Managed Agents Come to AWS&lt;a class="anchor" href="#openai-models-codex-and-managed-agents-come-to-aws"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; OpenAI and AWS expanded their partnership with OpenAI models, Codex, and Amazon Bedrock Managed Agents powered by OpenAI entering limited preview. AWS customers can use models such as GPT-5.5 and Codex inside Bedrock while relying on AWS security, billing, and governance controls.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; OpenAI agents and models are moving directly into enterprise cloud infrastructure. That gives companies a more familiar path to adoption without building a separate security and procurement model from scratch.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Codex support through the Bedrock API, starting with CLI, desktop app, and VS Code extension access, shows how quickly coding agents are becoming enterprise deployment targets.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/openai-on-aws/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;, &lt;a href="https://aws.amazon.com/about-aws/whats-new/2026/04/bedrock-openai-models-codex-managed-agents/" target="_blank"&gt;Read the AWS announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openai-publishes-symphony-for-codex-orchestration"&gt;OpenAI Publishes Symphony for Codex Orchestration&lt;a class="anchor" href="#openai-publishes-symphony-for-codex-orchestration"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; OpenAI published Symphony, an open-source spec for orchestrating Codex runs. The spec describes a long-running service that polls an issue tracker, creates an isolated workspace per issue, and launches a coding-agent session for that issue.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; The coding-agent bottleneck is shifting from “can the model write code?” to “which task should run, in which isolated environment, with what observability and retry behavior?” Symphony treats that operational layer as an explicit system design problem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; This is closely connected to harness engineering. Agent work is becoming less like a single prompt and more like a system of issues, workspaces, retries, and observable runs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/open-source-codex-orchestration-symphony/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;, &lt;a href="https://github.com/openai/symphony/blob/main/SPEC.md" target="_blank"&gt;Read the Symphony spec&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="nvidia-introduces-nemotron-3-nano-omni"&gt;NVIDIA Introduces Nemotron 3 Nano Omni&lt;a class="anchor" href="#nvidia-introduces-nemotron-3-nano-omni"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; NVIDIA introduced Nemotron 3 Nano Omni, an open multimodal model that combines vision, audio, image, and text reasoning. NVIDIA says the model reduces latency and cost versus stitching together separate perception models, with up to 9x higher throughput under comparable interactive conditions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Agents that work with screens, documents, audio, and video need fast multimodal perception. Nemotron 3 Nano Omni points toward a pattern where efficient perception submodels support larger agent workflows instead of handing every step to a frontier model.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; It is worth tracking as a potential lower-level component for computer-use agents, document intelligence, and audio / video automation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blogs.nvidia.com/blog/nemotron-3-nano-omni-multimodal-ai-agents/" target="_blank"&gt;Read the NVIDIA announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="youtube-tests-ask-youtube"&gt;YouTube Tests Ask YouTube&lt;a class="anchor" href="#youtube-tests-ask-youtube"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; YouTube is testing Ask YouTube, a conversational search experiment for U.S. Premium subscribers aged 18 or older. The feature returns text summaries, long-form videos, Shorts, and relevant video segments in response to natural-language questions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Video search is moving from a list of videos toward a blended answer interface with summaries, evidence, and follow-up questions. That could change both content discovery and creator visibility.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; When using YouTube as a source for future briefs, the important artifact may become not only the video itself but also the AI-generated segments and summaries around it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.theverge.com/streaming/919441/google-ask-youtube-ai-chatbot-search" target="_blank"&gt;Read The Verge coverage&lt;/a&gt;, &lt;a href="https://techcrunch.com/2026/04/28/youtube-is-testing-an-ai-powered-search-feature-that-shows-guided-answers/" target="_blank"&gt;Read TechCrunch coverage&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="youtube-brief"&gt;YouTube Brief&lt;a class="anchor" href="#youtube-brief"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="autoresearch-agent-loops-and-the-future-of-work"&gt;Autoresearch, Agent Loops and the Future of Work&lt;a class="anchor" href="#autoresearch-agent-loops-and-the-future-of-work"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Channel&lt;/strong&gt;: The AI Daily Brief&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Key idea&lt;/strong&gt; The episode uses Andrej Karpathy&amp;rsquo;s Autoresearch project to explain a loop-based workflow where agents run experiments, keep only improvements, and revert failed attempts. It connects fixed time budgets, single evaluation metrics, rollback behavior, and committed improvements to the future of research and product experimentation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why watch&lt;/strong&gt; It is useful for understanding that agent work is becoming less about one-off answers and more about repeatable experiment loops. That connects directly to harnesses, workspace isolation, and evaluation design.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Video&lt;/strong&gt;: &lt;a href="https://www.youtube.com/watch?v=nt9j1k2IhUY" target="_blank"&gt;Watch the video&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-05-02 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260502-ai-brief/</link><pubDate>Sat, 02 May 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260502-ai-brief/</guid><description>&lt;h1 id="2026-05-02-ai-news-brief"&gt;2026-05-02 AI News Brief&lt;a class="anchor" href="#2026-05-02-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Here is a short summary of AI technology news and videos worth checking today. This edition focuses on May 1-2 updates after the previous brief, while also including Claude Security&amp;rsquo;s April 30 public beta because it was not covered in the previous brief.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Cursor now lets admins create team marketplaces for plugins without first connecting a repository.&lt;/li&gt;
&lt;li&gt;GitHub Copilot will deprecate GPT-5.2 and GPT-5.2-Codex on June 1 and has named replacement models.&lt;/li&gt;
&lt;li&gt;Claude Security is now in public beta for Enterprise customers, offering vulnerability scans and proposed fixes.&lt;/li&gt;
&lt;li&gt;The U.S. Department of Defense expanded AI agreements for classified networks across several major AI providers.&lt;/li&gt;
&lt;li&gt;Anthropic&amp;rsquo;s MCP video explains how the Model Context Protocol works with the Claude API and agent systems.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-stories"&gt;Top Stories&lt;a class="anchor" href="#top-stories"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="cursor-strengthens-team-marketplace-settings"&gt;Cursor Strengthens Team Marketplace Settings&lt;a class="anchor" href="#cursor-strengthens-team-marketplace-settings"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Cursor now lets admins create a team marketplace without connecting a repository first. Team marketplaces can distribute plugins that bundle MCP servers, skills, subagents, rules, and hooks, with each plugin set to Default Off, Default On, or Required.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Agent tooling is moving from individual preference into team-level operations. For organizations, the question of which tools and permissions agents should receive can now be managed as policy instead of being left to each developer&amp;rsquo;s local setup.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; For harness engineering, plugin bundles, execution permissions, and team defaults are becoming part of the system design.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://cursor.com/changelog/05-01-26" target="_blank"&gt;Read the Cursor announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-plans-gpt-52-model-deprecations"&gt;GitHub Copilot Plans GPT-5.2 Model Deprecations&lt;a class="anchor" href="#github-copilot-plans-gpt-52-model-deprecations"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; GitHub announced that GPT-5.2 and GPT-5.2-Codex will be deprecated across Copilot experiences on June 1, 2026. GitHub recommends GPT-5.5 as the replacement for GPT-5.2 and GPT-5.3-Codex as the replacement for GPT-5.2-Codex.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Coding-agent workflows depend on model choice for quality, cost, speed, and policy. Copilot Enterprise admins in particular need to check model policies and make sure their workflows are not pinned to models that are going away.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Teams running long-lived agents or automated code review should avoid hardcoding model names into operational workflows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-05-01-upcoming-deprecation-of-gpt-5-2-and-gpt-5-2-codex/" target="_blank"&gt;Read the GitHub Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="claude-security-enters-public-beta"&gt;Claude Security Enters Public Beta&lt;a class="anchor" href="#claude-security-enters-public-beta"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Anthropic released Claude Security in public beta for Claude Enterprise customers. Claude Security scans codebases for vulnerabilities, explains severity and reproduction details, proposes patch directions, and can hand off fixes into Claude Code on the Web.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Security review is expanding from static pattern detection toward agentic analysis that understands code flow and business logic. At the same time, the same capabilities can increase exploitability if misused, so Anthropic also highlights cyber safeguards and its Cyber Verification Program.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; For development teams, the real productivity metric may be the time from scan to a mergeable patch, not just raw finding count.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://claude.com/blog/claude-security-public-beta" target="_blank"&gt;Read the Claude announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="pentagon-expands-classified-network-ai-deals"&gt;Pentagon Expands Classified-Network AI Deals&lt;a class="anchor" href="#pentagon-expands-classified-network-ai-deals"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; According to TechCrunch and The Verge, the U.S. Department of Defense signed agreements with NVIDIA, Microsoft, Amazon Web Services, and Reflection AI to deploy their AI technology and models on classified networks for &amp;ldquo;lawful operational use.&amp;rdquo; The reports say the broader set of agreements includes seven companies, including OpenAI, Google, and xAI, while Anthropic remains excluded amid a dispute over safety terms.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; AI models and infrastructure are moving quickly into military and national-security environments. This is a live example of AI company use policies, government procurement, safety guardrails, and cloud security requirements colliding.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; The usable scope of commercial AI tools can change dramatically based on contract language and policy decisions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://techcrunch.com/2026/05/01/pentagon-inks-deals-with-nvidia-microsoft-and-aws-to-deploy-ai-on-classified-networks/" target="_blank"&gt;Read TechCrunch coverage&lt;/a&gt;, &lt;a href="https://www.theverge.com/ai-artificial-intelligence/922113/pentagon-ai-classified-openai-google-nvidia" target="_blank"&gt;Read The Verge coverage&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="youtube-brief"&gt;YouTube Brief&lt;a class="anchor" href="#youtube-brief"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="building-with-mcp-and-the-claude-api"&gt;Building with MCP and the Claude API&lt;a class="anchor" href="#building-with-mcp-and-the-claude-api"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Channel&lt;/strong&gt;: Anthropic&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Key idea&lt;/strong&gt; Anthropic&amp;rsquo;s Alex Albert, John Welsh, and Michael Cohen explain the origins of the Model Context Protocol (MCP) and how MCP works with the Claude API. They frame MCP as a universal connector between models and external tools or data sources, then cover remote MCP, registries, the Claude API MCP connector, and tool-design principles.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why watch&lt;/strong&gt; Agents need more than stronger models to work inside real business systems; they need connection patterns, permissions, and well-described tools. This is a useful overview for readers tracking Claude, Cursor, and other agent runtimes together.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Video&lt;/strong&gt;: &lt;a href="https://www.youtube.com/watch?v=aZLr962R6Ag" target="_blank"&gt;Watch the video&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-05-09 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260509-ai-brief/</link><pubDate>Sat, 09 May 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260509-ai-brief/</guid><description>&lt;h1 id="2026-05-09-ai-news-brief"&gt;2026-05-09 AI News Brief&lt;a class="anchor" href="#2026-05-09-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Here is a short summary of AI technology news worth checking today. This edition focuses on official announcements from May 3-9 after the previous brief; no YouTube item is included because no suitable video could be verified beyond title and description-level evidence.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI released three new Realtime API models for realtime voice agents, live translation, and streaming transcription.&lt;/li&gt;
&lt;li&gt;OpenAI expanded Trusted Access for Cyber and introduced a limited preview of GPT-5.5-Cyber for verified defenders.&lt;/li&gt;
&lt;li&gt;Anthropic announced a SpaceX compute deal and raised Claude Code and Claude API usage limits.&lt;/li&gt;
&lt;li&gt;Cursor 3.3 added PR review, parallel plan execution, and a way to split multitasking changes into PRs.&lt;/li&gt;
&lt;li&gt;GitHub Copilot&amp;rsquo;s VS Code updates strengthened semantic code search, browser tab sharing, terminal access, and remote CLI session steering.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-stories"&gt;Top Stories&lt;a class="anchor" href="#top-stories"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="openai-releases-three-new-voice-models-for-the-realtime-api"&gt;OpenAI Releases Three New Voice Models for the Realtime API&lt;a class="anchor" href="#openai-releases-three-new-voice-models-for-the-realtime-api"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; OpenAI released GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper in the API. GPT-Realtime-2 is a realtime voice model with GPT-5-class reasoning, Translate handles live translation from 70+ input languages into 13 output languages, and Whisper provides streaming speech-to-text while someone is still speaking.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Voice AI is moving beyond simple call-and-response toward interfaces that can listen, reason, call tools, and take action. That can change product experiences in customer support, travel, education, meetings, and live events where typing is inconvenient.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; The important part is not only natural-sounding speech, but the balance between tool calling, interruption recovery, latency, and safety controls.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openai-expands-gpt-55-cyber-and-trusted-access-for-cyber"&gt;OpenAI Expands GPT-5.5-Cyber and Trusted Access for Cyber&lt;a class="anchor" href="#openai-expands-gpt-55-cyber-and-trusted-access-for-cyber"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; OpenAI explained its Trusted Access for Cyber framework and introduced GPT-5.5-Cyber in limited preview. Verified defenders can see fewer refusals for approved security work such as vulnerability identification, malware analysis, detection engineering, and patch validation, while requests involving credential theft or real-world harm remain blocked.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Strong models can speed up security work, but the same capabilities can be misused. That makes access control around who is using the model, with which permissions, and in what environment increasingly important.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Secure code review and automated vulnerability validation can directly improve developer productivity, but only when account security, audit logs, and approved target scope are designed together.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/gpt-5-5-with-trusted-access-for-cyber/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="anthropic-raises-claude-limits-with-a-spacex-compute-deal"&gt;Anthropic Raises Claude Limits With a SpaceX Compute Deal&lt;a class="anchor" href="#anthropic-raises-claude-limits-with-a-spacex-compute-deal"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Anthropic announced an agreement to use SpaceX&amp;rsquo;s Colossus 1 data center capacity. The company says this gives it more than 300 megawatts of new capacity and over 220,000 NVIDIA GPUs within the month, while also doubling Claude Code&amp;rsquo;s five-hour rate limits and removing peak-hour limit reductions for Pro and Max accounts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; AI product quality depends not only on model capability but also on dependable inference capacity. For developer tools such as Claude Code, rate limits and peak-hour policies directly shape real workflows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Frontier-model competition is now also an operations race across power, GPUs, data centers, and regional infrastructure.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.anthropic.com/news/higher-limits-spacex" target="_blank"&gt;Read the Anthropic announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="cursor-33-strengthens-pr-review-and-parallel-build-flows"&gt;Cursor 3.3 Strengthens PR Review and Parallel Build Flows&lt;a class="anchor" href="#cursor-33-strengthens-pr-review-and-parallel-build-flows"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Cursor 3.3 added a new PR review experience for reviewing and moving PRs toward merge inside Cursor. It also introduced Build in Parallel, which finds independent parts of a plan and runs them with async subagents, and Split changes into PRs, which turns multitasking changes into logical PR slices.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Coding agents are moving from tools that only write code into tools that plan work, execute parts in parallel, and package changes into reviewable units. In team development, reviewability and change separation matter as much as raw generation speed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; For harness engineering, the operating problem is how to verify parallel-agent output and split it into small, understandable PRs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://cursor.com/changelog/05-07-26" target="_blank"&gt;Read the Cursor Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-expands-the-vs-code-agent-experience"&gt;GitHub Copilot Expands the VS Code Agent Experience&lt;a class="anchor" href="#github-copilot-expands-the-vs-code-agent-experience"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; GitHub summarized Copilot updates for VS Code releases from April through early May, including semantic search across any workspace, grep-style search across GitHub repositories and organizations, and the experimental &lt;code&gt;/chronicle&lt;/code&gt; chat-history feature. Agents also gain inline diffs in chat, browser tab sharing, read/write access to open terminals, and remote monitoring and steering for Copilot CLI sessions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Agents need reliable access to code, browser state, terminals, and prior conversation context to produce useful work. Copilot&amp;rsquo;s direction looks less like a chatbot inside the IDE and more like an operator across the full development environment.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Enterprises should track Bring Your Own Key and domain access policies alongside these capabilities. As agents gain more context, productivity and security policy need to be designed together.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-05-06-github-copilot-in-visual-studio-code-april-releases/" target="_blank"&gt;Read the GitHub Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-05-12 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260512-ai-brief/</link><pubDate>Tue, 12 May 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260512-ai-brief/</guid><description>&lt;h1 id="2026-05-12-ai-news-brief"&gt;2026-05-12 AI News Brief&lt;a class="anchor" href="#2026-05-12-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Here is a short summary of AI technology news worth checking today. This edition focuses on official announcements and security reports from May 10-12 after the previous brief; no YouTube item is included because no suitable recent video could be verified beyond title and description-level evidence.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI launched the OpenAI Deployment Company, a dedicated organization for deploying AI into real enterprise workflows.&lt;/li&gt;
&lt;li&gt;Google Threat Intelligence Group published examples of AI-assisted zero-day exploitation and broader adversarial AI usage.&lt;/li&gt;
&lt;li&gt;GitHub MCP Server secret scanning is now generally available, letting AI coding agents check for secrets before commits.&lt;/li&gt;
&lt;li&gt;GitHub Copilot cloud agent now supports organization-level dedicated secrets and variables.&lt;/li&gt;
&lt;li&gt;NVIDIA&amp;rsquo;s 2026 State of AI report shows enterprise AI moving from pilots toward operations and agent deployment.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-stories"&gt;Top Stories&lt;a class="anchor" href="#top-stories"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="openai-launches-an-enterprise-ai-deployment-company"&gt;OpenAI Launches an Enterprise AI Deployment Company&lt;a class="anchor" href="#openai-launches-an-enterprise-ai-deployment-company"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; OpenAI launched the OpenAI Deployment Company to design, test, and deploy AI systems in core enterprise workflows. The company will place Forward Deployed Engineers (FDEs) inside customer organizations to connect OpenAI models with data, tools, permissions, and operating processes, and OpenAI expects to add about 150 deployment specialists through its acquisition of Tomoro.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; AI competition is shifting from model capability to whether systems can reliably fit into real work. For enterprises, the hard part is no longer only building demos, but turning security, permissions, governance, evaluation, and operating change into production systems.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; The FDE model blurs the line between AI product companies and consulting firms, while repeatable deployment patterns can flow back into product capabilities.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/openai-launches-the-deployment-company/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-publishes-a-security-report-on-adversarial-ai-use"&gt;Google Publishes a Security Report on Adversarial AI Use&lt;a class="anchor" href="#google-publishes-a-security-report-on-adversarial-ai-use"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Google Threat Intelligence Group (GTIG) published a report on how AI is being used for vulnerability discovery, malware development, defense evasion, information operations, and account abuse. GTIG says it identified, for the first time, a zero-day exploit likely developed with AI support, related to bypassing two-factor authentication (2FA) in a web-based system administration tool.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; AI gives defenders stronger tools for code security and vulnerability remediation, but it also helps attackers find high-level logic flaws and automate parts of the attack lifecycle. The key point is that models can reason about contradictions between developer intent and implementation, which traditional static analysis and fuzzing may miss.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; AI security cannot stop at model refusal policies. Authentication and authorization invariants, secret management, agent tool permissions, and audit logs all need to be designed together.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://cloud.google.com/blog/topics/threat-intelligence/ai-vulnerability-exploitation-initial-access" target="_blank"&gt;Read the Google Cloud report&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-mcp-server-secret-scanning-reaches-general-availability"&gt;GitHub MCP Server Secret Scanning Reaches General Availability&lt;a class="anchor" href="#github-mcp-server-secret-scanning-reaches-general-availability"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; GitHub made secret scanning in the GitHub MCP(Model Context Protocol) Server generally available. MCP-compatible AI coding tools such as GitHub Copilot CLI and Visual Studio Code can now scan for exposed tokens, keys, and credentials before a commit or pull request.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; When agents modify code and prepare commits, secret leaks need to be caught earlier in the workflow. Because the MCP tools honor existing push protection customization, teams can apply the same security policies to agent work that they already use for human workflows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; In AI coding environments, a pre-commit secret scan may become as basic as linting and tests.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-05-11-secret-scanning-with-github-mcp-server-is-now-generally-available/" target="_blank"&gt;Read the GitHub Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-cloud-agent-adds-organization-level-secrets-and-variables"&gt;GitHub Copilot Cloud Agent Adds Organization-Level Secrets and Variables&lt;a class="anchor" href="#github-copilot-cloud-agent-adds-organization-level-secrets-and-variables"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; GitHub Copilot cloud agent now supports dedicated &amp;ldquo;Agents&amp;rdquo; secrets and variables. Organizations can configure internal package registry tokens, shared Model Context Protocol(MCP) server settings, and environment variables at the organization level, then control which repositories can access them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Cloud agents need access to private packages, internal APIs, and MCP servers to work inside real company repositories. Centralized organization-level configuration reduces the operational overhead of repeating the same setup across many repositories.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Features that expand access should be paired with least privilege, repository-scoped access, and auditability. Operational control matters more than convenience.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-05-11-more-flexible-secrets-and-variables-for-copilot-cloud-agent/" target="_blank"&gt;Read the GitHub Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="nvidia-summarizes-enterprise-ai-adoption-in-its-2026-state-of-ai-report"&gt;NVIDIA Summarizes Enterprise AI Adoption in Its 2026 State of AI Report&lt;a class="anchor" href="#nvidia-summarizes-enterprise-ai-adoption-in-its-2026-state-of-ai-report"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; NVIDIA published its 2026 State of AI report, based on more than 3,200 respondents across financial services, retail, healthcare, telecommunications, and manufacturing. Sixty-four percent of respondents said their organizations are actively using AI in operations, and 44% said they are deploying or assessing AI agents.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Enterprise AI is moving from experimentation toward measured productivity, cost reduction, and revenue impact. The report frames agentic AI, open source and open weight models, data readiness, and shortage of AI experts as key variables for enterprise AI strategy this year.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; From a harness engineering perspective, the important question is not only whether an organization uses AI, but how it verifies AI-generated output and controls cost and permissions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blogs.nvidia.com/blog/state-of-ai-report-2026/" target="_blank"&gt;Read the NVIDIA Blog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-05-16 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260516-ai-brief/</link><pubDate>Sat, 16 May 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260516-ai-brief/</guid><description>&lt;h1 id="2026-05-16-ai-news-brief"&gt;2026-05-16 AI News Brief&lt;a class="anchor" href="#2026-05-16-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Today&amp;rsquo;s brief covers AI technology news along with developer tools, open source, infrastructure, and organizational shifts in the AI era. This edition combines official announcements from May 13-16 with technical signals that resurfaced in developer communities.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI brought Codex into the ChatGPT mobile app so developers can monitor, steer, and approve long-running coding-agent work from a phone.&lt;/li&gt;
&lt;li&gt;Anthropic introduced Claude for Small Business, connecting Claude workflows to tools such as QuickBooks, PayPal, HubSpot, and Canva.&lt;/li&gt;
&lt;li&gt;Cursor 3.4 lets teams configure, version, and audit the development environments used by cloud agents.&lt;/li&gt;
&lt;li&gt;GitHub introduced the Copilot app technical preview and a REST API for starting Copilot cloud agent tasks.&lt;/li&gt;
&lt;li&gt;DeerFlow 2.0, Bun&amp;rsquo;s Rust rewrite, Learning Opportunities, and the &amp;ldquo;Emacsification&amp;rdquo; of software show broader patterns around agent harnesses, large code changes, learning, and personal software.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-stories"&gt;Top Stories&lt;a class="anchor" href="#top-stories"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="openai-brings-codex-into-the-chatgpt-mobile-app"&gt;OpenAI Brings Codex Into the ChatGPT Mobile App&lt;a class="anchor" href="#openai-brings-codex-into-the-chatgpt-mobile-app"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; OpenAI released a preview of Codex inside the ChatGPT mobile app. From a phone, users can inspect active Codex threads, review outputs, diffs, test results, and screenshots, approve commands, change models, and start new work.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; The point is not &amp;ldquo;coding on a phone,&amp;rdquo; but coordinating long-running agent work that is already running on a laptop, Mac mini, or remote development environment. Files, credentials, permissions, and local setup stay on the machine where Codex is operating, while the phone receives state and approval flows through a secure relay layer.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; The next layer of coding-agent competition is not only model capability, but when human judgment enters the loop and how approvals are split across mobile, desktop, and remote environments.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/work-with-codex-from-anywhere/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;, &lt;a href="https://chatgpt.com/codex/mobile/" target="_blank"&gt;Open the Codex mobile page&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="anthropic-introduces-claude-for-small-business"&gt;Anthropic Introduces Claude for Small Business&lt;a class="anchor" href="#anthropic-introduces-claude-for-small-business"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Anthropic introduced Claude for Small Business. Inside Claude Cowork, businesses can connect tools such as QuickBooks, PayPal, HubSpot, Canva, DocuSign, Google Workspace, and Microsoft 365, then use 15 agentic workflows and 15 skills across finance, operations, sales, marketing, HR, and customer service.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Enterprise AI adoption has centered on permissions, data, and workflows, and the same problems show up in smaller teams with less operational capacity. Anthropic is trying to move AI beyond the chat window and into concrete work units such as month-end close, payroll planning, campaign execution, and invoice chasing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; The design choice to keep humans in the loop before plans are approved, messages are sent, or payments are made matters. For small businesses, a single automation failure can directly affect cash flow and customer trust.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.anthropic.com/news/claude-for-small-business" target="_blank"&gt;Read the Anthropic announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="cursor-34-strengthens-development-environments-for-cloud-agents"&gt;Cursor 3.4 Strengthens Development Environments for Cloud Agents&lt;a class="anchor" href="#cursor-34-strengthens-development-environments-for-cloud-agents"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Cursor 3.4 gives teams more control over the development environments used by cloud agents and automations. The release includes multi-repo environments, Dockerfile-based environment-as-code, build secrets, layer caching, agent-led setup, environment-level egress and secret scoping, version history, and audit logs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; For an agent to finish engineering work, it needs repositories, dependencies, internal packages, build systems, and credentials in a usable runtime environment. The competition is expanding from &amp;ldquo;does the agent answer well?&amp;rdquo; to &amp;ldquo;does the agent work in a reproducible and governable development environment?&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Environment versioning and audit logs may become as important as tests for cloud-agent operations. When an agent fails, teams need to know whether the problem came from the model, the environment, or permissions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://cursor.com/changelog/05-13-26" target="_blank"&gt;Read the Cursor Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-introduces-the-copilot-app-and-agent-tasks-rest-api"&gt;GitHub Introduces the Copilot App and Agent Tasks REST API&lt;a class="anchor" href="#github-introduces-the-copilot-app-and-agent-tasks-rest-api"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; GitHub released a technical preview of the GitHub Copilot app, a GitHub-native desktop experience for starting work from issues, pull requests, prompts, or previous sessions, reviewing plans and diffs, validating changes with an integrated terminal and browser, and moving the work into pull requests. Separately, Copilot Business and Enterprise users can now start Copilot cloud agent tasks through a REST API in public preview.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; GitHub is turning coding agents into a work system connected to issues, reviews, checks, and pull requests rather than a side feature inside an IDE. The REST API lets teams use agents in automations such as multi-repository refactors, internal developer-portal repository setup, and weekly release preparation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Once agent tasks can be launched through APIs, success criteria, cost, permissions, and failure recovery need to be designed together. Automated agent work can scale faster than tasks started by a human click.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-05-14-github-copilot-app-is-now-available-in-technical-preview" target="_blank"&gt;Read the GitHub Copilot app announcement&lt;/a&gt;, &lt;a href="https://github.blog/changelog/2026-05-13-start-copilot-cloud-agent-tasks-via-the-rest-api/" target="_blank"&gt;Read the Agent tasks REST API announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="related-trends"&gt;Related Trends&lt;a class="anchor" href="#related-trends"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="deerflow-20-a-long-horizon-superagent-harness"&gt;DeerFlow 2.0, a Long-Horizon SuperAgent Harness&lt;a class="anchor" href="#deerflow-20-a-long-horizon-superagent-harness"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; ByteDance&amp;rsquo;s DeerFlow 2.0 is an open-source harness for decomposing tasks that can take minutes to hours, such as research, coding, and content creation, across subagents, sandboxes, memory, skills, and message gateways. The project describes itself as a long-horizon agent harness that combines skills, sandboxes, memory, tools, and subagents to handle complex work.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; DeerFlow is a useful reference for what agent systems need beyond closed commercial products. Sandboxes, filesystem offloading, and isolated context per subagent are patterns that keep appearing when long-running work needs to be made reliable.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; DeerFlow is worth reading as a harness-design checklist even if you do not adopt it directly. The bigger design problem is not only model calls, but work environments, memory, permissions, and observability.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/bytedance/deer-flow" target="_blank"&gt;Open the GitHub repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="bun-merges-its-rust-rewrite-pr"&gt;Bun Merges Its Rust Rewrite PR&lt;a class="anchor" href="#bun-merges-its-rust-rewrite-pr"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; Bun PR #30412 was merged on May 14, 2026, rewriting a large part of Bun in Rust. The PR shows 6,755 commits, 2,188 changed files, and roughly one million added lines, and says the change passes Bun&amp;rsquo;s existing test suite on all platforms, reduces binary size by 3-8 MB, and lands in the neutral-to-faster benchmark range.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; This is not strictly AI news, but it raises practical questions about software change at agent-era scale. Because of the &lt;code&gt;claude/phase-a-port&lt;/code&gt; branch name and the community discussion around the change, the merge has become a case study in AI-assisted large rewrites, quality, test trust, reviewability, and release strategy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; For large automated changes, &amp;ldquo;the tests pass&amp;rdquo; is not the end of the evaluation. Backward compatibility, real workloads, gradual rollout, and explainability of the change all need scrutiny.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/oven-sh/bun/pull/30412" target="_blank"&gt;Open the Bun PR&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="learning-opportunities-helps-developers-learn-during-ai-coding"&gt;Learning Opportunities Helps Developers Learn During AI Coding&lt;a class="anchor" href="#learning-opportunities-helps-developers-learn-during-ai-coding"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; Learning Opportunities is a Claude Code and Codex skill designed to help users develop expertise while doing AI-assisted coding. After work such as creating new files, changing schemas, or refactoring, it offers optional 10-15 minute learning exercises based on learning-science techniques such as prediction, generation, retrieval practice, and spaced repetition.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; Coding agents can raise productivity, but users may lose understanding if they passively accept generated code. This project positions an agent not only as a tool that does work, but as a tutor that helps the user understand the work better.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; The more often developers use AI tools, the more intentional the learning loop needs to be. Short exercises that make the user explain design decisions, failure modes, and test intent can keep agent reliance healthier.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/DrCatHicks/learning-opportunities" target="_blank"&gt;Open the GitHub repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="the-emacsification-of-software"&gt;The Emacsification of Software&lt;a class="anchor" href="#the-emacsification-of-software"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; Quarrelsome argues that AI agents are moving software toward Emacs-style personal customization because individuals can now build native apps for their own problems in hours. The author uses MDV.app, a macOS Markdown viewer built with Claude, as an example with search, SQLite FTS indexing, bookmarks, table-of-contents navigation, and remembered reading position.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; The essay is more useful than broad claims that AI agents will &amp;ldquo;replace developers&amp;rdquo; because it focuses on a smaller, practical shift. If people can improve awkward terminal tools, oversized Electron apps, and personal workflow tools for themselves, the boundary between consuming and making software gets blurrier.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; More personal software may be valuable less for its source code than for its ideas, observations, prompts, and work logs. Ted Factory&amp;rsquo;s widgets and experimental tools fit naturally into this pattern.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://sockpuppet.org/blog/2026/05/12/emacsification/" target="_blank"&gt;Read the original essay&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-05-20 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260520-ai-brief/</link><pubDate>Wed, 20 May 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260520-ai-brief/</guid><description>&lt;h1 id="2026-05-20-ai-news-brief"&gt;2026-05-20 AI News Brief&lt;a class="anchor" href="#2026-05-20-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Today&amp;rsquo;s brief covers AI technology news along with developer tools, open source, infrastructure, and organizational shifts in the AI era. This edition focuses on official announcements from May 17-20 and agent-operations trends that are worth reading from developer communities.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI and Dell Technologies announced a collaboration to bring Codex into hybrid and on-premises enterprise environments.&lt;/li&gt;
&lt;li&gt;Anthropic acquired Stainless, a company that builds SDK and MCP server tooling, strengthening Claude&amp;rsquo;s tool connectivity and developer experience.&lt;/li&gt;
&lt;li&gt;Cursor introduced Composer 2.5, a coding model aimed at better long-running work, complex instruction following, and collaboration.&lt;/li&gt;
&lt;li&gt;GitHub made GPT-5.3-Codex the base model for Copilot Business and Enterprise, and expanded Copilot cloud agent with lower-cost models, one-click Actions fixes, and remote control.&lt;/li&gt;
&lt;li&gt;agentmemory, MCP Gateway &amp;amp; Registry, and Simon Willison&amp;rsquo;s six-month LLM recap show what memory, governance, and real-world usefulness now mean for agents.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-stories"&gt;Top Stories&lt;a class="anchor" href="#top-stories"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="openai-and-dell-extend-codex-into-hybrid-and-on-premises-enterprise-environments"&gt;OpenAI and Dell Extend Codex Into Hybrid and On-Premises Enterprise Environments&lt;a class="anchor" href="#openai-and-dell-extend-codex-into-hybrid-and-on-premises-enterprise-environments"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; OpenAI and Dell Technologies announced a collaboration to connect Codex with enterprise infrastructure such as the Dell AI Data Platform and Dell AI Factory. OpenAI says more than 4 million developers now use Codex every week, across code review, test coverage, incident response, large-repository reasoning, and increasingly non-coding workflows such as report preparation, lead qualification, and work coordination.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Large enterprises cannot adopt agents on model capability alone. Their codebases, documentation, operational knowledge, and customer data often live inside internal systems, while data sovereignty, security, and cost control need to be handled at the same time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Coding-agent adoption in the enterprise is moving from &amp;ldquo;using one cloud service&amp;rdquo; toward placing agents next to internal data and permission systems.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/dell-codex-enterprise-partnership/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="anthropic-acquires-stainless-a-company-behind-sdk-and-mcp-tooling"&gt;Anthropic Acquires Stainless, a Company Behind SDK and MCP Tooling&lt;a class="anchor" href="#anthropic-acquires-stainless-a-company-behind-sdk-and-mcp-tooling"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Anthropic acquired Stainless. Stainless turns API specifications into SDKs, CLIs (Command-Line Interfaces), and MCP (Model Context Protocol) servers across TypeScript, Python, Go, Java, Kotlin, and other languages, and has helped generate Anthropic&amp;rsquo;s official SDKs since the early days of the API.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; For agents to do real work, models need more than strong answers. They need safe, consistent access to APIs and tools. Anthropic created MCP, and Stainless helps developers make that connection layer less painful.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Agent-platform competition may increasingly depend on the quality of connections: SDKs, tool schemas, MCP server generation, and permission models, not only model-call pricing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.anthropic.com/news/anthropic-acquires-stainless" target="_blank"&gt;Read the Anthropic announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="cursor-introduces-composer-25"&gt;Cursor Introduces Composer 2.5&lt;a class="anchor" href="#cursor-introduces-composer-25"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Cursor introduced Composer 2.5. Cursor describes it as a substantial improvement over Composer 2 in intelligence and behavior, with better sustained work on long-running tasks, more reliable complex instruction following, and a more pleasant collaboration experience.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; The practical value of a coding model depends less on one benchmark score and more on whether it keeps context during long tasks, follows instructions until the end, and collaborates smoothly when the user changes direction. Pricing also matters for teams: Cursor lists Standard at $0.50 per million input tokens and $2.50 per million output tokens.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; As lower-cost coding models improve, the operating question shifts from &amp;ldquo;use the most expensive model for important work&amp;rdquo; to &amp;ldquo;route tasks to models based on difficulty.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://cursor.com/changelog/composer-2-5" target="_blank"&gt;Read the Cursor Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-expands-enterprise-base-models-and-cloud-agent-operations"&gt;GitHub Copilot Expands Enterprise Base Models and Cloud Agent Operations&lt;a class="anchor" href="#github-copilot-expands-enterprise-base-models-and-cloud-agent-operations"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; GitHub changed the base model for Copilot Business and Copilot Enterprise organizations from GPT-4.1 to GPT-5.3-Codex. It is GitHub and OpenAI&amp;rsquo;s first long-term support (LTS) model and will remain available through February 4, 2027. GitHub also added Claude Haiku 4.5 and GPT-5.4-mini as 0.33x request-unit models for Copilot cloud agent, and introduced one-click delegation for failing GitHub Actions jobs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Enterprises often need security reviews, safety reviews, and internal approvals before using a new model. LTS models reduce that review burden, while lower-cost model choices let teams separate simple fixes from complex work with different cost structures.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Remote control for Copilot CLI sessions is now available across mobile, web, VS Code, and JetBrains, which is also worth tracking. Long-running agent work is becoming an operational flow where people monitor and approve progress across multiple surfaces, not just inside an IDE.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-05-17-gpt-5-3-codex-is-now-the-base-model-for-copilot-business-and-enterprise/" target="_blank"&gt;Read the base model update&lt;/a&gt;, &lt;a href="https://github.blog/changelog/2026-05-18-copilot-cloud-agent-fast-cost-efficient-models-for-simple-tasks/" target="_blank"&gt;Read the lower-cost model update&lt;/a&gt;, &lt;a href="https://github.blog/changelog/2026-05-18-one-click-fixes-for-failing-actions-with-copilot-cloud-agent/" target="_blank"&gt;Read the Actions fix update&lt;/a&gt;, &lt;a href="https://github.blog/changelog/2026-05-18-remote-control-for-copilot-cli-sessions-now-generally-available-on-mobile-web-and-vs-code/" target="_blank"&gt;Read the Copilot CLI remote control update&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="related-trends"&gt;Related Trends&lt;a class="anchor" href="#related-trends"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="agentmemory-experiments-with-persistent-memory-for-ai-coding-agents"&gt;agentmemory Experiments With Persistent Memory for AI Coding Agents&lt;a class="anchor" href="#agentmemory-experiments-with-persistent-memory-for-ai-coding-agents"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; agentmemory is an open-source project that lets AI coding agents such as Claude Code, Cursor, Gemini CLI, Codex CLI, Hermes, and OpenClaw share the same memory server. The project says it captures session context through hooks, MCP, and REST APIs, then retrieves prior work using a combination of BM25 search, vector search, and knowledge graphs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; If agents are going to work on the same codebase over a long period, users cannot keep re-explaining background context every session. Memory can raise productivity, but it also creates risks when outdated information, incorrect reasoning, or sensitive content keeps being reused.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; When adopting agent memory, teams should decide not only what to remember, but what to forget, who can edit it, and which tasks should receive it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/rohitg00/agentmemory" target="_blank"&gt;Open the GitHub repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="mcp-gateway--registry-highlights-tool-governance"&gt;MCP Gateway &amp;amp; Registry Highlights Tool Governance&lt;a class="anchor" href="#mcp-gateway--registry-highlights-tool-governance"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; MCP Gateway &amp;amp; Registry is an open-source project that brings access to multiple MCP servers and AI agents behind a single gateway and registry. It aims to manage scattered tool connections through OAuth authentication, dynamic tool discovery, access control, audit logs, and A2A (Agent-to-Agent) communication registration.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; As MCP adoption grows, per-developer local configuration and scattered API keys quickly become risky. In enterprise settings, teams need to track which tools an agent saw, what permissions it used, and who approved that access.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Even small teams will feel the need for registries, permission boundaries, and audit logs once their MCP server count grows. Governance should be part of the agent harness structure, not a feature bolted on later.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/agentic-community/mcp-gateway-registry" target="_blank"&gt;Open the GitHub repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="simon-willison-summarizes-six-months-of-llms-in-five-minutes"&gt;Simon Willison Summarizes Six Months of LLMs in Five Minutes&lt;a class="anchor" href="#simon-willison-summarizes-six-months-of-llms-in-five-minutes"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; Simon Willison published annotated slides from a PyCon US 2026 lightning talk, summarizing the last six months of LLMs around two themes: coding agents became good enough for real daily work, and open-weight models running on laptops started outperforming expectations. He frames November 2025 as the point where coding agents moved from &amp;ldquo;often works&amp;rdquo; to &amp;ldquo;mostly works.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; The post is useful because it focuses on how user expectations changed, not only on individual model announcements. Model rankings keep changing, but the important question is increasingly whether the system can be trusted with everyday work.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Point to watch&lt;/strong&gt; Ted Factory&amp;rsquo;s own harness experiments should follow the same question. Model names matter less over time than task definitions, validation loops, failure recovery, and when the user should intervene.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://simonwillison.net/2026/May/19/5-minute-llms/" target="_blank"&gt;Read the original post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="youtube-brief"&gt;YouTube Brief&lt;a class="anchor" href="#youtube-brief"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="nvidias-jensen-huang-and-dells-michael-dell-discuss-on-premises-agentic-ai"&gt;NVIDIA&amp;rsquo;s Jensen Huang and Dell&amp;rsquo;s Michael Dell Discuss On-Premises Agentic AI&lt;a class="anchor" href="#nvidias-jensen-huang-and-dells-michael-dell-discuss-on-premises-agentic-ai"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Channel&lt;/strong&gt;: Bloomberg Television&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; In a Bloomberg interview from Dell World, Jensen Huang and Michael Dell discussed agentic AI, memory demand, and enterprise AI infrastructure. Huang emphasized that intelligence should be produced where context and action happen, and that on-premises agents matter for work involving manufacturing, life sciences, security data, and other internal business context.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth watching&lt;/strong&gt; It provides useful background for understanding why enterprises are interested in running agents near internal infrastructure, not only in the cloud, which connects directly to the OpenAI and Dell Codex partnership.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Video&lt;/strong&gt;: &lt;a href="https://www.youtube.com/watch?v=oE5lNDhz9oo" target="_blank"&gt;Watch the video&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-05-22 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260522-ai-brief/</link><pubDate>Fri, 22 May 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260522-ai-brief/</guid><description>&lt;h1 id="2026-05-22-ai-news-brief"&gt;2026-05-22 AI News Brief&lt;a class="anchor" href="#2026-05-22-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Today we look at notable AI technology news, alongside changes in developer tools, open source, infrastructure, and work practices in the AI era. This brief covers major Google I/O 2026 announcements published from May 19 to 22, plus a few official updates that were not included in the previous brief.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Google I/O 2026 expanded Google&amp;rsquo;s agent strategy with Gemini 3.5 Flash, AI Search, Gemini Spark, and Antigravity 2.0 / Managed Agents.&lt;/li&gt;
&lt;li&gt;Gemini Omni is coming to YouTube Shorts, the Gemini app, and Google Flow, while Flow Agent, Gemini for Science, Universal Cart, and expanded SynthID verification were also announced.&lt;/li&gt;
&lt;li&gt;NVIDIA introduced Nemotron 3 Nano Omni, an open multimodal model that handles video, audio, images, and text in one model.&lt;/li&gt;
&lt;li&gt;OpenAI said an internal reasoning model produced a proof disproving a longstanding conjecture in discrete geometry.&lt;/li&gt;
&lt;li&gt;Cursor 3.5, Datasette Agent, and the Open Agent Leaderboard show how agents are connecting to developer environments, data tools, and evaluation systems.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="major-news"&gt;Major News&lt;a class="anchor" href="#major-news"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="google-io-2026-puts-gemini-with-action-at-the-center-with-gemini-35-flash"&gt;Google I/O 2026 Puts &amp;ldquo;Gemini With Action&amp;rdquo; at the Center With Gemini 3.5 Flash&lt;a class="anchor" href="#google-io-2026-puts-gemini-with-action-at-the-center-with-gemini-35-flash"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; At I/O 2026, Google announced the Gemini 3.5 model family and introduced the first model, Gemini 3.5 Flash. Google describes it as &amp;ldquo;frontier intelligence with action&amp;rdquo; and is rolling it out across the Gemini app, Google Search&amp;rsquo;s AI Mode, Google Antigravity, the Gemini API, Google AI Studio, Android Studio, and Gemini Enterprise.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; This shows Google moving the Gemini story beyond chatbot answers toward agent execution, coding, long-horizon tasks, and multimodal interfaces. The important shift is that a Flash model is being positioned not just as a fast helper model, but as the default engine for agentic and coding workflows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; The practical value of Gemini 3.5 Flash will depend less on benchmark numbers and more on how reliably it performs long tasks inside harnesses such as Antigravity, Search, and the Gemini app.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/" target="_blank"&gt;Gemini 3.5 announcement&lt;/a&gt;, &lt;a href="https://blog.google/innovation-and-ai/technology/ai/google-io-2026-all-our-announcements/" target="_blank"&gt;I/O 2026 summary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-search-gets-its-biggest-search-box-upgrade-in-25-years-and-adds-information-agents"&gt;Google Search Gets Its Biggest Search Box Upgrade in 25 Years and Adds Information Agents&lt;a class="anchor" href="#google-search-gets-its-biggest-search-box-upgrade-in-25-years-and-adds-information-agents"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Google is making Gemini 3.5 Flash the default model for AI Mode in Search and redesigning the Search box around AI. The new Search box can take text, images, files, videos, and Chrome tabs as inputs, while AI Overviews can flow into follow-up conversations in AI Mode.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Search is moving from a place where people find information into an agent platform that can monitor topics and synthesize updates over time. Google says information agents can watch the web, news, blogs, social posts, finance, shopping, and sports data for changes related to a user&amp;rsquo;s question.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; If Antigravity-powered generative UI and mini-app creation reach Search, the search results page starts looking less like a list of links and more like a runtime that creates custom interfaces for each task.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.google/products-and-platforms/products/search/search-io-2026/" target="_blank"&gt;Google Search announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="gemini-spark-and-daily-brief-move-personal-assistants-into-background-agents"&gt;Gemini Spark and Daily Brief Move Personal Assistants Into Background Agents&lt;a class="anchor" href="#gemini-spark-and-daily-brief-move-personal-assistants-into-background-agents"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Google said the Gemini app now serves more than 900 million monthly users and introduced Gemini Spark and Daily Brief. Gemini Spark is a 24/7 personal agent powered by Gemini 3.5 and the Antigravity harness, integrated with Google Workspace tools such as Gmail, Docs, and Slides, and able to keep working in the cloud even when a device is closed or locked.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Personal AI assistants are shifting from apps that answer questions into systems that monitor and execute recurring tasks with user permission. For actions such as sending email, booking, or spending money, approval design and auditability become central product requirements.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; For Spark to work well, model quality may matter less than permission boundaries, understandable task status, interruption controls, approval flows, and rollback experiences.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.google/innovation-and-ai/products/gemini-app/next-evolution-gemini-app/" target="_blank"&gt;Gemini app update&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-antigravity-20-and-managed-agents-expand-googles-developer-agent-platform"&gt;Google Antigravity 2.0 and Managed Agents Expand Google&amp;rsquo;s Developer Agent Platform&lt;a class="anchor" href="#google-antigravity-20-and-managed-agents-expand-googles-developer-agent-platform"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Google announced the Antigravity 2.0 desktop app, Antigravity CLI, Antigravity SDK, and Managed Agents in the Gemini API. Managed Agents let developers start an agent with a single API call inside an isolated Linux environment that can use tools, execute code, manage files, and browse the web.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; As Cursor, Codex, and Claude Code have shown, developer tool competition is moving from model calls into harnesses, sandboxes, asynchronous work, subagents, skills, and deployment environments. Google is positioning Antigravity as an agent-first development platform optimized with Gemini models.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; Antigravity SDK and Managed Agents connect directly to Ted Factory&amp;rsquo;s harness experiments. The question is not only whether a model writes good code, but how the product packages environment, permissions, verification, and cost tracing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/google-io-2026-developer-highlights/" target="_blank"&gt;developer announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="nvidia-introduces-nemotron-3-nano-omni-as-a-perception-layer-for-multimodal-agents"&gt;NVIDIA Introduces Nemotron 3 Nano Omni as a Perception Layer for Multimodal Agents&lt;a class="anchor" href="#nvidia-introduces-nemotron-3-nano-omni-as-a-perception-layer-for-multimodal-agents"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; NVIDIA introduced Nemotron 3 Nano Omni, an open multimodal model that processes video, audio, images, and text together. It uses a 30B-A3B hybrid MoE(Mixture of Experts) architecture, and NVIDIA says it can deliver up to 9x higher throughput than pipelines that stitch together separate vision and speech models.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; More agents now need to look at screens, listen to recordings, and read documents and charts at the same time. Splitting those tasks across separate models increases latency, cost, and context loss; Nemotron 3 Nano Omni tries to collapse that perception layer into one model.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; From the author&amp;rsquo;s perspective, multimodal models may reach production faster as &amp;ldquo;sub-agents that read screens / documents / audio&amp;rdquo; than as final answer models.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blogs.nvidia.com/blog/nemotron-3-nano-omni-multimodal-ai-agents/" target="_blank"&gt;NVIDIA announcement&lt;/a&gt;, &lt;a href="https://developer.nvidia.com/blog/nvidia-nemotron-3-nano-omni-powers-multimodal-agent-reasoning-in-a-single-efficient-open-model/" target="_blank"&gt;technical blog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openai-model-disproves-a-longstanding-unit-distance-conjecture-in-discrete-geometry"&gt;OpenAI Model Disproves a Longstanding Unit Distance Conjecture in Discrete Geometry&lt;a class="anchor" href="#openai-model-disproves-a-longstanding-unit-distance-conjecture-in-discrete-geometry"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; OpenAI said an internal general-purpose reasoning model produced a proof that disproves a central conjecture related to Paul Erdős&amp;rsquo;s 1946 planar unit distance problem. The problem asks how many pairs of points in the plane can be exactly one unit apart, and OpenAI says the model found an infinite family of constructions that break the long-held belief that grid-like constructions were essentially optimal.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; The headline is not just &amp;ldquo;AI solved a math problem.&amp;rdquo; The more important point is that a general-purpose reasoning model, rather than a problem-specific search system, produced the proof idea and external mathematicians reviewed it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; The value of research AI will grow around its ability to sustain long verifiable reasoning and suggest connections between fields that humans may not have prioritized.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/model-disproves-discrete-geometry-conjecture/" target="_blank"&gt;OpenAI announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="cursor-35-integrates-automations-into-the-agents-window"&gt;Cursor 3.5 Integrates Automations Into the Agents Window&lt;a class="anchor" href="#cursor-35-integrates-automations-into-the-agents-window"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Cursor 3.5 now lets users create and manage Cursor Automations inside the Agents Window. Automations can attach multiple repositories, or run with no repository at all for recurring workflows such as Slack digests, product analytics, FAQ responses, billing metrics, and customer health monitoring.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Coding agents are expanding beyond work inside a single repository into operational automations that span codebases and work tools. No-repo automations are especially interesting because they move agents from &amp;ldquo;code writers&amp;rdquo; toward &amp;ldquo;operators that monitor and summarize signals.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; Before adopting automations, teams should define triggers, permissions, reviewers, and failure-notification paths as clearly as execution cost.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://cursor.com/changelog/05-20-26" target="_blank"&gt;Cursor Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="youtube-announces-ask-youtube-and-gemini-omni-remix"&gt;YouTube Announces Ask YouTube and Gemini Omni Remix&lt;a class="anchor" href="#youtube-announces-ask-youtube-and-gemini-omni-remix"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; At Google I/O 2026, YouTube announced Ask YouTube and Gemini Omni-powered Shorts Remix. Ask YouTube is a conversational search experience for complex questions and follow-ups, while Gemini Omni Remix lets users transform eligible Shorts with prompts and images while preserving the original video&amp;rsquo;s context.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Search is moving from keywords toward conversational exploration, and video creation is moving toward context-aware editing of existing content rather than only generating new clips from scratch. YouTube also highlighted digital watermarks, identifying metadata, links back to source videos, creator opt-out controls, and expanded likeness detection.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; The first broad use case for generative video may be less about creating cinematic clips from nothing and more about editing existing content with source links and controls intact.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.youtube/news-and-events/youtube-news-google-io-2026/" target="_blank"&gt;YouTube Blog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="worth-watching"&gt;Worth Watching&lt;a class="anchor" href="#worth-watching"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="gemini-for-science-moves-research-workflows-into-agent-harnesses"&gt;Gemini for Science Moves Research Workflows Into Agent Harnesses&lt;a class="anchor" href="#gemini-for-science-moves-research-workflows-into-agent-harnesses"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; Google announced Gemini for Science, including three experimental tools: Hypothesis Generation, Computational Discovery, and Literature Insights. It also introduced Science Skills, which connect more than 30 life science databases and tools, including UniProt, AlphaFold Database, AlphaGenome API, and InterPro, to agent platforms such as Antigravity.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; If OpenAI&amp;rsquo;s math result shows that models can contribute research ideas, Gemini for Science shows a product approach to connecting research workflows, data sources, and agent harnesses.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; Scientific agents need sources, reproducibility, and verifiable intermediate outputs more than persuasive final prose. The Literature Insights pattern of structured tables and citations is worth watching for other knowledge-work tools.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.google/innovation-and-ai/technology/research/gemini-for-science-io-2026/" target="_blank"&gt;Gemini for Science&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-flow-agent-and-universal-cart-bring-agent-patterns-to-creation-and-shopping"&gt;Google Flow Agent and Universal Cart Bring Agent Patterns to Creation and Shopping&lt;a class="anchor" href="#google-flow-agent-and-universal-cart-bring-agent-patterns-to-creation-and-shopping"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; Google Flow announced Flow Agent, Flow Tools, Flow Music updates, and Gemini Omni integration. Flow Agent helps with brainstorming, dialogue review, variation generation, batch edits, and asset organization, while Universal Cart creates an intelligent cart across Search, Gemini, YouTube, and Gmail that can reason about product compatibility, pricing, and payment benefits.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; Agent patterns are spreading beyond developer tools into creative tools and shopping flows. Universal Cart is especially notable because AI moves beyond recommendations and closer to purchase decisions and checkout.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; Creation and shopping agents make work easier, but they also raise operational questions around copyright, source attribution, payment authorization, and accountability.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.google/innovation-and-ai/models-and-research/google-labs/flow-updates/" target="_blank"&gt;Google Flow updates&lt;/a&gt;, &lt;a href="https://blog.google/products-and-platforms/products/shopping/google-shopping-cart/" target="_blank"&gt;Universal Cart&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="expanded-synthid-and-c2pa-support-strengthen-ai-content-provenance"&gt;Expanded SynthID and C2PA Support Strengthen AI Content Provenance&lt;a class="anchor" href="#expanded-synthid-and-c2pa-support-strengthen-ai-content-provenance"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; In its I/O 2026 summary, Google said it is expanding SynthID verification from the Gemini app into Search and Chrome. It is also adding C2PA Content Credentials to the Gemini app, with Search and Chrome support planned later.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; As generative AI spreads into search, video, image editing, shopping, and work documents, users need better ways to understand how content was created. Watermarking and content credentials are not perfect, but they are part of the trust infrastructure platforms now need.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; For blogs and news briefs, clearer habits around source links, AI-generated media disclosure, and edit history will become more important as generated images and videos become more common.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.google/innovation-and-ai/technology/ai/google-io-2026-all-our-announcements/" target="_blank"&gt;I/O 2026 summary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="datasette-agent-brings-a-conversational-open-source-agent-to-sqlite-data"&gt;Datasette Agent Brings a Conversational Open Source Agent to SQLite Data&lt;a class="anchor" href="#datasette-agent-brings-a-conversational-open-source-agent-to-sqlite-data"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; Datasette released Datasette Agent, an open source plugin for exploring SQLite data through conversation. It connects the LLM Python library with Datasette so users can ask questions in natural language, generate SQL, and extend the agent with plugins for charts, image generation, and Fly Sprites sandbox execution.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; Agent products do not only evolve as giant general-purpose assistants. A small conversational layer attached to an existing data tool, with plugins for extra tools, can be just as powerful.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; For personal knowledge bases or blog analytics tools, a small and verifiable data interface like Datasette Agent may be a faster starting point than a large agent platform.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://datasette.io/blog/2026/datasette-agent/" target="_blank"&gt;Datasette announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="open-agent-leaderboard-evaluates-full-agent-systems-not-just-models"&gt;Open Agent Leaderboard Evaluates Full Agent Systems, Not Just Models&lt;a class="anchor" href="#open-agent-leaderboard-evaluates-full-agent-systems-not-just-models"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; IBM Research&amp;rsquo;s Open Agent Leaderboard on Hugging Face evaluates full systems that pair a model with an agent implementation, rather than only reporting model scores. It unifies benchmarks such as SWE-Bench Verified, BrowseComp+, AppWorld, and tau2-Bench under a common protocol, and reports success rates, cost per task, and failure cost.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; The same model can behave very differently depending on tool selection, planning, memory, and error recovery. In production, &amp;ldquo;how expensively does it fail?&amp;rdquo; can matter more than the top-line score.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; Ted Factory&amp;rsquo;s harness experiments should compare not only model names, but also task definitions, tool constraints, verification logs, and cost traces.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://huggingface.co/blog/ibm-research/open-agent-leaderboard" target="_blank"&gt;Hugging Face article&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="youtube-brief"&gt;YouTube Brief&lt;a class="anchor" href="#youtube-brief"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="datasette-agent-demo"&gt;Datasette Agent Demo&lt;a class="anchor" href="#datasette-agent-demo"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Channel&lt;/strong&gt;: Datasette / Simon Willison&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; The demo video linked from the Datasette Agent announcement shows a user asking natural language questions of SQLite data while the agent generates SQL and returns results. According to the announcement post, the demo runs against the live &lt;code&gt;agent.datasette.io&lt;/code&gt; instance using example databases and Gemini 3.1 Flash-Lite.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why watch it&lt;/strong&gt; It is a quick way to see what user experience looks like when an agent interface is added to a small data tool.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Video&lt;/strong&gt;: &lt;a href="https://www.youtube.com/watch?v=AFZKp6hbFjI" target="_blank"&gt;Watch video&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="the-most-important-ai-news-from-google-io"&gt;The Most Important AI News from Google I/O&lt;a class="anchor" href="#the-most-important-ai-news-from-google-io"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Channel&lt;/strong&gt;: The AI Daily Brief: Artificial Intelligence News&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; This episode explains Google I/O announcements around Omni, Gemini 3.5 Flash, Antigravity 2.0, and Gemini Spark. It also discusses Google&amp;rsquo;s distribution advantage across consumer products and the confusion that can come from having many overlapping AI product names and interfaces.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why watch it&lt;/strong&gt; It is useful for understanding YouTube&amp;rsquo;s Ask / Gemini Omni announcement inside Google&amp;rsquo;s broader AI strategy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Video&lt;/strong&gt;: &lt;a href="https://www.youtube.com/watch?v=G2HX30CelNk" target="_blank"&gt;Watch video&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-05-27 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260527-ai-brief/</link><pubDate>Wed, 27 May 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260527-ai-brief/</guid><description>&lt;h1 id="2026-05-27-ai-news-brief"&gt;2026-05-27 AI News Brief&lt;a class="anchor" href="#2026-05-27-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Today we look at notable AI technology news, alongside changes in developer tools, open source, infrastructure, and work practices in the AI era. This brief focuses on official announcements and community signals published from May 23 to 27. Recent video candidates were also checked, but no suitable recent item had enough verified transcript, description, and primary-source context, so this brief skips the YouTube section.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Microsoft Copilot Studio made computer-using agents generally available, bringing UI automation to business systems without APIs.&lt;/li&gt;
&lt;li&gt;GitHub Copilot added organization-targeted model rules and stronger Copilot Memory controls, thickening the governance layer for agents.&lt;/li&gt;
&lt;li&gt;NVIDIA is pushing agent security runtimes, OpenClaw, and AI factory infrastructure through OpenShell and GTC Taipei updates.&lt;/li&gt;
&lt;li&gt;Anthropic appointed a Korea representative ahead of its Seoul office opening and named Korea as one of Claude&amp;rsquo;s most active markets.&lt;/li&gt;
&lt;li&gt;Forge, llama.cpp, and OpenClaw updates show that harness design and isolation matter even for small local models and local agents.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="major-news"&gt;Major News&lt;a class="anchor" href="#major-news"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="microsoft-copilot-studio-makes-computer-using-agents-generally-available"&gt;Microsoft Copilot Studio Makes Computer-Using Agents Generally Available&lt;a class="anchor" href="#microsoft-copilot-studio-makes-computer-using-agents-generally-available"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Microsoft made computer-using agents generally available in Copilot Studio. These agents can look at and interact with websites and desktop applications through the user interface, so older business systems and tools without APIs can become automation targets.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Enterprise automation works well when APIs and structured workflows exist, but real work often still depends on changing screens, legacy apps, and exceptions. When computer-using agents are combined with workflows, approvals, business logic, remote MCP(Model Context Protocol) servers, and agent-to-agent(A2A) communication, the product starts looking less like a chatbot and more like an execution platform.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; The important question is not only model quality. It is whether the product handles credentials, audit logs, human approval, and failure states clearly enough for real operations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.microsoft.com/en-us/microsoft-copilot/blog/copilot-studio/new-and-improved-computer-using-agents-a-new-workflows-experience-and-real-time-voice-experiences/" target="_blank"&gt;Microsoft Copilot Blog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-adds-organization-level-model-rules-and-stronger-memory-controls"&gt;GitHub Copilot Adds Organization-Level Model Rules and Stronger Memory Controls&lt;a class="anchor" href="#github-copilot-adds-organization-level-model-rules-and-stronger-memory-controls"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; GitHub introduced targeted model rules in public preview for Copilot Business and Copilot Enterprise, allowing enterprise owners to control which Copilot models are available to specific organizations. GitHub also updated Copilot Memory documentation around viewing and deleting repository-level facts and user preferences, Copilot CLI usage, and the 28-day automatic deletion policy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Once agents use multiple models and persistent memory, &amp;ldquo;which model can this team use?&amp;rdquo; and &amp;ldquo;which memories influence the agent?&amp;rdquo; become operational risks. Model choice and memory are convenience features, but in enterprise settings they also affect cost, compliance, privacy, and the spread of stale context.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; Agent memory is powerful, but a wrong memory can quietly damage productivity. Teams should define scope, retention, deletion rights, and auditability before enabling it broadly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-05-26-target-copilot-models-to-organizations-with-model-rules/" target="_blank"&gt;GitHub model rules&lt;/a&gt;, &lt;a href="https://docs.github.com/en/copilot/how-tos/use-copilot-agents/copilot-memory" target="_blank"&gt;Copilot Memory docs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="nvidia-openshell-moves-agent-security-from-prompts-into-the-runtime"&gt;NVIDIA OpenShell Moves Agent Security From Prompts Into the Runtime&lt;a class="anchor" href="#nvidia-openshell-moves-agent-security-from-prompts-into-the-runtime"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; NVIDIA described OpenShell as an open source secure runtime for autonomous agents. It runs each agent inside a sandbox and enforces file access, networking, credentials, and policy at a system layer outside the agent.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; As agents read files, run code, and connect to external services, telling a model to &amp;ldquo;be careful&amp;rdquo; in a prompt is not enough. OpenShell points toward a browser-tab-like model: isolate sessions, enforce policy in the runtime, and prevent the agent from overriding the controls meant to contain it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; For Ted Factory&amp;rsquo;s harness experiments, tool permissions should be runtime invariants rather than prompt instructions. Local files, secrets, and external network access should default to denied, with only the required scope opened.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blogs.nvidia.com/blog/secure-autonomous-ai-agents-openshell/" target="_blank"&gt;NVIDIA OpenShell article&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="nvidia-gtc-taipei-preview-emphasizes-agents-and-physical-ai-infrastructure"&gt;NVIDIA GTC Taipei Preview Emphasizes Agents and Physical AI Infrastructure&lt;a class="anchor" href="#nvidia-gtc-taipei-preview-emphasizes-agents-and-physical-ai-infrastructure"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; NVIDIA began its GTC Taipei at COMPUTEX 2026 live updates, including a Meet-a-Claw event with demos around OpenClaw and OpenShell-secured autonomous agents. NVIDIA also noted COMPUTEX 2026 Best Choice Awards for Vera Rubin NVL72, Jetson Thor, and Alpamayo, while revealing plans for a new Taipei research and development campus.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; NVIDIA&amp;rsquo;s message now extends beyond GPUs into the full AI factory stack: CPUs, networking, DPUs, sandboxes, robotics, and manufacturing. Long-running agents need not only model inference, but also infrastructure for tool calls, file work, code execution, simulation, and security isolation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; Developers should evaluate not only which model to use, but where that model can run safely and what cost structure supports long-running work.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blogs.nvidia.com/blog/nvidia-gtc-taipei-computex-2026-news/" target="_blank"&gt;NVIDIA GTC Taipei updates&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="anthropic-appoints-korea-representative-ahead-of-seoul-office-opening"&gt;Anthropic Appoints Korea Representative Ahead of Seoul Office Opening&lt;a class="anchor" href="#anthropic-appoints-korea-representative-ahead-of-seoul-office-opening"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Anthropic appointed KiYoung Choi, formerly General Manager for Korea at Snowflake, as Representative Director of Korea ahead of opening a Seoul office. Anthropic said Korea is one of the most active Claude.ai markets, with usage more than 3.5 times what would be expected from population size and skewed heavily toward technical and creative work.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Korea is a market where semiconductors, telecom, games, content, and legal / financial automation meet quickly. By naming SK Telecom and Law&amp;amp;Company as Claude users, Anthropic is signaling enterprise and professional workflows rather than only consumer chat.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; Korean companies will likely compare Claude, OpenAI, Gemini, and Copilot more actively. Data boundaries, internal system integration, and responsible deployment policies may matter as much as model scores.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.anthropic.com/news/kiyoung-choi-representative-director-anthropic-korea" target="_blank"&gt;Anthropic announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openai-signs-content-partnership-with-brazils-folha-and-uol"&gt;OpenAI Signs Content Partnership With Brazil&amp;rsquo;s Folha and UOL&lt;a class="anchor" href="#openai-signs-content-partnership-with-brazils-folha-and-uol"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Folha de S.Paulo and UOL signed Brazil&amp;rsquo;s first commercial content agreement with OpenAI. The media groups will provide real-time news to the ChatGPT ecosystem so users can receive more current answers grounded in original reporting and source links.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; As generative AI services absorb more news and search behavior, compensation for journalism, attribution, and real-time information quality become central issues. The agreement also ends a 2025 lawsuit from Folha over unauthorized and unpaid use of its content.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; For blog publishing, source links matter more, not less. Even when AI summaries are useful, readers need a clear path back to the original reporting.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www1.folha.uol.com.br/internacional/en/business/2026/05/folha-and-uol-sign-brazils-first-openai-deal-to-supply-content-to-chatgpt.shtml" target="_blank"&gt;Folha report&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="worth-watching"&gt;Worth Watching&lt;a class="anchor" href="#worth-watching"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="forge-argues-that-small-local-models-need-better-harnesses-not-only-bigger-weights"&gt;Forge Argues That Small Local Models Need Better Harnesses, Not Only Bigger Weights&lt;a class="anchor" href="#forge-argues-that-small-local-models-need-better-harnesses-not-only-bigger-weights"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; Forge is an open source reliability layer for self-hosted LLM tool-calling. It uses retry nudges, step enforcement, error recovery, and VRAM-aware context management to improve multi-step agent workflows for small local models.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; The project asks a useful question: not &amp;ldquo;is the model smart enough?&amp;rdquo; but &amp;ldquo;does the system retry well, treat bad tool results as errors, and compact context safely?&amp;rdquo; That connects directly to the growing importance of harness engineering.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; When building local agents, it may be faster to define a small task suite and evaluation harness first, then improve error recovery and logs before swapping models.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/antoinezambelli/forge" target="_blank"&gt;Forge repository&lt;/a&gt;, &lt;a href="https://news.ycombinator.com/item?id=48192383" target="_blank"&gt;Hacker News discussion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="llamacpp-built-in-tools-show-both-the-convenience-and-risk-of-local-agents"&gt;llama.cpp Built-In Tools Show Both the Convenience and Risk of Local Agents&lt;a class="anchor" href="#llamacpp-built-in-tools-show-both-the-convenience-and-risk-of-local-agents"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; &lt;code&gt;llama-server&lt;/code&gt; in llama.cpp now documents an experimental &lt;code&gt;--tools&lt;/code&gt; option for enabling built-in tools such as &lt;code&gt;read_file&lt;/code&gt;, &lt;code&gt;write_file&lt;/code&gt;, &lt;code&gt;edit_file&lt;/code&gt;, &lt;code&gt;exec_shell_command&lt;/code&gt;, &lt;code&gt;grep_search&lt;/code&gt;, and &lt;code&gt;apply_diff&lt;/code&gt;. With &lt;code&gt;--tools all&lt;/code&gt;, a local GGUF model can get close to a file-and-shell agent without a separate MCP server.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; The barrier to running local agents is falling, but direct host execution is a serious security concern. The official README explicitly warns not to enable the feature in untrusted environments.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; Even in a local development environment, file-write and shell-execution tools should not be enabled without sandboxing, permission checks, and working-directory limits.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md" target="_blank"&gt;llama.cpp server README&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openclaw-2026524-beta-adds-agent-diagnostics-and-sandbox-hardening"&gt;OpenClaw 2026.5.24 Beta Adds Agent Diagnostics and Sandbox Hardening&lt;a class="anchor" href="#openclaw-2026524-beta-adds-agent-diagnostics-and-sandbox-hardening"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; OpenClaw 2026.5.24 beta adds bounded skill usage metrics and spans, tool source / owner labels, Chrome DevTools MCP usage statistics disabled by default, and read-only skill mounts for remote container working-directory operations. It also avoids exposing raw paths or session identifiers in diagnostic output.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it is worth reading&lt;/strong&gt; As long-running agents become common, observability and sandbox policy become part of product quality. If teams cannot tell which tool ran when, or if browser sessions and skill directories are too open, even small experiments can become operational risks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch point&lt;/strong&gt; When evaluating agent products, release notes should be checked for tool provenance, execution scope, remote session behavior, and telemetry defaults, not just model features.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/openclaw/openclaw/releases/tag/v2026.5.24-beta.1" target="_blank"&gt;OpenClaw release&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-05-30 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260530-ai-brief/</link><pubDate>Sat, 30 May 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260530-ai-brief/</guid><description>&lt;h1 id="2026-05-30-ai-news-brief"&gt;2026-05-30 AI News Brief&lt;a class="anchor" href="#2026-05-30-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;A roundup of AI technology news worth checking today, along with shifts in developer tools, open source, infrastructure, and organizations in the AI era. This brief focuses on official announcements and community signals published from May 28 to May 30.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Anthropic released Claude Opus 4.8 with effort control, dynamic workflows, and improved honesty.&lt;/li&gt;
&lt;li&gt;GitHub Copilot made Claude Opus 4.8 generally available while signaling a switch to Usage Based Billing on June 1.&lt;/li&gt;
&lt;li&gt;Cursor 3.6 introduced an Auto-review run mode that combines a classifier subagent with sandboxing to work longer with fewer approvals.&lt;/li&gt;
&lt;li&gt;Google released Gemini Embedding 2, mapping text, image, video, audio, and documents into one space to simplify multimodal search and RAG.&lt;/li&gt;
&lt;li&gt;Hexo Labs open-sourced SIA, a self-improving agent that edits both the harness and the model weights.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-news"&gt;Top News&lt;a class="anchor" href="#top-news"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="anthropic-releases-claude-opus-48"&gt;Anthropic releases Claude Opus 4.8&lt;a class="anchor" href="#anthropic-releases-claude-opus-48"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On May 28, Anthropic released Claude Opus 4.8. It improves on Opus 4.7 across coding and agentic benchmarks while keeping the same price: $5 per million input tokens and $25 per million output tokens. A new effort control lets you choose how hard Claude thinks on a task—and how many tokens it spends—across Low / Medium / High / Max. Claude Code adds dynamic workflows as a research preview, letting Claude spin up hundreds of parallel subagents in a single session to tackle large tasks and verify the results.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; The detail this writer finds most notable is honesty rather than raw performance. Anthropic says Opus 4.8 is less likely to &amp;ldquo;confidently claim progress on thin evidence&amp;rdquo; and is roughly 4x less likely to let flaws in its own code pass unremarked. As agents run autonomously for longer, a &amp;ldquo;plausible but wrong report&amp;rdquo; becomes the most expensive failure, so a model that flags its own uncertainty directly helps operational trust.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; Dynamic workflows store orchestration logic in standalone scripts instead of the LLM context window, with checkpointing and resume. When attempting long tasks like large-scale migrations, don&amp;rsquo;t just look at model performance—design how the work is split and where verification loops sit.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.anthropic.com/news/claude-opus-4-8" target="_blank"&gt;Read the Anthropic announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-makes-claude-opus-48-ga-and-signals-usage-based-billing"&gt;GitHub Copilot makes Claude Opus 4.8 GA and signals usage-based billing&lt;a class="anchor" href="#github-copilot-makes-claude-opus-48-ga-and-signals-usage-based-billing"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On May 28, GitHub announced that Claude Opus 4.8 is generally available in GitHub Copilot. Copilot Pro+ / Business / Enterprise users can pick it in the model picker across VS Code, Visual Studio, Copilot CLI, the cloud agent, JetBrains, Xcode, and more. The model launches with a 15x premium request multiplier until Usage Based Billing begins on June 1. Enterprise and Business admins must enable the Opus 4.8 policy in settings.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; Even for the same model, where and how it&amp;rsquo;s billed drives the real cost. The 15x multiplier and the June 1 billing switch are a signal that leaving a high-performance model on by default can run up costs quickly. The shift from per-seat flat pricing to usage-based billing is accelerating across developer tools.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; Before turning Opus 4.8 on for a team, it helps to decide which tasks deserve the high-performance model and which everyday completions can use a lighter one.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-05-28-claude-opus-4-8-is-generally-available-for-github-copilot/" target="_blank"&gt;Read the GitHub Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="cursor-36-adds-an-auto-review-run-mode"&gt;Cursor 3.6 adds an Auto-review run mode&lt;a class="anchor" href="#cursor-36-adds-an-auto-review-run-mode"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On May 29, Cursor 3.6 introduced a new run mode called Auto-review. It applies to Shell, MCP, and Fetch tool calls. Allowlisted calls run immediately, calls that can be sandboxed run in the sandbox, and every other agent action goes to a classifier subagent that decides whether to allow the call, try a different approach, or ask for your approval.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; To let agents run autonomously for longer, you need to cut the friction of constant approvals—without letting risky commands run unchecked. Auto-review tries to strike that balance with execution-level safeguards (allowlist + sandbox + classifier) instead of merely telling the model to &amp;ldquo;be careful&amp;rdquo; in a prompt.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; In Ted Factory&amp;rsquo;s harness experiments, tool permissions are more robust as rules of the execution environment than as model prompts. You can give the classifier custom instructions, so it helps to spell out criteria for risky working directories or network calls.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://cursor.com/changelog" target="_blank"&gt;Read the Cursor Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-releases-the-multimodal-embedding-model-gemini-embedding-2"&gt;Google releases the multimodal embedding model Gemini Embedding 2&lt;a class="anchor" href="#google-releases-the-multimodal-embedding-model-gemini-embedding-2"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On May 29, Google released Gemini Embedding 2. An embedding turns data like text or images into numeric vectors that are easy to search and compare, and Gemini Embedding 2 is the first model to map text, image, video, audio, and documents into a single semantic space. It&amp;rsquo;s available via the Gemini API and Vertex AI and supports over 100 languages.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; Until now, multimodal search meant building separate text and image embeddings and stitching together complex pipelines. When one model maps multiple formats into the same space, building RAG (Retrieval-Augmented Generation) or multimodal search becomes simpler, and agents can cross-reference documents, video, and code more easily.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; When building a personal knowledge base or blog search, it&amp;rsquo;s worth checking whether you can merge separate text and image indexes into one. That said, the balance between output dimensions (3,072 by default) and storage cost is best tested directly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-embedding-2/" target="_blank"&gt;Read the Google announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-usage-metrics-api-adds-ai-adoption-cohorts"&gt;GitHub Copilot usage metrics API adds AI adoption cohorts&lt;a class="anchor" href="#github-copilot-usage-metrics-api-adds-ai-adoption-cohorts"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On May 29, GitHub added AI adoption phase classification to the Copilot usage metrics API. Based on which Copilot surfaces a user touched over a rolling 28-day window, each engaged user is sorted into four phases: Code first (code completion / IDE agent), Agent first (a single agent surface), Multi-agent (two or more agent surfaces or the new Copilot app), and Phase 0 for users who don&amp;rsquo;t meet the criteria.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; &amp;ldquo;How people use Copilot&amp;rdquo; reveals an organization&amp;rsquo;s AI maturity better than &amp;ldquo;how many people use it.&amp;rdquo; A team stuck on autocomplete and a team chaining multiple agents have different productivity and risk profiles. Cohort metrics like these give a basis for measuring adoption impact and deciding where to invest in training and governance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; When handling adoption metrics, it&amp;rsquo;s better not to equate usage directly with outcomes. They only become meaningful alongside result metrics like per-phase code acceptance rates and time-to-merge.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-05-29-copilot-usage-metrics-api-adds-cohorts-for-ai-adoption/" target="_blank"&gt;Read the GitHub Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="threads-to-watch"&gt;Threads to Watch&lt;a class="anchor" href="#threads-to-watch"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="hexo-labs-sia-an-open-source-self-improving-agent-that-edits-both-harness-and-weights"&gt;Hexo Labs SIA, an open-source self-improving agent that edits both harness and weights&lt;a class="anchor" href="#hexo-labs-sia-an-open-source-self-improving-agent-that-edits-both-harness-and-weights"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; On May 28, Hexo Labs open-sourced SIA (Self-Improving AI) under an MIT license. Most agents stop improving once a human stops tuning them, but SIA edits both the agent&amp;rsquo;s harness (system prompts / tool dispatch / retry policy) and the model weights (via LoRA, a low-rank adapter) inside a single self-improving loop. A Feedback-Agent reads the full trajectory of each run and, based on observed rewards, chooses whether to rewrite the harness or update the weights. The base model is gpt-oss-120b, with the Meta-Agent and Feedback-Agent running on Claude Sonnet 4.6.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; It captures the shift from &amp;ldquo;is the model smart enough?&amp;rdquo; to &amp;ldquo;how do we evolve the harness and the learning loop around the model together?&amp;rdquo; The authors&amp;rsquo; distinction is especially interesting: harness edits add software-engineering hygiene, while weight updates surface domain knowledge no prompt can reach.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; Rather than marketing lines like &amp;ldquo;350x acceleration,&amp;rdquo; look at how they separately measure harness changes and weight changes—that comparison gives a better sense of what the self-improving loop actually does.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/hexo-ai/sia" target="_blank"&gt;View the SIA repository&lt;/a&gt;, &lt;a href="https://arxiv.org/abs/2605.27276" target="_blank"&gt;Read the paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="the-missing-quality-layer-for-ai-coding-agents"&gt;The missing quality layer for AI coding agents&lt;a class="anchor" href="#the-missing-quality-layer-for-ai-coding-agents"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; A post from Generative Programmer argues that teams are moving past the first-order question of &amp;ldquo;can a coding agent write code?&amp;rdquo; to &amp;ldquo;what has to exist around the agent before we can trust the code it merges?&amp;rdquo; The author proposes a quality layer that sits between the agent and the pull request, with five controls: fast feedback, semantic evals, refactor boundaries, provenance tracking, and an agent-surface inventory of what the agent touched.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; Agents make first drafts cheap, but trust still comes from engineering controls. By focusing on &amp;ldquo;how do you verify, and how do you prove where things came from?&amp;rdquo; rather than model bragging, it offers a perspective you can apply to real-world decisions independent of big-tech launches.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; If your team has started using agents, it&amp;rsquo;s worth starting with fast feedback and provenance tracking among the five controls, then layering on the rest.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://generativeprogrammer.com/p/the-missing-quality-layer-for-ai" target="_blank"&gt;Read the Generative Programmer post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="aislop-a-cli-for-catching-ai-generated-code-smells"&gt;AISlop, a CLI for catching AI-generated code smells&lt;a class="anchor" href="#aislop-a-cli-for-catching-ai-generated-code-smells"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; AISlop, posted as a Show HN on Hacker News, is a CLI that catches patterns that show up in AI-generated code—empty catch blocks, useless comments, duplicated helper functions, dead code—the &amp;ldquo;code smells&amp;rdquo; that aren&amp;rsquo;t syntax errors or test failures and so slip past ordinary linters and tests. You can wire it into hooks so the agent checks itself after each tool call.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; As code generation speeds up, filtering out &amp;ldquo;code that passes but erodes maintainability&amp;rdquo; matters more. AISlop takes the approach of a review assistant that catches what a human missed at the end, sitting in the same context as the quality-layer discussion above.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; When adding a quality gate to an agent workflow, it&amp;rsquo;s worth considering a lightweight dedicated scanner at the hook stage for fast feedback, instead of a heavy mega-linter.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://news.ycombinator.com/item?id=48322956" target="_blank"&gt;Read the Hacker News thread&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="youtube-brief"&gt;YouTube Brief&lt;a class="anchor" href="#youtube-brief"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="opus-48-just-dropped-heres-how-to-actually-use-it"&gt;Opus 4.8 Just Dropped. Here&amp;rsquo;s How To Actually Use It.&lt;a class="anchor" href="#opus-48-just-dropped-heres-how-to-actually-use-it"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Channel&lt;/strong&gt;: Nate Herk | AI Automation&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; The video covers how Opus 4.8 layers sharper judgment, more honesty about its own progress, and longer autonomous runs on top of Opus 4.7—at the same price. It walks through what&amp;rsquo;s new from a Claude Code perspective, how 4.8 aims to address pain points people hit with 4.7, and how effort control changes the way you should work with it. It also notes that rate limits for API usage in Claude Code were raised to accommodate higher token use at higher effort levels.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why watch&lt;/strong&gt; Useful for developers wondering how to apply Opus 4.8 to a real coding workflow.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Video&lt;/strong&gt;: &lt;a href="https://www.youtube.com/watch?v=q5lg3npxjAc" target="_blank"&gt;Watch the video&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-06-03 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260603-ai-brief/</link><pubDate>Wed, 03 Jun 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260603-ai-brief/</guid><description>&lt;h1 id="2026-06-03-ai-news-brief"&gt;2026-06-03 AI News Brief&lt;a class="anchor" href="#2026-06-03-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;A roundup of AI technology news worth checking today, along with shifts in developer tools, open source, infrastructure, and organizations in the AI era. This brief focuses on official announcements and community / open-source signals published from May 31 to June 3.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI is expanding Codex from a coding agent into an organizational work tool with role-specific plugins, Sites, and annotations.&lt;/li&gt;
&lt;li&gt;OpenAI frontier models and Codex are now generally available on Amazon Bedrock, moving the April limited preview into enterprise deployment.&lt;/li&gt;
&lt;li&gt;Anthropic expanded Project Glasswing to about 150 organizations, arguing that the AI security bottleneck is shifting from vulnerability discovery to verification and patching.&lt;/li&gt;
&lt;li&gt;GitHub Copilot SDK is generally available, while Copilot usage-based billing is now active, making agent runtime and cost governance part of the same conversation.&lt;/li&gt;
&lt;li&gt;NVIDIA Rubin-based DGX SuperPOD, Holo3.1, and Mellum2 show where agent-era infrastructure, local agents, and lightweight models are heading.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-news"&gt;Top News&lt;a class="anchor" href="#top-news"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="openai-expands-codex-into-a-role-specific-work-platform"&gt;OpenAI expands Codex into a role-specific work platform&lt;a class="anchor" href="#openai-expands-codex-into-a-role-specific-work-platform"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 2, OpenAI added role-specific plugins, Sites, and annotations to Codex. A plugin is a reusable work package that bundles app integrations, skills, and MCP (Model Context Protocol) servers. The new plugins cover data analytics, creative production, sales, product design, public equity investing, and investment banking, with 62 apps and 110 skills combined. Sites lets Codex create interactive web apps such as dashboards, planners, and project boards that can be shared through workspace URLs, while annotations let users point Codex at a specific part of a document, spreadsheet, or site for targeted revision.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; Codex is moving from &amp;ldquo;a tool that writes code&amp;rdquo; toward &amp;ldquo;an execution environment that creates and updates many kinds of organizational work products.&amp;rdquo; The fact that plugins bundle skills, apps, and MCP servers together is a signal that agent product competition is expanding beyond model calls into permissions, tool connections, approval flows, and shared outputs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; Sites are especially interesting from a developer-tools angle. Once agents start producing small web apps that teams can inspect and manipulate, the line between a report and an internal tool gets thinner.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/codex-for-every-role-tool-workflow/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;, &lt;a href="https://developers.openai.com/codex/plugins" target="_blank"&gt;Read the Codex plugins docs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="follow-up-openai-models-and-codex-are-ga-on-amazon-bedrock"&gt;Follow-up: OpenAI models and Codex are GA on Amazon Bedrock&lt;a class="anchor" href="#follow-up-openai-models-and-codex-are-ga-on-amazon-bedrock"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 1, OpenAI and AWS made OpenAI frontier models and Codex generally available on Amazon Bedrock. This is the next step after the limited preview covered in the April brief. Enterprises can call GPT-5.5 and GPT-5.4 through Bedrock&amp;rsquo;s Responses API and configure the Codex app, CLI (Command-Line Interface), and IDE extensions to use Bedrock as the model provider. Authentication uses a Bedrock API key or AWS IAM credentials instead of ChatGPT sign-in or &lt;code&gt;OPENAI_API_KEY&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; The real barriers to enterprise AI adoption are not only model performance, but also security review, data residency, procurement, billing, and audit controls. The Bedrock path places OpenAI models and Codex inside an AWS operating model enterprises already use, reducing the friction between evaluation and production deployment. That said, OpenAI&amp;rsquo;s docs note that Fast Mode, some first-party plugins, and Codex cloud agents are limited in the initial Bedrock configuration.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; The same Codex product now has meaningful differences depending on whether it runs through OpenAI directly or through Bedrock. When evaluating enterprise adoption, teams should check not only whether the model is available, but which agent features are missing and where logs and permission boundaries sit.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/openai-frontier-models-and-codex-are-now-available-on-aws/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;, &lt;a href="https://developers.openai.com/codex/amazon-bedrock" target="_blank"&gt;Read the Codex on Bedrock docs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="anthropic-expands-project-glasswing-to-about-150-organizations"&gt;Anthropic expands Project Glasswing to about 150 organizations&lt;a class="anchor" href="#anthropic-expands-project-glasswing-to-about-150-organizations"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 2, Anthropic announced that Project Glasswing is expanding to about 150 new organizations. Project Glasswing is a collaboration program that uses the restricted Claude Mythos Preview model to find vulnerabilities in critical software and move defensive work earlier. The new group spans more than 15 countries and includes power, water, healthcare, communications, hardware, and maintainers of critical open-source software where a successful attack could create broad social harm.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; Anthropic expects high-capability cyber models to become more widely available within 6 to 12 months, so defenders need to adapt first. The key point is that the bottleneck is becoming verification, disclosure, patching, and deployment rather than discovery itself. As AI finds more bugs, security teams must triage more findings, verify real risk, and turn them into patches maintainers can actually ship.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; Teams should avoid treating AI security scanners as merely smarter linters. The post-discovery workflow, including triage, reproduction, patch validation, and responsible disclosure, has to be designed if model capability is to become real security improvement.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.anthropic.com/news/expanding-project-glasswing" target="_blank"&gt;Read the Anthropic announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-sdk-is-generally-available"&gt;GitHub Copilot SDK is generally available&lt;a class="anchor" href="#github-copilot-sdk-is-generally-available"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 2, GitHub made Copilot SDK generally available. The SDK lets developers embed Copilot&amp;rsquo;s agent runtime into applications, services, and internal developer tools. It includes planning, tool invocation, file edits, streaming, and multi-turn session management, with support for Node.js / TypeScript, Python, Go, .NET, Rust, and Java. It also includes MCP server connections, custom tools, partial system prompt customization, OpenTelemetry tracing, BYOK (Bring Your Own Key), and a hook system.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; Teams can bring the same agent runtime used by Copilot into their products instead of rebuilding planners, tool loops, permission handlers, and streaming protocols themselves. This is another sign that developer tools are moving from &amp;ldquo;AI chat panes&amp;rdquo; toward programmable agent execution layers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; Hooks and permission handlers are especially important. When embedding agents into products, operational quality depends less on answer fluency and more on which tools are allowed, who approves them, and what trace data is left behind.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-06-02-copilot-sdk-is-now-generally-available/" target="_blank"&gt;Read the GitHub Changelog&lt;/a&gt;, &lt;a href="https://github.com/github/copilot-sdk" target="_blank"&gt;View the Copilot SDK repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-usage-based-billing-is-now-active"&gt;GitHub Copilot usage-based billing is now active&lt;a class="anchor" href="#github-copilot-usage-based-billing-is-now-active"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 1, GitHub activated usage-based billing for Copilot across all plans. GitHub AI Credits replace premium request units, and every plan includes a monthly allowance. After included credits are consumed, users need to set an additional spending budget to keep using premium capabilities. Copilot code review now consumes both GitHub AI Credits and GitHub Actions minutes, and organization admins can set a default runner. User-level budget controls are also generally available for organizations and enterprises.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; High-performance models and agentic features are becoming harder to manage as a simple per-seat subscription. Features such as code review and cloud agents consume both model tokens and execution resources. Operating AI tools is now a FinOps (Financial Operations) problem as much as a feature-policy problem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; Teams should define model access, user budgets, and code review runner policy before opening every premium model to everyone. A default model by task type, plus a clear exception process, will make cost more predictable.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-06-01-updates-to-github-copilot-billing-and-plans/" target="_blank"&gt;Read the GitHub Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="nvidia-emphasizes-agent-infrastructure-with-rubin-based-dgx-superpod"&gt;NVIDIA emphasizes agent infrastructure with Rubin-based DGX SuperPOD&lt;a class="anchor" href="#nvidia-emphasizes-agent-infrastructure-with-rubin-based-dgx-superpod"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 2, NVIDIA described its Rubin-based DGX SuperPOD configuration. Rubin is an AI infrastructure platform co-designed across the Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet Switch. NVIDIA says Rubin is built to accelerate mixture-of-experts (MoE), long-context reasoning, and agentic AI, with a goal of reducing inference token cost by up to 10x versus the previous generation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters?&lt;/strong&gt; Agents require more intermediate calls, tool use, long context, and verification loops than a single inference pass. AI infrastructure is being redesigned not only for training large models, but also for handling many-step inference reliably and cheaply. It is also notable that NVIDIA emphasizes operational features such as Confidential Computing, RAS (reliability / availability / serviceability), and Mission Control.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; Agent cost is not just model pricing. The real bottleneck includes networking, memory, failure recovery, power, cooling, and operational automation across the whole AI factory.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blogs.nvidia.com/blog/dgx-superpod-rubin/" target="_blank"&gt;Read the NVIDIA Blog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="threads-to-watch"&gt;Threads to Watch&lt;a class="anchor" href="#threads-to-watch"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="holo31-a-local-computer-use-agent-model"&gt;Holo3.1, a local computer-use agent model&lt;a class="anchor" href="#holo31-a-local-computer-use-agent-model"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; H Company released the Holo3.1 model family on June 2. Holo3.1 is a computer-use model for agents that see and operate web, desktop, and mobile interfaces. It comes in 0.8B, 4B, 9B, and 35B-A3B sizes, with quantized checkpoints such as FP8, Q4 GGUF, and NVFP4. The company says Q4 GGUF is aimed at local deployment on consumer hardware, and that agents can be configured on Windows or Mac so execution stays inside the user&amp;rsquo;s own network.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; Computer-use agents can handle business systems, browsers, and desktop apps that lack APIs, but screen interaction often touches sensitive data. Local execution and smaller model sizes can reduce privacy risk, latency, and cost at the same time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; The combination of &amp;ldquo;terminal coding agent&amp;rdquo; and &amp;ldquo;GUI-operating local subagent&amp;rdquo; is worth tracking. In real workflow automation, those two agents will likely delegate to each other rather than remain separate products.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://huggingface.co/blog/Hcompany/holo31" target="_blank"&gt;Read the Hugging Face post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="jetbrains-mellum2-a-lightweight-code-model-for-agent-subtasks"&gt;JetBrains Mellum2, a lightweight code model for agent subtasks&lt;a class="anchor" href="#jetbrains-mellum2-a-lightweight-code-model-for-agent-subtasks"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; JetBrains released Mellum2 on June 1. Mellum2 is a 12B-parameter Mixture-of-Experts (MoE) model for natural language and code, activating only 2.5B parameters per token. It is released under Apache 2.0 and positioned for routing, RAG (Retrieval-Augmented Generation), summarization, sub-agents, high-throughput coding features, and private deployment.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; Agent systems are not made of one giant model alone. Real products call models repeatedly for routing, context compression, validation, and tool selection, and many of those calls do not need the strongest frontier model. Mellum2 captures the trend toward well-scoped models that make frequent intermediate work faster and cheaper.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; Even in personal projects or internal tools, it is worth experimenting with lightweight models as classifiers, summarizers, and validators instead of sending every step to a frontier model.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://huggingface.co/blog/JetBrains/mellum2-launch" target="_blank"&gt;Read the Hugging Face post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="youtube-brief"&gt;YouTube Brief&lt;a class="anchor" href="#youtube-brief"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="nvidia-gtc-taipei-2026-keynote--full-replay"&gt;NVIDIA GTC Taipei 2026 Keynote | Full Replay&lt;a class="anchor" href="#nvidia-gtc-taipei-2026-keynote--full-replay"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Channel&lt;/strong&gt;: NVIDIA&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; NVIDIA&amp;rsquo;s GTC Taipei 2026 keynote connects AI factories, agentic AI systems, physical AI, and AI-native personal computing into one story. It introduces Vera Rubin as a multi-rack, pod-scale system for the agent era and frames the Vera CPU as the processor for the agent loop: tool use, data access, and orchestration. It also discusses software and system layers such as OpenShell, Agent Toolkit, and DGX Station.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why watch&lt;/strong&gt; Useful for readers who want the bigger picture of why agents are changing not only model features, but also infrastructure, operations, security, and local computing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Video&lt;/strong&gt;: &lt;a href="https://www.youtube.com/watch?v=wSp6AiNIrsY" target="_blank"&gt;Watch the video&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-06-07 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260607-ai-brief/</link><pubDate>Sun, 07 Jun 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260607-ai-brief/</guid><description>&lt;h1 id="2026-06-07-ai-news-brief"&gt;2026-06-07 AI News Brief&lt;a class="anchor" href="#2026-06-07-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;A roundup of AI technology news worth checking today, along with shifts in developer tools, open source, infrastructure, and organizations in the agent era. This brief centers on announcements between June 4 and June 7, but also covers Microsoft&amp;rsquo;s Build 2026 MAI model launch, which landed right after the previous brief (June 3).&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI unveiled Dreaming, a system that automatically synthesizes ChatGPT memory, cutting compute by roughly 5x so memory can reach free users too.&lt;/li&gt;
&lt;li&gt;OpenAI expanded Lockdown Mode, a security setting designed to limit data exfiltration from prompt injection attacks, to all logged-in users.&lt;/li&gt;
&lt;li&gt;Microsoft introduced seven in-house MAI models at Build 2026 to reduce OpenAI dependence, putting the coding model MAI-Code-1-Flash straight into GitHub Copilot and VS Code.&lt;/li&gt;
&lt;li&gt;GitHub Copilot opened a 1-million-token context window, configurable reasoning levels, and an Agent tasks REST API for driving cloud agents from code.&lt;/li&gt;
&lt;li&gt;Cursor 3.7 added canvas Design Mode and a context-usage report, plus custom tools, stores, and Auto-review in the SDK.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-news"&gt;Top News&lt;a class="anchor" href="#top-news"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="openai-unveils-dreaming-a-rebuilt-chatgpt-memory-system"&gt;OpenAI unveils Dreaming, a rebuilt ChatGPT memory system&lt;a class="anchor" href="#openai-unveils-dreaming-a-rebuilt-chatgpt-memory-system"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 4, OpenAI unveiled Dreaming, a new system that automatically synthesizes ChatGPT memory. The previous approach centered on saved memories that required you to explicitly say &amp;ldquo;remember this.&amp;rdquo; Dreaming runs a background process after conversations to combine many chats into a picture of your preferences, constraints, and ongoing projects, and it revises stale information as circumstances change. For example, it updates &amp;ldquo;going to Singapore in July&amp;rdquo; to &amp;ldquo;went there&amp;rdquo; after the trip. It also adds a memory summary page that shows what&amp;rsquo;s stored and lets you edit or delete it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; OpenAI says it cut the compute needed to serve memory synthesis by roughly 5x in order to offer memory to free users. That shows personalization features like memory are not just a model-quality problem but a cost and scheduling problem of running background work cheaply at the scale of hundreds of millions of users. Once long-term memory reaches free users, an assistant that doesn&amp;rsquo;t make you repeat yourself becomes the norm.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; When building enterprise agents, &amp;ldquo;can the user see and edit what&amp;rsquo;s remembered&amp;rdquo; is becoming an important requirement. An editable memory summary page is close to a baseline expectation in regulated or audited environments.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/chatgpt-memory-dreaming/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openai-expands-lockdown-mode-to-defend-against-prompt-injection"&gt;OpenAI expands Lockdown Mode to defend against prompt injection&lt;a class="anchor" href="#openai-expands-lockdown-mode-to-defend-against-prompt-injection"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 4, OpenAI expanded Lockdown Mode to all logged-in users. Lockdown Mode is a security setting that deliberately blocks the paths data could leave a conversation through, to defend against prompt injection (attacks that hide malicious instructions in webpages or files to trick an AI). When on, it limits features such as live web browsing, web image display, Deep Research, Agent Mode, Canvas networking, live connectors, and file downloads. Personal users can turn it on under Settings &amp;gt; Security, and workspace admins can enable it per member.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; The more AI connects to the web and external tools, the more an attacker can exfiltrate sensitive data via hidden instructions without ever hacking the model directly. OpenAI frames Lockdown Mode not as a cure-all but as a last line of defense. It doesn&amp;rsquo;t stop prompt injection itself; it reduces the routes through which data can leave even if an attack succeeds.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; When attaching tools and external connections to an agent, it&amp;rsquo;s safer to design under the assumption that the model can be tricked. Rather than leaving everything on, blocking outbound paths by default for sensitive work and opening them only when needed reduces exfiltration risk.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/introducing-lockdown-mode-and-elevated-risk-labels-in-chatgpt/" target="_blank"&gt;Read the OpenAI announcement&lt;/a&gt;, &lt;a href="https://techcrunch.com/2026/06/06/openai-unveils-lockdown-mode-to-protect-sensitive-data-from-prompt-injection-attacks/" target="_blank"&gt;Read the TechCrunch article&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="microsoft-unveils-seven-in-house-mai-models-at-build-2026"&gt;Microsoft unveils seven in-house MAI models at Build 2026&lt;a class="anchor" href="#microsoft-unveils-seven-in-house-mai-models-at-build-2026"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 2 at Build 2026, Microsoft introduced seven in-house MAI models spanning image (MAI-Image-2.5 and Flash), voice (MAI-Voice-2 and Flash), transcription (MAI-Transcribe-1.5), reasoning (MAI-Thinking-1), and coding (MAI-Code-1-Flash). MAI-Thinking-1 is a Mixture-of-Experts (MoE) model with 35 billion active parameters and a 256k-token context window; Microsoft says blind testers preferred it to Claude Sonnet 4.6 and it approaches Claude Opus 4.6 on the SWE-Bench Pro coding evaluation. MAI-Code-1-Flash is a lightweight 5-billion-active-parameter coding model that shipped the same day as one of the default models in VS Code via Copilot. Microsoft stressed it trained the family from scratch on its own data, with no distillation from third-party models.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Microsoft has been the largest distribution channel for OpenAI models. This launch signals it can now route Copilot, GitHub, Office, and Azure workloads to its own models when it makes sense. Notably, putting a small coding model in as a default reflects a trend toward handling everyday work with cost-efficient models rather than sending everything to a top-tier model.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; Even within the same Copilot, it&amp;rsquo;s worth checking which model is the default for which kind of task. As model providers multiply, choosing per-task default models by cost, performance, and data residency increasingly drives operational quality.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://microsoft.ai/news/building-a-hillclimbing-machine-launching-seven-new-mai-models/" target="_blank"&gt;Read the Microsoft AI announcement&lt;/a&gt;, &lt;a href="https://microsoft.ai/news/introducing-mai-thinking-1/" target="_blank"&gt;See the MAI-Thinking-1 intro&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-adds-a-1m-token-context-and-configurable-reasoning"&gt;GitHub Copilot adds a 1M-token context and configurable reasoning&lt;a class="anchor" href="#github-copilot-adds-a-1m-token-context-and-configurable-reasoning"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 4, GitHub added a 1-million-token context window and configurable reasoning levels to Copilot. The 1M-token context lets you work across larger codebases, longer documents, and multi-file tasks without losing context. Configurable reasoning lets you set the balance of speed and depth, turning on extended thinking for hard architecture and debugging problems. Both are available in VS Code, the Copilot CLI (Command-Line Interface), and the GitHub Copilot app.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Choosing a larger context or higher reasoning level consumes more AI credits per interaction. GitHub recommends defaults for everyday tasks and extended options only for complex multi-file problems. Combined with usage-based billing that took effect on June 1, &amp;ldquo;how far you push performance&amp;rdquo; now directly maps to &amp;ldquo;how much you spend.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; At the team level, setting default context and reasoning levels as the standard and guiding people to use extended options only for exceptions helps keep costs predictable.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-06-04-larger-context-windows-and-configurable-reasoning-levels-for-github-copilot/" target="_blank"&gt;Read the GitHub Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-copilot-opens-an-agent-tasks-rest-api-for-cloud-agents"&gt;GitHub Copilot opens an Agent tasks REST API for cloud agents&lt;a class="anchor" href="#github-copilot-opens-an-agent-tasks-rest-api-for-cloud-agents"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 4, GitHub opened the Agent tasks REST API in public preview for Copilot Pro / Pro+ / Max users. The API lets you start and track Copilot cloud agent tasks from a program. The cloud agent makes and validates code changes in its own development environment, then opens a pull request. GitHub cited examples like fanning out refactors or migrations across many repositories from a script, setting up new repositories in one click from an internal developer portal, and automatically preparing weekly release notes. It supports personal access tokens and OAuth tokens for authentication.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; This is the shift from agents that work only inside a chat window to agents wired into internal automation and workflows via code. Once you can fan tasks out across many repositories, the human role moves from doing the work to designing who gets delegated which tasks, when, and how they&amp;rsquo;re reviewed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; When attaching agents to automation, it&amp;rsquo;s safer to decide token permission scope, approval rules for write actions, and how many tasks you fan out at once before you start.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.blog/changelog/2026-06-04-agent-tasks-rest-api-now-available-for-copilot-pro-pro-and-max/" target="_blank"&gt;Read the GitHub Changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="cursor-37-brings-canvas-design-mode-and-sdk-updates"&gt;Cursor 3.7 brings canvas Design Mode and SDK updates&lt;a class="anchor" href="#cursor-37-brings-canvas-design-mode-and-sdk-updates"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Across June 4 to 5, Cursor shipped its 3.7 update and SDK improvements. Canvases (interactive artifacts agents create, like dashboards, reports, and internal tools) gained Design Mode, so instead of describing a change in text you can point at a UI element to direct edits. A context-usage report was added that shows, as a canvas, how tokens are allocated across the system prompt, tool definitions, rules, and skills, with a &amp;ldquo;Debug with Agent&amp;rdquo; button to diagnose ways to reduce usage in a new conversation. Around the same time, the SDK added custom tool exposure, a choice of metadata store (SQLite or version-controllable JSONL), routing local tool calls through Auto-review, and nested subagents.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; The trend of agents producing interactive tools teams can directly manipulate, rather than plain text, continues. The ability to see and diagnose context usage in particular addresses the fact that agent quality depends heavily not just on model capability but on &amp;ldquo;what you put into context.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; The more rules, skills, and MCP (Model Context Protocol) servers you add, the more context quietly bloats. Periodically checking where tokens go via the usage report lets you manage cost and response quality together.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://cursor.com/changelog" target="_blank"&gt;Read the Cursor Changelog&lt;/a&gt;, &lt;a href="https://cursor.com/changelog/sdk-updates-jun-2026" target="_blank"&gt;See the Cursor SDK update&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="flows-worth-following"&gt;Flows Worth Following&lt;a class="anchor" href="#flows-worth-following"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="hermes-agent-an-open-source-agent-with-a-self-improvement-loop"&gt;Hermes Agent, an open-source agent with a self-improvement loop&lt;a class="anchor" href="#hermes-agent-an-open-source-agent-with-a-self-improvement-loop"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; Hermes Agent, the open-source agent from Nous Research, shipped a new release (v2026.6.5) on June 6. With over 180,000 GitHub stars, it&amp;rsquo;s one of the fastest-growing projects of the year. It says it has a built-in self-improvement loop that creates skills from experience, refines them during use, searches its own past conversations, and builds a deepening model of who you are across sessions. It isn&amp;rsquo;t tied to a specific model and can run on anything from a cheap VPS to a GPU cluster.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; Separate from large companies&amp;rsquo; closed agent products, community-built open-source agents are maturing fast. Having concepts like memory, skills, and self-improvement open in code lets you directly experiment with how an agent adapts to a user over time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; When designing how to store and update an agent&amp;rsquo;s memory and skills in an internal tool or personal project, referencing an open-source implementation helps you structure your own.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/NousResearch/hermes-agent" target="_blank"&gt;See the Hermes Agent repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="draft-us-federal-ai-bill-the-great-american-ai-act"&gt;Draft US federal AI bill, the &amp;lsquo;Great American AI Act&amp;rsquo;&lt;a class="anchor" href="#draft-us-federal-ai-bill-the-great-american-ai-act"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; On June 4, US Representatives Jay Obernolte and Lori Trahan released a 269-page discussion draft of a federal AI bill, the Great American Artificial Intelligence Act. The core is a clause that would, for three years, preempt state laws regulating the development of frontier (cutting-edge) AI models at the federal level. It leaves state laws on post-deployment use in place, and requires companies with over $500M in annual revenue to publish frontier AI safety frameworks, report critical safety incidents, and allow audits. It is a discussion draft, not a formal bill, and labor unions and others pushed back strongly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; It&amp;rsquo;s a turning point for whether US AI regulation fragments by state or consolidates into a single federal standard. As an attempt to regulate the building side (development) and the using side (deployment) separately, it helps you gauge in advance what obligations might arise, and where, when bringing AI products to the US market.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; At the discussion-draft stage it may change significantly or never pass. Still, the &amp;ldquo;development vs deployment&amp;rdquo; framing is likely to keep appearing in future debates, so it&amp;rsquo;s worth tracking the trend.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://rollcall.com/2026/06/04/bipartisan-ai-draft-proposes-three-year-preemption-of-state-laws/" target="_blank"&gt;Read the Roll Call article&lt;/a&gt;, &lt;a href="https://fedscoop.com/bipartisan-great-american-ai-act-draft-proposes-new-federal-ai-governance-framework/" target="_blank"&gt;Read the FedScoop article&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="nvidia-rtx-spark-a-signal-toward-on-device-ai"&gt;NVIDIA RTX Spark, a signal toward on-device AI&lt;a class="anchor" href="#nvidia-rtx-spark-a-signal-toward-on-device-ai"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; On June 1 at Computex 2026 in Taiwan, NVIDIA unveiled the Arm-based RTX Spark chip. Designed to handle AI agents, content creation, and gaming on a single laptop, NVIDIA said it would reinvent the PC alongside Microsoft. Adobe is rebuilding Photoshop and Premiere Pro for the chip&amp;rsquo;s architecture, and RTX Spark laptops are expected to launch in autumn 2026.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; The center of gravity for AI compute has been the data center. NVIDIA expanding into client devices means it sees running agents locally, without cloud latency and cost, as a potential next bottleneck. For computer-use agents or sensitive data processing, local execution reduces not just cost but privacy and latency concerns too.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Worth watching&lt;/strong&gt; It&amp;rsquo;s worth watching the split of roles between &amp;ldquo;large cloud models&amp;rdquo; and &amp;ldquo;lightweight on-device agents.&amp;rdquo; Deciding which tasks to push local and which to keep in the cloud becomes a key axis of product design.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.cnbc.com/2026/06/02/nvidias-new-pc-chips-jensen-huangs-bid-to-win-at-every-layer.html" target="_blank"&gt;Read the CNBC article&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="youtube-brief"&gt;YouTube Brief&lt;a class="anchor" href="#youtube-brief"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="microsoft-ai-ceo-unveils-7-new-ai-models--mustafa-suleyman-at-microsoft-build-2026"&gt;Microsoft AI CEO unveils 7 new AI models | Mustafa Suleyman at Microsoft Build 2026&lt;a class="anchor" href="#microsoft-ai-ceo-unveils-7-new-ai-models--mustafa-suleyman-at-microsoft-build-2026"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Channel&lt;/strong&gt;: Microsoft&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt; In the Microsoft Build 2026 keynote, Microsoft AI CEO Mustafa Suleyman personally introduces the seven MAI models. He walks through the lineup across image, voice, transcription, reasoning, and coding, presents MAI-Thinking-1 as a reasoning model with 35B active parameters and a 256k context, and MAI-Code-1-Flash as a 5B coding model that scores 51% on SWE-Bench Pro while being tuned for VS Code and the GitHub Copilot CLI. He also mentions optimizing the models on Microsoft&amp;rsquo;s own Maia 200 chip.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth watching&lt;/strong&gt; Useful for readers who want to hear, from the presenter himself, why Microsoft started building its own models and what putting small models into default tools is aiming for.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Video&lt;/strong&gt;: &lt;a href="https://www.youtube.com/watch?v=OvLIae4HCeM" target="_blank"&gt;Watch the video&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-06-10 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260610-ai-brief/</link><pubDate>Wed, 10 Jun 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260610-ai-brief/</guid><description>&lt;h1 id="2026-06-10-ai-news-brief"&gt;2026-06-10 AI News Brief&lt;a class="anchor" href="#2026-06-10-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Here are the AI technology news items worth checking today, along with shifts in developer tools, open source, infrastructure, and organizations in the AI era. This brief centers on announcements from June 8 to June 10, while also covering the developer news from Apple WWDC 2026, held during the same window.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI confidentially filed its IPO paperwork (S-1), joining Anthropic and SpaceX in the race for public listings among AI companies.&lt;/li&gt;
&lt;li&gt;At WWDC 2026, Apple added a LanguageModel protocol to Foundation Models, letting developers swap in external models like Claude and Gemini without code changes.&lt;/li&gt;
&lt;li&gt;Google unveiled Gemini 3.5 Live Translate, which interprets 70-plus languages in real time.&lt;/li&gt;
&lt;li&gt;Google NotebookLM moved to Gemini 3.5 and Antigravity, gaining code execution and chart / slide generation.&lt;/li&gt;
&lt;li&gt;We also cover non-big-tech developer signals such as the Nex-N2 open-source agent model and Simon Willison&amp;rsquo;s WASM code sandbox.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-news"&gt;Top News&lt;a class="anchor" href="#top-news"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="openai-confidentially-files-its-ipo-s-1"&gt;OpenAI Confidentially Files Its IPO S-1&lt;a class="anchor" href="#openai-confidentially-files-its-ipo-s-1"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 8, OpenAI said it confidentially submitted a draft S-1 for an IPO (Initial Public Offering) to the U.S. Securities and Exchange Commission (SEC). A confidential draft is not a formal listing application; it lets the SEC review the document first, after which the company can decide whether to go public depending on market conditions. OpenAI has not set the offering size, price, or timeline, but reports point to a Q4 2026 listing at a valuation between roughly $850 billion and $1 trillion. Anthropic took the same step on June 1, and SpaceX is set to list on June 12.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; It is the first time AI builders have lined up at the public-market threshold within a single month. Going public means disclosing numbers like revenue, profit and loss, and compute commitments, so the question moves beyond &amp;ldquo;can it build strong models?&amp;rdquo; to &amp;ldquo;can it turn strong models into a durable, profitable business?&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; Once the filing becomes public, items like token consumption, inference costs, and GPU rental commitments may be revealed. Even for those who simply use AI services, it offers a way to gauge how a provider&amp;rsquo;s cost structure feeds into pricing and usage limits.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://asia.nikkei.com/business/technology/artificial-intelligence/openai-follows-anthropic-in-filing-for-us-ipo" target="_blank"&gt;Read the Nikkei Asia article&lt;/a&gt;, &lt;a href="https://www.anthropic.com/news/confidential-draft-s1-sec" target="_blank"&gt;Read Anthropic&amp;rsquo;s announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="apple-wwdc-2026-adds-a-model-swapping-protocol-and-xcode-27-agents-to-foundation-models"&gt;Apple WWDC 2026 Adds a Model-Swapping Protocol and Xcode 27 Agents to Foundation Models&lt;a class="anchor" href="#apple-wwdc-2026-adds-a-model-swapping-protocol-and-xcode-27-agents-to-foundation-models"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Apple held its developer event WWDC 2026 on June 8 and substantially expanded the Foundation Models framework for adding AI to apps. The centerpiece is the new LanguageModel protocol. A protocol is a shared spec that lets Apple&amp;rsquo;s on-device models and external cloud models be called the same way, so developers can switch among Apple&amp;rsquo;s default model, Claude, and Gemini by changing only a Swift Package Manager dependency, with no other code changes. Anthropic and Google each published Swift packages implementing the protocol, and Apple also announced server models usable without account setup (Private Cloud Compute) and the open-sourcing of the framework. The accompanying Xcode 27 brings the latest models and agents from Anthropic, Google, and OpenAI directly into the editor.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Until now, wiring a specific AI into an app often locked you into that vendor. Abstracting models behind a spec makes it easier to switch by task type, cost, or data-processing location. This is Apple cementing, at the operating-system level, the trend of treating AI models like interchangeable parts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; When models become easy to swap, differentiation shifts from the model itself to which task you route to which model and how you review the results. Designing how to split on-device, server, and external-cloud models by task will drive both app quality and cost.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.apple.com/newsroom/2026/06/apple-aids-app-development-with-new-intelligence-frameworks-and-advanced-tools/" target="_blank"&gt;Read the Apple Newsroom post&lt;/a&gt;, &lt;a href="https://developer.apple.com/videos/play/wwdc2026/241/" target="_blank"&gt;Watch the WWDC session&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-unveils-gemini-35-live-translate-for-real-time-interpretation-across-70-plus-languages"&gt;Google Unveils Gemini 3.5 Live Translate for Real-Time Interpretation Across 70-plus Languages&lt;a class="anchor" href="#google-unveils-gemini-35-live-translate-for-real-time-interpretation-across-70-plus-languages"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 9, Google unveiled Gemini 3.5 Live Translate, a real-time speech translation model. It automatically detects more than 70 languages and generates natural translated speech that preserves the speaker&amp;rsquo;s intonation, pace, and pitch. Older systems waited for a speaker to finish before translating, but this model interprets continuously while staying just a few seconds behind. It opened in public preview for developers via the Gemini Live API and Google AI Studio, in private preview for enterprises in Google Meet, and is rolling out to consumers through the Google Translate app on Android and iOS.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Real-time interpretation directly affects situations where people interact face to face, such as meetings, business travel, and customer service. Because it is also available via API, translation can be embedded as a feature inside one&amp;rsquo;s own app or service.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; For voice features, latency shapes the experience. How the model balances &amp;ldquo;wait longer for accuracy&amp;rdquo; against &amp;ldquo;speak sooner for real-time flow&amp;rdquo; determines the perceived quality in actual conversation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-live-3-5-translate/" target="_blank"&gt;Read Google&amp;rsquo;s announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-notebooklm-adds-code-execution-and-document-generation-on-gemini-35-and-antigravity"&gt;Google NotebookLM Adds Code Execution and Document Generation on Gemini 3.5 and Antigravity&lt;a class="anchor" href="#google-notebooklm-adds-code-execution-and-document-generation-on-gemini-35-and-antigravity"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 8, Google substantially upgraded its research tool NotebookLM. NotebookLM answers questions based on documents users upload and helps summarize and connect them. With this update, the underlying models move to Gemini 3.5 and Antigravity, and a secure cloud computer for safely running code is added, so it can directly produce formats like charts, spreadsheets, and slides. You can even start with a loose idea and have the tool find and organize relevant web sources. It is rolling out globally to Google AI Ultra users and some Workspace business accounts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; This is a shift from reading and answering toward running code to analyze and produce finished artifacts. When a research tool expands from &amp;ldquo;reading assistant&amp;rdquo; to &amp;ldquo;analysis / output workbench,&amp;rdquo; handling everything from research to a draft report inside one tool becomes possible.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; For tools with code execution, it matters whether you can trace the basis of the results. Building a habit of checking which sources and calculations a generated chart or table came from helps preserve reliability.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://blog.google/innovation-and-ai/products/notebooklm/better-research-notebooklm/" target="_blank"&gt;Read Google&amp;rsquo;s announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="claude-code-21169-adds-a-diagnostic-safe-mode-and-cd-command"&gt;Claude Code 2.1.169 Adds a Diagnostic Safe Mode and /cd Command&lt;a class="anchor" href="#claude-code-21169-adds-a-diagnostic-safe-mode-and-cd-command"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Anthropic&amp;rsquo;s terminal coding tool Claude Code shipped version 2.1.169 on June 9. The new safe mode (the &lt;code&gt;--safe-mode&lt;/code&gt; flag or the &lt;code&gt;CLAUDE_CODE_SAFE_MODE&lt;/code&gt; environment variable) runs with all customizations disabled, including CLAUDE.md, plugins, skills, hooks, and MCP (Model Context Protocol) servers, so you can tell whether a problem comes from your configuration or the tool itself. The &lt;code&gt;/cd&lt;/code&gt; command moves the working directory without breaking the prompt cache mid-session, and the &lt;code&gt;disableBundledSkills&lt;/code&gt; setting hides built-in skills and slash commands from the model. The release also fixed enterprise MCP policy enforcement and remote-session stability.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; As rules, skills, and MCP servers pile up, it gets harder to tell why an agent behaves oddly. Safe mode, which reproduces behavior in a clean state with everything turned off, provides a starting point for debugging in increasingly customized agent setups.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; Hiding bundled skills is also a way to reduce context. Since tokens spent on tool definitions and skills affect both response quality and cost, regularly trimming to only what you need is becoming more important.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/anthropics/claude-code/blob/main/CHANGELOG.md" target="_blank"&gt;Read the Claude Code changelog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="worth-a-look"&gt;Worth a Look&lt;a class="anchor" href="#worth-a-look"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="nex-n2-an-open-source-agent-model-built-on-qwen35"&gt;Nex-N2, an Open-Source Agent Model Built on Qwen3.5&lt;a class="anchor" href="#nex-n2-an-open-source-agent-model-built-on-qwen35"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; On June 9, Nex-AGI open-sourced Nex-N2, a model built for agents. Designed to carry long-running, real-world tasks through to the end, it comes in two variants post-trained on the Qwen3.5 family. The larger Nex-N2-Pro and the lighter Nex-N2-mini are each published on Hugging Face and ModelScope, letting you choose between latency and quality. It emphasizes coding and agentic performance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; Apart from big tech&amp;rsquo;s closed models, open-weights agent models keep appearing in the coding and long-horizon task space. Open-weights models can be run on your own servers or fine-tuned, making them an option where cost and data control matter.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; When designing in-house agents, it&amp;rsquo;s worth experimenting with routing some tasks to open models to cut costs rather than sending everything to a top-tier closed model.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/nex-agi/Nex-N2" target="_blank"&gt;View the Nex-N2 repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="simon-willisons-python-code-sandbox-built-with-webassembly"&gt;Simon Willison&amp;rsquo;s Python Code Sandbox Built with WebAssembly&lt;a class="anchor" href="#simon-willisons-python-code-sandbox-built-with-webassembly"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; On June 6, developer and blogger Simon Willison shared an experiment in safely executing agent-generated Python code. He released an alpha package, &lt;code&gt;micropython-wasm&lt;/code&gt;, that runs MicroPython on top of WebAssembly (WASM, a technology for safely running code in browsers or isolated environments), and wired it into his tool as a code-execution plugin. He challenged a powerful model to break out of the sandbox, and it has not managed to so far.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; As agents increasingly run code directly, &amp;ldquo;where do we safely run generated code?&amp;rdquo; has become a real problem. This post shows the choices and limits an individual developer hit while implementing isolated execution, offering a practical reference for anyone tackling the same issue.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; Like OpenAI&amp;rsquo;s Lockdown Mode or Apple&amp;rsquo;s server-model isolation, isolation and permission control are common themes of the agent era. If you&amp;rsquo;re wondering how to set up isolation when adding code execution, this is worth a read.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/" target="_blank"&gt;Read Simon Willison&amp;rsquo;s post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-research-unveils-agentic-rag-that-checks-for-sufficient-context"&gt;Google Research Unveils Agentic RAG That Checks for Sufficient Context&lt;a class="anchor" href="#google-research-unveils-agentic-rag-that-checks-for-sufficient-context"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; Google Research, in collaboration with Google Cloud, unveiled an Agentic RAG framework and launched it as the Cross-Corpus Retrieval feature of the Gemini Enterprise Agent Platform in public preview. RAG (Retrieval-Augmented Generation) is an approach where a model searches external sources for grounding before answering. This version has multiple agents collaborate to break down complex questions and, before generating an answer, first confirms whether there is &amp;ldquo;sufficient context,&amp;rdquo; re-searching if not. Google says factuality accuracy improved by up to 34% over standard RAG.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth a look&lt;/strong&gt; For in-house document-based chatbots or search assistants, the biggest problem is answering plausibly without enough grounding. A structure that checks for sufficient context before answering is a design pattern that will frequently appear in business systems where reliability matters.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; For questions that span multiple source collections, the key to real adoption is whether you can trace which sources were used as grounding (auditability).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://research.google/blog/unlocking-dependable-responses-with-gemini-enterprise-agent-platforms-agentic-rag/" target="_blank"&gt;Read the Google Research post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="youtube-brief"&gt;YouTube Brief&lt;a class="anchor" href="#youtube-brief"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="openai-files-for-ipo-with-spacex-debut-well-oversubscribed--daybreak-europe-6092026"&gt;OpenAI Files for IPO with SpaceX Debut Well Oversubscribed | Daybreak Europe 6/09/2026&lt;a class="anchor" href="#openai-files-for-ipo-with-spacex-debut-well-oversubscribed--daybreak-europe-6092026"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Channel&lt;/strong&gt;: Bloomberg Television&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The gist&lt;/strong&gt; Bloomberg&amp;rsquo;s morning markets show covers OpenAI&amp;rsquo;s confidential IPO filing and its backdrop. It walks through OpenAI joining Anthropic and SpaceX in the public markets, the outlook for a valuation that could top $1 trillion, and reports that demand for this week&amp;rsquo;s SpaceX listing is oversubscribed at around $10 billion.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth watching&lt;/strong&gt; Useful for readers who want a quick take on the AI listing race from a capital-markets angle rather than a technical one.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Video&lt;/strong&gt;: &lt;a href="https://www.youtube.com/watch?v=5aXY_ATy_uM" target="_blank"&gt;Watch the video&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>2026-06-13 AI News Brief</title><link>https://tedfactory.com/en/news/ai-news/20260613-ai-brief/</link><pubDate>Sat, 13 Jun 2026 00:00:00 +0900</pubDate><guid>https://tedfactory.com/en/news/ai-news/20260613-ai-brief/</guid><description>&lt;h1 id="2026-06-13-ai-news-brief"&gt;2026-06-13 AI News Brief&lt;a class="anchor" href="#2026-06-13-ai-news-brief"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Here are the AI technology news items worth checking today, along with shifts in developer tools, open source, infrastructure, and organizations in the AI era. This brief centers on announcements from June 11 to June 13, while also catching up on Anthropic&amp;rsquo;s June 9 launch of Claude Fable 5, which the previous brief did not cover.&lt;/p&gt;
&lt;h2 id="quick-summary"&gt;Quick Summary&lt;a class="anchor" href="#quick-summary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Anthropic launched Claude Fable 5, the first Mythos-class model made generally available, alongside the restricted Claude Mythos 5, but disabled both models entirely on June 12 under a US government export-control directive.&lt;/li&gt;
&lt;li&gt;OpenAI is acquiring Ona, a company building secure cloud execution for long-running agents, to expand Codex.&lt;/li&gt;
&lt;li&gt;A new partnership lets Oracle Cloud customers spend their existing committed credits on OpenAI models and Codex.&lt;/li&gt;
&lt;li&gt;Google DeepMind and partners opened a funding call of up to $10 million for multi-agent AI safety research.&lt;/li&gt;
&lt;li&gt;Following Google&amp;rsquo;s subscription price cut, reports say OpenAI and Anthropic are weighing token price cuts as the AI price war intensifies.&lt;/li&gt;
&lt;li&gt;Xiaomi released MiMo Code, an open-source coding agent forked from OpenCode, and Simon Willison analyzed Fable 5&amp;rsquo;s &amp;ldquo;relentlessly proactive&amp;rdquo; character.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="top-news"&gt;Top News&lt;a class="anchor" href="#top-news"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="anthropic-suspends-claude-fable-5--mythos-5-access-days-after-launch-under-us-government-directive"&gt;Anthropic Suspends Claude Fable 5 / Mythos 5 Access Days After Launch Under US Government Directive&lt;a class="anchor" href="#anthropic-suspends-claude-fable-5--mythos-5-access-days-after-launch-under-us-government-directive"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; Anthropic launched Claude Fable 5 on June 9. Fable 5 is the first Mythos-class model—a capability tier above the existing Opus class—made available to general users, and it posts the highest performance of any Claude to date across software engineering, knowledge work, vision, and long-horizon tasks. The key is its safety classifier architecture: when separate AI systems detect requests related to cybersecurity, biology / chemistry, or model distillation, Claude Opus 4.8 responds instead of Fable 5. But on June 12, citing national security authorities, the US government issued an export-control directive to suspend access to Fable 5 / Mythos 5 for all foreign nationals inside or outside the US (including Anthropic&amp;rsquo;s own foreign-national employees). To comply, Anthropic immediately disabled both models for all customers—other models are unaffected—and pushed back that the &amp;ldquo;jailbreak&amp;rdquo; the government cited amounts to already-known, minor vulnerabilities that other public models like GPT-5.5 can find without any bypass.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Just as the launch pattern of &amp;ldquo;a powerful model plus a classifier that routes risky requests to a safer model&amp;rdquo; drew attention, this became the first case of a government effectively recalling a commercial frontier model. It signals that national security and export controls—separate from a model&amp;rsquo;s technical merit—have emerged as variables that decide whether it can be deployed at all.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; If you bind core workflows to a single model, work stalls when that model abruptly disappears by external directive, as it did here. Keeping a setup where you can swap models per task matters not just for cost but for availability.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://www.anthropic.com/news/claude-fable-5-mythos-5" target="_blank"&gt;Read the launch announcement&lt;/a&gt;, &lt;a href="https://www.anthropic.com/news/fable-mythos-access" target="_blank"&gt;Read the access-suspension statement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openai-to-acquire-ona-a-long-running-agent-infrastructure-company"&gt;OpenAI to Acquire Ona, a Long-Running Agent Infrastructure Company&lt;a class="anchor" href="#openai-to-acquire-ona-a-long-running-agent-infrastructure-company"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; OpenAI announced on June 11 that it will acquire Ona, a company building secure cloud execution and orchestration environments—technology for coordinating multiple agents and tasks—where agents can work for hours or days at a stretch. OpenAI plans to integrate the technology into Codex, its coding agent product line, so organizations can deploy long-running agents that are not tied to a single device or active session. The acquisition still requires regulatory approval, and the two companies will operate independently until it closes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; It shows the center of gravity in the agent race shifting from model capability to execution infrastructure: where agents run, how safely, and for how long. Handing agents multi-day work like running tests, fixing vulnerabilities, or modernizing applications requires isolated persistent environments and ways to review work in progress.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; This follows the same thread as Apple&amp;rsquo;s server model isolation and Simon Willison&amp;rsquo;s WASM sandbox covered in the previous brief. The isolation, permission, and persistence design of agent execution environments is becoming a core competitive area of agent-era infrastructure.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/openai-to-acquire-ona/" target="_blank"&gt;Read OpenAI&amp;rsquo;s announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openai-models-and-codex-now-purchasable-with-oracle-cloud-credits"&gt;OpenAI Models and Codex Now Purchasable with Oracle Cloud Credits&lt;a class="anchor" href="#openai-models-and-codex-now-purchasable-with-oracle-cloud-credits"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; OpenAI and Oracle announced a partnership on June 10. In the coming weeks, Oracle Cloud Infrastructure (OCI) customers will be able to apply their existing Oracle Universal Credits—prepaid committed credits usable across cloud services—toward OpenAI frontier models and Codex. There is no new model or feature here; what changes is the purchasing path and billing channel.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Large enterprises do not subscribe with a credit card the way individuals do; they adopt software through legal / security approvals and multi-year commitments. Letting them use OpenAI inside an already-approved Oracle contract removes the biggest adoption barrier: new vendor review. The announcement is a reminder that enterprise AI adoption is driven more by procurement paths than by benchmarks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; OpenAI has steadily widened distribution beyond its own channels—AWS Bedrock, Apple Foundation Models, and now OCI. The pattern of model companies borrowing the existing distribution networks of clouds and operating systems is solidifying.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/openai-on-oracle-cloud/" target="_blank"&gt;Read OpenAI&amp;rsquo;s announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-deepmind-opens-10m-funding-call-for-multi-agent-safety-research"&gt;Google DeepMind Opens $10M Funding Call for Multi-Agent Safety Research&lt;a class="anchor" href="#google-deepmind-opens-10m-funding-call-for-multi-agent-safety-research"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 11, Google DeepMind, together with Schmidt Sciences, the UK&amp;rsquo;s ARIA, the Cooperative AI Foundation, and Google.org, opened a funding call for multi-agent safety research. It offers up to $10 million to researchers worldwide studying the new risks—collusion, conflict, cascading failures—that emerge when millions of AI agents interact with each other online. Applications close August 8, with awardees announced in autumn.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; AI safety research so far has focused on making a single model safe; this call addresses the behavior of agent &amp;ldquo;populations.&amp;rdquo; As an era of agents contracting and transacting with each other approaches, system-level risks that single-agent verification cannot catch are becoming real operational problems.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; When designing pipelines where multiple agents collaborate, this is a signal that failure modes arising from agent-to-agent interaction deserve separate scrutiny, apart from verifying each agent individually.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://deepmind.google/blog/investing-in-multi-agent-ai-safety-research/" target="_blank"&gt;Read Google DeepMind&amp;rsquo;s announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="the-ai-subscription--token-price-war-heats-up"&gt;The AI Subscription / Token Price War Heats Up&lt;a class="anchor" href="#the-ai-subscription--token-price-war-heats-up"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 8, Google cut the price of its consumer Google AI Plus subscription from $7.99 to $4.99 per month and doubled the included storage to 400 GB. Then on June 11, analyses citing Wall Street Journal reporting said OpenAI and Anthropic—both preparing to go public—are weighing token price cuts to defend their enterprise customers. The backdrop: as major models converge in performance on common enterprise tasks, corporate buyers increasingly see the tools as somewhat interchangeable and are pushing back on costs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Generative AI burns GPU and power on every query, so its marginal costs are not low the way traditional software&amp;rsquo;s are. If price competition becomes structural, the profitability test for model companies—which have committed to massive infrastructure investments—accelerates, right as they head to public markets.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; For users, this is a period when model prices and subscription policies change frequently. Keeping a setup where you can swap models per task, rather than binding deeply to one model, preserves your cost leverage.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://sherwood.news/tech/openai-anthropic-google-price-wars-where-no-one-is-making-money/" target="_blank"&gt;Read the Sherwood News analysis&lt;/a&gt;, &lt;a href="https://9to5google.com/2026/06/08/google-ai-plus-price-drop/" target="_blank"&gt;Read the 9to5Google report&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openai-backs-the-eu-code-of-practice-on-ai-content-transparency"&gt;OpenAI Backs the EU Code of Practice on AI Content Transparency&lt;a class="anchor" href="#openai-backs-the-eu-code-of-practice-on-ai-content-transparency"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What happened?&lt;/strong&gt; On June 11, OpenAI announced its support for the European Commission&amp;rsquo;s Code of Practice on Transparency of AI-Generated Content. The Code is an implementation step of the EU AI Act, setting shared industry standards for labeling AI-generated content and making its provenance verifiable. OpenAI noted it has worked on provenance since 2024, when it began adding C2PA (Content Credentials) metadata to generated images, and that it contributed to drafting the Code.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters&lt;/strong&gt; Labeling AI-generated content is hardening from a recommendation into a regulation-backed standard. This follows the same thread as Google expanding SynthID watermarking to Search / Chrome: for any service that creates or distributes content, handling provenance metadata is gradually becoming a baseline requirement.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; If your blog or product uses AI-generated images, it is worth checking in advance which standards their metadata follows and which platforms verify it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://openai.com/index/supporting-eu-trustworthy-ai-ecosystem/" target="_blank"&gt;Read OpenAI&amp;rsquo;s announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="worth-following"&gt;Worth Following&lt;a class="anchor" href="#worth-following"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="xiaomi-releases-mimo-code-an-open-source-coding-agent-forked-from-opencode"&gt;Xiaomi Releases MiMo Code, an Open-Source Coding Agent Forked from OpenCode&lt;a class="anchor" href="#xiaomi-releases-mimo-code-an-open-source-coding-agent-forked-from-opencode"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Key points&lt;/strong&gt; On June 10, Xiaomi released MiMo Code, a terminal AI coding agent, under the MIT license. It is a fork of the open-source agent OpenCode—forking means cloning an existing project to evolve it—with additions including SQLite-based persistent memory, session checkpoints, and a separate subagent that periodically maintains the memory. Xiaomi&amp;rsquo;s own evaluation claims it beats Claude Code on ultra-long tasks exceeding 200 steps, and besides Xiaomi&amp;rsquo;s free model it can connect to external models like DeepSeek, Kimi, and GLM. It hit the Hacker News front page right after release, drawing praise along with criticism that telemetry (usage data reporting) is on by default.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth reading&lt;/strong&gt; A pattern is settling in: Anthropic ships a tool, the open-source community answers with OpenCode, and Chinese manufacturers fork that harness to optimize it for their own models. The design choice of separating the working agent from a memory-maintenance agent is an interesting answer to a shared challenge of long-running agents.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; The benchmark claims are self-reported and deserve skepticism; if you try it, disabling telemetry and starting with a personal project is the safe path.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://github.com/XiaomiMiMo/MiMo-Code" target="_blank"&gt;View the MiMo Code repository&lt;/a&gt;, &lt;a href="https://venturebeat.com/technology/xiaomis-new-open-source-agentic-ai-coding-harness-mimo-code-beats-claude-code-at-ultra-long-200-step-tasks" target="_blank"&gt;Read the VentureBeat article&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="simon-willison-claude-fable-is-relentlessly-proactive"&gt;Simon Willison: &amp;ldquo;Claude Fable Is Relentlessly Proactive&amp;rdquo;&lt;a class="anchor" href="#simon-willison-claude-fable-is-relentlessly-proactive"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Key points&lt;/strong&gt; Developer and blogger Simon Willison published his impressions of two days with Claude Fable 5 on June 11. He describes the model as &amp;ldquo;relentlessly proactive&amp;rdquo;: it deploys every trick it knows to reach its goal and has a strong tendency to fix surrounding problems it was never asked about. He shares a case where, while he was using one of his own libraries, the model spotted bugs in a dependency and fixed them on its own.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth reading&lt;/strong&gt; This is a firsthand record of how a model&amp;rsquo;s &amp;ldquo;character&amp;rdquo; shows up in real use, beyond official benchmarks. A highly proactive model boosts productivity but also raises the risk of unintended changes, making scope containment a new operational challenge.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; It illustrates that harness design—defining the boundaries of an agent&amp;rsquo;s work through rules and permissions—matters more as models grow more proactive.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/" target="_blank"&gt;Read Simon Willison&amp;rsquo;s post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="openrl-an-open-source-model-training-api-for-your-own-kubernetes-cluster"&gt;OpenRL, an Open-Source Model Training API for Your Own Kubernetes Cluster&lt;a class="anchor" href="#openrl-an-open-source-model-training-api-for-your-own-kubernetes-cluster"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Key points&lt;/strong&gt; Google&amp;rsquo;s GKE Labs released a research preview of OpenRL, an open-source, self-hosted training API for fine-tuning LLMs on your own Kubernetes cluster. Researchers write datasets, rewards, and training-loop code locally, while the cluster handles the GPU-heavy work—a deliberate separation of roles. It is compatible with Thinking Machines&amp;rsquo; Tinker API and supports LoRA fine-tuning and reinforcement learning workflows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it&amp;rsquo;s worth reading&lt;/strong&gt; It shows post-training moving down from a managed-service task to something teams run on their own infrastructure for data control and cost optimization. The design of splitting infrastructure engineers and AI researchers along an API boundary is also worth studying.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What to watch&lt;/strong&gt; For teams refining small models on their own data, this adds one more option between managed training services and full self-hosting.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source&lt;/strong&gt;: &lt;a href="https://opensource.googleblog.com/2026/06/introducing-openrl-a-self-hosted-post-training-api-for-fine-tuning-llms.html" target="_blank"&gt;Read the Google Open Source blog post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="youtube-brief"&gt;YouTube Brief&lt;a class="anchor" href="#youtube-brief"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="introducing-claude-fable-5"&gt;Introducing Claude Fable 5&lt;a class="anchor" href="#introducing-claude-fable-5"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Channel&lt;/strong&gt;: Anthropic&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Key points&lt;/strong&gt; Anthropic&amp;rsquo;s official introduction video for Fable 5. In under two minutes it explains why the previous Mythos-class model could not be released broadly—its ability to find thousands of cybersecurity vulnerabilities—and how the safeguards automatically review high-risk requests and route them to Opus 4.8. Watched alongside the announcement post, it quickly conveys the intent behind the safety classifier architecture.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why watch&lt;/strong&gt; Useful for readers who want the launch context and safety design of Fable 5 in the official presenters&amp;rsquo; own words, in a short format.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Video&lt;/strong&gt;: &lt;a href="https://www.youtube.com/watch?v=Y9Wz2PV404E" target="_blank"&gt;Watch the video&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item></channel></rss>