Candidates
386
Unique
327
GitHub
91
HN/Web
171
Papers
40
1Executive Snapshot
| Signal | Why it matters | Action |
|---|---|---|
| 386 candidates / 360 unique | Runtime+eval > model news | NEXA harness |
| GitHub 91 repos | OSS phân mảnh | 3-repo benchmark |
| HN/Web 171 items | Security/context concern | SYNCA gate |
| Paper 40 items | Enterprise eval gap | Fabbi eval suite |
| YouTube 25 videos | Practitioner education | Track KOL not hype |
2Trend Radar
Hot: agent harnessHot: sandboxEmerging: context memoryWatch: CLI IDE convergence
- Hot now: eval/runtime governance, 3+ source groups.
- Noise: demo-only agent videos, engagement N/A.
- Watchlist: Terminal-Bench/SWE-bench variants.
3KOL/OG Feed Watch
| Platform | Author/KOL | Timestamp | Engagement | URL | Why CTO cares |
|---|---|---|---|---|---|
| hn | vinhnx | 2026-05-30T03:07:25Z | 10 | Show HN: VT Code – open-source terminal coding agent in Rust | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | agentseal | 2026-05-29T21:37:59Z | 2 | Where AI coding spend goes: 48% code, 40% thinking | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | robert_dds | 2026-05-29T18:28:49Z | 2 | DDS Vibe Academy – 47 free AI coding masterclasses, built by AI agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | matt_d | 2026-05-29T18:07:26Z | 2 | MIT EECS/CSAIL Agentic Coding in Practice Seminar Series | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | nike-17 | 2026-05-29T17:25:16Z | 3 | Show HN: Sverklo – repo memory for coding agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | swanros | 2026-05-29T17:16:19Z | 2 | My "blocked-by-default" approach to working with coding agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | Brajeshwar | 2026-05-29T16:36:21Z | 6 | Nesbitt: Protestware for Coding Agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | jimsojim | 2026-05-29T16:13:45Z | 12 | Ask HN: Any advice on how to learn good software architecture practices? | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | ananandreas | 2026-05-29T14:35:42Z | 5 | Show HN: OpenHive – AI agents share solutions so other agents dont re-solve them | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | joozio | 2026-05-29T07:05:31Z | 58 | Undisclosed addition in jqwik instructed AI coding agents to delete app output | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | sparkleMing | 2026-05-29T07:00:42Z | 1 | Show HN: SharkBay – a local macOS workbench for coding-agent CLIs | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | patriceckhart | 2026-05-29T05:48:21Z | 78 | Show HN: Zot – Yet another coding agent harness | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | peterneyra | 2026-05-29T01:18:58Z | 2 | Dis Dat – Loom for AI coding agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | aanet | 2026-05-28T22:46:14Z | 1 | Clawd-on-Desk: a pixel desktop pet watching your AI coding agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | SVI | 2026-05-28T21:03:24Z | 59 | Protestware for coding agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | akashi_dev | 2026-05-28T20:44:37Z | 2 | Show HN: Rig – Local-first code graph for coding agents, in one npx command | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | nkko | 2026-05-28T18:54:47Z | 2 | Coding agent can read your .env file | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | juanre | 2026-05-28T18:30:22Z | 3 | Show HN: Bootstrap a team of coding agents from a template, OSS | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | ltononro | 2026-05-28T18:18:21Z | 3 | Show HN: Notification when coding agent is done, free | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | ramonga | 2026-05-28T16:11:13Z | 3 | Show HN: Free open source coding models in Slack | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | spinchange | 2026-05-30T02:04:12Z | 1 | Show HN: A Claude Code skill that scopes problems like Peter Naur | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | vbutsomesayw | 2026-05-27T04:01:44Z | 3 | Bill Gates AI on AI (one month later) | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | armcat | 2026-05-24T19:37:43Z | 3 | Show HN: Simple Sprite Sheet Generation | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | jeroen_stulen | 2026-05-24T10:07:13Z | 3 | Show HN: My first app, artisanally vibe-coded in 4 months | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | xendo | 2026-05-23T11:13:35Z | 3 | Zero – Programming Language for Agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | goodroot | 2026-05-21T14:59:15Z | 2 | Show HN: opub, donated compute for open-source | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | afshinmeh | 2026-05-19T20:19:46Z | 3 | Zero: The Programming Language for Agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | amitbidlan | 2026-05-19T17:40:39Z | 1 | Show HN: Korveo – a local firewall for AI agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | Marius77 | 2026-05-19T14:09:50Z | 20 | The Programming Language for Agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | steveharing1 | 2026-05-17T20:25:40Z | 5 | Vercel's Zero: A Programming Language Designed for AI Agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | alex_x | 2026-05-17T14:40:22Z | 1 | The Programming Language for Agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | mindwarp | 2026-05-12T13:24:00Z | 1 | Show HN: Telegram/Slack bridge for local Codex agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | errata_dev | 2026-05-04T17:27:52Z | 1 | Show HN: [inerrata] – Collective and Causal Knowledge Layer for Coding Agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | georgestrakhov | 2026-05-03T18:51:13Z | 2 | AOP: Agent-Oriented Programming | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | is-it-art | 2026-04-27T21:15:46Z | 2 | Show HN: Is it art? An art project for AI agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | poshmosh | 2026-04-27T11:34:19Z | 4 | Show HN: Slerp.audio – VDJ with WebGL2 and real-time DSP | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | NickMiladinov | 2026-04-23T19:47:17Z | 7 | Show HN: Chestnut – The antidote to AI-induced skill atrophy | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | cobblr_mosaic | 2026-05-26T17:38:55Z | 3 | Agentic Harness Engineering | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | ramayac | 2026-05-20T04:31:50Z | 2 | Show HN: GoPOSIX – a Go-native POSIX userland, ~97% BusyBox-compatible | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | redbell | 2026-05-18T12:17:04Z | 159 | Learn Harness Engineering | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | Garbage | 2026-05-16T04:59:11Z | 3 | Agent Harness Engineering | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | Lunar5227 | 2026-05-15T05:42:46Z | 1 | Agentic SDLC: How OpenSearch accelerates engineering with its own engine | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | sahil-shubham | 2026-05-14T10:44:17Z | 3 | Show HN: Bhatti – self-hosted runtime for your harness engineering | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | gruyaume | 2026-05-12T14:37:45Z | 1 | Implicit Knowledge Is a Liability | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | pretext | 2026-05-10T05:19:22Z | 8 | Agent Harness Engineering | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | straydusk | 2026-05-08T22:57:31Z | 1 | Ask HN: Is agent-driven QA a thing? | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | nbstme | 2026-05-03T19:03:04Z | 2 | Why does my harness forget me? Agent engineering | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | kumulo | 2026-04-30T11:31:33Z | 1 | Harness engineering: leveraging Codex in an agent-first world | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | anophelon | 2026-04-29T07:30:11Z | 14 | Why Codex works better than Claude Code for my production monolith | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | ElFitz | 2026-04-27T10:59:16Z | 6 | Ask HN: What does your agentic software dark factory look like? | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | kiyanwang | 2026-04-27T05:50:28Z | 1 | Agent Harness Engineering | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | alex000kim | 2026-04-26T18:13:46Z | 7 | You've been doing harness engineering all along | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | jdw64 | 2026-04-19T08:42:37Z | 10 | Ask HN: May be a basic question, but how can I use AI well? | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | alexblackwell_ | 2026-04-16T15:19:54Z | 100 | Launch HN: Kampala (YC W26) – Reverse-Engineer Apps into APIs | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | geopsist | 2026-05-28T12:39:46Z | 6 | We Benchmarked Claude Code, Codex, Semgrep, CodeQL, Trent on 28 CWE-Bench CVEs | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | fittingopposite | 2026-05-28T05:05:59Z | 2 | Mini-SWE-agent scores up to 74% on SWE-bench in 100 lines of Python code | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | kimjune01 | 2026-05-24T18:03:28Z | 2 | Show HN: 97% on SWE-bench Verified with subscription-token agents | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | Sushrutkm | 2026-05-19T10:02:03Z | 2 | Bito's AI Architect Boosts Claude Opus's task success rate by 35% | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | azurewraith | 2026-05-12T14:24:55Z | 126 | Show HN: Statewright – Visual state machines that make AI agents reliable | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | lieret | 2026-05-05T15:10:41Z | 24 | Show HN: New Benchmark from SWE-bench team is 0% solved | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | Philpax | 2026-05-02T21:35:54Z | 2 | talkie-coder: From 1930 to SWE-bench | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | jryio | 2026-04-29T19:16:48Z | 2 | Anthropic's Argument for Mythos SWE-bench improvement contains a fatal error | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | kmdupree | 2026-04-26T13:58:13Z | 343 | SWE-bench Verified no longer measures frontier coding capabilities | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | george_ciobanu | 2026-04-24T21:34:31Z | 10 | Show HN: Codex context bloat? 87% avg reduction on SWE-bench Verified traces | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | nicola_alessi | 2026-04-16T20:19:18Z | 1 | Ask HN: Opus 4.7 – is anyone measuring the real token cost on agentic tasks? | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | stared | 2026-04-14T08:32:45Z | 1 | Compare harnesses not models: Blitzy vs. GPT-5.4 on SWE-Bench Pro | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | sriharis | 2026-04-13T13:14:44Z | 3 | Checking my model vibes against SWE-Bench Pro | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | chenglin97 | 2026-04-10T22:48:29Z | 4 | SWE-Bench Verified Leaderboard March 2026 – Independent vs. Self-Reported Scores | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | raghavchamadiya | 2026-04-06T20:15:26Z | 1 | Show HN: Repowise – Codebase intelligence for AI coding agents (open source) | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | asfsf23423 | 2026-03-29T21:47:54Z | 6 | SWE-bench will hit 90% this year | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | stared | 2026-03-25T14:24:43Z | 4 | Blitzy Scores a Record 66.5% on SWE-Bench Pro | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | neversettles | 2026-05-03T03:40:04Z | 1 | The Terminal Bench 3.0 community is looking for task contributors | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | gk1 | 2026-04-29T18:16:23Z | 4 | ForgeCode: Top open source coding agent in Terminal-Bench 2.0 | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | ubermon | 2026-04-28T19:11:57Z | 6 | Open-weight 27B hits 38% on Terminal-Bench 2.0 (Opus 4.1 hit 38% in Aug 2025) | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | GodelNumbering | 2026-04-27T12:35:55Z | 393 | Show HN: OSS Agent I built topped the TerminalBench on Gemini-3-flash-preview | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | neversupervised | 2026-04-15T00:42:30Z | 6 | Show HN: Terminal-Wrench, a dataset of 331 realistic hackable environments | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | jackykwok | 2026-04-14T20:27:39Z | 1 | A simple test-time method that beats Claude Mythos on Terminal-Bench | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | _nhynes | 2026-04-13T07:48:11Z | 1 | Show HN: Amber, a capability-based runtime/compiler for agent benchmarks | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | joozio | 2026-04-01T12:59:36Z | 4 | Claude Code ranks 39th on terminal bench. The leaked source shows why | Liên quan agentic SDLC/eval/context; score 60-82. |
| hn | bcollins34 | 2026-03-31T19:07:11Z | 4 | Show HN: Wozcode – double Claude Code output | Liên quan agentic SDLC/eval/context; score 60-82. |
4Repo Watch
| Repo | Metric | Updated | Fabbi move |
|---|---|---|---|
| genusercillaindependentclause781/gstack | 0 stars | 2026-05-30T08:30:54Z | Trial nếu có sandbox/logging/test hooks. |
| semionkuksov23/personal-workspace-os | 1 stars | 2026-05-30T08:30:52Z | Trial nếu có sandbox/logging/test hooks. |
| piqueriastrongbreeze520/claude-and-codex-website | 1 stars | 2026-05-30T08:30:48Z | Trial nếu có sandbox/logging/test hooks. |
| nolte/claude-shared | 0 stars | 2026-05-30T08:30:47Z | Trial nếu có sandbox/logging/test hooks. |
| MyHeavenDyf/UXAI | 0 stars | 2026-05-30T08:30:46Z | Trial nếu có sandbox/logging/test hooks. |
| Machineaccessible-ochre867/agenthub | 1 stars | 2026-05-30T08:30:39Z | Trial nếu có sandbox/logging/test hooks. |
| Acromegaliacanaliculus452/Swift-Testing-Agent-Skill | 0 stars | 2026-05-30T08:30:34Z | Trial nếu có sandbox/logging/test hooks. |
| entrepreneurial-cabinetminister913/harness-engineering | 0 stars | 2026-05-30T08:30:34Z | Trial nếu có sandbox/logging/test hooks. |
| jeongmk522-netizen/agentlas-desktop | 1 stars | 2026-05-30T08:30:48Z | Trial nếu có sandbox/logging/test hooks. |
| linny006/agent-eval-harness | 0 stars | 2026-05-30T08:30:29Z | Trial nếu có sandbox/logging/test hooks. |
| elgrhy/gx | 2 stars | 2026-05-30T08:30:54Z | Trial nếu có sandbox/logging/test hooks. |
| lorenzocl3940/gsd-2 | 0 stars | 2026-05-30T08:29:55Z | Trial nếu có sandbox/logging/test hooks. |
| intangible-sidalceamalviflora302/engram | 0 stars | 2026-05-30T08:28:35Z | Trial nếu có sandbox/logging/test hooks. |
| laoxs2002/genai-agentes | 0 stars | 2026-05-30T08:27:12Z | Trial nếu có sandbox/logging/test hooks. |
| aditxver/engram | 2 stars | 2026-05-30T08:25:58Z | Trial nếu có sandbox/logging/test hooks. |
5Paper / Benchmark Watch
6Impact Coverage
| Domain | Now 0-2w | Next 1-2m | Later 3-6m | Decision |
|---|---|---|---|---|
| FARE | Context/codebase map eval | Repo memory PoC | Knowledge layer | trial |
| NEXA | Harness for coding agent | Sandbox executor | Multi-agent orchestration | adopt |
| SYNCA | Quality/risk gates | Human-in-loop audit | Compliance dashboard | adopt |
| DOMUS | Monitor | AI ops assistant | Workflow automation | watch |
| Japan/VN/Global | Enterprise caution | Pilot bundles | Managed AI-SDLC service | trial |
7CTO Evaluation Matrix
| Top signal | Thesis | Evidence | Counter-signal | Fabbi implication | Confidence | Decision | Next validation |
|---|---|---|---|---|---|---|---|
| Harness-first coding agents | Agent value phụ thuộc eval loop | 386 candidates, 91 GitHub | X/FB N/A | NEXA+SYNCA differentiation | 78% | adopt | Run 20-task benchmark |
| CLI/IDE convergence | Workflow moved to terminal+IDE | 25 YT, 171 HN | Video metadata partial | Dev enablement package | 68% | trial | Measure review time |
| Context engineering | Repo understanding bottleneck | 40 paper refs | No universal benchmark | FARE moat | 72% | trial | Codebase Q&A eval |
8CTO Recommendations
| Action | ROI/time-saving | Risk | Owner | TTV | Validation |
|---|---|---|---|---|---|
| Build NEXA Agent Harness v0: 20 tasks, cost/latency/pass@1/security log. | 18-28% | 2/5 | AI Platform Lead | 2 tuần | Baseline vs agent |
| Add SYNCA HITL gate: risk class, diff summary, rollback checklist. | 12-20% | 2/5 | QA/Governance Lead | 10 ngày | Defect escape rate |
| Run FARE context-memory PoC on 2 legacy repos. | 15-25% | 3/5 | Solution Architect | 3 tuần | Onboarding time |
| Create Japan/VN packaged AI-SDLC pilot offer. | 8-15% | 3/5 | Delivery Director | 4 tuần | 2 paid pilots / 60 days |
9Data Quality / Scan Health
Scanned 386 candidates; deduped 327. Counts: {'hn': 171, 'youtube': 25, 'github': 91, 'paper': 40}. X direct/Facebook public: N/A — no authenticated API/public usable links in cron; confidence impact -12%. Reddit JSON returned 0 usable; compensated by HN/GitHub/papers/YouTube. Status PARTIAL-PASS.