High-Precision Static Analysis & Performance Intelligence โ a research-grade, zero-false-positive engine for Python code that targets "silent performance killers" and produces AI-native findings for autonomous refactoring agents.
Standard linters are built on a philosophy of high recall โ catch everything, accept some noise. This is fine for style enforcement, but catastrophic for an AI agent using findings to autonomously refactor production code. A single false positive that causes an AI agent to "fix" code incorrectly destroys trust in the entire system.
OptiScan's philosophy is the inverse: prefer a missed true positive over a single false positive. Every rule is built with multi-layer validation that conservatively flags only when all criteria are definitively met.
| Version | Engine | False Positive Rate | Key Achievement |
|---|---|---|---|
| V1.0 | Basic AST pattern matching | 100% on string rules | Conceptual prototype; proved the problem space |
| V2.0 | LibCST (Concrete Syntax Tree) | 0% | Three-Layer Filter System; "Conservative Flagging" philosophy; production-ready for CI/CD |
| V3.0 (Current) | LibCST + metadata providers | 0% | Architectural & Bottleneck rule tiers; AI-perfect JSON schema; full contextual integrity (5-10 lines of surrounding code) |
The challenge that broke V1: distinguishing result_str += item (inefficient โ O(nยฒ)) from count += 1 (perfectly fine โ numeric increment). The Three-Layer Filter for PY-PERF-001:
A dictionary of 30+ counter variable name patterns (ct, cnt, index, total, num, i, j, k, countโฆ) is checked against the target variable. If the name matches, the finding is suppressed.
The visitor tracks the first assignment of each variable within its scope. If initialised as 0 or any integer โ marked numeric (suppress). If initialised as "" or any string literal โ marked string (candidate).
The right-hand side of the += is inspected. String literals, f-strings, and str() calls confirm the finding. Numeric literals, variable names matching counter heuristics, or arithmetic expressions suppress it.
| Rule ID | Category | Pattern Detected | Impact |
|---|---|---|---|
PY-BOTTLENECK-201 | ๐ด Critical | N+1 Query Pattern (data-loading calls inside loops) | Eliminates 100โ10,000+ redundant I/O operations |
PY-BOTTLENECK-202 | ๐ด Critical | Nested Loops โ O(nยฒ) complexity | Reduces to O(n) or O(n log n) |
PY-BOTTLENECK-203 | ๐ด Critical | Tight Loop Hotspots (regex, complex math in loops) | Enables vectorisation or loop-invariant hoisting |
PY-PERF-001 | ๐ก Auto-Fix | String Concatenation in loops (+= on str) | O(nยฒ) โ O(n) via str.join() |
PY-PERF-002 | ๐ก Auto-Fix | Inefficient Membership (x in dict.keys()) | Avoids view object creation and method lookup |
PY-PERF-003 | ๐ก Auto-Fix | Manual .append() loops (convert to list comprehension) | Bytecode-level optimisation; only if loop is "pure" |
PY-ARCH-101 | ๐ Manual | Repeated Computation (method calls as subscripts in loops) | 10-100x reduction via caching or precomputation |
PY-ARCH-102 | ๐ Manual | Inefficient List Tests (x in large_list) | O(n) โ O(1) by converting to set |
PY-ARCH-103 | ๐ Manual | Unbatched I/O (open() or write() inside loops) | 10-1,000x reduction in expensive system calls |
PY-ARCH-104 | ๐ Manual | Large List-Building (should return generators) | 90%+ memory reduction for large datasets |
PY-ARCH-105 | ๐ Manual | Repeated Attribute Access (obj.attr in tight loops) | 2-5x speedup by caching as local variable |