Refactoroscope provides advanced AST-based duplicate code detection with clone type classification to help identify redundant code in your project.
The duplicate code detection uses Abstract Syntax Tree (AST) analysis to identify similar code patterns while being resilient to superficial differences like variable names, whitespace, and comments.
The detection identifies different types of code clones:
To analyze for duplicate code:
uv run refactoroscope duplicates /path/to/your/project
To focus on exact duplicates only:
uv run refactoroscope duplicates /path/to/your/project --type exact
To focus on renamed clones:
uv run refactoroscope duplicates /path/to/your/project --type renamed
To find similar code with a minimum similarity threshold:
uv run refactoroscope duplicates /path/to/your/project --min-similarity 0.8
The duplicate detection works across different files in your project, identifying duplicated code patterns regardless of their location.
Results include quantitative similarity measures between code blocks (0.0 to 1.0), where:
The analysis uses caching and global indexing for efficient analysis of large codebases.
Duplicate detection can be configured through the .refactoroscope.yml
configuration file:
duplicates:
# Minimum similarity threshold (0.0 to 1.0)
min_similarity: 0.8
# Clone types to detect
clone_types:
- "exact"
- "renamed"
- "modified"
- "semantic"
# Whether to include comments in comparison
include_comments: false
# Whether to include docstrings in comparison
include_docstrings: false
# Minimum number of lines for a clone
min_lines: 3
# Maximum number of lines for a clone
max_lines: 100
# Patterns to ignore
ignore_patterns:
- "*.generated.*"
- "*_pb2.py"
The duplicate detection output includes: