The csharp_analyzer module is a language-specific analyzer responsible for parsing and extracting code components and their relationships from C# source files. It is part of the larger CodeWiki dependency analysis system and enables automated documentation generation by understanding C# code structure, class hierarchies, and component dependencies.
- Parse C# syntax trees using Tree-sitter parser to identify code components
- Extract type definitions including classes, interfaces, structs, enums, records, and delegates
- Analyze dependencies between components based on inheritance, field types, property types, and method parameters
- Generate component metadata for documentation and visualization
- Support repository-wide analysis by processing multiple C# files consistently
graph TB
subgraph "Language Analyzers"
CSH["C# Analyzer<br/>TreeSitterCSharpAnalyzer"]
PY["Python Analyzer<br/>PythonASTAnalyzer"]
JS["JavaScript Analyzer<br/>TreeSitterJSAnalyzer"]
end
subgraph "Dependency Analysis Services"
DP["DependencyParser"]
AS["AnalysisService"]
end
subgraph "Data Models"
NODE["Node Model"]
CR["CallRelationship Model"]
end
subgraph "Documentation Pipeline"
DG["DocumentationGenerator"]
HTML["HTMLGenerator"]
end
CSH -->|analyzes| NODE
CSH -->|creates| CR
DP -->|uses| CSH
AS -->|orchestrates| DP
NODE -->|consumed by| DG
CR -->|consumed by| DG
DG -->|generates| HTML
style CSH fill:#4A90E2,color:#fff
style NODE fill:#50C878,color:#fff
style CR fill:#50C878,color:#fff
style DG fill:#FF6B6B,color:#fff
graph TD
A["TreeSitterCSharpAnalyzer<br/>__init__"]
B["_analyze<br/>Orchestrator"]
C["_extract_nodes<br/>Type Definitions"]
D["_extract_relationships<br/>Dependencies"]
E["Helper Methods<br/>Path, ID, Type Mgmt"]
F["Node & CallRelationship<br/>Data Objects"]
A -->|calls| B
B -->|calls| C
B -->|calls| D
C -->|creates| F
D -->|creates| F
C -->|uses| E
D -->|uses| E
style A fill:#4A90E2,color:#fff
style B fill:#4A90E2,color:#fff
style C fill:#4A90E2,color:#fff
style D fill:#4A90E2,color:#fff
style F fill:#50C878,color:#fff
style E fill:#FFD700,color:#000
Purpose: Main analyzer class that parses C# files and extracts code components and relationships.
Key Attributes:
file_path(Path): Path to the C# file being analyzedcontent(str): Raw content of the C# filerepo_path(str): Repository root path for relative path calculationnodes(List[Node]): Extracted code componentscall_relationships(List[CallRelationship]): Dependencies between components
Key Methods:
Initializes the analyzer and triggers analysis.
Main orchestrator method that:
- Creates a Tree-sitter parser with C# language support
- Parses the file content into an AST
- Extracts top-level nodes (type definitions)
- Extracts relationships between nodes
Recursively traverses the AST and extracts type definitions:
- Class declarations (including abstract and static variants)
- Interface declarations
- Struct declarations
- Enum declarations
- Record declarations
- Delegate declarations
Each extracted node is represented as a Node object with metadata including:
- Component ID (file::typename)
- Display name and type
- Source code snippet
- Line number range
- Docstring information
Recursively analyzes AST nodes to identify dependencies:
Inheritance relationships:
- Extracts base class/interface from
base_listnodes - Creates CallRelationship with
is_resolved=True
Type dependencies:
- Properties: Analyzes property types
- Fields: Analyzes field types
- Method parameters: Analyzes parameter types
Dependencies are marked as is_resolved=False unless the type is a known top-level class.
Filters out C# primitive and common built-in types:
- Primitives:
bool,byte,int,long,string,object, etc. - Built-in types:
List,Dictionary,Task,DateTime, etc.
This prevents unnecessary dependency tracking for system types.
Traverses parent nodes to find the class/struct that contains a given AST node, enabling proper relationship attribution.
Generates unique component identifiers in the format: relative_path::component_name
Converts file path to module path by removing file extension and replacing separators with dots.
graph LR
A["C# File Input<br/>file_path + content"]
B["TreeSitterCSharpAnalyzer<br/>Initialize"]
C["Tree-sitter Parser"]
D["AST Root Node"]
E["_extract_nodes"]
F["Type Definitions"]
G["Node Objects"]
H["_extract_relationships"]
I["Dependencies"]
J["CallRelationship Objects"]
K["Analysis Results"]
A --> B
B --> C
C --> D
D --> E
D --> H
E --> F
F --> G
H --> I
I --> J
G --> K
J --> K
style A fill:#E8F4F8,color:#000
style B fill:#4A90E2,color:#fff
style C fill:#FFD700,color:#000
style G fill:#50C878,color:#fff
style J fill:#50C878,color:#fff
style K fill:#FF6B6B,color:#fff
graph LR
A["AST Node"]
B["Check Node Type"]
C1["Class"]
C2["Interface"]
C3["Struct"]
C4["Enum"]
C5["Record"]
C6["Delegate"]
D["Get Component Name"]
E["Create Node Object"]
F["Extract Metadata"]
A --> B
B --> C1
B --> C2
B --> C3
B --> C4
B --> C5
B --> C6
C1 --> D
C2 --> D
C3 --> D
C4 --> D
C5 --> D
C6 --> D
D --> E
E --> F
style A fill:#E8F4F8,color:#000
style B fill:#FFD700,color:#000
style E fill:#50C878,color:#fff
style F fill:#50C878,color:#fff
graph LR
A["AST Node"]
B["Analyze Type"]
C1["Inheritance"]
C2["Properties"]
C3["Fields"]
C4["Method Params"]
D["Extract Type"]
E["Filter Check<br/>Is Primitive?"]
F["Create<br/>CallRelationship"]
G["Add to<br/>Relationships"]
A --> B
B --> C1
B --> C2
B --> C3
B --> C4
C1 --> D
C2 --> D
C3 --> D
C4 --> D
D --> E
E -->|no| F
F --> G
style A fill:#E8F4F8,color:#000
style B fill:#FFD700,color:#000
style F fill:#50C878,color:#fff
style G fill:#50C878,color:#fff
Uses the tree-sitter library with C# grammar for robust, incremental parsing that handles:
- Complex syntax structures
- Malformed code gracefully
- Large files efficiently
Analyzes various C# type definitions:
- Classes (including abstract and static variants)
- Interfaces
- Structs
- Enums
- Records (C# 9+)
- Delegates
Captures dependencies from:
- Class inheritance (base classes and interfaces)
- Property types used in class definitions
- Field types used in class definitions
- Method parameter types
Intelligently filters out built-in and primitive types to focus on meaningful domain dependencies:
- C# primitives (bool, int, string, etc.)
- Framework types (List, Dictionary, Task, DateTime, etc.)
- Only tracks custom domain types
Handles path conversions for consistent identification:
- Absolute to relative path conversion
- File-to-module path transformation
- Cross-platform path separator handling
For each component, captures:
- Unique component ID
- Component name and type
- Full source code
- Line number range
- Documentation/docstring information
graph LR
DP["DependencyParser"]
CSH["TreeSitterCSharpAnalyzer"]
NODES["Node + CallRelationship"]
GRAPH["Dependency Graph"]
DP -->|instantiates| CSH
CSH -->|returns| NODES
DP -->|builds from results| GRAPH
style DP fill:#FF6B6B,color:#fff
style CSH fill:#4A90E2,color:#fff
style NODES fill:#50C878,color:#fff
style GRAPH fill:#FFD700,color:#000
The DependencyParser orchestrates the analyzer to process all C# files in a repository and constructs a complete dependency graph.
Integration Details:
- Called for each
.csfile found during repository scanning - Results aggregated with other language analyzers
- File path and content provided by parser
- Nodes and relationships collected into unified component model
The AnalysisService (see dependency_analysis_services.md) coordinates:
- Repository structure analysis
- Multi-language file discovery
- Parallel analysis of components
- Call graph construction
Node Model captures component metadata:
Node(
id="MyFile.cs::MyClass",
name="MyClass",
component_type="class",
file_path="/path/to/MyFile.cs",
relative_path="MyFile.cs",
source_code="public class MyClass { ... }",
start_line=5,
end_line=25,
display_name="class MyClass",
base_classes=["BaseClass"], # Inheritance
# ... other metadata
)CallRelationship Model represents dependencies:
CallRelationship(
caller="MyFile.cs::MyClass",
callee="MyFile.cs::IDependency",
call_line=10,
is_resolved=True # Known type vs external
)See dependency_analyzer_models.md for detailed model documentation.
Output from C# analyzer feeds into documentation pipeline:
graph LR
CSH["C# Analyzer"]
DG["DocumentationGenerator"]
AGE["Agent Execution"]
MD["Markdown Docs"]
HTML["HTML Output"]
CSH -->|Nodes + Relationships| DG
DG -->|processes| AGE
AGE -->|generates| MD
MD -->|converts| HTML
style CSH fill:#4A90E2,color:#fff
style DG fill:#FF6B6B,color:#fff
style MD fill:#50C878,color:#fff
style HTML fill:#FFD700,color:#000
See documentation_generation.md for details on how analysis results are used to generate documentation.
from codewiki.src.be.dependency_analyzer.analyzers.csharp import analyze_csharp_file
# Analyze a single C# file
content = open("MyClass.cs").read()
nodes, relationships = analyze_csharp_file(
file_path="src/MyClass.cs",
content=content,
repo_path="/path/to/repo"
)
# Process results
for node in nodes:
print(f"Found: {node.display_name} at line {node.start_line}")
for rel in relationships:
print(f"Dependency: {rel.caller} -> {rel.callee}")from codewiki.src.be.dependency_analyzer.ast_parser import DependencyParser
parser = DependencyParser(repo_path="/path/to/csharp/repo")
components = parser.parse_repository() # Automatically uses C# analyzer
# Results include all C#-specific types and relationships
for component_id, node in components.items():
if "class" in node.component_type:
print(f"Class: {node.name} inherits from {node.base_classes}")Both _extract_nodes and _extract_relationships use recursive traversal of the AST:
def _extract_nodes(self, node, top_level_nodes, lines):
# Process current node
if node.type == "class_declaration":
# Extract this node
...
# Recursively process children
for child in node.children:
self._extract_nodes(child, top_level_nodes, lines)This pattern ensures all components at any nesting level are captured.
Primitive type check centralizes filtering logic:
def _is_primitive_type(self, type_name: str) -> bool:
primitives = {
"bool", "byte", "int", ..., "Task", "List", ...
}
return type_name in primitivesEnables consistent filtering across all relationship extraction methods.
Separates concerns of absolute and relative path handling:
_get_module_path(): File to module path conversion_get_relative_path(): Absolute to relative conversion_get_component_id(): Generates formatted component identifiers
For C# declarations, finds the keyword then the identifier:
def _get_identifier_name_cs(self, node):
if node.type == "class_declaration":
found_class_keyword = False
for child in node.children:
if child.type == "class":
found_class_keyword = True
elif found_class_keyword and child.type == "identifier":
return child.text.decode()Handles variations in AST structure robustly.
The analyzer handles:
- Incomplete/malformed C# code: Tree-sitter continues parsing despite syntax errors
- Missing components: Optional node searches return None instead of raising
- Path normalization: Cross-platform path separator handling
Uses Python's logging module for diagnostic information:
logger.debug(f"Found node: {node_name}")
logger.warning(f"Unknown node type: {node_type}")Relationships marked with is_resolved flag:
True: Callee is a known top-level component in same fileFalse: Callee is external dependency or unresolved type
- File parsing: O(n) where n = file size
- Node extraction: O(m) where m = number of AST nodes
- Relationship extraction: O(m × k) where k = average node children
Overall: O(n) per file - linear in file size
- AST memory: O(n) to store parsed tree
- Nodes & relationships: O(c + r) where c = components, r = relationships
| Feature | C# Analyzer | Python Analyzer | JavaScript Analyzer |
|---|---|---|---|
| Parser | Tree-sitter | Python AST | Tree-sitter |
| Type Support | Classes, Interfaces, Structs, Enums, Records, Delegates | Classes, Functions | Classes, Functions, Interfaces (TS) |
| Inheritance | ✓ (base_list) | ✓ (bases) | ✓ (class_name) |
| Field Types | ✓ | - | ✓ |
| Property Types | ✓ | - | ✓ |
| Method Params | ✓ | - | ✓ |
| Call Tracking | - | ✓ | ✓ |
Key distinction: C# analyzer focuses on type structure and inheritance, while Python analyzer emphasizes function calls and dependencies.
- tree-sitter (>= 0.20): Core parsing engine
- tree-sitter-c-sharp (>=X.Y.Z): C# language grammar
- models.core:
Node,CallRelationshipdata models - dependency_analyzer: Parent package for repository-wide analysis
- Generic Type Support: Better handling of
List<T>,Dictionary<K,V>patterns - Method Call Tracking: Analyze method calls within class bodies
- Namespace Resolution: Track cross-namespace dependencies
- Attribute Extraction: Capture C# attributes and annotations
- Interface Implementation: Explicitly track interface implementations
- Extension Methods: Recognize and track extension method dependencies
- Async/Await Analysis: Track async method relationships
- Using Directive Analysis: Extract namespace-level dependencies
For very large C# projects:
- Consider parallel file processing
- Implement streaming analysis for large files
- Add caching of parsing results
- language_analyzers.md - Overview of all language analyzers
- dependency_analysis_services.md - Service orchestration layer
- dependency_analyzer_models.md - Node and CallRelationship models
- documentation_generation.md - How analysis results are used
- dependency_analyzer.md - High-level dependency analysis module
Location: codewiki/src/be/dependency_analyzer/analyzers/csharp.py
Primary Export: TreeSitterCSharpAnalyzer class
Helper Function: analyze_csharp_file(file_path, content, repo_path) -> Tuple[List[Node], List[CallRelationship]]
- v1.0: Initial implementation with support for classes, interfaces, structs, enums, records, delegates
- v1.1: Added property and field type dependency tracking
- v1.2: Improved method parameter dependency extraction
- v1.3: Enhanced primitive type filtering with expanded built-in types list