Skip to content

Improve performance in packet dissection#5005

Open
polybassa wants to merge 3 commits into
secdev:masterfrom
polybassa:dissector_performance_increase
Open

Improve performance in packet dissection#5005
polybassa wants to merge 3 commits into
secdev:masterfrom
polybassa:dissector_performance_increase

Conversation

@polybassa
Copy link
Copy Markdown
Contributor

@polybassa polybassa commented May 27, 2026

I played around with AI tools and it came along with this proposal for performance improvement on Packet dissection.

Profiled packet dissection and optimized the critical path. Benchmark on Ether/IP/TCP/Raw (10K iterations): 6153 → 7197 pkt/s (+17%), function calls reduced from 6.71M to 4.92M (-27%).

Changes

scapy/fields.py

  • Field.getfield(): Use struct.unpack_from(buf) instead of struct.unpack(buf[:n]) to avoid temporary slice allocation. Falls back to slice for bytes subclasses (e.g. TrailerBytes) that override __getitem__.
  • Field.m2i/h2i/i2m: Remove typing.cast() — 260K+ no-op function calls per 10K packets.
  • _FieldContainer/Field: Add class-level _is_conditional/_may_end flags to avoid isinstance() in tight loops.

scapy/packet.py

  • do_dissect(): Check pre-computed field flags instead of isinstance(ConditionalField) / isinstance(MayEnd) per iteration.
  • guess_payload_class(): Inline getfieldval with local variable caching of self.fields/self.overloaded_fields/self.default_fields. Original did 3 dict lookups + deprecated field check per field per candidate layer.
  • getfieldval(): Replace if k in d1 ... elif k in d2 ... with single try/except KeyError on fast path.
  • __init__(): Skip time.time() syscall for internal sub-layer packets (_internal=1).
  • _raw_packet_cache_field_value(): Replace per-call lambda with direct attribute access.
  • do_init_cached_fields(): Eliminate redundant dict.get() pattern.

API

No public API changes.

Dissection Throughput (10K iterations each)

Packet Type Baseline Optimized Δ
Ether/IP/TCP/Raw(100B) 7,026 pkt/s 7,508 pkt/s +6.9%
Ether/IP/UDP/DNS(query) 5,286 pkt/s 5,685 pkt/s +7.5%
Ether/IP/UDP/DNS(response) 2,726 pkt/s 2,889 pkt/s +6.0%
Ether/IP/ICMP 5,603 pkt/s 6,000 pkt/s +7.1%
IP/TCP/Raw(50B) 9,389 pkt/s 10,063 pkt/s +7.2%
Ether/IP/TCP/Raw(1400B) 6,870 pkt/s 7,439 pkt/s +8.3%
Ether (minimal) 62,548 pkt/s 64,819 pkt/s +3.6%
IP/UDP/Raw(5B) 7,998 pkt/s 8,776 pkt/s +9.7%
Batch 1000×Ether/IP/TCP/Raw (100K pkts) 7,117 pkt/s 7,781 pkt/s +9.3%

Profile Comparison (5,000 Ether/IP/TCP/Raw dissections)

Metric Baseline Optimized Δ
Total function calls 2,915,002 2,350,002 −19.4%
Total time 1.559s 1.357s −13.0%
isinstance() calls 380,000 230,000 −39.5%
guess_payload_class time 0.137s 0.062s −55%

Key Observations

  • Consistent 6–10% throughput improvement across all packet types
  • Larger improvement on simpler packets (IP/UDP, IP/TCP) where overhead is proportionally higher
  • DNS response shows less improvement since most time is spent in complex DNS field parsing
  • isinstance() calls reduced by ~40% via pre-computed _is_conditional/_may_end flags
  • guess_payload_class is 2× faster due to inlined field lookups with local variable caching

AI-Assisted: yes (Claude Code Opus 4.6)

…ket processing

Profiled packet dissection and optimized the critical path. Benchmark on `Ether/IP/TCP/Raw` (10K iterations): **6153 → 7197 pkt/s (+17%)**, function calls reduced from 6.71M to 4.92M (-27%).

## Changes

### `scapy/fields.py`
- **`Field.getfield()`**: Use `struct.unpack_from(buf)` instead of `struct.unpack(buf[:n])` to avoid temporary slice allocation. Falls back to slice for bytes subclasses (e.g. `TrailerBytes`) that override `__getitem__`.
- **`Field.m2i/h2i/i2m`**: Remove `typing.cast()` — 260K+ no-op function calls per 10K packets.
- **`_FieldContainer`/`Field`**: Add class-level `_is_conditional`/`_may_end` flags to avoid `isinstance()` in tight loops.

### `scapy/packet.py`
- **`do_dissect()`**: Check pre-computed field flags instead of `isinstance(ConditionalField)` / `isinstance(MayEnd)` per iteration.
- **`guess_payload_class()`**: Inline `getfieldval` with local variable caching of `self.fields`/`self.overloaded_fields`/`self.default_fields`. Original did 3 dict lookups + deprecated field check per field per candidate layer.
- **`getfieldval()`**: Replace `if k in d1 ... elif k in d2 ...` with single `try/except KeyError` on fast path.
- **`__init__()`**: Skip `time.time()` syscall for internal sub-layer packets (`_internal=1`).
- **`_raw_packet_cache_field_value()`**: Replace per-call lambda with direct attribute access.
- **`do_init_cached_fields()`**: Eliminate redundant `dict.get()` pattern.

## API

No public API changes.

## Dissection Throughput (10K iterations each)

| Packet Type | Baseline | Optimized | Δ |
|---|---|---|---|
| `Ether/IP/TCP/Raw(100B)` | 7,026 pkt/s | 7,508 pkt/s | **+6.9%** |
| `Ether/IP/UDP/DNS(query)` | 5,286 pkt/s | 5,685 pkt/s | **+7.5%** |
| `Ether/IP/UDP/DNS(response)` | 2,726 pkt/s | 2,889 pkt/s | **+6.0%** |
| `Ether/IP/ICMP` | 5,603 pkt/s | 6,000 pkt/s | **+7.1%** |
| `IP/TCP/Raw(50B)` | 9,389 pkt/s | 10,063 pkt/s | **+7.2%** |
| `Ether/IP/TCP/Raw(1400B)` | 6,870 pkt/s | 7,439 pkt/s | **+8.3%** |
| `Ether` (minimal) | 62,548 pkt/s | 64,819 pkt/s | **+3.6%** |
| `IP/UDP/Raw(5B)` | 7,998 pkt/s | 8,776 pkt/s | **+9.7%** |
| Batch 1000×`Ether/IP/TCP/Raw` (100K pkts) | 7,117 pkt/s | 7,781 pkt/s | **+9.3%** |

## Profile Comparison (5,000 `Ether/IP/TCP/Raw` dissections)

| Metric | Baseline | Optimized | Δ |
|---|---|---|---|
| Total function calls | 2,915,002 | 2,350,002 | **−19.4%** |
| Total time | 1.559s | 1.357s | **−13.0%** |
| `isinstance()` calls | 380,000 | 230,000 | **−39.5%** |
| `guess_payload_class` time | 0.137s | 0.062s | **−55%** |

## Key Observations

- Consistent **6–10% throughput improvement** across all packet types
- Larger improvement on simpler packets (IP/UDP, IP/TCP) where overhead is proportionally higher
- DNS response shows less improvement since most time is spent in complex DNS field parsing
- `isinstance()` calls reduced by ~40% via pre-computed `_is_conditional`/`_may_end` flags
- `guess_payload_class` is **2× faster** due to inlined field lookups with local variable caching

AI-Assisted: yes (Claude Code Opus 4.6)
@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

❌ Patch coverage is 96.22642% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.29%. Comparing base (30c07a1) to head (9b31ee5).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
scapy/fields.py 92.30% 1 Missing ⚠️
scapy/packet.py 97.50% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5005   +/-   ##
=======================================
  Coverage   80.29%   80.29%           
=======================================
  Files         383      383           
  Lines       95163    95205   +42     
=======================================
+ Hits        76407    76445   +38     
- Misses      18756    18760    +4     
Files with missing lines Coverage Δ
scapy/fields.py 92.80% <92.30%> (+0.03%) ⬆️
scapy/packet.py 84.89% <97.50%> (+0.18%) ⬆️

... and 9 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

AI-Assisted: no
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes Scapy’s packet dissection hot path by reducing temporary allocations, cutting down on per-field type checks, and inlining/caching common lookups during payload-class guessing.

Changes:

  • Optimize Field.getfield() by using struct.unpack_from() for plain bytes to avoid slicing allocations.
  • Reduce overhead in dissection loops by using precomputed field flags (_is_conditional, _may_end) instead of repeated isinstance() checks.
  • Speed up field/value resolution by replacing membership checks with try/except KeyError, and by inlining getfieldval() logic inside guess_payload_class().

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
scapy/packet.py Reduces per-packet overhead in __init__, getfieldval, do_dissect, and guess_payload_class to improve dissection throughput.
scapy/fields.py Optimizes Field.getfield() and adds class-level flags to enable faster checks in the packet dissection loop.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scapy/packet.py
):
# type: (...) -> None
self.time = time.time() # type: Union[EDecimal, float]
self.time = 0.0 if _internal else time.time() # type: Union[EDecimal, float]
Comment thread scapy/packet.py Outdated
AI-Assisted: yes (GitHub Copilot)
@polybassa polybassa force-pushed the dissector_performance_increase branch from 699122d to 9b31ee5 Compare May 28, 2026 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants