Skip to content

Commit 8266d68

Browse files
authored
Merge pull request #63 from tcalmant/v3
Addition of a v3 package
2 parents 519fc21 + 01d037f commit 8266d68

13 files changed

Lines changed: 3639 additions & 72 deletions

File tree

.github/workflows/build-20.04.yml

Lines changed: 0 additions & 49 deletions
This file was deleted.

.github/workflows/build-24.04.yml

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,10 @@ name: CI Build - Python 3.8+
55

66
on:
77
push:
8-
branches: [ "master" ]
8+
branches: [ "main", "master" ]
99
tags: '**'
1010
pull_request:
11-
branches: [ "master" ]
11+
branches: [ "main", "master" ]
1212

1313
jobs:
1414
build:
@@ -17,7 +17,7 @@ jobs:
1717
strategy:
1818
fail-fast: false
1919
matrix:
20-
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12", "3.13", "3.14-dev"]
20+
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12", "3.13", "3.14"]
2121

2222
steps:
2323
- uses: actions/checkout@v4
@@ -32,13 +32,23 @@ jobs:
3232
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
3333
- name: Lint with flake8
3434
run: |
35-
# stop the build if there are Python syntax errors or undefined names
36-
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
37-
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
38-
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
35+
# javaobj/v3 and tests/test_v3.py require Python 3.12+ syntax; exclude them on older versions
36+
if python -c "import sys; sys.exit(0 if sys.version_info >= (3, 12) else 1)"; then
37+
# stop the build if there are Python syntax errors or undefined names
38+
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
39+
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
40+
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
41+
else
42+
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics --exclude=javaobj/v3,tests/test_v3.py
43+
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics --exclude=javaobj/v3,tests/test_v3.py
44+
fi
3945
- name: Test
4046
run: |
41-
coverage run -m pytest
47+
if python -c "import sys; sys.exit(0 if sys.version_info >= (3, 12) else 1)"; then
48+
coverage run -m pytest
49+
else
50+
coverage run --omit='javaobj/v3/*,tests/test_v3.py' -m pytest --ignore=tests/test_v3.py
51+
fi
4252
- name: Coveralls
4353
env:
4454
COVERALLS_REPO_TOKEN: ${{ secrets.COVERALLS_REPO_TOKEN }}

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,3 +46,7 @@ nosetests.xml
4646
/issue*/
4747
/repro*.py
4848
/test*.py
49+
50+
# uv
51+
.venv
52+
uv.lock

README.md

Lines changed: 172 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -28,22 +28,27 @@ This fork intends to work both on Python 2.7 and Python 3.4+.
2828
| Implementations | Version |
2929
|-----------------|----------|
3030
| `v1`, `v2` | `0.4.0+` |
31+
| `v3` | `0.5.0+` |
3132

32-
Since version 0.4.0, two implementations of the parser are available:
33+
Since version 0.4.0, three implementations of the parser are available:
3334

3435
* `v1`: the *classic* implementation of `javaobj`, with a work in progress
3536
implementation of a writer.
36-
* `v2`: the *new* implementation, which is a port of the Java project
37+
* `v2`: a rewritten implementation, which is a port of the Java project
3738
[`jdeserialize`](https://github.com/frohoff/jdeserialize/),
3839
with support of the object transformer (with a new API) and of the `numpy`
3940
arrays loading.
41+
* `v3`: a **new** implementation, written from scratch to benefit from
42+
Python 3.12+ features.
4043

4144
You can use the `v1` parser to ensure that the behaviour of your scripts
4245
doesn't change and to keep the ability to write down files.
4346

44-
You can use the `v2` parser for new developments
45-
*which won't require marshalling* and as a *fallback* if the `v1`
46-
fails to parse a file.
47+
You can use the `v2` parser for developments in Python versions lower
48+
than 3.12 and *which won't require marshalling*, or as a *fallback*
49+
if the `v1` parser fails to parse a file.
50+
51+
For new development, you should use the `v3` parser.
4752

4853
### Object transformers V1
4954

@@ -67,6 +72,18 @@ it, and avoids a mismatch between the referenced object and the transformed one.
6772
The `v2` implementation provides a new API for the object transformers.
6873
Please look at the *Usage (V2)* section in this file.
6974

75+
### Object transformers V3
76+
77+
| Implementations | Version |
78+
|-----------------|----------|
79+
| `v3` | `0.5.0+` |
80+
81+
The `v3` implementation is a full rewrite targeting **Python 3.12+**.
82+
It uses `dataclasses`, structural pattern matching (`match/case`) and PEP 604
83+
union types. Its API is intentionally similar to `v2` but fixes several
84+
correctness issues and adds stricter safety limits.
85+
Please look at the *Usage (V3)* and *Migration to V3* sections in this file.
86+
7087
### Bytes arrays
7188

7289
| Implementations | Version |
@@ -98,7 +115,8 @@ You can find a sample usage in the *Custom Transformer* section in this file.
98115

99116
## Requirements
100117

101-
* Python >= 2.7 or Python >= 3.4
118+
* Python >= 2.7 or Python >= 3.4 for `v1` and `v2`
119+
* Python >= 3.12 for `v3`
102120
* `enum34` and `typing` when using Python <= 3.4 (installable with `pip`)
103121
* Maven 2+ (for building test data of serialized objects.
104122
You can skip it if you do not plan to run `tests.py`)
@@ -134,8 +152,8 @@ with open("objCollections.ser", "rb") as fd:
134152

135153
**Note:** The objects and methods provided by `javaobj` module are shortcuts
136154
to the `javaobj.v1` package, for Compatibility purpose.
137-
It is **recommended** to explicitly import methods and classes from the `v1`
138-
(or `v2`) package when writing new code, in order to be sure that your code
155+
It is **recommended** to explicitly import methods and classes from the `v1`,
156+
`v2`, or `v3` package when writing new code, in order to be sure that your code
139157
won't need import updates in the future.
140158

141159

@@ -391,13 +409,13 @@ class JavaRandomTransformer(BaseTransformer):
391409
values = []
392410
for f_name, f_type in zip(self.field_names, self.field_types):
393411
values.append(parser._read_field_value(f_type))
394-
fields.append(javaobj.beans.JavaField(f_type, f_name))
412+
fields.append(javaobj.v2.beans.JavaField(f_type, f_name))
395413

396-
class_desc = javaobj.beans.JavaClassDesc(
397-
javaobj.beans.ClassDescType.NORMALCLASS
414+
class_desc = javaobj.v2.beans.JavaClassDesc(
415+
javaobj.v2.beans.ClassDescType.NORMALCLASS
398416
)
399417
class_desc.name = self.name
400-
class_desc.desc_flags = javaobj.beans.ClassDataType.EXTERNAL_CONTENTS
418+
class_desc.desc_flags = javaobj.v2.beans.ClassDataType.EXTERNAL_CONTENTS
401419
class_desc.fields = fields
402420
class_desc.field_data = values
403421
return class_desc
@@ -473,10 +491,151 @@ transformers = [
473491
RandomChildTransformer(),
474492
JavaRandomTransformer()
475493
]
476-
pobj = javaobj.loads("custom_objects.ser", *transformers)
494+
with open("custom_objects.ser", "rb") as fd:
495+
pobj = javaobj.load(fd, *transformers)
477496

478497
# Here we show a field that isn't visible from the class description
479498
# The field belongs to the class but it's not serialized by default because
480499
# it's static. See: https://stackoverflow.com/a/16477421/12621168
481500
print(pobj.field_data["int_not_in_fields"])
482501
```
502+
503+
## Usage (V3 implementation)
504+
505+
> **Requires Python 3.12+.**
506+
507+
The `javaobj.v3` package is a full rewrite of the Java object stream parser.
508+
It provides the same two entry-points as `v2`:
509+
510+
* `load(fd, *transformers, use_numpy_arrays=False, max_array_size=…, max_depth=500)`:
511+
Parses a binary file descriptor opened in `rb` mode and returns the top-level
512+
object if the stream contains exactly one, a list of objects if there are
513+
several, or `None` for an empty stream. Pass additional `ObjectTransformer`
514+
instances as positional arguments.
515+
516+
* `loads(data, *transformers, …)`:
517+
Convenience wrapper around `load()` that accepts `bytes`.
518+
519+
Sample usage:
520+
521+
```python
522+
import javaobj.v3 as javaobj
523+
524+
with open("obj5.ser", "rb") as fd:
525+
pobj = javaobj.load(fd)
526+
527+
# Access fields by name (preferred)
528+
value = pobj.get_field("myField")
529+
530+
# Or use attribute-style access (issues a warning on ambiguity)
531+
value = pobj.myField
532+
```
533+
534+
### New features in V3
535+
536+
| Feature | V1 | V2 | V3 |
537+
|---|---|---|---|
538+
| Python 3.12+ (`match/case`, PEP 604) ||||
539+
| Fully typed (`dataclasses`, PEP 695 `type` aliases) || partial ||
540+
| `TC_RESET` handling ||||
541+
| `TC_EXCEPTION` in object graph ||||
542+
| `TC_PROXYCLASSDESC` ||||
543+
| Security limits (max depth / array size) ||||
544+
| Correct `TYPE_CHAR` numpy dtype (`>u2`) ||||
545+
| Typed exception hierarchy ||||
546+
| `BlockData.__eq__(bytes)` compatibility ||||
547+
548+
### Security limits
549+
550+
`v3` adds two optional safety limits that prevent resource exhaustion when
551+
parsing untrusted streams:
552+
553+
```python
554+
import javaobj.v3 as javaobj
555+
556+
with open("untrusted.ser", "rb") as fd:
557+
pobj = javaobj.load(
558+
fd,
559+
max_array_size=10 * 1024 * 1024, # 10 MiB max per array
560+
max_depth=100, # max object-graph depth
561+
)
562+
```
563+
564+
### Object Transformer V3
565+
566+
The `ObjectTransformer` base class in `v3` has the same three override points
567+
as in `v2`:
568+
569+
* `create_instance(classdesc)` — return a `JavaInstance` subclass (or `None`
570+
to fall back to the next transformer).
571+
* `load_array(reader, type_code, size)` — called for `TC_ARRAY` records;
572+
return the array data (`bytes` or `list`) or `None` to use the default logic.
573+
* `load_custom_writeObject(parser, reader, class_name)` — called when a
574+
class written with `writeObject()` requires fully custom parsing.
575+
576+
The `DefaultObjectTransformer` additionally exposes a public `handles(name)`
577+
method that returns `True` when the transformer knows how to load the given
578+
Java class name.
579+
580+
### Using NumPy arrays (V3)
581+
582+
```python
583+
import javaobj.v3 as javaobj
584+
585+
with open("arrays.ser", "rb") as fd:
586+
pobj = javaobj.load(fd, use_numpy_arrays=True)
587+
```
588+
589+
When `use_numpy_arrays=True`, a `NumpyArrayTransformer` is appended to the
590+
transformer list and primitive arrays are returned as `numpy.ndarray`.
591+
592+
---
593+
594+
## Migration to V3
595+
596+
### From V1 to V3
597+
598+
| V1 | V3 |
599+
|---|---|
600+
| `import javaobj` | `import javaobj.v3 as javaobj` |
601+
| `pobj.classdesc.name` | `pobj.classdesc.name` (unchanged) |
602+
| `pobj.myField` (direct attribute) | `pobj.get_field("myField")` (preferred) or `pobj.myField` |
603+
| `pobj._data` on arrays | `pobj.data` (public) |
604+
| `javaobj.JavaObjectUnmarshaller` | removed — use `javaobj.v3.parser.JavaStreamParser` |
605+
| `javaobj.JavaObjectMarshaller` | marshalling not available in `v3` |
606+
| Exceptions: bare `Exception` | Typed: `ParseError`, `UnexpectedOpcodeError`, … |
607+
608+
Shallow conversion helper (best-effort, for gradual migration):
609+
610+
```python
611+
from javaobj.v3._compat import v1_to_v3
612+
v3_obj = v1_to_v3(v1_obj)
613+
```
614+
615+
### From V2 to V3
616+
617+
| V2 | V3 |
618+
|---|---|
619+
| `import javaobj.v2 as javaobj` | `import javaobj.v3 as javaobj` |
620+
| `javaobj.load(fd)` | `javaobj.load(fd)` (same signature) |
621+
| `javaobj.loads(data)` | `javaobj.loads(data)` (same signature) |
622+
| `pobj.classdesc.name` | `pobj.classdesc.name` (unchanged) |
623+
| `pobj.field_data[cd][field]` | `pobj.field_data[cd][field]` (unchanged) |
624+
| `pobj.get_field("name")` | `pobj.get_field("name")` (unchanged) |
625+
| `pobj.__getattr__` ambiguity silent | warns when field exists in multiple classes |
626+
| `transformer._type_mapper` (private) | `transformer.handles(name)` (public) |
627+
| `JavaArray.data` (`tuple` of ints for bytes) | `JavaArray.data` (`bytes` for `TYPE_BYTE`) |
628+
| `BlockData` compared with `bytes` | `BlockData.__eq__(bytes)` still works |
629+
| `use_numpy_arrays=True` (v2 option) | `use_numpy_arrays=True` (same) |
630+
| No depth/size limits | `max_depth=500`, `max_array_size=100 MiB` |
631+
| No typed exceptions | `ParseError`, `SecurityError`, … |
632+
633+
Shallow conversion helper (best-effort, for gradual migration):
634+
635+
```python
636+
from javaobj.v3._compat import v2_to_v3
637+
v3_obj = v2_to_v3(v2_obj)
638+
```
639+
640+
> **Note:** `v3` requires **Python 3.12+** and does **not** support marshalling
641+
> (writing). If you need to write Java object streams, use `v1`.

0 commit comments

Comments
 (0)