ConvRDF

Converts a file in some RDF formats such as RDF/XML, JSON-LD, Turtle to it in N-Triples.

ConvRDF converts a file in a streaming fashion, so that it can handle huge numbers of triples (> 1 billion) without consuming a huge amount of memory.

Compressed (.gz, .bz2, and .xz) files including .tar.gz can be processed properly.

NOTICE: This tool uses Apache Jena and Apache Commons libraries, which are released under the Apache License Version 2.0. In addition, XZ for Java is used for processing XZ files.

Requirements

Java 17 or later (LTS recommended)
Apache Jena 5.x (resolved automatically via Maven)

Build

mvn -B package
# produces target/ConvRDF.jar (self-contained, executable)

Usage

java -jar ConvRDF.jar [options] <file-or-directory>...

Options

Option	Description
`-r`	Recursively process directories. Default: off.
`-c`	Enable strict RDF syntax checking by Apache Jena. Default: off.
`-i`	Ignore lexical-level errors (bad IRIs, malformed literals, invalid Unicode escapes) with a warning, and continue. Structural syntax errors still abort the current file. Implies `-c`.
`-b <URI>`	Override the base URI used to resolve relative IRIs. Default: derived from the input file path.
`-o <file>`	Output file. Extensions `.gz` and `.xz` trigger compression. Default: standard output.
`-h`, `--help`	Show help and exit.

Exit status

Code	Meaning
`0`	All inputs were converted successfully.
`1`	One or more errors occurred (invalid arguments, unreadable files, parse failures, or I/O errors). Errors are reported to stderr.

Examples

# Convert a single Turtle file to N-Triples on stdout
java -jar ConvRDF.jar data.ttl > data.nt

# Convert an xz-compressed RDF/XML file, writing gzip-compressed N-Triples
java -jar ConvRDF.jar -o out.nt.gz ontology.owl.xz

# Recursively convert a tree, tolerating bad IRIs but reporting them
java -jar ConvRDF.jar -r -i -o all.nt.xz /data/rdf/

# Override the base URI used to resolve relative IRIs in a streamed input
java -jar ConvRDF.jar -b http://example.org/ontology -o out.nt input.owl.xz

Migration notes (v1 -> v2)

Apache Jena upgraded to 5.x. Requires Java 17 or later.
Streaming pipeline rewritten. The old PipedRDFIterator + thread-based producer/consumer pattern has been replaced with a direct StreamRDF pipeline. Memory use is now constant regardless of input size.
Base URI is set explicitly when reading from compressed streams or tar entries. This fixes the previous {E211} Base URI is null failure on RDF/XML inputs that contain relative IRIs (for example rdf:about="") when the input was a .xz / .gz / .bz2 file.
New option -i ignores lexical-level errors (bad IRIs, malformed literals, invalid Unicode escapes) with a warning, while still aborting on structural syntax errors.
New option -b <URI> overrides the auto-derived base URI.
Exit status is now well-defined: 0 on full success, 1 if any error occurred. Output is flushed and closed cleanly so .gz / .xz trailers are always written.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
src/main/java/jp/ac/rois/dbcls		src/main/java/jp/ac/rois/dbcls
.gitignore		.gitignore
ConvRDF.jar		ConvRDF.jar
ConvRDF.mf		ConvRDF.mf
ConvRDF.png		ConvRDF.png
LICENSE		LICENSE
README.md		README.md
log4j.properties		log4j.properties
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ConvRDF

Requirements

Build

Usage

Options

Exit status

Examples

Migration notes (v1 -> v2)

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ConvRDF

Requirements

Build

Usage

Options

Exit status

Examples

Migration notes (v1 -> v2)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages