Skip to content

LessUp/fq-compressor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

245 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fq-compressor

High-performance FASTQ compression for the sequencing era

CI Status Code Quality Latest Release License C++23 Documentation

English简体中文Rust Implementation


🎯 What is fq-compressor?

fq-compressor is a high-performance FASTQ compression tool that leverages Assembly-based Compression (ABC) and Statistical Context Mixing (SCM) to achieve near-entropy compression ratios while maintaining O(1) random access to compressed data.

Key highlights:

  • 🧬 3.97× compression ratio on Illumina data
  • 11.9 MB/s compression, 62.3 MB/s decompression (multithreaded)
  • 🎯 Random access without full decompression
  • 🚀 Intel oneTBB parallel pipeline
  • 📦 Transparent support for .gz, .bz2, .xz inputs

📦 Quick Installation

Pre-built Binaries (Recommended)

Linux (x86_64, static binary):

wget https://github.com/LessUp/fq-compressor/releases/download/v0.2.0/fq-compressor-v0.2.0-linux-x86_64-musl.tar.gz
tar -xzf fq-compressor-v0.2.0-linux-x86_64-musl.tar.gz
sudo mv fq-compressor-v0.2.0-linux-x86_64-musl/fqc /usr/local/bin/

macOS (Homebrew):

# Coming soon

Other platforms: See Installation Guide

Build from Source

git clone https://github.com/LessUp/fq-compressor.git
cd fq-compressor

# Install dependencies via Conan
conan install . --build=missing -of=build/gcc-release \
    -s build_type=Release -s compiler.cppstd=23

# Build
cmake --preset gcc-release
cmake --build --preset gcc-release -j$(nproc)

# Binary: build/gcc-release/src/fqc

Requirements: GCC 14+ or Clang 18+, CMake 3.28+, Conan 2.x


🚀 Basic Usage

Compress & Decompress

# Compress FASTQ to FQC format
fqc compress -i reads.fastq -o reads.fqc

# Verify archive integrity
fqc verify reads.fqc

# Full decompression
fqc decompress -i reads.fqc -o restored.fastq

Advanced Features

# Random access - extract reads 1000-2000
fqc decompress -i reads.fqc --range 1000:2000 -o subset.fastq

# Multi-threaded compression (8 threads)
fqc compress -i reads.fastq -o reads.fqc -t 8 -v

# Paired-end data
fqc compress -i reads_1.fastq -2 reads_2.fastq \
  -o paired.fqc --paired

# Archive inspection
fqc info reads.fqc

📊 Proof Points

  • 3.97× compression on Illumina data with O(1) random access
  • 11.9 MB/s compression and 62.3 MB/s decompression in multithreaded runs
  • Archive inspection and verification via fqc info and fqc verify
  • Transparent input handling for .gz, .bz2, and .xz FASTQ inputs

For deeper benchmark data, algorithm notes, and file-format details, use the maintained docs rather than this repository entry page.


📚 Documentation & Project Surfaces

Surface Role
📖 GitHub Pages Public landing page and EN/ZH entry paths
🚀 English docs Whitepaper, academy, architecture, evidence
简体中文文档 白皮书、学院、架构说明、证据链
📦 Releases Prebuilt binaries
🤝 Contributing Guide Closeout-oriented development workflow

🛠️ Development

fq-compressor is in closeout mode. Simple development workflow:

./scripts/build.sh clang-debug
./scripts/lint.sh format-check
./scripts/test.sh clang-debug

See AGENTS.md for full project rules and architecture.


🤝 Contributing

Focused contributions are welcome, especially for:

  • documentation cleanup and ownership tightening
  • evidence-driven bug fixes with regression coverage
  • workflow and tooling simplification
  • archive-readiness polish

See the Contributing Guide for the repository workflow.


📄 License

  • Project Code: MIT License — see LICENSE
  • vendor/spring-core/: Spring's original research license (not MIT)

🙏 Acknowledgments

  • Spring (Chandak et al., 2019) — ABC algorithm inspiration
  • fqzcomp5 (Bonfield) — Quality compression reference
  • Intel oneTBB — Parallel computing framework
  • Contributors — Everyone who has helped improve this project

ReleasesDocumentationChangelogDiscussions

About

High-performance FASTQ compression tool with 3.97x ratio and O(1) random access. C++23, ABC+SCM algorithms, Intel oneTBB parallelism.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors