Skip to content

Add Text Fitter plugin v0.0.1#2452

Open
AlexMultiAgent wants to merge 6 commits into
langgenius:mainfrom
AlexMultiAgent:add-text-fitter-0.0.1
Open

Add Text Fitter plugin v0.0.1#2452
AlexMultiAgent wants to merge 6 commits into
langgenius:mainfrom
AlexMultiAgent:add-text-fitter-0.0.1

Conversation

@AlexMultiAgent
Copy link
Copy Markdown
Contributor

@AlexMultiAgent AlexMultiAgent commented May 23, 2026

Submission Type

  • New plugin submission
  • Version update to an existing plugin

Checklist

  • I have read and followed the Publish to Dify Marketplace guidelines
  • I have read and comply with the Plugin Developer Agreement
  • I confirm my plugin works properly on both Dify Community Edition and Cloud Version
  • I confirm my plugin has been thoroughly tested for completeness and functionality
  • My plugin brings new value to Dify

Documentation Checklist

  • README includes step-by-step setup instructions
  • README includes detailed usage instructions
  • README clearly lists all parameters and outputs
  • README includes connection requirements and configuration details
  • README includes a link to the plugin source code repository

Privacy Protection

Description

Text Fitter is a Dify tool plugin that ensures text fits within LLM context window limits via intelligent extractive summarization. Supports Chinese, Japanese, and English text.

  • Pure Python stdlib — zero external NLP dependencies
  • MMR diversity penalty to reduce redundancy
  • Boundary-aware fallback for edge cases
  • 80+ unit tests covering all algorithm functions
  • UI localized in 4 languages (en_US, zh_Hans, zh_Hant, ja_JP)

Source Code

https://github.com/AlexMultiAgent/dify-plugin-text-fitter

@AlexMultiAgent AlexMultiAgent force-pushed the add-text-fitter-0.0.1 branch 3 times, most recently from 8373ea4 to a76c8bf Compare May 24, 2026 17:32
@AlexMultiAgent AlexMultiAgent force-pushed the add-text-fitter-0.0.1 branch from 4365de4 to 30f8e83 Compare May 24, 2026 20:37
- Fix length score discontinuities across all boundaries
- Expand tokenizer CJK character ranges
- Improve exception handling with error visibility
- Increase compression_ratio precision
- Use CJK ellipsis for CJK text truncation
- Add .difyignore for clean packaging
- Fix length score piecewise continuity at len=20 boundary
- Add multi-dot abbreviation protection (i.e., e.g.)
- Pre-compile sentence-split regex at module level
- Extract shared CJK Unicode ranges into _is_cjk_char helper
- Fix incomplete CJK detection in boundary truncation fallback
- Guard MMR prefilter empty-selection index mapping
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant