fix: admin-adjustable max columns with a clear overflow message#4669
fix: admin-adjustable max columns with a clear overflow message#4669Ma77Ball wants to merge 12 commits intoapache:mainfrom
Conversation
|
#3825 is already closed? |
We want this parameter to be adjustable by the admins. I am OK with fixing it. |
|
@kunwp1 @SarahAsad23 Feel free to chime in based on a related use case with a collaborator. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4669 +/- ##
============================================
- Coverage 43.53% 43.52% -0.02%
+ Complexity 2084 2080 -4
============================================
Files 957 958 +1
Lines 34077 34110 +33
Branches 3753 3755 +2
============================================
+ Hits 14836 14846 +10
- Misses 18445 18463 +18
- Partials 796 801 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Let's make it a large fixed number, unless there is a disadvantage for that. |
The disadvantage is that univocity allocates memory per parser based on that number (10,000 is negligible, but something like 1,000,000 is not when you want to scale the platform). |
That PR was closed after we decided in a meeting to keep it static, as @aglinxinyuan suggested (it only fixes the display of CSVs in the result panel with more columns). The problem is that users are hitting the 512 limit and need it increased. |
This makes sense to me now. Why do we also need batch size adjustable for result table? Can we just pick a fixed number? |
That setting I added because the result panel settings menu seems a little empty, and I thought it might be a nice feature for users. I can remove it if it should be in a different PR or if it's not needed. |
Let's don't over complicate the settings. |
|
Agreed. Let's take a simple solution. |
|
@aglinxinyuan, please review again. I removed the other column-per-result-panel setting. |
What changes were proposed in this PR?
csv_parser_max_columnsandresult_table_columns_per_batchto Admin → Settings → Result Panel. The CSV scan operator and the result table read these fromsite_settingsat runtime;defaults live in
default.conf.RuntimeException("Max columns of N exceeded.")Instead of letting Univocity's raw stack trace through. Detection looks at the cause(
ArrayIndexOutOfBoundsException) and parses the index from both the Java 8 and Java 9+ message formats — needed because Univocity's own hint string is silently dropped on Java 9+.site_settingswith fallback" pattern intocommon/dao/SiteSettings, replacing inline jOOQ inConfigResource,CSVScanSourceOpExec, andDatasetResource.Any related issues, documentation, discussions?
closes #4589
Related #3825
How was this PR tested?
CSVScanSourceOpExecSpec: happy path, end-of-input, overflow translation against a real parser, andisColumnOverflowcases for both AIOOBE message formats.admin-settings.component.spec.tsandresult-table-frame.component.spec.tsfor load/save/reset and the columnLimit fallback chain.sbt Test/compileclean across DAO, ConfigService, WorkflowOperator, FileService. Frontend specs need Node ≥ 20.19 to run.Max columns of N exceeded.Was this PR authored or co-authored using generative AI tooling?
Co-authored with: Claude Opus 4.7