Fix/array type ip binary#5516
Conversation
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit dad3901.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
b1f89a9 to
b567be6
Compare
PR Reviewer Guide 🔍(Review updated until commit b491034)Here are some key observations to aid the review process:
|
b567be6 to
69d24da
Compare
PR Code Suggestions ✨Latest suggestions up to b491034 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 2734707
Suggestions up to commit 69d24da
Suggestions up to commit b567be6
|
|
Persistent review updated to latest commit 69d24da |
69d24da to
2734707
Compare
|
Persistent review updated to latest commit 2734707 |
|
Please take a look at the CI failures |
2734707 to
dad3901
Compare
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
dad3901 to
b491034
Compare
|
Persistent review updated to latest commit b491034 |
Description
Companion fix for the LIST nullability change in opensearch-project/OpenSearch#21997. After that fix,
stats list(<field>)returns HTTP 200 for 10 of 12 element types — butlist(ip_value)andlist(binary_value)still fail withHTTP 400 ExpressionEvaluationException: unsupported object class [B. This PR fixes those two.Root cause
Two layers, each correct in isolation:
PPLBuiltinOperators.LISTwas registered with return typePPLReturnTypes.STRING_ARRAY, which always returnsARRAY<VARCHAR>regardless of input element type — type information is lost.AnalyticsExecutionEngine.toExprValue) checkstype instanceof IpType/BinaryTypeper cell. With the column type collapsed toARRAY<VARCHAR>it never matches, so per-elementbyte[]values fall through toExprValueUtils.fromObjectValue, which has nobyte[]case and throws.Fix
Two small changes:
PPLBuiltinOperators.LIST: switch return-type inference fromSTRING_ARRAYtoARG0_ARRAY, so the column type at the response boundary becomesARRAY<IpType>/ARRAY<BinaryType>(matching the existingTAKEregistration).AnalyticsExecutionEngine.toExprValue: when the column isARRAY<IpType>/ARRAY<BinaryType>, recurse element-wise and apply the same canonical-IP-string / Base64 conversion already used for scalarIpType/BinaryTypecells. Extracted the two byte[] converters into helpers (ipBytesToExprValue,binaryBytesToExprValue) so the scalar and array paths share them.Testing
stats list(ip_value)unsupported object class [B[["127.0.0.1"]]stats list(binary_value)unsupported object class [B[["U29tZSBiaW5hcnkgYmxvYg=="]]stats list(<other 10 types>)Out of scope
VALUESaggregate registration still usesSTRING_ARRAY. Same fix would apply if anyone exercisesvalues(ip_value)/values(binary_value); flagging for follow-up.ListAggFunction.javastill doesString.valueOf(byte[])— latent on the V3 (non-analytics-engine) path. Analytics path uses DataFusion'sLOCAL_ARRAY_AGG_OPinstead, so it doesn't fire here.