-
Notifications
You must be signed in to change notification settings - Fork 84
Esql support #233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Esql support #233
Conversation
…esql option, validations to make sure both LS and ES support the ESQL execution.
… adds by default - might be users are looking for by default.
…/info and add docinfo* fields in ineffective fields list.
Fix the condition to correctly compares supported LS version.
…t timestampt converter to LogStash::Timestamp, dotted fields extended to nested fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first round of review, overall it looks good, I'll give it a spin today/tomorrow to check on the overall user experience.
…tting the result into target if defined. Debug logs added which can help to investigate query and its result.
|
||
private | ||
|
||
def get_query_object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review note: moved to private area
One use case that concerns me is the common default pattern of ES creating a "field.keyword" for each "field", which results in an error in the plugin during The ways to not have this is to have a dedicated mapping without this overlap or being explicit about what to keep using Also the error is not very helpful given it's coming straight from
not sure yet what the solution should be, but at least catching this particular nesting scenario and bubbling up a warning saying "you can't keep top level and nested fields". |
I was wrong, if there is a |
Right! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks to be on track.
- I'd like to include the client-side mitigation of queries that come back with inner sub-fields to prevent crashes
- I'd like to align with the filter plugin for which parameter to specify the ESQL query in; if we determine that is better to use
esql_query
in the filter due to the filter's inability to distinguish a QueryString query from an ES|QL query, I'd like to use it here too. - I would prefer more validation of inputs; a user shouldn't be able to configure ESQL with irrelevant things like
slices
ordocinfo
.
NOTE: If your index has a mapping with sub-objects where `status.code` and `status.desc` actually dotted fields, they appear in {ls} events as a nested structure. | ||
|
||
[id="plugins-{type}s-{plugin}-esql-multifields"] | ||
===== Conflict on multi-fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is going to be a pretty common issue with things like text/keyword multi-fields, I've proposed in a separate channel that we could detect and drop sub-fields instead of allowing the plugin to crash, and have provided a prototype for doing so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have optimized the prototype to reduce the time complexity (O(NLgN) + O(N^2) -> O(NLgN) + O(N+K))
(K is the max depth) with memoization.
The (so-far) last commit
- updates the doc to reflect this change;
- warn log messages (1 warn message for 1 query result, not all rows) with ignored multi-field values and includes the guidance (use
RENAME
command) if user wants to include the them into the event; - adds unit test as well
Real scenario test (also tested with type => keyword
scenario):
// mapping
{
"my-time-index-000001": {
"mappings": {
"properties": {
"metrics": {
"subobjects": false,
"properties": {
"time": {
"type": "long"
},
"time.max": {
"type": "long"
},
"time.min": {
"type": "long"
}
}
}
}
}
}
}
// warn message
[2025-05-08T11:44:12,419][WARN ][logstash.inputs.elasticsearch.esql][main][8ff0da15d6ccf4b9d00dbcea466def36aa962864eb8638fa2b28e2f58af6d254] Multi-fields found in ES|QL result and they will not be available in the event. Please use `RENAME` command if you want to include them. {:found_multi_fields=>["metrics.time.max", "metrics.time.min"]}
// output event
{
"@timestamp" => 2025-05-08T18:44:12.421771Z,
"@version" => "1",
"_id" => "metric_2",
"metrics" => {
"time" => 100
}
}
{
"@timestamp" => 2025-05-08T18:44:12.419639Z,
"@version" => "1",
"_id" => "metric_1",
"metrics" => {
"time" => 100
}
}
docs/index.asciidoc
Outdated
|This plugin |4.23.0+ (4.x series) or 5.2.0+ (5.x series) | ||
|=== | ||
|
||
To configure ES|QL query in the plugin, set the `response_type` to `esql` and provide your ES|QL query in the `query` parameter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels cumbersome to me.
Could we align with the proposal in the filter PR to provide an ESQL query with esql_query
instead of requring the configuration of multiple separate parameters?
In this case, since the input plugin does require a JSON-encoded object for its query
parameter when using the Query DSL, we could auto-detect that a given query
parameter is ESQL (unlike the ES filter, which uses a QueryString query as its query
parameter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we had a discussion with @jsvd about this, we had a similar idea to deprecate this response_type
and replace with query_type
in the future. And through the experience as I do see, introducing new param is not a difficult, deprecation -> obseletion -> removal is a long headache process.
From this point of view, I would support adding minimal change but I am open to apply changes if anyone has strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a separate note on how to do it.
I don't personally care much about removing the response_type
right away, but if a user starts using ESQL I'd like them to not start new usages of a config that we'd like to deprecate.
Since this is effectively a rename, we can easily use the with_deprecated_alias
helper from NormalizeConfigSupport
.
Co-authored-by: Rye Biesemeyer <yaauie@users.noreply.github.com> Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
…yntax fix, unit test errors fix.
# hits: normal search request | ||
# aggregations: aggregation request | ||
# esql: ES|QL request | ||
config :response_type, :validate => %w[hits aggregations esql], :default => 'hits' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Migrating to query_type
with auto-detection of ESQL queries would be pretty straight-forward with the NormalizeConfigSupport
mixin:
config :response_type, :validate => %w[hits aggregations esql], :default => 'hits' | |
config :response_type, :validate => %w[hits aggregations], :deprecated => "use `query_type`" | |
config :query_type, :validate => %w[hits aggregations esql] # default depends on query shape |
def register
+ @query_type = normalize_config("query_type") do |normalizer|
+ normalizer.with_deprecated_alias("response_type")
+ end || (@query.start_with?('{') ? 'hits' : 'esql')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking to add the deprecation right after this ES|QL change.
One agreement we need to decide is naming. I personally do not like hits
, aggregations
along with esql
. They indicate different contexts. I had options dsl_search
, dsl_aggregation
and esql
.
Let me please know your opinion: I can either apply with change if we quickly come with agreement or create an issue follow up right after this PR.
docs/index.asciidoc
Outdated
|This plugin |4.23.0+ (4.x series) or 5.2.0+ (5.x series) | ||
|=== | ||
|
||
To configure ES|QL query in the plugin, set the `response_type` to `esql` and provide your ES|QL query in the `query` parameter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a separate note on how to do it.
I don't personally care much about removing the response_type
right away, but if a user starts using ESQL I'd like them to not start new usages of a config that we'd like to deprecate.
Since this is effectively a rename, we can easily use the with_deprecated_alias
helper from NormalizeConfigSupport
.
…pply method to avoid null checks at runtime.
Description
ES|QL support:
response_type
acceptsesql
option distinguish from other query types. For the long term this will be deprecated and replaced byquery_type
if team agrees.METADATA
which adds_id
,_version
to the response entriessize
,search_api
,target
if users configure{a.b.c: 'val'}
=>{'a':{'b':{'c':'val'}}}
)FYI: failed docs CI isn't related to this change.
Sample minimal config to test:
Author's check
Logs