Skip to content

Add Ecto.Query.label/2 query expression for tagging queries with an S…#4732

Open
alesasnouski wants to merge 1 commit into
elixir-ecto:masterfrom
alesasnouski:master
Open

Add Ecto.Query.label/2 query expression for tagging queries with an S…#4732
alesasnouski wants to merge 1 commit into
elixir-ecto:masterfrom
alesasnouski:master

Conversation

@alesasnouski
Copy link
Copy Markdown

Hi,
I added a label query expression (and label: keyword) that prepends a /* ... */ comment to the generated statement, right before the SELECT/UPDATE/DELETE keyword, like:

from(p in Post, label: "get_post_titles_q", select: p.title)

Tag (or label) makes it easy to identify a query's call site in database logs, slow-query logs, etc. The comment is rendered leading (not trailing) so it survives truncation of long statements in those logs.

I went with label rather than comment because :comment is already a Postgrex connection option (it appends a trailing comment to the statement), so reusing the name would be confusing.

Tested against Postgres, MySQL and SQL Server.

Comment thread lib/ecto/repo.ex
in database logs and monitoring tools. The string is embedded verbatim into
the SQL and therefore cannot contain `/*`, `*/`, or null bytes. Prefer a
stable identifier per call site, as the label becomes part of the query
cache key.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should support it as a repo option and query option, let's go with one or the other.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It reads more like a query option than a repo option:

from(s in Site, label: "foobar") |> Repo.all()   vs   from(s in Site) |> Repo.all(label: "foobar")

But Repo.insert/update/delete get a struct/changeset and insert_all gets a list of maps, which is why we ended up with the mixed version.

We could switch to the repo option everywhere (same as :prefix). What are your thoughts?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have :comment on Postgrex, so if that's your goal, we can add :comment to MyXQL and TDS, and there is no need to change Ecto? :)

Copy link
Copy Markdown
Author

@alesasnouski alesasnouski Jun 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truth is that the comment should be at the beginning of the query. By default in Postgres (for instance) track_activity_query_size == 1024 bytes, long queries are truncated (in pg_stat_statements, pg_stat_activity and, as a result, in GCP Query Insights console). Changing track_activity_query_size requires a DB server restart.
GCP allows you to change query length up to 4500 bytes, for instance (+ server restart needed).
image
Also adding a comment at the end (in the current implementation) disables the query cache (prep statements).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is at the end because that’s what sqlcommenter uses. Does it also work at the beginning? Another option is to have a flag to control it.

In any case, the disabling of query caching is not related to the position. If we implement it in Ecto and make it part of query cache, then it can work regardless. One option is to support precomment and postcomment options, and deprecate the Postgres comment one.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did similar change some time ago master...dkuku:ecto:comments_in_query

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With sqlcommenter, comments are dynamic, which is why the prepared-statement cache must be disabled.
So we have three variables: the comment, its position (head/tail), and the cache (on/off) ...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With sqlcommenter, comments are dynamic, which is why the prepared-statement cache must be disabled.

They are dynamic in some sense but, once the code is compiled, you have a limited set of callers to Ecto.Repo. So it could go through the cache. Unless, of course, you start adding dynamic content to it (such as user_id). Then the amount of combinations are just too high and I think it makes sense to disable cache.

One option is for us to add two macros: pre_comment and post_comment, that comment before and after (although we store it in a single field called comments: [{:pre, string}, {:post, string}]. And yeah, I'd force them to be static (so no repo option).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a :cache option ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did work on this some time ago, and one idea I had was to allow caching the query when the comment is:

  • an atom e.g. :get_all_users
  • keyword list where values are only atoms such as [app: :my_app, context: :rpc]
  • compile-time string, e.g. label(query, "my_app") which the macro system is aware of when used as a Query macro.

This way, by design, you have a limited set of tags for a single query. If you are generating atoms dynamically, you probably have bigger problems in your application, and standard code inspection tools would catch it.


alias Ecto.Query.Builder

@forbidden ["*/", "/*", <<0>>]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should let the adapters do the escaping in case some adapters use a different style?

Comment thread lib/ecto/query.ex
tools. The label is rendered *leading* (rather than trailing) so it survives
truncation of long statements in logs.

Because the label becomes part of the generated SQL, it is also part of the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should be in an admonition block. there have been a few times the past couple of years where we had to add workarounds for people whose cache's blew up without them expecting it.

maybe we should also highlight that if they want to use high cardinality values they can disable the cache for that query. i never used sql comments for logging though so i'm not too sure if this is some kind of extremely bad practice that we shouldn't be encouraging at all.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josevalim If comments are always prepended do we need to make them part of the cache? Can we just leave them out and then prepend them to the cached sql in the adapter.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They need to be part of the cache, yes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants