Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,22 @@

### Added

- New `aggregate` SETTING on Identity-stat layers (point, line, area, bar, ribbon,
range, segment, arrow, rule, text). By default it collapses each group to a
single row by replacing every numeric mapping in place with its aggregated
value. Accepts a single string or array of strings; entries are either
unprefixed defaults (`'mean'`) or per-aesthetic targets (`'y:max'`,
`'color:median'`). Up to two defaults may be supplied — the first applies to
lower-half aesthetics plus all non-range layers, the second to upper-half
(`max`/`end` suffix). Numeric mappings without a target or applicable default
are dropped with a warning. Targeting the same aesthetic more than once
(e.g. `aggregate => ('y:min', 'y:max')`) produces one row per function with
a synthetic `aggregate` column tagging each row, available for `REMAPPING` to
another aesthetic; targets with a single function and the unprefixed defaults
are reused unchanged across the exploded rows. The `aggregate` column's value
is built from the dedup-and-joined function names of all exploded targets at
each row, separated by `/` (so `('y:min', 'y:max', 'color:sum', 'color:prod')`
yields `'min/sum'` and `'max/prod'`). Mixed lengths above 1 are an error.
- Add cell delimiters and code lens actions to the Positron extension (#366)
- ODBC is now turned on for the CLI as well (#344)
- `FROM` can now come before `VISUALIZE`, mirroring the DuckDB style. This means
Expand Down Expand Up @@ -37,6 +53,7 @@ portion (#364).
- Removed polars from dependency list along with all its transient dependencies. Rewrote DataFrame struct on top of arrow (#350)
- Moved ggsql-python to its own repo (posit-dev/ggsql-python) and cleaned up any additional references to it
- Moved ggsql-r to its own repo (posit-dev/ggsql-r)
- The `orientation` setting on `ribbon` and `range` layers. With explicit `xmin`/`xmax` or `ymin`/`ymax` mappings, orientation is unambiguous and is auto-detected from the mappings; the override is no longer needed.

## [2.7.0] - 2026-04-20

Expand Down
39 changes: 39 additions & 0 deletions doc/syntax/clause/draw.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,45 @@ The `SETTING` clause can be used for two different things:
#### Position
A special setting is `position` which controls how overlapping objects are repositioned to avoid overlapping etc. Position adjustments have special mapping requirements so all position adjustments will not be relevant for all layer types. Different layers have different defaults as detailed in their documentation. You can read about each different position adjustment at [their own documentation sites](../index.qmd#position-adjustments).

#### Aggregate
Some layers support aggregation of their data through the `aggregate` setting. These layers will state this. `aggregate` collapses each group to a single row, replacing every numeric mapping in place with its aggregated value. Groups are defined by `PARTITION BY` together with all discrete mappings.

The setting takes a single string or an array of strings. Each string is one of:

* **Default** — `'<func>'` (no prefix). With one default the function applies to every untargeted numeric mapping. With two defaults the first is used for the lower side of range layers (e.g. `x`/`xmin`) plus all non-range layers, and the second is used for the upper side of range layers (e.g. `xend`/`xmax`). More than two defaults is an error.
* **Target** — `'<aes>:<func>'`. Applies `func` to the named aesthetic only (`<aes>` is a user-facing name like `x`, `y`, `xmin`, `xmax`, `xend`, `yend`, `color`, `size`, …). A target overrides any default for that aesthetic.

A numeric mapping that has neither a target nor an applicable default is dropped from the layer with a warning.

You can also target the same aesthetic more than once to produce **multiple rows per group** — one for each function. For example `aggregate => ('y:min', 'y:max')` emits a min row and a max row per group, so a single `DRAW line` produces two summary lines that connect within each group rather than across them.

The stat exposes a synthetic `aggregate` column tagging each row, which you can pick up with a `REMAPPING` to drive another aesthetic — e.g. `REMAPPING aggregate AS stroke` to colour the two lines differently. The column's value is built from the per-row function names of the *exploded* targets, deduplicated, and joined with `/`:

* `aggregate => ('y:min', 'y:max')` → rows tagged `'min'`, `'max'`.
* `aggregate => ('y:min', 'y:max', 'color:sum', 'color:prod')` → rows tagged `'min/sum'`, `'max/prod'`.
* `aggregate => ('y:mean', 'y:max', 'color:mean', 'color:prod')` → rows tagged `'mean'`, `'max/prod'` (the duplicate `'mean'` collapses).
* `aggregate => ('y:min', 'y:max', 'color:median')` → rows tagged `'min'`, `'max'` (the single-function `color` target is recycled across rows and is not part of the label).

When several aesthetics are targeted with the same number of functions, they explode in lockstep (row 1 uses each aesthetic's first function, row 2 the second, and so on); aesthetics with a single function — and the unprefixed defaults — are reused unchanged across every row. Mixing different lengths above 1 is an error.

The simple functions are:

* `'count'`: Non-null tally of the bound column.
* `'sum'` and `'prod'`: The sum or product
* `'min'`, `'max'`, and `'range'`: Extremes and max - min
* `'mean'`, and `'median'`: Central tendency
* `'geomean'`, `'harmean'`, and `'rms'`: Geometric, harmonic, and root-mean-square
* `'sdev'`, `'var'`, `'iqr'`, and `'se'`: Standard deviation, variance, interquartile range, and standard error
* `'p05'`, `'p10'`, `'p25'`, `'p50'`, `'p75'`, `'p90'`, and `'p95'`: Percentiles

For band functions you combine an offset with an expansion, potentially multiplied. An example could be `'mean-1.96sdev'` which does exactly what you'd expect it to be. The general form is `<offset>±<multiplier><expansion>` with `<multiplier>` being optional (defaults to `1`).

Allowed offsets are: `'mean'`, `'median'`, `'geomean'`, `'harmean'`, `'rms'`, `'sum'`, `'prod'`, `'min'`, `'max'`, and `'p05'`–`'p95'`

Allowed expansions are: `'sdev'`, `'se'`, `'var'`, `'iqr'`, and `'range'`

In the single-row (reduction) case aggregation applies in place — no `REMAPPING` is needed and no synthetic column is added. Only the multi-row (explosion) case described above introduces the synthetic `aggregate` column.

### `FILTER`
```ggsql
FILTER <condition>
Expand Down
5 changes: 4 additions & 1 deletion doc/syntax/layer/type/area.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,12 @@ The following aesthetics are recognised by the area layer.
* `orientation`: The orientation of the layer, see the [Orientation section](#orientation). One of the following:
* `'aligned'` to align the layer's primary axis with the coordinate system's first axis.
* `'transposed'` to align the layer's primary axis with the coordinate system's second axis.
* `aggregate`: Aggregation functions to apply per group. Either a single string or an array of strings. See an overview of aggregation function in [the `DRAW` documentation](../../clause/draw.qmd#aggregate) and more information in the *Data transformation* section below.

## Data transformation
The area layer sorts the data along its primary axis
This layer supports aggregation through the `aggregate` setting. Within each group, defined by `PARTITION BY`, all discrete mappings, and the primary axis, every numeric mapping is replaced in place by its aggregated value. Use a default like `'mean'` or target individual aesthetics with `'<aes>:<func>'`. See [the `DRAW` documentation](../../clause/draw.qmd#aggregate) for the full setting shape.

Further, the area layer sorts the data along its primary axis before returning it.

## Orientation
Area plots are sorted and connected along their primary axis. Since the primary axis cannot be deduced from the mapping it must be specified using the `orientation` setting. E.g. if you wish to create a vertical area plot you need to set `orientation => 'transposed'` to indicate that the primary layer axis follows the second axis of the coordinate system.
Expand Down
15 changes: 15 additions & 0 deletions doc/syntax/layer/type/bar.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,13 @@ The bar layer has no required aesthetics
## Settings
* `position`: Position adjustment. One of `'identity'`, `'stack'` (default), `'dodge'`, or `'jitter'`
* `width`: The width of the bars as a proportion of the available width (0 to 1)
* `aggregate`: Aggregation functions to apply per group if the secondary position has been mapped. Either a single string or an array of strings. See an overview of aggregation function in [the `DRAW` documentation](../../clause/draw.qmd#aggregate) and more information in the *Data transformation* section below.

## Data transformation
If the secondary axis has not been mapped the layer will calculate counts for you and display these as the secondary axis.

If the secondary axis has been mapped you can apply aggregation through the `aggregate` setting. Within each group, defined by `PARTITION BY`, all discrete mappings, and the primary axis, every numeric mapping is replaced in place by its aggregated value. Use a default like `'mean'` or target individual aesthetics with `'<aes>:<func>'`. See [the `DRAW` documentation](../../clause/draw.qmd#aggregate) for the full setting shape.

### Properties

* `weight`: If mapped, the sum of the weights within each group is calculated instead of the count in each group
Expand Down Expand Up @@ -116,3 +119,15 @@ DRAW bar
MAPPING species AS fill
PROJECT TO polar
```

Use a different type of aggregation for the bars through the `aggregate` setting. The `range` layer needs both `ymin` and `ymax` mapped; with two defaults, the first is applied to the lower bound and the second to the upper bound.

```{ggsql}
VISUALISE species AS x FROM ggsql:penguins
DRAW bar
MAPPING body_mass AS y
SETTING aggregate => 'mean', fill => 'steelblue'
DRAW range
MAPPING body_mass AS ymin, body_mass AS ymax
SETTING aggregate => ('mean-1.96sdev', 'mean+1.96sdev')
```
19 changes: 17 additions & 2 deletions doc/syntax/layer/type/line.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,16 @@ The following aesthetics are recognised by the line layer.
* `orientation`: The orientation of the layer, see the [Orientation section](#orientation). One of the following:
* `'aligned'` to align the layer's primary axis with the coordinate system's first axis.
* `'transposed'` to align the layer's primary axis with the coordinate system's second axis.
* `aggregate`: Aggregation functions to apply per group. Either a single string or an array of strings. See an overview of aggregation function in [the `DRAW` documentation](../../clause/draw.qmd#aggregate) and more information in the *Data transformation* section below.

## Data transformation
The line layer sorts the data along its primary axis.
This layer supports aggregation through the `aggregate` setting. Within each group, defined by `PARTITION BY`, all discrete mappings, and the primary axis, every numeric mapping is replaced in place by its aggregated value to produce a summary trace. Use a default like `'mean'` to summarise the secondary axis, or target other aesthetics with `'<aes>:<func>'` (e.g. `'color:median'`). To draw min/max envelope lines, use a separate `DRAW line` layer per function, or use a [`range` layer](range.qmd) for a single range mark. See [the `DRAW` documentation](../../clause/draw.qmd#aggregate) for the full setting shape.

Further, the line layer sorts the data along its primary axis before returning it.

If the line has a variable `stroke` or `opacity` aesthetic within groups, the line is broken into segments.
Each segment gets the property of the preceding datapoint, so the last datapoint in a group does not transfer these properties.
This behavior is not compatible with aggregation.

## Orientation
Line plots are sorted and connected along their primary axis. Since the primary axis cannot be deduced from the mapping it must be specified using the `orientation` setting. If you wish to create a vertical line plot, you need to set `orientation => 'transposed'` to indicate that the primary layer axis follows the second axis of the coordinate system.
Expand Down Expand Up @@ -89,4 +94,14 @@ VISUALISE x, y FROM data
DRAW line
MAPPING z AS linewidth
SCALE linewidth TO (0, 30)
```
```

Use aggregation to draw min and max lines from a set of observations on a single layer. Targeting `y` twice produces one summary line per function within the same layer, with a synthetic `aggregate` column tagging each row that you can remap to colour the lines distinctly:

```{ggsql}
VISUALISE Day AS x, Temp AS y FROM ggsql:airquality
DRAW line
REMAPPING aggregate AS stroke
SETTING aggregate => ('y:min', 'y:max')
DRAW point
```
3 changes: 2 additions & 1 deletion doc/syntax/layer/type/point.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,10 @@ The following aesthetics are recognised by the point layer.

## Settings
* `position`: Position adjustment. One of `'identity'` (default), `'stack'`, `'dodge'`, or `'jitter'`
* `aggregate`: Aggregation functions to apply per group. Either a single string or an array of strings. See an overview of aggregation function in [the `DRAW` documentation](../../clause/draw.qmd#aggregate) and more information in the *Data transformation* section below.

## Data transformation
The point layer does not transform its data but passes it through unchanged
This layer supports aggregation through the `aggregate` setting. Within each group, defined by `PARTITION BY` and all discrete mappings, every numeric mapping is replaced in place by its aggregated value. Use a default like `'mean'` or target individual aesthetics with `'<aes>:<func>'`. See [the `DRAW` documentation](../../clause/draw.qmd#aggregate) for the full setting shape.

## Orientation
The point layer has no orientation. The axes are treated symmetrically.
Expand Down
3 changes: 2 additions & 1 deletion doc/syntax/layer/type/range.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,10 @@ The following aesthetics are recognised by the range layer.

## Settings
* `width`: The width of the hinges in points (must be >= 0). Defaults to 10. Can be set to `null` to not display hinges.
* `aggregate`: Aggregation functions to apply per group. Either a single string or an array of strings. See [the `DRAW` documentation](../../clause/draw.qmd#aggregate) and the *Data transformation* section below.

## Data transformation
The range layer does not transform its data but passes it through unchanged.
This layer supports aggregation through the `aggregate` setting. Within each group, defined by `PARTITION BY` and all discrete mappings, every numeric mapping is replaced in place by its aggregated value, producing one range per group. Range is a range layer: with two defaults the first applies to the start point (`xmin`/`ymin`) and the second applies to the end point (`xmax`/`ymax`). Use a single default like `'mean'` to apply the same function to all values, or target individual aesthetics with `'<aes>:<func>'`. See [the `DRAW` documentation](../../clause/draw.qmd#aggregate) for the full setting shape.

## Orientation
The orientation of range layers is deduced directly from the mapping, because the interval is mapped to the secondary axis. To create a horizontal range layer, you map the independent variable to `y` instead of `x` and the interval to `xmin` and `xmax` (assuming a default Cartesian coordinate system).
Expand Down
3 changes: 2 additions & 1 deletion doc/syntax/layer/type/ribbon.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,10 @@ The following aesthetics are recognised by the ribbon layer.

## Settings
* `position`: Position adjustment. One of `'identity'` (default), `'stack'`, `'dodge'`, or `'jitter'`
* `aggregate`: Aggregation functions to apply per group. Either a single string or an array of strings. See [the `DRAW` documentation](../../clause/draw.qmd#aggregate) and the *Data transformation* section below.

## Data transformation
The ribbon layer sorts the data along its primary axis
This layer supports aggregation through the `aggregate` setting. Within each group, defined by `PARTITION BY` and all discrete mappings, every numeric mapping is replaced in place by its aggregated value, producing one ribbon per group. Ribon is a range layer: with two defaults the first applies to the start point (`xmin`/`ymin`) and the second applies to the end point (`xmax`/`ymax`). Use a single default like `'mean'` to apply the same function to all values, or target individual aesthetics with `'<aes>:<func>'`. See [the `DRAW` documentation](../../clause/draw.qmd#aggregate) for the full setting shape.

## Orientation
Ribbon layers are sorted and connected along their primary axis. The orientation is deduced directly from the mapping, because the interval is mapped to the secondary axis. To create a vertical ribbon layer you map the independent variable to `y` instead of `x` and the interval to `xmin` and `xmax` (assuming a default Cartesian coordinate system).
Expand Down
14 changes: 13 additions & 1 deletion doc/syntax/layer/type/rule.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,10 @@ The following aesthetics are recognised by the rule layer.

## Settings
* `position`: Position adjustment. One of `'identity'` (default), `'stack'`, `'dodge'`, or `'jitter'`
* `aggregate`: Aggregation functions to apply per group. Either a single string or an array of strings. See an overview of aggregation function in [the `DRAW` documentation](../../clause/draw.qmd#aggregate) and more information in the *Data transformation* section below.

## Data transformation
This layer supports aggregation through the `aggregate` setting. Within each group, defined by `PARTITION BY` and all discrete mappings, every numeric mapping is replaced in place by its aggregated value. Use a default like `'mean'` or target individual aesthetics with `'<aes>:<func>'`. See [the `DRAW` documentation](../../clause/draw.qmd#aggregate) for the full setting shape.

For diagonal lines, the position aesthetic determines the intercept:

Expand Down Expand Up @@ -110,4 +112,14 @@ VISUALISE FROM ggsql:penguins
intercept AS y,
label AS colour
FROM lines
```
```

Show a max rule for a timeseries

```{ggsql}
VISUALISE Temp AS y FROM ggsql:airquality
DRAW line
MAPPING Date AS x
DRAW rule
SETTING aggregate => 'max'
```
3 changes: 2 additions & 1 deletion doc/syntax/layer/type/segment.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,10 @@ For axis-aligned intervals where one coordinate is shared between the start and

## Settings
* `position`: Position adjustment. One of `'identity'` (default), `'stack'`, `'dodge'`, or `'jitter'`
* `aggregate`: Aggregation functions to apply per group. Either a single string or an array of strings. See an overview of aggregation function in [the `DRAW` documentation](../../clause/draw.qmd#aggregate) and more information in the *Data transformation* section below.

## Data transformation
The segment layer does not transform its data but passes it through unchanged.
This layer supports aggregation through the `aggregate` setting. Within each group, defined by `PARTITION BY` and all discrete mappings, every numeric mapping is replaced in place by its aggregated value, producing one segment per group. Segment is a range layer: with two defaults the first applies to the start point (`x`/`y`) and the second applies to the end point (`xend`/`yend`). Use a single default like `'mean'` to apply the same function to all four endpoints, or target individual aesthetics with `'<aes>:<func>'`. See [the `DRAW` documentation](../../clause/draw.qmd#aggregate) for the full setting shape.

## Orientation
The segment layer has no orientations. The axes are treated symmetrically.
Expand Down
Loading
Loading