From 3e1c24c135339c3042f522e091236df303a62a52 Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Tue, 24 Mar 2026 09:53:55 +0100 Subject: [PATCH 01/14] docs: deprecate stats v1 in 8.5 docs --- .../information-schema-analyze-status.md | 6 +++++- sql-statements/sql-statement-analyze-table.md | 4 ++++ .../sql-statement-show-analyze-status.md | 4 ++++ statistics.md | 18 +++++++++++------- system-variables.md | 4 ++++ 5 files changed, 28 insertions(+), 8 deletions(-) diff --git a/information-schema/information-schema-analyze-status.md b/information-schema/information-schema-analyze-status.md index 4a57f1433ed7d..d232f7929385d 100644 --- a/information-schema/information-schema-analyze-status.md +++ b/information-schema/information-schema-analyze-status.md @@ -7,6 +7,10 @@ summary: Learn the `ANALYZE_STATUS` information_schema table. The `ANALYZE_STATUS` table provides information about the running tasks that collect statistics and a limited number of history tasks. +> **Warning:** +> +> Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`). + Starting from TiDB v6.1.0, the `ANALYZE_STATUS` table supports showing cluster-level tasks. Even after a TiDB restart, you can still view task records before the restart using this table. Before TiDB v6.1.0, the `ANALYZE_STATUS` table can only show instance-level tasks, and task records are cleared after a TiDB restart. Starting from TiDB v6.1.0, you can view the history tasks within the last 7 days through the system table `mysql.analyze_jobs`. @@ -79,4 +83,4 @@ Fields in the `ANALYZE_STATUS` table are described as follows: ## See also - [`ANALYZE TABLE`](/sql-statements/sql-statement-analyze-table.md) -- [`SHOW ANALYZE STATUS`](/sql-statements/sql-statement-show-analyze-status.md) \ No newline at end of file +- [`SHOW ANALYZE STATUS`](/sql-statements/sql-statement-show-analyze-status.md) diff --git a/sql-statements/sql-statement-analyze-table.md b/sql-statements/sql-statement-analyze-table.md index b277481c3f634..9525514004055 100644 --- a/sql-statements/sql-statement-analyze-table.md +++ b/sql-statements/sql-statement-analyze-table.md @@ -9,6 +9,10 @@ This statement updates the statistics that TiDB builds on tables and indexes. It TiDB will also automatically update its statistics over time as it discovers that they are inconsistent with its own estimates. +> **Warning:** +> +> Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`). For details, see [Introduction to Statistics](/statistics.md#versions-of-statistics). + Currently, TiDB collects statistical information as a full collection by using the `ANALYZE TABLE` statement. For more information, see [Introduction to statistics](/statistics.md). ## Synopsis diff --git a/sql-statements/sql-statement-show-analyze-status.md b/sql-statements/sql-statement-show-analyze-status.md index 59efee4dc4483..0fadf8e6f5f2b 100644 --- a/sql-statements/sql-statement-show-analyze-status.md +++ b/sql-statements/sql-statement-show-analyze-status.md @@ -7,6 +7,10 @@ summary: An overview of the usage of SHOW ANALYZE STATUS for the TiDB database. The `SHOW ANALYZE STATUS` statement shows the statistics collection tasks being executed by TiDB and a limited number of historical task records. +> **Warning:** +> +> Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. The example in this document includes Version 1 output only for comparison with Version 2. + Starting from TiDB v6.1.0, the `SHOW ANALYZE STATUS` statement supports showing cluster-level tasks. Even after a TiDB restart, you can still view task records before the restart using this statement. Before TiDB v6.1.0, the `SHOW ANALYZE STATUS` statement can only show instance-level tasks, and task records are cleared after a TiDB restart. Starting from TiDB v6.1.0, you can view the history tasks within the last 7 days through the system table `mysql.analyze_jobs`. diff --git a/statistics.md b/statistics.md index 39db445c5a660..589a5f36ffac2 100644 --- a/statistics.md +++ b/statistics.md @@ -355,13 +355,17 @@ WHERE db_name = 'test' AND table_name = 't' AND last_analyzed_at IS NOT NULL; ## Versions of statistics -The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, two versions of statistics are supported: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. +> **Warning:** +> +> Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and migrate existing analyzed objects to Version 2. + +The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, TiDB supports two statistics versions: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. - For TiDB Self-Managed, the default value of this variable changes from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. - If your cluster is upgraded from an earlier version, the default value of `tidb_analyze_version` does not change after the upgrade. -Version 2 is preferred, and will continue to be enhanced to ultimately replace Version 1 completely. Compared to Version 1, Version 2 improves the accuracy of many of the statistics collected for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation, and also supporting automated collection only on selected columns (see [Collecting statistics on some columns](#collect-statistics-on-some-columns)). +Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation, and it supports automated collection only on selected columns (see [Collecting statistics on some columns](#collect-statistics-on-some-columns)). The following table lists the information collected by each version for usage in the optimizer estimates: @@ -376,11 +380,11 @@ The following table lists the information collected by each version for usage in ### Switch between statistics versions -It is recommended to ensure that all tables/indexes (and partitions) utilize statistics collection from the same version. Version 2 is recommended, however, it is not recommended to switch from one version to another without a justifiable reason such as an issue experienced with the version in use. A switch between versions might take a period of time when no statistics are available until all tables have been analyzed with the new version, which might negatively affect the optimizer plan choices if statistics are not available. +It is recommended to ensure that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. During the migration, there might be a period when some objects temporarily do not have statistics until `ANALYZE` finishes, which might negatively affect optimizer plan choices. -Examples of justifications to switch might include - with Version 1, there could be inaccuracies in equal/IN predicate estimation due to hash collisions when collecting Count-Min sketch statistics. Solutions are listed in the [Count-Min Sketch](#count-min-sketch) section. Alternatively, setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects is also a solution. In the early release of Version 2, there was a risk of memory overflow after `ANALYZE`. This issue is resolved, but initially, one solution was to set `tidb_analyze_version = 1` and rerun `ANALYZE` on all objects. +One common reason to migrate is that Version 1 might produce inaccurate equal/IN predicate estimates because Count-Min sketch can have hash collisions. For details, see [Count-Min Sketch](#count-min-sketch). Setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects avoids this issue. -To prepare `ANALYZE` for switching between versions: +To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Version 2: - If the `ANALYZE` statement is executed manually, manually analyze every table to be analyzed. @@ -388,7 +392,7 @@ To prepare `ANALYZE` for switching between versions: SELECT DISTINCT(CONCAT('ANALYZE TABLE ', table_schema, '.', table_name, ';')) FROM information_schema.tables JOIN mysql.stats_histograms ON table_id = tidb_table_id - WHERE stats_ver = 2; + WHERE stats_ver = 1; ``` - If TiDB automatically executes the `ANALYZE` statement because the auto-analysis has been enabled, execute the following statement that generates the [`DROP STATS`](/sql-statements/sql-statement-drop-stats.md) statement: @@ -397,7 +401,7 @@ To prepare `ANALYZE` for switching between versions: SELECT DISTINCT(CONCAT('DROP STATS ', table_schema, '.', table_name, ';')) FROM information_schema.tables JOIN mysql.stats_histograms ON table_id = tidb_table_id - WHERE stats_ver = 2; + WHERE stats_ver = 1; ``` - If the result of the preceding statement is too long to copy and paste, you can export the result to a temporary text file and then perform execution from the file like this: diff --git a/system-variables.md b/system-variables.md index dd168da0d23e5..e6e79bf1aada8 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1129,6 +1129,10 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a ### tidb_analyze_version New in v5.1.0 +> **Warning:** +> +> Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use `tidb_analyze_version = 2`. + - Scope: SESSION | GLOBAL - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No From 928cc5427d4ac1cd3fc1ffb7f94c8505ef9c7604 Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Tue, 24 Mar 2026 11:08:28 +0100 Subject: [PATCH 02/14] docs: remove extra stats v1 warnings --- information-schema/information-schema-analyze-status.md | 4 ---- sql-statements/sql-statement-analyze-table.md | 4 ---- sql-statements/sql-statement-show-analyze-status.md | 4 ---- 3 files changed, 12 deletions(-) diff --git a/information-schema/information-schema-analyze-status.md b/information-schema/information-schema-analyze-status.md index d232f7929385d..837f319a2d408 100644 --- a/information-schema/information-schema-analyze-status.md +++ b/information-schema/information-schema-analyze-status.md @@ -7,10 +7,6 @@ summary: Learn the `ANALYZE_STATUS` information_schema table. The `ANALYZE_STATUS` table provides information about the running tasks that collect statistics and a limited number of history tasks. -> **Warning:** -> -> Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`). - Starting from TiDB v6.1.0, the `ANALYZE_STATUS` table supports showing cluster-level tasks. Even after a TiDB restart, you can still view task records before the restart using this table. Before TiDB v6.1.0, the `ANALYZE_STATUS` table can only show instance-level tasks, and task records are cleared after a TiDB restart. Starting from TiDB v6.1.0, you can view the history tasks within the last 7 days through the system table `mysql.analyze_jobs`. diff --git a/sql-statements/sql-statement-analyze-table.md b/sql-statements/sql-statement-analyze-table.md index 9525514004055..b277481c3f634 100644 --- a/sql-statements/sql-statement-analyze-table.md +++ b/sql-statements/sql-statement-analyze-table.md @@ -9,10 +9,6 @@ This statement updates the statistics that TiDB builds on tables and indexes. It TiDB will also automatically update its statistics over time as it discovers that they are inconsistent with its own estimates. -> **Warning:** -> -> Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`). For details, see [Introduction to Statistics](/statistics.md#versions-of-statistics). - Currently, TiDB collects statistical information as a full collection by using the `ANALYZE TABLE` statement. For more information, see [Introduction to statistics](/statistics.md). ## Synopsis diff --git a/sql-statements/sql-statement-show-analyze-status.md b/sql-statements/sql-statement-show-analyze-status.md index 0fadf8e6f5f2b..59efee4dc4483 100644 --- a/sql-statements/sql-statement-show-analyze-status.md +++ b/sql-statements/sql-statement-show-analyze-status.md @@ -7,10 +7,6 @@ summary: An overview of the usage of SHOW ANALYZE STATUS for the TiDB database. The `SHOW ANALYZE STATUS` statement shows the statistics collection tasks being executed by TiDB and a limited number of historical task records. -> **Warning:** -> -> Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. The example in this document includes Version 1 output only for comparison with Version 2. - Starting from TiDB v6.1.0, the `SHOW ANALYZE STATUS` statement supports showing cluster-level tasks. Even after a TiDB restart, you can still view task records before the restart using this statement. Before TiDB v6.1.0, the `SHOW ANALYZE STATUS` statement can only show instance-level tasks, and task records are cleared after a TiDB restart. Starting from TiDB v6.1.0, you can view the history tasks within the last 7 days through the system table `mysql.analyze_jobs`. From 5fd380538dc67aec3cbc2e024a7f184e0c0d7d87 Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Tue, 24 Mar 2026 11:20:19 +0100 Subject: [PATCH 03/14] docs: clarify stats v2 summary --- statistics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/statistics.md b/statistics.md index 589a5f36ffac2..1f52760863986 100644 --- a/statistics.md +++ b/statistics.md @@ -365,7 +365,7 @@ The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v5 - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. - If your cluster is upgraded from an earlier version, the default value of `tidb_analyze_version` does not change after the upgrade. -Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation, and it supports automated collection only on selected columns (see [Collecting statistics on some columns](#collect-statistics-on-some-columns)). +Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation. In addition, Version 2 lets you limit column statistics collection to specific columns or `PREDICATE COLUMNS` to reduce collection overhead, but it still collects statistics on the indexed columns and all indexes together. For details, see [Collect statistics on some columns](#collect-statistics-on-some-columns). The following table lists the information collected by each version for usage in the optimizer estimates: From 903480b67459760aebcb1766682c1ec62f197080 Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Tue, 24 Mar 2026 11:21:20 +0100 Subject: [PATCH 04/14] docs: trim stats v2 summary --- statistics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/statistics.md b/statistics.md index 1f52760863986..75437ee932c93 100644 --- a/statistics.md +++ b/statistics.md @@ -365,7 +365,7 @@ The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v5 - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. - If your cluster is upgraded from an earlier version, the default value of `tidb_analyze_version` does not change after the upgrade. -Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation. In addition, Version 2 lets you limit column statistics collection to specific columns or `PREDICATE COLUMNS` to reduce collection overhead, but it still collects statistics on the indexed columns and all indexes together. For details, see [Collect statistics on some columns](#collect-statistics-on-some-columns). +Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation. The following table lists the information collected by each version for usage in the optimizer estimates: From f4c83d4db1867e5cfc6cb62e955aeb56ce1e85b2 Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Tue, 24 Mar 2026 11:23:07 +0100 Subject: [PATCH 05/14] docs: shorten stats v2 wording --- statistics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/statistics.md b/statistics.md index 75437ee932c93..9850c1ab1e2c1 100644 --- a/statistics.md +++ b/statistics.md @@ -365,7 +365,7 @@ The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v5 - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. - If your cluster is upgraded from an earlier version, the default value of `tidb_analyze_version` does not change after the upgrade. -Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation. +Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics. The following table lists the information collected by each version for usage in the optimizer estimates: From 1141461b925cb04e137105327a0d80c75b6e1126 Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Tue, 24 Mar 2026 11:24:44 +0100 Subject: [PATCH 06/14] docs: remove wrong migration note --- statistics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/statistics.md b/statistics.md index 9850c1ab1e2c1..37d86918af056 100644 --- a/statistics.md +++ b/statistics.md @@ -380,7 +380,7 @@ The following table lists the information collected by each version for usage in ### Switch between statistics versions -It is recommended to ensure that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. During the migration, there might be a period when some objects temporarily do not have statistics until `ANALYZE` finishes, which might negatively affect optimizer plan choices. +It is recommended to ensure that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. One common reason to migrate is that Version 1 might produce inaccurate equal/IN predicate estimates because Count-Min sketch can have hash collisions. For details, see [Count-Min Sketch](#count-min-sketch). Setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects avoids this issue. From 32023910907a89b68f9f4046a2a9bafc4c88f56d Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Tue, 24 Mar 2026 11:26:28 +0100 Subject: [PATCH 07/14] docs: refine stats v1 migration notes --- statistics.md | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/statistics.md b/statistics.md index 37d86918af056..9f07bbf051254 100644 --- a/statistics.md +++ b/statistics.md @@ -380,7 +380,7 @@ The following table lists the information collected by each version for usage in ### Switch between statistics versions -It is recommended to ensure that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. +It is recommended to ensure that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Before Statistics Version 2 is collected for an object, TiDB can continue to use its existing Statistics Version 1. One common reason to migrate is that Version 1 might produce inaccurate equal/IN predicate estimates because Count-Min sketch can have hash collisions. For details, see [Count-Min Sketch](#count-min-sketch). Setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects avoids this issue. @@ -395,14 +395,7 @@ To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Versi WHERE stats_ver = 1; ``` -- If TiDB automatically executes the `ANALYZE` statement because the auto-analysis has been enabled, execute the following statement that generates the [`DROP STATS`](/sql-statements/sql-statement-drop-stats.md) statement: - - ```sql - SELECT DISTINCT(CONCAT('DROP STATS ', table_schema, '.', table_name, ';')) - FROM information_schema.tables JOIN mysql.stats_histograms - ON table_id = tidb_table_id - WHERE stats_ver = 1; - ``` +- If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set `tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Statistics Version 2 is collected for an object, TiDB can continue to use its existing Statistics Version 1. If you need to speed up the migration for important objects, run `ANALYZE` on them manually. - If the result of the preceding statement is too long to copy and paste, you can export the result to a temporary text file and then perform execution from the file like this: From 188855486cde58efdf6f0b467cc2c80d61ceb8bf Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Tue, 24 Mar 2026 11:27:45 +0100 Subject: [PATCH 08/14] docs: clarify stats v1 wording --- statistics.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/statistics.md b/statistics.md index 9f07bbf051254..42208f2b52ca1 100644 --- a/statistics.md +++ b/statistics.md @@ -380,7 +380,7 @@ The following table lists the information collected by each version for usage in ### Switch between statistics versions -It is recommended to ensure that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Before Statistics Version 2 is collected for an object, TiDB can continue to use its existing Statistics Version 1. +It is recommended to ensure that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Before Statistics Version 2 statistics are collected for an object, TiDB can continue to use its existing Statistics Version 1 statistics. One common reason to migrate is that Version 1 might produce inaccurate equal/IN predicate estimates because Count-Min sketch can have hash collisions. For details, see [Count-Min Sketch](#count-min-sketch). Setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects avoids this issue. @@ -395,7 +395,7 @@ To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Versi WHERE stats_ver = 1; ``` -- If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set `tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Statistics Version 2 is collected for an object, TiDB can continue to use its existing Statistics Version 1. If you need to speed up the migration for important objects, run `ANALYZE` on them manually. +- If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set `tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Statistics Version 2 statistics are collected for an object, TiDB can continue to use its existing Statistics Version 1 statistics. If you need to speed up the migration for important objects, run `ANALYZE` on them manually. - If the result of the preceding statement is too long to copy and paste, you can export the result to a temporary text file and then perform execution from the file like this: From 611c8e486053c45ff8f58ef0ea06cd48c70c256a Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Tue, 24 Mar 2026 11:29:35 +0100 Subject: [PATCH 09/14] docs: align stats migration terms --- statistics.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/statistics.md b/statistics.md index 42208f2b52ca1..c363db092f095 100644 --- a/statistics.md +++ b/statistics.md @@ -380,7 +380,7 @@ The following table lists the information collected by each version for usage in ### Switch between statistics versions -It is recommended to ensure that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Before Statistics Version 2 statistics are collected for an object, TiDB can continue to use its existing Statistics Version 1 statistics. +It is recommended to ensure that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Before Version 2 statistics are collected for an object, TiDB can continue to use its existing Version 1 statistics. One common reason to migrate is that Version 1 might produce inaccurate equal/IN predicate estimates because Count-Min sketch can have hash collisions. For details, see [Count-Min Sketch](#count-min-sketch). Setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects avoids this issue. @@ -395,7 +395,7 @@ To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Versi WHERE stats_ver = 1; ``` -- If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set `tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Statistics Version 2 statistics are collected for an object, TiDB can continue to use its existing Statistics Version 1 statistics. If you need to speed up the migration for important objects, run `ANALYZE` on them manually. +- If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set `tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Version 2 statistics are collected for an object, TiDB can continue to use its existing Version 1 statistics. If you need to speed up the migration for important objects, run `ANALYZE` on them manually. - If the result of the preceding statement is too long to copy and paste, you can export the result to a temporary text file and then perform execution from the file like this: From ec61b06bc3ad39971283ec129d084ff2d91e5400 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 27 Mar 2026 18:34:27 +0800 Subject: [PATCH 10/14] minor wording updates --- statistics.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/statistics.md b/statistics.md index c363db092f095..fbaa88c0ad3d8 100644 --- a/statistics.md +++ b/statistics.md @@ -380,9 +380,9 @@ The following table lists the information collected by each version for usage in ### Switch between statistics versions -It is recommended to ensure that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Before Version 2 statistics are collected for an object, TiDB can continue to use its existing Version 1 statistics. +It is recommended that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Before Version 2 statistics are collected for an object, TiDB continues to use its existing Version 1 statistics. -One common reason to migrate is that Version 1 might produce inaccurate equal/IN predicate estimates because Count-Min sketch can have hash collisions. For details, see [Count-Min Sketch](#count-min-sketch). Setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects avoids this issue. +One major reason to migrate is that Version 1 might produce inaccurate estimates for equal/IN predicates because the Count-Min sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects. To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Version 2: @@ -395,7 +395,7 @@ To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Versi WHERE stats_ver = 1; ``` -- If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set `tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Version 2 statistics are collected for an object, TiDB can continue to use its existing Version 1 statistics. If you need to speed up the migration for important objects, run `ANALYZE` on them manually. +- If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set `tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Version 2 statistics are collected for an object, TiDB can continue to use its existing Version 1 statistics. To speed up the migration for important objects, run `ANALYZE` on them manually. - If the result of the preceding statement is too long to copy and paste, you can export the result to a temporary text file and then perform execution from the file like this: From 7b8a7bbe5ce1da118ac63012dc81bb6c179430ca Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 27 Mar 2026 18:40:43 +0800 Subject: [PATCH 11/14] add v8.5.6 --- statistics.md | 2 +- system-variables.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/statistics.md b/statistics.md index fbaa88c0ad3d8..15a9cb14084f7 100644 --- a/statistics.md +++ b/statistics.md @@ -357,7 +357,7 @@ WHERE db_name = 'test' AND table_name = 't' AND last_analyzed_at IS NOT NULL; > **Warning:** > -> Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and migrate existing analyzed objects to Version 2. +> Starting from v8.5.6, statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and migrate existing analyzed objects to Version 2. The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, TiDB supports two statistics versions: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. diff --git a/system-variables.md b/system-variables.md index e6e79bf1aada8..dccb8a057e384 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1131,7 +1131,7 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a > **Warning:** > -> Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use `tidb_analyze_version = 2`. +> Starting from v8.5.6, statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use `tidb_analyze_version = 2`. - Scope: SESSION | GLOBAL - Persists to cluster: Yes From d5011c8ed526c8a186f0a7f4bf85558ad7d45107 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Sat, 28 Mar 2026 10:11:44 +0800 Subject: [PATCH 12/14] refine descriptions --- statistics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/statistics.md b/statistics.md index 15a9cb14084f7..4b57fa800972d 100644 --- a/statistics.md +++ b/statistics.md @@ -380,7 +380,7 @@ The following table lists the information collected by each version for usage in ### Switch between statistics versions -It is recommended that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Before Version 2 statistics are collected for an object, TiDB continues to use its existing Version 1 statistics. +It is recommended that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Until Version 2 statistics are collected for an object (such as a table, an index, or a partition), TiDB continues to use the existing Version 1 statistics for that object. One major reason to migrate is that Version 1 might produce inaccurate estimates for equal/IN predicates because the Count-Min sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects. From 784d2e6768ef29f85898106a757146b005c08841 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Sat, 28 Mar 2026 10:20:37 +0800 Subject: [PATCH 13/14] Update statistics.md --- statistics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/statistics.md b/statistics.md index 4b57fa800972d..53a0c02b476b7 100644 --- a/statistics.md +++ b/statistics.md @@ -357,7 +357,7 @@ WHERE db_name = 'test' AND table_name = 't' AND last_analyzed_at IS NOT NULL; > **Warning:** > -> Starting from v8.5.6, statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and migrate existing analyzed objects to Version 2. +> Starting from v8.5.6, statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, TiDB supports two statistics versions: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. From d6ca7e551a724abadd86fc61f409e90e6f36d070 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 10 Apr 2026 20:36:16 +0800 Subject: [PATCH 14/14] Apply suggestions from code review Co-authored-by: Aolin --- statistics.md | 4 ++-- system-variables.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/statistics.md b/statistics.md index 53a0c02b476b7..fd4493b6eb18d 100644 --- a/statistics.md +++ b/statistics.md @@ -357,7 +357,7 @@ WHERE db_name = 'test' AND table_name = 't' AND last_analyzed_at IS NOT NULL; > **Warning:** > -> Starting from v8.5.6, statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). +> Starting from v8.5.6, Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, TiDB supports two statistics versions: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. @@ -382,7 +382,7 @@ The following table lists the information collected by each version for usage in It is recommended that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Until Version 2 statistics are collected for an object (such as a table, an index, or a partition), TiDB continues to use the existing Version 1 statistics for that object. -One major reason to migrate is that Version 1 might produce inaccurate estimates for equal/IN predicates because the Count-Min sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects. +One major reason to migrate is that Version 1 might produce inaccurate estimates for equal/IN predicates because the Count-Min Sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects. To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Version 2: diff --git a/system-variables.md b/system-variables.md index dccb8a057e384..97df7fe64dc6e 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1131,7 +1131,7 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a > **Warning:** > -> Starting from v8.5.6, statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use `tidb_analyze_version = 2`. +> Starting from v8.5.6, Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use `tidb_analyze_version = 2`. - Scope: SESSION | GLOBAL - Persists to cluster: Yes