diff --git a/docs/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md b/docs/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md new file mode 100644 index 0000000000000..4f24fe0458fae --- /dev/null +++ b/docs/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md @@ -0,0 +1,88 @@ +--- +{ + "title": "Conditional Functions Overview", + "language": "en", + "description": "Conditional functions are built-in functions used to perform conditional logic and branching in SQL queries." +} +--- + +# Conditional Functions Overview + +Conditional functions are built-in functions used to perform conditional logic and branching in SQL queries. They help execute different operations based on specified conditions, such as selecting values, handling NULL values, and performing case-based logic. + +## Vectorized Execution and Conditional Functions + +Doris is a vectorized execution engine. However, conditional functions may behave in ways that seem counterintuitive. + +Consider the following example: + +```sql +mysql> set enable_strict_cast = true; +Query OK, 0 rows affected (0.00 sec) + +mysql> select count( + -> if(number < 128 , + -> cast(number as tinyint), + -> cast(number as String)) + -> ) from numbers("number" = "300"); +ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Value 128 out of range for type tinyint +``` + +In this example, even though we only cast to `tinyint` when `number < 128` in the `if` function, an error still occurs. This is because of how conditional functions like `if(cond, colA, colB)` were traditionally executed: + +1. First, both `colA` and `colB` are fully computed +2. Then, based on the value of `cond`, the corresponding result is selected and returned + +So even if `colA`'s value is not actually used in practice, since `colA` is fully computed, it will still trigger an error. + +Functions like `if`, `ifnull`, `case`, and `coalesce` have similar behavior. + +Note that functions like `LEAST` do not have this issue because they inherently need to compute all parameters to compare values. + +## Short-Circuit Evaluation + +In Doris 4.0.4, we improved the execution logic of conditional functions to allow short-circuit evaluation. + +```sql +mysql> set short_circuit_evaluation = true; +Query OK, 0 rows affected (0.00 sec) + +mysql> select count( + -> if(number < 128 , + -> cast(number as tinyint), + -> cast(number as String)) + -> ) from numbers("number" = "300"); ++-------------------------------------------------------------------------+ +| count(if(number < 128, cast(number as tinyint), cast(number as String)))| ++-------------------------------------------------------------------------+ +| 300 | ++-------------------------------------------------------------------------+ +``` + +With short-circuit evaluation enabled, functions like `if`, `ifnull`, `case`, and `coalesce` can avoid unnecessary computations in many scenarios, thus preventing errors and improving performance. + +### Enabling Short-Circuit Evaluation + +To enable short-circuit evaluation, set the session variable: + +```sql +SET short_circuit_evaluation = true; +``` + +### Benefits of Short-Circuit Evaluation + +1. **Error Prevention**: Avoids executing branches that would cause errors when conditions exclude them +2. **Performance Improvement**: Reduces unnecessary computations by only evaluating branches that are actually needed +3. **More Intuitive Behavior**: Makes conditional functions behave more like traditional programming language conditionals + +## Common Conditional Functions + +Common conditional functions that benefit from short-circuit evaluation include: + +- `IF`: Returns one of two values based on a condition +- `IFNULL`: Returns the first argument if it's not NULL, otherwise returns the second argument +- `CASE`: Provides multiple conditional branches similar to switch-case statements +- `COALESCE`: Returns the first non-NULL value from a list of arguments +- `NULLIF`: Returns NULL if two arguments are equal, otherwise returns the first argument + +For detailed information about each function, please refer to their respective documentation pages. diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md new file mode 100644 index 0000000000000..177496a49a061 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md @@ -0,0 +1,88 @@ +--- +{ + "title": "条件函数概述", + "language": "zh-CN", + "description": "条件函数是用于在 SQL 查询中执行条件逻辑和分支的内置函数。" +} +--- + +# 条件函数概述 + +条件函数是用于在 SQL 查询中执行条件逻辑和分支的内置函数。它们帮助根据指定的条件执行不同的操作,例如选择值、处理 NULL 值以及执行基于条件的逻辑判断。 + +## 向量化执行与条件函数 + +Doris 是向量化执行的引擎。但是对于条件函数,可能会有一些反直觉的地方。 + +考虑以下示例: + +```sql +mysql> set enable_strict_cast = true; +Query OK, 0 rows affected (0.00 sec) + +mysql> select count( + -> if(number < 128 , + -> cast(number as tinyint), + -> cast(number as String)) + -> ) from numbers("number" = "300"); +ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Value 128 out of range for type tinyint +``` + +上面的例子中,虽然我们在 `if` 函数中,`number < 128` 的分支才会被转换为 `tinyint` 类型,但是还是报错了。这是因为对于 `if(cond, colA, colB)` 这个条件函数,传统的执行方式是: + +1. 先完整计算 `colA` 和 `colB` +2. 然后根据 `cond` 的值,选择对应的结果返回 + +所以即使在实际执行中,并没有用到 `colA` 的值,但是因为 `colA` 被完整计算了,所以会报错。 + +`if`、`ifnull`、`case`、`coalesce` 等函数都有类似的问题。 + +注意,例如 `LEAST` 这样的函数是没有这样的问题的,因为它本身就需要把所有的参数都计算出来,才能比较大小。 + +## 短路执行 + +在 Doris 4.0.4 版本中,我们对条件函数的执行逻辑进行了改进,允许短路执行。 + +```sql +mysql> set short_circuit_evaluation = true; +Query OK, 0 rows affected (0.00 sec) + +mysql> select count( + -> if(number < 128 , + -> cast(number as tinyint), + -> cast(number as String)) + -> ) from numbers("number" = "300"); ++-------------------------------------------------------------------------+ +| count(if(number < 128, cast(number as tinyint), cast(number as String)))| ++-------------------------------------------------------------------------+ +| 300 | ++-------------------------------------------------------------------------+ +``` + +开启短路执行后,`if`、`ifnull`、`case`、`coalesce` 等函数在很多场景下可以避免不必要的计算,从而避免报错并提升性能。 + +### 开启短路执行 + +要开启短路执行,需要设置会话变量: + +```sql +SET short_circuit_evaluation = true; +``` + +### 短路执行的优势 + +1. **避免错误**:当条件排除某些分支时,避免执行会导致错误的分支 +2. **性能提升**:只计算实际需要的分支,减少不必要的计算 +3. **更直观的行为**:使条件函数的行为更接近传统编程语言中的条件语句 + +## 常见条件函数 + +受益于短路执行的常见条件函数包括: + +- `IF`:根据条件返回两个值中的一个 +- `IFNULL`:如果第一个参数不为 NULL 则返回第一个参数,否则返回第二个参数 +- `CASE`:提供多个条件分支,类似于 switch-case 语句 +- `COALESCE`:从参数列表中返回第一个非 NULL 的值 +- `NULLIF`:如果两个参数相等则返回 NULL,否则返回第一个参数 + +有关每个函数的详细信息,请参阅各自的文档页面。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md new file mode 100644 index 0000000000000..2587cbbd3802c --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md @@ -0,0 +1,88 @@ +--- +{ + "title": "条件函数概述", + "language": "zh-CN", + "description": "条件函数是用于在 SQL 查询中执行条件逻辑和分支的内置函数。" +} +--- + +# 条件函数概述 + +条件函数是用于在 SQL 查询中执行条件逻辑和分支的内置函数。它们帮助根据指定的条件执行不同的操作,例如选择值、处理 NULL 值以及执行基于条件的逻辑判断。 + +## 向量化执行与条件函数 + +Doris 是向量化执行的引擎。但是对于条件函数,可能会有一些反直觉的地方。 + +考虑以下示例: + +```sql +mysql> set enable_strict_cast = true; +Query OK, 0 rows affected (0.00 sec) + +mysql> select count( + -> if(number < 128 , + -> cast(number as tinyint), + -> cast(number as String)) + -> ) from numbers("number" = "300"); +ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Value 128 out of range for type tinyint +``` + +上面的例子中,虽然我们在 `if` 函数中,`number < 128` 的分支才会被转换为 `tinyint` 类型,但是还是报错了。这是因为对于 `if(cond, colA, colB)` 这个条件函数,传统的执行方式是: + +1. 先完整计算 `colA` 和 `colB` +2. 然后根据 `cond` 的值,选择对应的结果返回 + +所以即使在实际执行中,并没有用到 `colA` 的值,但是因为 `colA` 被完整计算了,所以会报错。 + +`if`、`ifnull`、`case`、`coalesce` 等函数都有类似的问题。 + +注意,例如 `LEAST` 这样的函数是没有这样的问题的,因为它本身就需要把所有的参数都计算出来,才能比较大小。 + +## 短路执行 + +在 Doris 4.0.4 版本中,我们对条件函数的执行逻辑进行了改进,允许短路执行。 + +```sql +mysql> set short_circuit_evaluation = true; +Query OK, 0 rows affected (0.00 sec) + +mysql> select count( + -> if(number < 128 , + -> cast(number as tinyint), + -> cast(number as String)) + -> ) from numbers("number" = "300"); ++-------------------------------------------------------------------------+ +| count(if(number < 128, cast(number as tinyint), cast(number as String)))| ++-------------------------------------------------------------------------+ +| 300 | ++-------------------------------------------------------------------------+ +``` + +开启短路执行后,`if`、`ifnull`、`case`、`coalesce` 等函数在很多场景下可以避免不必要的计算,从而避免报错并提升性能。 + +### 开启短路执行 + +要开启短路执行,需要设置会话变量: + +```sql +SET short_circuit_evaluation = true; +``` + +### 短路执行的优势 + +1. **避免错误**:当条件排除某些分支时,避免执行会导致错误的分支 +2. **性能提升**:只计算实际需要的分支,减少不必要的计算 +3. **更直观的行为**:使条件函数的行为更接近传统编程语言中的条件语句 + +## 常见条件函数 + +受益于短路执行的常见条件函数包括: + +- `IF`:根据条件返回两个值中的一个 +- `IFNULL`:如果第一个参数不为 NULL 则返回第一个参数,否则返回第二个参数 +- `CASE`:提供多个条件分支,类似于 switch-case 语句 +- `COALESCE`:从参数列表中返回第一个非 NULL 的值 +- `NULLIF`:如果两个参数相等则返回 NULL,否则返回第一个参数 + +有关每个函数的详细信息,请参阅各自的文档页面。 diff --git a/sidebars.ts b/sidebars.ts index c413d0e89bf05..7ab0a33065403 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -1824,6 +1824,7 @@ const sidebars: SidebarsConfig = { type: 'category', label: 'Conditional Functions', items: [ + 'sql-manual/sql-functions/scalar-functions/conditional-functions/overview', 'sql-manual/sql-functions/scalar-functions/conditional-functions/coalesce', 'sql-manual/sql-functions/scalar-functions/conditional-functions/greatest', 'sql-manual/sql-functions/scalar-functions/conditional-functions/if', diff --git a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md new file mode 100644 index 0000000000000..4f24fe0458fae --- /dev/null +++ b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md @@ -0,0 +1,88 @@ +--- +{ + "title": "Conditional Functions Overview", + "language": "en", + "description": "Conditional functions are built-in functions used to perform conditional logic and branching in SQL queries." +} +--- + +# Conditional Functions Overview + +Conditional functions are built-in functions used to perform conditional logic and branching in SQL queries. They help execute different operations based on specified conditions, such as selecting values, handling NULL values, and performing case-based logic. + +## Vectorized Execution and Conditional Functions + +Doris is a vectorized execution engine. However, conditional functions may behave in ways that seem counterintuitive. + +Consider the following example: + +```sql +mysql> set enable_strict_cast = true; +Query OK, 0 rows affected (0.00 sec) + +mysql> select count( + -> if(number < 128 , + -> cast(number as tinyint), + -> cast(number as String)) + -> ) from numbers("number" = "300"); +ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Value 128 out of range for type tinyint +``` + +In this example, even though we only cast to `tinyint` when `number < 128` in the `if` function, an error still occurs. This is because of how conditional functions like `if(cond, colA, colB)` were traditionally executed: + +1. First, both `colA` and `colB` are fully computed +2. Then, based on the value of `cond`, the corresponding result is selected and returned + +So even if `colA`'s value is not actually used in practice, since `colA` is fully computed, it will still trigger an error. + +Functions like `if`, `ifnull`, `case`, and `coalesce` have similar behavior. + +Note that functions like `LEAST` do not have this issue because they inherently need to compute all parameters to compare values. + +## Short-Circuit Evaluation + +In Doris 4.0.4, we improved the execution logic of conditional functions to allow short-circuit evaluation. + +```sql +mysql> set short_circuit_evaluation = true; +Query OK, 0 rows affected (0.00 sec) + +mysql> select count( + -> if(number < 128 , + -> cast(number as tinyint), + -> cast(number as String)) + -> ) from numbers("number" = "300"); ++-------------------------------------------------------------------------+ +| count(if(number < 128, cast(number as tinyint), cast(number as String)))| ++-------------------------------------------------------------------------+ +| 300 | ++-------------------------------------------------------------------------+ +``` + +With short-circuit evaluation enabled, functions like `if`, `ifnull`, `case`, and `coalesce` can avoid unnecessary computations in many scenarios, thus preventing errors and improving performance. + +### Enabling Short-Circuit Evaluation + +To enable short-circuit evaluation, set the session variable: + +```sql +SET short_circuit_evaluation = true; +``` + +### Benefits of Short-Circuit Evaluation + +1. **Error Prevention**: Avoids executing branches that would cause errors when conditions exclude them +2. **Performance Improvement**: Reduces unnecessary computations by only evaluating branches that are actually needed +3. **More Intuitive Behavior**: Makes conditional functions behave more like traditional programming language conditionals + +## Common Conditional Functions + +Common conditional functions that benefit from short-circuit evaluation include: + +- `IF`: Returns one of two values based on a condition +- `IFNULL`: Returns the first argument if it's not NULL, otherwise returns the second argument +- `CASE`: Provides multiple conditional branches similar to switch-case statements +- `COALESCE`: Returns the first non-NULL value from a list of arguments +- `NULLIF`: Returns NULL if two arguments are equal, otherwise returns the first argument + +For detailed information about each function, please refer to their respective documentation pages. diff --git a/versioned_sidebars/version-4.x-sidebars.json b/versioned_sidebars/version-4.x-sidebars.json index 33f6c724640a7..ce8a46257075f 100644 --- a/versioned_sidebars/version-4.x-sidebars.json +++ b/versioned_sidebars/version-4.x-sidebars.json @@ -1830,6 +1830,7 @@ "type": "category", "label": "Conditional Functions", "items": [ + "sql-manual/sql-functions/scalar-functions/conditional-functions/overview", "sql-manual/sql-functions/scalar-functions/conditional-functions/coalesce", "sql-manual/sql-functions/scalar-functions/conditional-functions/greatest", "sql-manual/sql-functions/scalar-functions/conditional-functions/if",