Spark SQL示例用法所有函数示例权威详解一
SparkSession: Spark入口
Spark中所有功能的入口点是SparkSession
类。要创建一个基本的SparkSession
,只需使用SparkSession.builder()
:
import org.apache.spark.sql.SparkSession
val spark = SparkSession
.builder()
.appName("Spark SQL basic example")
.config("spark.some.config.option", "some-value")
.getOrCreate()
完整示例代码可在Spark存储库的“examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala”中找到。
在Spark 2.0中,SparkSession
提供了对Hive功能的内置支持,包括使用HiveQL编写查询、访问Hive UDF以及从Hive表读取数据的能力。要使用这些功能,不需要已有Hive设置。
1.创建DataFrames
使用SparkSession
,应用程序可以从现有的RDD、Hive表或Spark数据源创建DataFrames。
例如,以下代码根据JSON文件的内容创建一个DataFrame:
val df = spark.read.json("examples/src/main/resources/people.json")
// 将DataFrame的内容显示到标准输出
df.show()
// +----+-------+
// | age| name|
// +----+-------+
// |null|Michael|
// | 30| Andy|
// | 19| Justin|
// +----+-------+
完整示例代码可在Spark存储库的“examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala”中找到。
2.未命名的Dataset操作(也称为DataFrame操作)
DataFrames提供了用于Scala、Java、Python和R中结构化数据操作的特定领域语言。
如上所述,在Spark 2.0中,DataFrames在Scala和Java API中只是一组行的Dataset
。这些操作也被称为“未命名转换”,与强类型的Scala/Java Datasets
相对应的是“已命名转换”。
下面是使用Datasets进行结构化数据处理的一些基本示例:
// 需要导入此项以使用$符号表示法
import spark.implicits._
// 以树形格式打印模式
df.printSchema()
// root
// |-- age: long (nullable = true)
// |-- name: string (nullable = true)
// 只选择“name”列
df.select("name").show()
// +-------+
// | name|
// +-------+
// |Michael|
// | Andy|
// | Justin|
// +-------+
// 选择所有人,但将年龄增加1
df.select($"name", $"age" + 1).show()
// +-------+---------+
// | name|(age + 1)|
// +-------+---------+
// |Michael| null|
// | Andy| 31|
// | Justin| 20|
// +-------+---------+
// 选择年龄大于21岁的人
df.filter($"age" > 21).show()
// +---+----+
// |age|name|
// +---+----+
// | 30|Andy|
// +---+----+
// 按年龄统计人数
df.groupBy("age").count().show()
// +----+-----+
// | age|count|
// +----+-----+
// | 19| 1|
// |null| 1|
// | 30| 1|
// +----+-----+
完整示例代码可在Spark存储库的“examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala”中找到。
有关可以在Dataset上执行的操作类型的完整列表,请参阅API文档。
除了简单的列引用和表达式外,Datasets还具有丰富的函数库,包括字符串操作、日期算术、常见数学运算等。完整列表可在DataFrame函数参考中找到。
3.以编程方式运行SQL查询
SparkSession
的sql
函数使应用程序能够以编程方式运行SQL查询,并将结果作为DataFrame返回。
// 将DataFrame注册为SQL临时视图
df.createOrReplaceTempView("people")
val sqlDF = spark.sql("SELECT * FROM people")
sqlDF.show()
// +----+-------+
// | age| name|
// +----+-------+
// |null|Michael|
// | 30| Andy|
// | 19| Justin|
// +----+-------+
完整示例代码可在Spark存储库的“examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala”中找到。
4.全局临时视图
Spark SQL中的临时视图是会话范围的,如果创建它的会话终止,临时视图将消失。如果您想要一个在所有会话中共享且在Spark应用程序终止之前保持活动的临时视图,可以创建一个全局临时视图。全局临时视图与系统保留数据库global_temp
相关联,我们必须使用限定名称来引用它,例如SELECT * FROM global_temp.view1
。
// 将DataFrame注册为全局临时视图
df.createGlobalTempView("people")
// 全局临时视图与系统保留的数据库`global_temp`相关联
spark.sql("SELECT * FROM global_temp.people").show()
// +----+-------+
// | age| name|
// +----+-------+
// |null|Michael|
// | 30| Andy|
// | 19| Justin|
// +----+-------+
// 全局临时视图是跨会话的
spark.newSession().sql("SELECT * FROM global_temp.people").show()
// +----+-------+
// | age| name|
// +----+-------+
// |null|Michael|
// | 30| Andy|
// | 19| Justin|
// +----+-------+
完整示例代码可在Spark存储库的“examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala”中找到。
5.创建Datasets
Datasets与RDD类似,但不同于使用Java序列化或Kryo,它们使用专门的编码器对对象进行序列化以供处理或通过网络传输。尽管编码器和标准序列化都负责将对象转换为字节,但编码器是动态生成的,并使用一种格式,使得Spark可以在不将字节反序列化回对象的情况下执行许多操作,例如过滤、排序和哈希。
case class Person(name: String, age: Long)
// 为case类创建编码器
val caseClassDS = Seq(Person("Andy", 32)).toDS()
caseClassDS.show()
// +----+---+
// |name|age|
// +----+---+
// |Andy| 32|
// +----+---+
// 大多数常见类型的编码器可以通过导入spark.implicits._自动提供
val primitiveDS = Seq(1, 2, 3).toDS()
primitiveDS.map(_ + 1).collect() // 返回:Array(2, 3, 4)
// 可以通过提供一个类将DataFrames转换为Dataset。映射将按名称完成
val path = "examples/src/main/resources/people.json"
val peopleDS = spark.read.json(path).as[Person]
peopleDS.show()
// +----+-------+
// | age| name|
// +----+-------+
// |null|Michael|
// | 30| Andy|
// | 19| Justin|
// +----+-------+
完整示例代码可在Spark存储库的“examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala”中找到。
6.如何将RDD转换为Datasets
Spark SQL支持两种不同的方法将现有RDD转换为Datasets。
第一种方法使用反射推断包含特定类型对象的RDD的模式。这种基于反射的方法使代码更简洁,并且在编写Spark应用程序时已经知道模式时效果很好。
创建Datasets的第二种方法是通过编程接口,允许您构建一个模式,然后将其应用于现有的RDD。虽然这种方法更冗长,但它允许您在运行时之前不知道列及其类型的情况下构建Datasets。
6.1使用反射推断模式
// 用于从RDD到DataFrame的隐式转换
import spark.implicits._
// 从文本文件创建Person对象的RDD,将其转换为Dataframe
val peopleDF = spark.sparkContext
.textFile("examples/src/main/resources/people.txt")
.map(_.split(","))
.map(attributes => Person(attributes(0), attributes(1).trim.toInt))
.toDF()
// 将DataFrame注册为临时视图
peopleDF.createOrReplaceTempView("people")
// 可以使用Spark提供的sql方法运行SQL语句
val teenagersDF = spark.sql("SELECT name, age FROM people WHERE age BETWEEN 13 AND 19")
// 可以通过字段索引访问结果中的行的列
teenagersDF.map(teenager => "Name: " + teenager(0)).show()
// +------------+
// | value|
// +------------+
// |Name: Justin|
// +------------+
// 或者通过字段名称访问结果中的行的列
teenagersDF.map(teenager => "Name: " + teenager.getAs[String]("name")).show()
// +------------+
// | value|
// +------------+
// |Name: Justin|
// +------------+
// 对于Dataset[Map[K,V]],没有预定义的编码器,需要显式定义
implicit val mapEncoder = org.apache.spark.sql.Encoders.kryo[Map[String, Any]]
// 原始类型和case类也可以定义为
// implicit val stringIntMapEncoder: Encoder[Map[String, Any]] = ExpressionEncoder()
// row.getValuesMap[T]一次检索多列到Map[String, T]中
teenagersDF.map(teenager => teenager.getValuesMap[Any](List("name", "age"))).collect()
// Array(Map("name" -> "Justin", "age" -> 19))
完整示例代码可在Spark存储库的“examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala”中找到。
6.2以编程方式指定模式
当无法提前定义case类时(例如,记录的结构编码在字符串中,或者将解析文本数据集并且字段将根据不同用户进行不同的投影),可以通过以下三个步骤以编程方式创建DataFrame。
- 从原始RDD创建一个Row的RDD;
- 根据与第1步中创建的RDD中的行结构匹配的StructType生成模式;
- 使用
SparkSession
提供的createDataFrame
方法将模式应用于Row的RDD。
例如:
import org.apache.spark.sql.Row
import org.apache.spark.sql.types._
// 创建一个RDD
val peopleRDD = spark.sparkContext.textFile("examples/src/main/resources/people.txt")
// 模式编码为字符串
val schemaString = "name age"
// 基于模式字符串生成模式
val fields = schemaString.split(" ")
.map(fieldName => StructField(fieldName, StringType, nullable = true))
val schema = StructType(fields)
// 将RDD(people)的记录转换为Rows
val rowRDD = peopleRDD
.map(_.split(","))
.map(attributes => Row(attributes(0), attributes(1).trim))
// 将模式应用于RDD
val peopleDF = spark.createDataFrame(rowRDD, schema)
// 使用DataFrame创建一个临时视图
peopleDF.createOrReplaceTempView("people")
// 可以通过使用DataFrames提供的SQL方法运行SQL语句
val results = spark.sql("SELECT name FROM people")
// SQL查询的结果是DataFrames,并支持所有正常的RDD操作
// 可以通过字段索引或字段名称访问结果中行的列
results.map(attributes => "Name: " + attributes(0)).show()
// +-------------+
// | value|
// +-------------+
// |Name: Michael|
// | Name: Andy|
// | Name: Justin|
// +-------------+
完整示例代码可在Spark存储库的“examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala”中找到。
7.标量函数
标量函数是每行返回一个单值的函数,与聚合函数相对应,后者返回一组行的值。Spark SQL支持各种内置标量函数。它还支持用户定义的标量函数。
Spark SQL提供了两种函数功能来满足广泛的用户需求:内置函数和用户定义函数(UDFs)。内置函数是Spark SQL预定义的常用例程,完整的函数列表可以在内置函数API文档中找到。当系统的内置函数无法完成所需任务时,用户可以使用UDF来定义自己的函数。
内置函数 Spark SQL有一些常用的内置函数类别,用于聚合、数组/映射、日期/时间戳和JSON数据处理。本小节介绍这些函数的用法和描述。
数组函数
数组函数
函数名 | 描述 |
---|---|
array(expr, …) | 返回具有给定元素的数组。 |
array_append(array, element) | 将元素添加到作为第一个参数传递的数组的末尾。元素的类型应与数组元素的类型相似。还会将空元素追加到数组中。但是如果传递的数组为空,输出将为NULL。 |
array_compact(array) | 从数组中删除空值。 |
array_contains(array, value) | 如果数组包含该值,则返回true。 |
array_distinct(array) | 从数组中删除重复值。 |
array_except(array1, array2) | 返回array1中存在但不在array2中的元素的数组,不包括重复元素。 |
array_insert(x, pos, val) | 在数组x的索引pos处插入val。数组索引从1开始。最大负索引为-1,函数在当前最后一个元素之后插入新元素。大于数组大小的索引将向数组附加,如果索引为负数,则在数组前面插入“null”元素。 |
array_intersect(array1, array2) | 返回array1和array2交集中的元素的数组,不包括重复元素。 |
array_join(array, delimiter[, nullReplacement]) | 使用分隔符连接给定数组的元素,并可选地使用字符串替换null值。如果未设置nullReplacement的值,则过滤任何null值。 |
array_max(array) | 返回数组中的最大值。对于double/float类型,NaN大于任何非NaN元素。跳过NULL元素。 |
array_min(array) | 返回数组中的最小值。对于double/float类型,NaN大于任何非NaN元素。跳过NULL元素。 |
array_position(array, element) | 返回数组中第一个匹配元素的(从1开始的)索引,如果找不到匹配项,则返回0。 |
array_prepend(array, element) | 将元素添加到作为第一个参数传递的数组的开头。元素的类型应与数组元素的类型相同。还会在数组前面添加空元素。但是如果传递的数组为空,输出将为NULL。 |
array_remove(array, element) | 从数组中删除所有与element相等的元素。 |
array_repeat(element, count) | 返回包含重复元素count次的数组。 |
array_union(array1, array2) | 返回array1和array2并集中的元素的数组,不包括重复元素。 |
arrays_overlap(a1, a2) | 如果a1至少包含一个也存在于a2中的非空元素,则返回true。如果数组没有公共元素,并且它们都非空,并且其中之一包含空元素,则返回null;否则返回false。 |
arrays_zip(a1, a2, …) | 返回合并后的结构体数组,每个结构体的第N个元素包含输入数组的所有第N个值。 |
flatten(arrayOfArrays) | 将数组的数组转换为单个数组。 |
get(array, index) | 返回数组中给定(从0开始的)索引处的元素。如果索引超出数组边界,则此函数返回NULL。 |
sequence(start, stop, step) | 生成从start到stop(包括)的元素数组,步长为step。返回元素的类型与参数表达式的类型相同。支持的类型有:byte、short、integer、long、date、timestamp。start和stop表达式必须解析为相同的类型。如果start和stop表达式解析为“date”或“timestamp”类型,则step表达式必须解析为“interval”、“year-month interval”或“day-time interval”类型,否则与start和stop表达式解析为相同类型。 |
shuffle(array) | 返回给定数组的随机排列。 |
slice(x, start, length) | 从索引start(数组索引从1开始,或者如果start为负数,则从末尾开始)开始,提取长度为length的数组x的子集。 |
sort_array(array[, ascendingOrder]) | 根据数组元素的自然排序,按升序或降序对输入数组进行排序。对于double/float类型,NaN大于任何非NaN元素。将空元素按升序放在返回的数组开头,或按降序放在返回的数组结尾。 |
示例
-- array
SELECT array(1, 2, 3);
+--------------+
|array(1, 2, 3)|
+--------------+
| [1, 2, 3]|
+--------------+
-- array_append
SELECT array_append(array('b', 'd', 'c', 'a'), 'd');
+----------------------------------+
|array_append(array(b, d, c, a), d)|
+----------------------------------+
| [b, d, c, a, d]|
+----------------------------------+
SELECT array_append(array(1, 2, 3, null), null);
+----------------------------------------+
|array_append(array(1, 2, 3, NULL), NULL)|
+----------------------------------------+
| [1, 2, 3, NULL, N...|
+----------------------------------------+
SELECT array_append(CAST(null as Array<Int>), 2);
+---------------------+
|array_append(NULL, 2)|
+---------------------+
| NULL|
+---------------------+
-- array_compact
SELECT array_compact(array(1, 2, 3, null));
+-----------------------------------+
|array_compact(array(1, 2, 3, NULL))|
+-----------------------------------+
| [1, 2, 3]|
+-----------------------------------+
SELECT array_compact(array("a", "b", "c"));
+-----------------------------+
|array_compact(array(a, b, c))|
+-----------------------------+
| [a, b, c]|
+-----------------------------+
-- array_contains
SELECT array_contains(array(1, 2, 3), 2);
+---------------------------------+
|array_contains(array(1, 2, 3), 2)|
+---------------------------------+
| true|
+---------------------------------+
-- array_distinct
SELECT array_distinct(array(1, 2, 3, null, 3));
+---------------------------------------+
|array_distinct(array(1, 2, 3, NULL, 3))|
+---------------------------------------+
| [1, 2, 3, NULL]|
+---------------------------------------+
-- array_except
SELECT array_except(array(1, 2, 3), array(1, 3, 5));
+--------------------------------------------+
|array_except(array(1, 2, 3), array(1, 3, 5))|
+--------------------------------------------+
| [2]|
+--------------------------------------------+
-- array_insert
SELECT array_insert(array(1, 2, 3, 4), 5, 5);
+-------------------------------------+
|array_insert(array(1, 2, 3, 4), 5, 5)|
+-------------------------------------+
| [1, 2, 3, 4, 5]|
+-------------------------------------+
SELECT array_insert(array(5, 4, 3, 2), -1, 1);
+--------------------------------------+
|array_insert(array(5, 4, 3, 2), -1, 1)|
+--------------------------------------+
| [5, 4, 3, 2, 1]|
+--------------------------------------+
SELECT array_insert(array(5, 3, 2, 1), -4, 4);
+--------------------------------------+
|array_insert(array(5, 3, 2, 1), -4, 4)|
+--------------------------------------+
| [5, 4, 3, 2, 1]|
+--------------------------------------+
-- array_intersect
SELECT array_intersect(array(1, 2, 3), array(1, 3, 5));
+-----------------------------------------------+
|array_intersect(array(1, 2, 3), array(1, 3, 5))|
+-----------------------------------------------+
| [1, 3]|
+-----------------------------------------------+
-- array_join
SELECT array_join(array('hello', 'world'), ' ');
+----------------------------------+
|array_join(array(hello, world), )|
+----------------------------------+
| hello world|
+----------------------------------+
SELECT array_join(array('hello', null ,'world'), ' ');
+----------------------------------------+
|array_join(array(hello, NULL, world), )|
+----------------------------------------+
| hello world|
+----------------------------------------+
SELECT array_join(array('hello', null ,'world'), ' ', ',');
+-------------------------------------------+
|array_join(array(hello, NULL, world), , ,)|
+-------------------------------------------+
| hello , world|
+-------------------------------------------+
-- array_max
SELECT array_max(array(1, 20, null, 3));
+--------------------------------+
|array_max(array(1, 20, NULL, 3))|
+--------------------------------+
| 20|
+--------------------------------+
-- array_min
SELECT array_min(array(1, 20, null, 3));
+--------------------------------+
|array_min(array(1, 20, NULL, 3))|
+--------------------------------+
| 1|
+--------------------------------+
-- array_position
SELECT array_position(array(312, 773, 708, 708), 708);
+----------------------------------------------+
|array_position(array(312, 773, 708, 708), 708)|
+----------------------------------------------+
| 3|
+----------------------------------------------+
SELECT array_position(array(312, 773, 708, 708), 414);
+----------------------------------------------+
|array_position(array(312, 773, 708, 708), 414)|
+----------------------------------------------+
| 0|
+----------------------------------------------+
-- array_prepend
SELECT array_prepend(array('b', 'd', 'c', 'a'), 'd');
+-----------------------------------+
|array_prepend(array(b, d, c, a), d)|
+-----------------------------------+
| [d, b, d, c, a]|
+-----------------------------------+
SELECT array_prepend(array(1, 2, 3, null), null);
+-----------------------------------------+
|array_prepend(array(1, 2, 3, NULL), NULL)|
+-----------------------------------------+
| [NULL, 1, 2, 3, N...|
+-----------------------------------------+
SELECT array_prepend(CAST(null as Array<Int>), 2);
+----------------------+
|array_prepend(NULL, 2)|
+----------------------+
| NULL|
+----------------------+
-- array_remove
SELECT array_remove(array(1, 2, 3, null, 3), 3);
+----------------------------------------+
|array_remove(array(1, 2, 3, NULL, 3), 3)|
+----------------------------------------+
| [1, 2, NULL]|
+----------------------------------------+
-- array_repeat
SELECT array_repeat('123', 2);
+--------------------+
|array_repeat(123, 2)|
+--------------------+
| [123, 123]|
+--------------------+
-- array_union
SELECT array_union(array(1, 2, 3), array(1, 3, 5));
+-------------------------------------------+
|array_union(array(1, 2, 3), array(1, 3, 5))|
+-------------------------------------------+
| [1, 2, 3, 5]|
+-------------------------------------------+
-- arrays_overlap
SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5));
+----------------------------------------------+
|arrays_overlap(array(1, 2, 3), array(3, 4, 5))|
+----------------------------------------------+
| true|
+----------------------------------------------+
-- arrays_zip
SELECT arrays_zip(array(1, 2, 3), array(2, 3, 4));
+------------------------------------------+
|arrays_zip(array(1, 2, 3), array(2, 3, 4))|
+------------------------------------------+
| [{1, 2}, {2, 3}, ...|
+------------------------------------------+
SELECT arrays_zip(array(1, 2), array(2, 3), array(3, 4));
+-------------------------------------------------+
|arrays_zip(array(1, 2), array(2, 3), array(3, 4))|
+-------------------------------------------------+
| [{1, 2, 3}, {2, 3...|
+-------------------------------------------------+
-- flatten
SELECT flatten(array(array(1, 2), array(3, 4)));
+----------------------------------------+
|flatten(array(array(1, 2), array(3, 4)))|
+----------------------------------------+
| [1, 2, 3, 4]|
+----------------------------------------+
-- get
SELECT get(array(1, 2, 3), 0);
+----------------------+
|get(array(1, 2, 3), 0)|
+----------------------+
| 1|
+----------------------+
SELECT get(array(1, 2, 3), 3);
+----------------------+
|get(array(1, 2, 3), 3)|
+----------------------+
| NULL|
+----------------------+
SELECT get(array(1, 2, 3), -1);
+-----------------------+
|get(array(1, 2, 3), -1)|
+-----------------------+
| NULL|
+-----------------------+
-- sequence
SELECT sequence(1, 5);
+---------------+
| sequence(1, 5)|
+---------------+
|[1, 2, 3, 4, 5]|
+---------------+
SELECT sequence(5, 1);
+---------------+
| sequence(5, 1)|
+---------------+
|[5, 4, 3, 2, 1]|
+---------------+
SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month);
+----------------------------------------------------------------------+
|sequence(to_date(2018-01-01), to_date(2018-03-01), INTERVAL '1' MONTH)|
+----------------------------------------------------------------------+
| [2018-01-01, 2018...|
+----------------------------------------------------------------------+
SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval '0-1' year to month);
+--------------------------------------------------------------------------------+
|sequence(to_date(2018-01-01), to_date(2018-03-01), INTERVAL '0-1' YEAR TO MONTH)|
+--------------------------------------------------------------------------------+
| [2018-01-01, 2018...|
+--------------------------------------------------------------------------------+
-- shuffle
SELECT shuffle(array(1, 20, 3, 5));
+---------------------------+
|shuffle(array(1, 20, 3, 5))|
+---------------------------+
| [3, 5, 20, 1]|
+---------------------------+
SELECT shuffle(array(1, 20, null, 3));
+------------------------------+
|shuffle(array(1, 20, NULL, 3))|
+------------------------------+
| [20, 1, 3, NULL]|
+------------------------------+
-- slice
SELECT slice(array(1, 2, 3, 4), 2, 2);
+------------------------------+
|slice(array(1, 2, 3, 4), 2, 2)|
+------------------------------+
| [2, 3]|
+------------------------------+
SELECT slice(array(1, 2, 3, 4), -2, 2);
+-------------------------------+
|slice(array(1, 2, 3, 4), -2, 2)|
+-------------------------------+
| [3, 4]|
+-------------------------------+
-- sort_array
SELECT sort_array(array('b', 'd', null, 'c', 'a'), true);
+-----------------------------------------+
|sort_array(array(b, d, NULL, c, a), true)|
+-----------------------------------------+
| [NULL, a, b, c, d]|
+-----------------------------------------+
映射函数
映射函数
函数名 | 描述 |
---|---|
element_at(array, index) | 返回数组中给定(从1开始的)索引处的元素。如果索引为0,Spark将抛出错误。如果索引小于0,则从最后一个元素到第一个元素进行访问。如果索引超过数组的长度并且spark.sql.ansi.enabled 设置为false,则函数返回NULL。如果spark.sql.ansi.enabled 设置为true,则对于无效索引,它会抛出ArrayIndexOutOfBoundsException异常。 |
element_at(map, key) | 返回给定键的值。如果map不包含该键,则函数返回NULL。 |
map(key0, value0, key1, value1, …) | 创建具有给定键/值对的映射。 |
map_concat(map, …) | 返回所有给定映射的并集。 |
map_contains_key(map, key) | 如果map包含该键,则返回true。 |
map_entries(map) | 返回给定映射中所有条目的无序数组。 |
map_from_arrays(keys, values) | 使用给定的键数组和值数组创建映射。keys中的所有元素都不能为null。 |
map_from_entries(arrayOfEntries) | 从给定的条目数组创建映射。 |
map_keys(map) | 返回包含映射的键的无序数组。 |
map_values(map) | 返回包含映射的值的无序数组。 |
str_to_map(text[, pairDelim[, keyValueDelim]]) | 使用分隔符将文本拆分为键/值对,创建映射。默认的分隔符是’,‘用于pairDelim ,’:'用于keyValueDelim 。pairDelim 和keyValueDelim 都被视为正则表达式。 |
try_element_at(array, index) | 返回数组中给定(从1开始的)索引处的元素。如果索引为0,Spark将抛出错误。如果索引小于0,则从最后一个元素到第一个元素进行访问。如果索引超过数组的长度,则函数始终返回NULL。 |
try_element_at(map, key) | 返回给定键的值。如果map不包含该键,则函数始终返回NULL。 |
示例
-- element_at
SELECT element_at(array(1, 2, 3), 2);
+-----------------------------+
|element_at(array(1, 2, 3), 2)|
+-----------------------------+
| 2|
+-----------------------------+
SELECT element_at(map(1, 'a', 2, 'b'), 2);
+------------------------------+
|element_at(map(1, a, 2, b), 2)|
+------------------------------+
| b|
+------------------------------+
-- map
SELECT map(1.0, '2', 3.0, '4');
+--------------------+
| map(1.0, 2, 3.0, 4)|
+--------------------+
|{1.0 -> 2, 3.0 -> 4}|
+--------------------+
-- map_concat
SELECT map_concat(map(1, 'a', 2, 'b'), map(3, 'c'));
+--------------------------------------+
|map_concat(map(1, a, 2, b), map(3, c))|
+--------------------------------------+
| {1 -> a, 2 -> b, ...|
+--------------------------------------+
-- map_contains_key
SELECT map_contains_key(map(1, 'a', 2, 'b'), 1);
+------------------------------------+
|map_contains_key(map(1, a, 2, b), 1)|
+------------------------------------+
| true|
+------------------------------------+
SELECT map_contains_key(map(1, 'a', 2, 'b'), 3);
+------------------------------------+
|map_contains_key(map(1, a, 2, b), 3)|
+------------------------------------+
| false|
+------------------------------------+
-- map_entries
SELECT map_entries(map(1, 'a', 2, 'b'));
+----------------------------+
|map_entries(map(1, a, 2, b))|
+----------------------------+
| [{1, a}, {2, b}]|
+----------------------------+
-- map_from_arrays
SELECT map_from_arrays(array(1.0, 3.0), array('2', '4'));
+---------------------------------------------+
|map_from_arrays(array(1.0, 3.0), array(2, 4))|
+---------------------------------------------+
| {1.0 -> 2, 3.0 -> 4}|
+---------------------------------------------+
-- map_from_entries
SELECT map_from_entries(array(struct(1, 'a'), struct(2, 'b')));
+---------------------------------------------------+
|map_from_entries(array(struct(1, a), struct(2, b)))|
+---------------------------------------------------+
| {1 -> a, 2 -> b}|
+---------------------------------------------------+
-- map_keys
SELECT map_keys(map(1, 'a', 2, 'b'));
+-------------------------+
|map_keys(map(1, a, 2, b))|
+-------------------------+
| [1, 2]|
+-------------------------+
-- map_values
SELECT map_values(map(1, 'a', 2, 'b'));
+---------------------------+
|map_values(map(1, a, 2, b))|
+---------------------------+
| [a, b]|
+---------------------------+
-- str_to_map
SELECT str_to_map('a:1,b:2,c:3', ',', ':');
+-----------------------------+
|str_to_map(a:1,b:2,c:3, ,, :)|
+-----------------------------+
| {a -> 1, b -> 2, ...|
+-----------------------------+
SELECT str_to_map('a');
+-------------------+
|str_to_map(a, ,, :)|
+-------------------+
| {a -> NULL}|
+-------------------+
-- try_element_at
SELECT try_element_at(array(1, 2, 3), 2);
+---------------------------------+
|try_element_at(array(1, 2, 3), 2)|
+---------------------------------+
| 2|
+---------------------------------+
SELECT try_element_at(map(1, 'a', 2, 'b'), 2);
+----------------------------------+
|try_element_at(map(1, a, 2, b), 2)|
+----------------------------------+
| b|
+----------------------------------+
日期和时间函数
日期和时间函数
函数名 | 描述 |
---|---|
add_months(start_date, num_months) | 返回start_date 之后num_months 个月的日期。 |
convert_timezone([sourceTz, ]targetTz, sourceTs) | 将不带时区的时间戳sourceTs 从sourceTz 时区转换为targetTz 时区。 |
curdate() | 返回查询评估开始时的当前日期。同一查询中的所有curdate调用返回相同的值。 |
current_date() | 返回查询评估开始时的当前日期。同一查询中的所有current_date调用返回相同的值。 |
current_date | 返回查询评估开始时的当前日期。 |
current_timestamp() | 返回查询评估开始时的当前时间戳。同一查询中的所有current_timestamp调用返回相同的值。 |
current_timestamp | 返回查询评估开始时的当前时间戳。 |
current_timezone() | 返回当前会话的本地时区。 |
date_add(start_date, num_days) | 返回start_date 之后num_days 天的日期。 |
date_diff(endDate, startDate) | 返回从startDate 到endDate 的天数。 |
date_format(timestamp, fmt) | 根据日期格式fmt 将timestamp 转换为字符串值。 |
date_from_unix_date(days) | 从1970-01-01以来的天数创建日期。 |
date_part(field, source) | 提取日期/时间戳或间隔源的部分。 |
date_sub(start_date, num_days) | 返回start_date 之前num_days 天的日期。 |
date_trunc(fmt, ts) | 将时间戳ts 按照格式模型fmt 截断到指定的单位。 |
dateadd(start_date, num_days) | 返回start_date 之后num_days 天的日期。 |
datediff(endDate, startDate) | 返回从startDate 到endDate 的天数。 |
datepart(field, source) | 提取日期/时间戳或间隔源的部分。 |
day(date) | 返回日期/时间戳的月份中的天数。 |
dayofmonth(date) | 返回日期/时间戳的月份中的天数。 |
dayofweek(date) | 返回日期/时间戳的星期几(1 = 周日,2 = 周一,…,7 = 周六)。 |
dayofyear(date) | 返回日期/时间戳的年份中的天数。 |
extract(field FROM source) | 提取日期/时间戳或间隔源的部分。 |
from_unixtime(unix_time[, fmt]) | 根据给定的fmt 表达式将UNIX时间戳unix_time 转换为时间戳。如果输入无效,则返回NULL。默认情况下,如果省略了fmt ,则遵循类型转换规则进行时间戳转换。 |
from_utc_timestamp(timestamp, timezone) | 将类似’2017-07-14 02:40:00.0’的时间戳解释为UTC时间,并将该时间渲染为给定时区的时间戳。例如,‘GMT+1’将产生’2017-07-14 03:40:00.0’。 |
hour(timestamp) | 返回字符串/时间戳的小时部分。 |
last_day(date) | 返回所属月份的最后一天。 |
localtimestamp() | 返回查询评估开始时的当前不带时区的时间戳。同一查询中的所有localtimestamp调用返回相同的值。 |
localtimestamp | 返回查询评估开始时的当前本地日期时间,以会话时区为准。 |
make_date(year, month, day) | 根据年、月和日字段创建日期。如果配置spark.sql.ansi.enabled 为false,则对于无效输入,函数返回NULL。否则,它会抛出错误。 |
make_dt_interval([days[, hours[, mins[, secs]]]]) | 从天数、小时、分钟和秒数创建DayTimeIntervalType的持续时间。 |
make_interval([years[, months[, weeks[, days[, hours[, mins[, secs]]]]]]]) | 根据年、月、周、天、小时、分钟和秒数创建间隔。 |
make_timestamp(year, month, day, hour, min, sec[, timezone]) | 根据年、月、日、小时、分钟、秒和时区字段创建时间戳。结果数据类型与配置spark.sql.timestampType 的值一致。如果配置spark.sql.ansi.enabled 为false,则对于无效输入,函数返回NULL。否则,它会抛出错误。 |
make_timestamp_ltz(year, month, day, hour, min, sec[, timezone]) | 根据年、月、日、小时、分钟、秒和时区字段创建带有本地时区的当前时间戳。如果配置spark.sql.ansi.enabled 为false,则对于无效输入,函数返回NULL。否则,它会抛出错误。 |
make_timestamp_ntz(year, month, day, hour, min, sec) | 根据年、月、日、小时、分钟、秒字段创建本地日期时间。如果配置spark.sql.ansi.enabled 为false,则对于无效输入,函数返回NULL。否则,它会抛出错误。 |
make_ym_interval([years[, months]]) | 从年和月创建年-月间隔。 |
minute(timestamp) | 返回字符串/时间戳的分钟部分。 |
month(date) | 返回日期/时间戳的月份部分。 |
months_between(timestamp1, timestamp2[, roundOff]) | 如果timestamp1 晚于timestamp2 ,则结果为正数。如果timestamp1 和timestamp2 在同一天,或者两者都是该月份的最后一天,则忽略时间部分。否则,根据每个月31天计算差异,并四舍五入到8位小数,除非roundOff=false 。 |
next_day(start_date, day_of_week) | 返回晚于start_date 且与指定值相同的第一个日期。如果至少有一个输入参数为NULL,则函数返回NULL。当两个输入参数都不为NULL且day_of_week是无效输入时,如果spark.sql.ansi.enabled 设置为true,则函数抛出IllegalArgumentException异常;否则返回NULL。 |
now() | 返回查询评估开始时的当前时间戳。 |
quarter(date) | 返回日期的季度(范围为1到4)。 |
second(timestamp) | 返回字符串/时间戳的秒部分。 |
session_window(time_column, gap_duration) | 根据指定列的时间戳生成会话窗口。参见结构化流指南文档中的“时间窗口类型”了解详细解释和示例。 |
timestamp_micros(microseconds) | 根据自UTC纪元以来的微秒数创建时间戳。 |
timestamp_millis(milliseconds) | 根据自UTC纪元以来的毫秒数创建时间戳。 |
timestamp_seconds(seconds) | 根据自UTC纪元以来的秒数(可以是小数)创建时间戳。 |
to_date(date_str[, fmt]) | 使用fmt 表达式将date_str 表达式解析为日期。对于无效输入,返回NULL。默认情况下,如果省略了fmt ,则遵循类型转换规则进行日期转换。 |
to_timestamp(timestamp_str[, fmt]) | 使用fmt 表达式将timestamp_str 表达式解析为时间戳。对于无效输入,返回NULL。默认情况下,如果省略了fmt ,则遵循类型转换规则进行时间戳转换。结果数据类型与配置spark.sql.timestampType 的值一致。 |
to_timestamp_ltz(timestamp_str[, fmt]) | 使用fmt 表达式将timestamp_str 表达式解析为带有本地时区的时间戳。对于无效输入,返回NULL。默认情况下,如果省略了fmt ,则遵循类型转换规则进行时间戳转换。 |
to_timestamp_ntz(timestamp_str[, fmt]) | 使用fmt 表达式将timestamp_str 表达式解析为不带时区的时间戳。对于无效输入,返回NULL。默认情况下,如果省略了fmt ,则遵循类型转换规则进行时间戳转换。 |
to_unix_timestamp(timeExp[, fmt]) | 返回给定时间的UNIX时间戳。 |
to_utc_timestamp(timestamp, timezone) | 将类似’2017-07-14 02:40:00.0’的时间戳解释为给定时区中的时间,并将该时间渲染为UTC时间戳。例如,‘GMT+1’将产生’2017-07-14 01:40:00.0’。 |
trunc(date, fmt) | 将日期的时间部分截断为格式模型fmt 指定的单位。 |
try_to_timestamp(timestamp_str[, fmt]) | 使用fmt 表达式将timestamp_str 表达式解析为时间戳。在无效输入(无论ANSI SQL模式是否启用)的情况下,该函数始终返回NULL。默认情况下,如果省略了fmt ,则遵循类型转换规则进行时间戳转换。结果数据类型与配置spark.sql.timestampType 的值一致。 |
unix_date(date) | 返回自1970-01-01以来的天数。 |
unix_micros(timestamp) | 返回自1970-01-01 00:00:00 UTC以来的微秒数。 |
unix_millis(timestamp) | 返回自1970-01-01 00:00:00 UTC以来的毫秒数。截断更高精度级别。 |
unix_seconds(timestamp) | 返回自1970-01-01 00:00:00 UTC以来的秒数。截断更高精度级别。 |
unix_timestamp([timeExp[, fmt]]) | 返回当前或指定时间的UNIX时间戳。 |
weekday(date) | 返回日期/时间戳的星期几(0 = 周一,1 = 周二,…,6 = 周日)。 |
weekofyear(date) | 返回给定日期的年份中的周数。一周从周一开始,第一周具有>3天。 |
window(time_column, window_duration[, slide_duration[, start_time]]) | 根据指定列的时间戳将行分桶到一个或多个时间窗口。窗口开始时包含在内,但窗口结束时不包含在内,例如12:05将在窗口[12:05,12:10)中,但不在窗口[12:00,12:05)中。窗口支持微秒精度。不支持按月排序的窗口。参见结构化流指南文档中的“基于事件时间的窗口操作”了解详细解释和示例。 |
window_time(window_column) | 从时间/会话窗口列中提取时间值,可用于窗口的事件时间值。提取的时间为(窗口结束 - 1),反映聚合窗口具有排他性上界的事实 - [start, end)。参见结构化流指南文档中的“基于事件时间的窗口操作”了解详细解释和示例。 |
year(date) | 返回日期/时间戳的年份部分。 |
示例
-- add_months
SELECT add_months('2016-08-31', 1);
+-------------------------+
|add_months(2016-08-31, 1)|
+-------------------------+
| 2016-09-30|
+-------------------------+
-- convert_timezone
SELECT convert_timezone('Europe/Brussels', 'America/Los_Angeles', timestamp_ntz'2021-12-06 00:00:00');
+-------------------------------------------------------------------------------------------+
|convert_timezone(Europe/Brussels, America/Los_Angeles, TIMESTAMP_NTZ '2021-12-06 00:00:00')|
+-------------------------------------------------------------------------------------------+
| 2021-12-05 15:00:00|
+-------------------------------------------------------------------------------------------+
SELECT convert_timezone('Europe/Brussels', timestamp_ntz'2021-12-05 15:00:00');
+------------------------------------------------------------------------------------------+
|convert_timezone(current_timezone(), Europe/Brussels, TIMESTAMP_NTZ '2021-12-05 15:00:00')|
+------------------------------------------------------------------------------------------+
| 2021-12-05 16:00:00|
+------------------------------------------------------------------------------------------+
-- curdate
SELECT curdate();
+--------------+
|current_date()|
+--------------+
| 2023-09-09|
+--------------+
-- current_date
SELECT current_date();
+--------------+
|current_date()|
+--------------+
| 2023-09-09|
+--------------+
SELECT current_date;
+--------------+
|current_date()|
+--------------+
| 2023-09-09|
+--------------+
-- current_timestamp
SELECT current_timestamp();
+--------------------+
| current_timestamp()|
+--------------------+
|2023-09-09 05:38:...|
+--------------------+
SELECT current_timestamp;
+--------------------+
| current_timestamp()|
+--------------------+
|2023-09-09 05:38:...|
+--------------------+
-- current_timezone
SELECT current_timezone();
+------------------+
|current_timezone()|
+------------------+
| Etc/UTC|
+------------------+
-- date_add
SELECT date_add('2016-07-30', 1);
+-----------------------+
|date_add(2016-07-30, 1)|
+-----------------------+
| 2016-07-31|
+-----------------------+
-- date_diff
SELECT date_diff('2009-07-31', '2009-07-30');
+---------------------------------+
|date_diff(2009-07-31, 2009-07-30)|
+---------------------------------+
| 1|
+---------------------------------+
SELECT date_diff('2009-07-30', '2009-07-31');
+---------------------------------+
|date_diff(2009-07-30, 2009-07-31)|
+---------------------------------+
| -1|
+---------------------------------+
-- date_format
SELECT date_format('2016-04-08', 'y');
+--------------------------+
|date_format(2016-04-08, y)|
+--------------------------+
| 2016|
+--------------------------+
-- date_from_unix_date
SELECT date_from_unix_date(1);
+----------------------+
|date_from_unix_date(1)|
+----------------------+
| 1970-01-02|
+----------------------+
-- date_part
SELECT date_part('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
+-------------------------------------------------------+
|date_part(YEAR, TIMESTAMP '2019-08-12 01:00:00.123456')|
+-------------------------------------------------------+
| 2019|
+-------------------------------------------------------+
SELECT date_part('week', timestamp'2019-08-12 01:00:00.123456');
+-------------------------------------------------------+
|date_part(week, TIMESTAMP '2019-08-12 01:00:00.123456')|
+-------------------------------------------------------+
| 33|
+-------------------------------------------------------+
SELECT date_part('doy', DATE'2019-08-12');
+---------------------------------+
|date_part(doy, DATE '2019-08-12')|
+---------------------------------+
| 224|
+---------------------------------+
SELECT date_part('SECONDS', timestamp'2019-10-01 00:00:01.000001');
+----------------------------------------------------------+
|date_part(SECONDS, TIMESTAMP '2019-10-01 00:00:01.000001')|
+----------------------------------------------------------+
| 1.000001|
+----------------------------------------------------------+
SELECT date_part('days', interval 5 days 3 hours 7 minutes);
+-------------------------------------------------+
|date_part(days, INTERVAL '5 03:07' DAY TO MINUTE)|
+-------------------------------------------------+
| 5|
+-------------------------------------------------+
SELECT date_part('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
+-------------------------------------------------------------+
|date_part(seconds, INTERVAL '05:00:30.001001' HOUR TO SECOND)|
+-------------------------------------------------------------+
| 30.001001|
+-------------------------------------------------------------+
SELECT date_part('MONTH', INTERVAL '2021-11' YEAR TO MONTH);
+--------------------------------------------------+
|date_part(MONTH, INTERVAL '2021-11' YEAR TO MONTH)|
+--------------------------------------------------+
| 11|
+--------------------------------------------------+
SELECT date_part('MINUTE', INTERVAL '123 23:55:59.002001' DAY TO SECOND);
+---------------------------------------------------------------+
|date_part(MINUTE, INTERVAL '123 23:55:59.002001' DAY TO SECOND)|
+---------------------------------------------------------------+
| 55|
+---------------------------------------------------------------+
-- date_sub
SELECT date_sub('2016-07-30', 1);
+-----------------------+
|date_sub(2016-07-30, 1)|
+-----------------------+
| 2016-07-29|
+-----------------------+
-- date_trunc
SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359');
+-----------------------------------------+
|date_trunc(YEAR, 2015-03-05T09:32:05.359)|
+-----------------------------------------+
| 2015-01-01 00:00:00|
+-----------------------------------------+
SELECT date_trunc('MM', '2015-03-05T09:32:05.359');
+---------------------------------------+
|date_trunc(MM, 2015-03-05T09:32:05.359)|
+---------------------------------------+
| 2015-03-01 00:00:00|
+---------------------------------------+
SELECT date_trunc('DD', '2015-03-05T09:32:05.359');
+---------------------------------------+
|date_trunc(DD, 2015-03-05T09:32:05.359)|
+---------------------------------------+
| 2015-03-05 00:00:00|
+---------------------------------------+
SELECT date_trunc('HOUR', '2015-03-05T09:32:05.359');
+-----------------------------------------+
|date_trunc(HOUR, 2015-03-05T09:32:05.359)|
+-----------------------------------------+
| 2015-03-05 09:00:00|
+-----------------------------------------+
SELECT date_trunc('MILLISECOND', '2015-03-05T09:32:05.123456');
+---------------------------------------------------+
|date_trunc(MILLISECOND, 2015-03-05T09:32:05.123456)|
+---------------------------------------------------+
| 2015-03-05 09:32:...|
+---------------------------------------------------+
-- dateadd
SELECT dateadd('2016-07-30', 1);
+-----------------------+
|date_add(2016-07-30, 1)|
+-----------------------+
| 2016-07-31|
+-----------------------+
-- datediff
SELECT datediff('2009-07-31', '2009-07-30');
+--------------------------------+
|datediff(2009-07-31, 2009-07-30)|
+--------------------------------+
| 1|
+--------------------------------+
SELECT datediff('2009-07-30', '2009-07-31');
+--------------------------------+
|datediff(2009-07-30, 2009-07-31)|
+--------------------------------+
| -1|
+--------------------------------+
-- datepart
SELECT datepart('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
+----------------------------------------------------------+
|datepart(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+----------------------------------------------------------+
| 2019|
+----------------------------------------------------------+
SELECT datepart('week', timestamp'2019-08-12 01:00:00.123456');
+----------------------------------------------------------+
|datepart(week FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+----------------------------------------------------------+
| 33|
+----------------------------------------------------------+
SELECT datepart('doy', DATE'2019-08-12');
+------------------------------------+
|datepart(doy FROM DATE '2019-08-12')|
+------------------------------------+
| 224|
+------------------------------------+
SELECT datepart('SECONDS', timestamp'2019-10-01 00:00:01.000001');
+-------------------------------------------------------------+
|datepart(SECONDS FROM TIMESTAMP '2019-10-01 00:00:01.000001')|
+-------------------------------------------------------------+
| 1.000001|
+-------------------------------------------------------------+
SELECT datepart('days', interval 5 days 3 hours 7 minutes);
+----------------------------------------------------+
|datepart(days FROM INTERVAL '5 03:07' DAY TO MINUTE)|
+----------------------------------------------------+
| 5|
+----------------------------------------------------+
SELECT datepart('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
+----------------------------------------------------------------+
|datepart(seconds FROM INTERVAL '05:00:30.001001' HOUR TO SECOND)|
+----------------------------------------------------------------+
| 30.001001|
+----------------------------------------------------------------+
SELECT datepart('MONTH', INTERVAL '2021-11' YEAR TO MONTH);
+-----------------------------------------------------+
|datepart(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH)|
+-----------------------------------------------------+
| 11|
+-----------------------------------------------------+
SELECT datepart('MINUTE', INTERVAL '123 23:55:59.002001' DAY TO SECOND);
+------------------------------------------------------------------+
|datepart(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND)|
+------------------------------------------------------------------+
| 55|
+------------------------------------------------------------------+
-- day
SELECT day('2009-07-30');
+---------------+
|day(2009-07-30)|
+---------------+
| 30|
+---------------+
-- dayofmonth
SELECT dayofmonth('2009-07-30');
+----------------------+
|dayofmonth(2009-07-30)|
+----------------------+
| 30|
+----------------------+
-- dayofweek
SELECT dayofweek('2009-07-30');
+---------------------+
|dayofweek(2009-07-30)|
+---------------------+
| 5|
+---------------------+
-- dayofyear
SELECT dayofyear('2016-04-09');
+---------------------+
|dayofyear(2016-04-09)|
+---------------------+
| 100|
+---------------------+
-- extract
SELECT extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456');
+---------------------------------------------------------+
|extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+---------------------------------------------------------+
| 2019|
+---------------------------------------------------------+
SELECT extract(week FROM timestamp'2019-08-12 01:00:00.123456');
+---------------------------------------------------------+
|extract(week FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+---------------------------------------------------------+
| 33|
+---------------------------------------------------------+
SELECT extract(doy FROM DATE'2019-08-12');
+-----------------------------------+
|extract(doy FROM DATE '2019-08-12')|
+-----------------------------------+
| 224|
+-----------------------------------+
SELECT extract(SECONDS FROM timestamp'2019-10-01 00:00:01.000001');
+------------------------------------------------------------+
|extract(SECONDS FROM TIMESTAMP '2019-10-01 00:00:01.000001')|
+------------------------------------------------------------+
| 1.000001|
+------------------------------------------------------------+
SELECT extract(days FROM interval 5 days 3 hours 7 minutes);
+---------------------------------------------------+
|extract(days FROM INTERVAL '5 03:07' DAY TO MINUTE)|
+---------------------------------------------------+
| 5|
+---------------------------------------------------+
SELECT extract(seconds FROM interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
+---------------------------------------------------------------+
|extract(seconds FROM INTERVAL '05:00:30.001001' HOUR TO SECOND)|
+---------------------------------------------------------------+
| 30.001001|
+---------------------------------------------------------------+
SELECT extract(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH);
+----------------------------------------------------+
|extract(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH)|
+----------------------------------------------------+
| 11|
+----------------------------------------------------+
SELECT extract(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND);
+-----------------------------------------------------------------+
|extract(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND)|
+-----------------------------------------------------------------+
| 55|
+-----------------------------------------------------------------+
-- from_unixtime
SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');
+-------------------------------------+
|from_unixtime(0, yyyy-MM-dd HH:mm:ss)|
+-------------------------------------+
| 1970-01-01 00:00:00|
+-------------------------------------+
SELECT from_unixtime(0);
+-------------------------------------+
|from_unixtime(0, yyyy-MM-dd HH:mm:ss)|
+-------------------------------------+
| 1970-01-01 00:00:00|
+-------------------------------------+
-- from_utc_timestamp
SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul');
+------------------------------------------+
|from_utc_timestamp(2016-08-31, Asia/Seoul)|
+------------------------------------------+
| 2016-08-31 09:00:00|
+------------------------------------------+
-- hour
SELECT hour('2009-07-30 12:58:59');
+-------------------------+
|hour(2009-07-30 12:58:59)|
+-------------------------+
| 12|
+-------------------------+
-- last_day
SELECT last_day('2009-01-12');
+--------------------+
|last_day(2009-01-12)|
+--------------------+
| 2009-01-31|
+--------------------+
-- localtimestamp
SELECT localtimestamp();
+--------------------+
| localtimestamp()|
+--------------------+
|2023-09-09 05:38:...|
+--------------------+
-- make_date
SELECT make_date(2013, 7, 15);
+----------------------+
|make_date(2013, 7, 15)|
+----------------------+
| 2013-07-15|
+----------------------+
SELECT make_date(2019, 7, NULL);
+------------------------+
|make_date(2019, 7, NULL)|
+------------------------+
| NULL|
+------------------------+
-- make_dt_interval
SELECT make_dt_interval(1, 12, 30, 01.001001);
+-------------------------------------+
|make_dt_interval(1, 12, 30, 1.001001)|
+-------------------------------------+
| INTERVAL '1 12:30...|
+-------------------------------------+
SELECT make_dt_interval(2);
+-----------------------------------+
|make_dt_interval(2, 0, 0, 0.000000)|
+-----------------------------------+
| INTERVAL '2 00:00...|
+-----------------------------------+
SELECT make_dt_interval(100, null, 3);
+----------------------------------------+
|make_dt_interval(100, NULL, 3, 0.000000)|
+----------------------------------------+
| NULL|
+----------------------------------------+
-- make_interval
SELECT make_interval(100, 11, 1, 1, 12, 30, 01.001001);
+----------------------------------------------+
|make_interval(100, 11, 1, 1, 12, 30, 1.001001)|
+----------------------------------------------+
| 100 years 11 mont...|
+----------------------------------------------+
SELECT make_interval(100, null, 3);
+----------------------------------------------+
|make_interval(100, NULL, 3, 0, 0, 0, 0.000000)|
+----------------------------------------------+
| NULL|
+----------------------------------------------+
SELECT make_interval(0, 1, 0, 1, 0, 0, 100.000001);
+-------------------------------------------+
|make_interval(0, 1, 0, 1, 0, 0, 100.000001)|
+-------------------------------------------+
| 1 months 1 days 1...|
+-------------------------------------------+
-- make_timestamp
SELECT make_timestamp(2014, 12, 28, 6, 30, 45.887);
+-------------------------------------------+
|make_timestamp(2014, 12, 28, 6, 30, 45.887)|
+-------------------------------------------+
| 2014-12-28 06:30:...|
+-------------------------------------------+
SELECT make_timestamp(2014, 12, 28, 6, 30, 45.887, 'CET');
+------------------------------------------------+
|make_timestamp(2014, 12, 28, 6, 30, 45.887, CET)|
+------------------------------------------------+
| 2014-12-28 05:30:...|
+------------------------------------------------+
SELECT make_timestamp(2019, 6, 30, 23, 59, 60);
+---------------------------------------+
|make_timestamp(2019, 6, 30, 23, 59, 60)|
+---------------------------------------+
| 2019-07-01 00:00:00|
+---------------------------------------+
SELECT make_timestamp(2019, 6, 30, 23, 59, 1);
+--------------------------------------+
|make_timestamp(2019, 6, 30, 23, 59, 1)|
+--------------------------------------+
| 2019-06-30 23:59:01|
+--------------------------------------+
SELECT make_timestamp(null, 7, 22, 15, 30, 0);
+--------------------------------------+
|make_timestamp(NULL, 7, 22, 15, 30, 0)|
+--------------------------------------+
| NULL|
+--------------------------------------+
-- make_timestamp_ltz
SELECT make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887);
+-----------------------------------------------+
|make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887)|
+-----------------------------------------------+
| 2014-12-28 06:30:...|
+-----------------------------------------------+
SELECT make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887, 'CET');
+----------------------------------------------------+
|make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887, CET)|
+----------------------------------------------------+
| 2014-12-28 05:30:...|
+----------------------------------------------------+
SELECT make_timestamp_ltz(2019, 6, 30, 23, 59, 60);
+-------------------------------------------+
|make_timestamp_ltz(2019, 6, 30, 23, 59, 60)|
+-------------------------------------------+
| 2019-07-01 00:00:00|
+-------------------------------------------+
SELECT make_timestamp_ltz(null, 7, 22, 15, 30, 0);
+------------------------------------------+
|make_timestamp_ltz(NULL, 7, 22, 15, 30, 0)|
+------------------------------------------+
| NULL|
+------------------------------------------+
-- make_timestamp_ntz
SELECT make_timestamp_ntz(2014, 12, 28, 6, 30, 45.887);
+-----------------------------------------------+
|make_timestamp_ntz(2014, 12, 28, 6, 30, 45.887)|
+-----------------------------------------------+
| 2014-12-28 06:30:...|
+-----------------------------------------------+
SELECT make_timestamp_ntz(2019, 6, 30, 23, 59, 60);
+-------------------------------------------+
|make_timestamp_ntz(2019, 6, 30, 23, 59, 60)|
+-------------------------------------------+
| 2019-07-01 00:00:00|
+-------------------------------------------+
SELECT make_timestamp_ntz(null, 7, 22, 15, 30, 0);
+------------------------------------------+
|make_timestamp_ntz(NULL, 7, 22, 15, 30, 0)|
+------------------------------------------+
| NULL|
+------------------------------------------+
-- make_ym_interval
SELECT make_ym_interval(1, 2);
+----------------------+
|make_ym_interval(1, 2)|
+----------------------+
| INTERVAL '1-2' YE...|
+----------------------+
SELECT make_ym_interval(1, 0);
+----------------------+
|make_ym_interval(1, 0)|
+----------------------+
| INTERVAL '1-0' YE...|
+----------------------+
SELECT make_ym_interval(-1, 1);
+-----------------------+
|make_ym_interval(-1, 1)|
+-----------------------+
| INTERVAL '-0-11' ...|
+-----------------------+
SELECT make_ym_interval(2);
+----------------------+
|make_ym_interval(2, 0)|
+----------------------+
| INTERVAL '2-0' YE...|
+----------------------+
-- minute
SELECT minute('2009-07-30 12:58:59');
+---------------------------+
|minute(2009-07-30 12:58:59)|
+---------------------------+
| 58|
+---------------------------+
-- month
SELECT month('2016-07-30');
+-----------------+
|month(2016-07-30)|
+-----------------+
| 7|
+-----------------+
-- months_between
SELECT months_between('1997-02-28 10:30:00', '1996-10-30');
+-----------------------------------------------------+
|months_between(1997-02-28 10:30:00, 1996-10-30, true)|
+-----------------------------------------------------+
| 3.94959677|
+-----------------------------------------------------+
SELECT months_between('1997-02-28 10:30:00', '1996-10-30', false);
+------------------------------------------------------+
|months_between(1997-02-28 10:30:00, 1996-10-30, false)|
+------------------------------------------------------+
| 3.9495967741935485|
+------------------------------------------------------+
-- next_day
SELECT next_day('2015-01-14', 'TU');
+------------------------+
|next_day(2015-01-14, TU)|
+------------------------+
| 2015-01-20|
+------------------------+
-- now
SELECT now();
+--------------------+
| now()|
+--------------------+
|2023-09-09 05:38:...|
+--------------------+
-- quarter
SELECT quarter('2016-08-31');
+-------------------+
|quarter(2016-08-31)|
+-------------------+
| 3|
+-------------------+
-- second
SELECT second('2009-07-30 12:58:59');
+---------------------------+
|second(2009-07-30 12:58:59)|
+---------------------------+
| 59|
+---------------------------+
-- session_window
SELECT a, session_window.start, session_window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:10:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, session_window(b, '5 minutes') ORDER BY a, start;
+---+-------------------+-------------------+---+
| a| start| end|cnt|
+---+-------------------+-------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:09:30| 2|
| A1|2021-01-01 00:10:00|2021-01-01 00:15:00| 1|
| A2|2021-01-01 00:01:00|2021-01-01 00:06:00| 1|
+---+-------------------+-------------------+---+
SELECT a, session_window.start, session_window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:10:00'), ('A2', '2021-01-01 00:01:00'), ('A2', '2021-01-01 00:04:30') AS tab(a, b) GROUP by a, session_window(b, CASE WHEN a = 'A1' THEN '5 minutes' WHEN a = 'A2' THEN '1 minute' ELSE '10 minutes' END) ORDER BY a, start;
+---+-------------------+-------------------+---+
| a| start| end|cnt|
+---+-------------------+-------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:09:30| 2|
| A1|2021-01-01 00:10:00|2021-01-01 00:15:00| 1|
| A2|2021-01-01 00:01:00|2021-01-01 00:02:00| 1|
| A2|2021-01-01 00:04:30|2021-01-01 00:05:30| 1|
+---+-------------------+-------------------+---+
-- timestamp_micros
SELECT timestamp_micros(1230219000123123);
+----------------------------------+
|timestamp_micros(1230219000123123)|
+----------------------------------+
| 2008-12-25 15:30:...|
+----------------------------------+
-- timestamp_millis
SELECT timestamp_millis(1230219000123);
+-------------------------------+
|timestamp_millis(1230219000123)|
+-------------------------------+
| 2008-12-25 15:30:...|
+-------------------------------+
-- timestamp_seconds
SELECT timestamp_seconds(1230219000);
+-----------------------------+
|timestamp_seconds(1230219000)|
+-----------------------------+
| 2008-12-25 15:30:00|
+-----------------------------+
SELECT timestamp_seconds(1230219000.123);
+---------------------------------+
|timestamp_seconds(1230219000.123)|
+---------------------------------+
| 2008-12-25 15:30:...|
+---------------------------------+
-- to_date
SELECT to_date('2009-07-30 04:17:52');
+----------------------------+
|to_date(2009-07-30 04:17:52)|
+----------------------------+
| 2009-07-30|
+----------------------------+
SELECT to_date('2016-12-31', 'yyyy-MM-dd');
+-------------------------------+
|to_date(2016-12-31, yyyy-MM-dd)|
+-------------------------------+
| 2016-12-31|
+-------------------------------+
-- to_timestamp
SELECT to_timestamp('2016-12-31 00:12:00');
+---------------------------------+
|to_timestamp(2016-12-31 00:12:00)|
+---------------------------------+
| 2016-12-31 00:12:00|
+---------------------------------+
SELECT to_timestamp('2016-12-31', 'yyyy-MM-dd');
+------------------------------------+
|to_timestamp(2016-12-31, yyyy-MM-dd)|
+------------------------------------+
| 2016-12-31 00:00:00|
+------------------------------------+
-- to_timestamp_ltz
SELECT to_timestamp_ltz('2016-12-31 00:12:00');
+-------------------------------------+
|to_timestamp_ltz(2016-12-31 00:12:00)|
+-------------------------------------+
| 2016-12-31 00:12:00|
+-------------------------------------+
SELECT to_timestamp_ltz('2016-12-31', 'yyyy-MM-dd');
+----------------------------------------+
|to_timestamp_ltz(2016-12-31, yyyy-MM-dd)|
+----------------------------------------+
| 2016-12-31 00:00:00|
+----------------------------------------+
-- to_timestamp_ntz
SELECT to_timestamp_ntz('2016-12-31 00:12:00');
+-------------------------------------+
|to_timestamp_ntz(2016-12-31 00:12:00)|
+-------------------------------------+
| 2016-12-31 00:12:00|
+-------------------------------------+
SELECT to_timestamp_ntz('2016-12-31', 'yyyy-MM-dd');
+----------------------------------------+
|to_timestamp_ntz(2016-12-31, yyyy-MM-dd)|
+----------------------------------------+
| 2016-12-31 00:00:00|
+----------------------------------------+
-- to_unix_timestamp
SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd');
+-----------------------------------------+
|to_unix_timestamp(2016-04-08, yyyy-MM-dd)|
+-----------------------------------------+
| 1460073600|
+-----------------------------------------+
-- to_utc_timestamp
SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul');
+----------------------------------------+
|to_utc_timestamp(2016-08-31, Asia/Seoul)|
+----------------------------------------+
| 2016-08-30 15:00:00|
+----------------------------------------+
-- trunc
SELECT trunc('2019-08-04', 'week');
+-----------------------+
|trunc(2019-08-04, week)|
+-----------------------+
| 2019-07-29|
+-----------------------+
SELECT trunc('2019-08-04', 'quarter');
+--------------------------+
|trunc(2019-08-04, quarter)|
+--------------------------+
| 2019-07-01|
+--------------------------+
SELECT trunc('2009-02-12', 'MM');
+---------------------+
|trunc(2009-02-12, MM)|
+---------------------+
| 2009-02-01|
+---------------------+
SELECT trunc('2015-10-27', 'YEAR');
+-----------------------+
|trunc(2015-10-27, YEAR)|
+-----------------------+
| 2015-01-01|
+-----------------------+
-- try_to_timestamp
SELECT try_to_timestamp('2016-12-31 00:12:00');
+-------------------------------------+
|try_to_timestamp(2016-12-31 00:12:00)|
+-------------------------------------+
| 2016-12-31 00:12:00|
+-------------------------------------+
SELECT try_to_timestamp('2016-12-31', 'yyyy-MM-dd');
+----------------------------------------+
|try_to_timestamp(2016-12-31, yyyy-MM-dd)|
+----------------------------------------+
| 2016-12-31 00:00:00|
+----------------------------------------+
SELECT try_to_timestamp('foo', 'yyyy-MM-dd');
+---------------------------------+
|try_to_timestamp(foo, yyyy-MM-dd)|
+---------------------------------+
| NULL|
+---------------------------------+
-- unix_date
SELECT unix_date(DATE("1970-01-02"));
+---------------------+
|unix_date(1970-01-02)|
+---------------------+
| 1|
+---------------------+
-- unix_micros
SELECT unix_micros(TIMESTAMP('1970-01-01 00:00:01Z'));
+---------------------------------+
|unix_micros(1970-01-01 00:00:01Z)|
+---------------------------------+
| 1000000|
+---------------------------------+
-- unix_millis
SELECT unix_millis(TIMESTAMP('1970-01-01 00:00:01Z'));
+---------------------------------+
|unix_millis(1970-01-01 00:00:01Z)|
+---------------------------------+
| 1000|
+---------------------------------+
-- unix_seconds
SELECT unix_seconds(TIMESTAMP('1970-01-01 00:00:01Z'));
+----------------------------------+
|unix_seconds(1970-01-01 00:00:01Z)|
+----------------------------------+
| 1|
+----------------------------------+
-- unix_timestamp
SELECT unix_timestamp();
+--------------------------------------------------------+
|unix_timestamp(current_timestamp(), yyyy-MM-dd HH:mm:ss)|
+--------------------------------------------------------+
| 1694237890|
+--------------------------------------------------------+
SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd');
+--------------------------------------+
|unix_timestamp(2016-04-08, yyyy-MM-dd)|
+--------------------------------------+
| 1460073600|
+--------------------------------------+
-- weekday
SELECT weekday('2009-07-30');
+-------------------+
|weekday(2009-07-30)|
+-------------------+
| 3|
+-------------------+
-- weekofyear
SELECT weekofyear('2008-02-20');
+----------------------+
|weekofyear(2008-02-20)|
+----------------------+
| 8|
+----------------------+
-- window
SELECT a, window.start, window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:06:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, window(b, '5 minutes') ORDER BY a, start;
+---+-------------------+-------------------+---+
| a| start| end|cnt|
+---+-------------------+-------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:05:00| 2|
| A1|2021-01-01 00:05:00|2021-01-01 00:10:00| 1|
| A2|2021-01-01 00:00:00|2021-01-01 00:05:00| 1|
+---+-------------------+-------------------+---+
SELECT a, window.start, window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:06:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, window(b, '10 minutes', '5 minutes') ORDER BY a, start;
+---+-------------------+-------------------+---+
| a| start| end|cnt|
+---+-------------------+-------------------+---+
| A1|2020-12-31 23:55:00|2021-01-01 00:05:00| 2|
| A1|2021-01-01 00:00:00|2021-01-01 00:10:00| 3|
| A1|2021-01-01 00:05:00|2021-01-01 00:15:00| 1|
| A2|2020-12-31 23:55:00|2021-01-01 00:05:00| 1|
| A2|2021-01-01 00:00:00|2021-01-01 00:10:00| 1|
+---+-------------------+-------------------+---+
-- window_time
SELECT a, window.start as start, window.end as end, window_time(window), cnt FROM (SELECT a, window, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:06:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, window(b, '5 minutes') ORDER BY a, window.start);
+---+-------------------+-------------------+--------------------+---+
| a| start| end| window_time(window)|cnt|
+---+-------------------+-------------------+--------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:05:00|2021-01-01 00:04:...| 2|
| A1|2021-01-01 00:05:00|2021-01-01 00:10:00|2021-01-01 00:09:...| 1|
| A2|2021-01-01 00:00:00|2021-01-01 00:05:00|2021-01-01 00:04:...| 1|
+---+-------------------+-------------------+--------------------+---+
-- year
SELECT year('2016-07-30');
+----------------+
|year(2016-07-30)|
+----------------+
| 2016|
+----------------+
JSON函数
JSON函数
函数名 | 描述 |
---|---|
from_json(jsonStr, schema[, options]) | 使用给定的jsonStr 和schema 返回一个结构化值。 |
get_json_object(json_txt, path) | 从path 中提取一个JSON对象。 |
json_array_length(jsonArray) | 返回最外层JSON数组中的元素数量。 |
json_object_keys(json_object) | 将最外层JSON对象的所有键作为数组返回。 |
json_tuple(jsonStr, p1, p2, …, pn) | 返回一个类似于get_json_object函数的元组,但它接受多个名称。所有输入参数和输出列类型都是字符串。 |
schema_of_json(json[, options]) | 返回JSON字符串的DDL格式的模式。 |
to_json(expr[, options]) | 使用给定的结构化值返回一个JSON字符串。 |
示例
-- from_json
SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE');
+---------------------------+
|from_json({"a":1, "b":0.8})|
+---------------------------+
| {1, 0.8}|
+---------------------------+
SELECT from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
+--------------------------------+
|from_json({"time":"26/08/2015"})|
+--------------------------------+
| {2015-08-26 00:00...|
+--------------------------------+
SELECT from_json('{"teacher": "Alice", "student": [{"name": "Bob", "rank": 1}, {"name": "Charlie", "rank": 2}]}', 'STRUCT<teacher: STRING, student: ARRAY<STRUCT<name: STRING, rank: INT>>>');
+--------------------------------------------------------------------------------------------------------+
|from_json({"teacher": "Alice", "student": [{"name": "Bob", "rank": 1}, {"name": "Charlie", "rank": 2}]})|
+--------------------------------------------------------------------------------------------------------+
| {Alice, [{Bob, 1}...|
+--------------------------------------------------------------------------------------------------------+
-- get_json_object
SELECT get_json_object('{"a":"b"}', '$.a');
+-------------------------------+
|get_json_object({"a":"b"}, $.a)|
+-------------------------------+
| b|
+-------------------------------+
-- json_array_length
SELECT json_array_length('[1,2,3,4]');
+----------------------------+
|json_array_length([1,2,3,4])|
+----------------------------+
| 4|
+----------------------------+
SELECT json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+------------------------------------------------+
|json_array_length([1,2,3,{"f1":1,"f2":[5,6]},4])|
+------------------------------------------------+
| 5|
+------------------------------------------------+
SELECT json_array_length('[1,2');
+-----------------------+
|json_array_length([1,2)|
+-----------------------+
| NULL|
+-----------------------+
-- json_object_keys
SELECT json_object_keys('{}');
+--------------------+
|json_object_keys({})|
+--------------------+
| []|
+--------------------+
SELECT json_object_keys('{"key": "value"}');
+----------------------------------+
|json_object_keys({"key": "value"})|
+----------------------------------+
| [key]|
+----------------------------------+
SELECT json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}');
+--------------------------------------------------------+
|json_object_keys({"f1":"abc","f2":{"f3":"a", "f4":"b"}})|
+--------------------------------------------------------+
| [f1, f2]|
+--------------------------------------------------------+
-- json_tuple
SELECT json_tuple('{"a":1, "b":2}', 'a', 'b');
+---+---+
| c0| c1|
+---+---+
| 1| 2|
+---+---+
-- schema_of_json
SELECT schema_of_json('[{"col":0}]');
+---------------------------+
|schema_of_json([{"col":0}])|
+---------------------------+
| ARRAY<STRUCT<col:...|
+---------------------------+
SELECT schema_of_json('[{"col":01}]', map('allowNumericLeadingZeros', 'true'));
+----------------------------+
|schema_of_json([{"col":01}])|
+----------------------------+
| ARRAY<STRUCT<col:...|
+----------------------------+
-- to_json
SELECT to_json(named_struct('a', 1, 'b', 2));
+---------------------------------+
|to_json(named_struct(a, 1, b, 2))|
+---------------------------------+
| {"a":1,"b":2}|
+---------------------------------+
SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+-----------------------------------------------------------------+
|to_json(named_struct(time, to_timestamp(2015-08-26, yyyy-MM-dd)))|
+-----------------------------------------------------------------+
| {"time":"26/08/20...|
+-----------------------------------------------------------------+
SELECT to_json(array(named_struct('a', 1, 'b', 2)));
+----------------------------------------+
|to_json(array(named_struct(a, 1, b, 2)))|
+----------------------------------------+
| [{"a":1,"b":2}]|
+----------------------------------------+
SELECT to_json(map('a', named_struct('b', 1)));
+-----------------------------------+
|to_json(map(a, named_struct(b, 1)))|
+-----------------------------------+
| {"a":{"b":1}}|
+-----------------------------------+
SELECT to_json(map(named_struct('a', 1),named_struct('b', 2)));
+----------------------------------------------------+
|to_json(map(named_struct(a, 1), named_struct(b, 2)))|
+----------------------------------------------------+
| {"[1]":{"b":2}}|
+----------------------------------------------------+
SELECT to_json(map('a', 1));
+------------------+
|to_json(map(a, 1))|
+------------------+
| {"a":1}|
+------------------+
SELECT to_json(array(map('a', 1)));
+-------------------------+
|to_json(array(map(a, 1)))|
+-------------------------+
| [{"a":1}]|
+-------------------------+
数学函数
数学函数
函数名 | 描述 |
---|---|
expr1 % expr2 | 返回expr1 除以expr2 的余数。 |
expr1 * expr2 | 返回expr1 乘以expr2 的结果。 |
expr1 + expr2 | 返回expr1 加上expr2 的结果。 |
expr1 - expr2 | 返回expr1 减去expr2 的结果。 |
expr1 / expr2 | 返回expr1 除以expr2 的结果。它始终执行浮点除法。 |
abs(expr) | 返回数字或间隔值的绝对值。 |
acos(expr) | 返回expr 的反余弦(又称反余弦),就像由java.lang.Math.acos 计算的一样。 |
acosh(expr) | 返回expr 的反双曲余弦。 |
asin(expr) | 返回expr 的反正弦(又称反正弦),就像由java.lang.Math.asin 计算的一样。 |
asinh(expr) | 返回expr 的反双曲正弦。 |
atan(expr) | 返回expr 的反正切(又称反正切),就像由java.lang.Math.atan 计算的一样。 |
atan2(exprY, exprX) | 返回平面上正x轴和由坐标(exprX ,exprY )给出的点之间的角度(以弧度为单位),就像由java.lang.Math.atan2 计算的一样。 |
atanh(expr) | 返回expr 的反双曲正切。 |
bin(expr) | 返回以二进制表示的长整型值expr 的字符串表示形式。 |
bround(expr, d) | 使用HALF_EVEN舍入模式将expr 四舍五入为d 位小数。 |
cbrt(expr) | 返回expr 的立方根。 |
ceil(expr[, scale]) | 返回不小于expr 的最小数。可以指定一个可选的scale 参数来控制舍入行为。 |
ceiling(expr[, scale]) | 返回不小于expr 的最小数。可以指定一个可选的scale 参数来控制舍入行为。 |
conv(num, from_base, to_base) | 将num 从from_base 转换为to_base 。 |
cos(expr) | 返回expr 的余弦,就像由java.lang.Math.cos 计算的一样。 |
cosh(expr) | 返回expr 的双曲余弦,就像由java.lang.Math.cosh 计算的一样。 |
cot(expr) | 返回expr 的余切,就像由1/java.lang.Math.tan 计算的一样。 |
csc(expr) | 返回expr 的余割,就像由1/java.lang.Math.sin 计算的一样。 |
degrees(expr) | 将弧度转换为度。 |
expr1 div expr2 | 将expr1 除以expr2 。如果操作数为NULL或expr2 为0,则返回NULL。结果转换为长整型。 |
e() | 返回自然对数的底数e。 |
exp(expr) | 返回e的expr 次幂。 |
expm1(expr) | 返回e的expr 次幂减去1。 |
factorial(expr) | 返回expr 的阶乘。expr 的取值范围为[0…20]。否则,返回NULL。 |
floor(expr[, scale]) | 返回不大于expr 的最大数。可以指定一个可选的scale 参数来控制舍入行为。 |
greatest(expr, …) | 返回所有参数中的最大值,跳过空值。 |
hex(expr) | 将expr 转换为十六进制。 |
hypot(expr1, expr2) | 返回sqrt(expr1 **2 + expr2 **2)。 |
least(expr, …) | 返回所有参数中的最小值,跳过空值。 |
ln(expr) | 返回expr 的自然对数(以e为底)。 |
log(base, expr) | 返回以base 为底的expr 的对数。 |
log10(expr) | 返回以10为底的expr 的对数。 |
log1p(expr) | 返回log(1 + expr )。 |
log2(expr) | 返回以2为底的expr 的对数。 |
expr1 mod expr2 | 返回expr1 除以expr2 的余数。 |
negative(expr) | 返回expr 的负值。 |
pi() | 返回π。 |
pmod(expr1, expr2) | 返回expr1 模expr2 的正值。 |
positive(expr) | 返回expr 的值。 |
pow(expr1, expr2) | 将expr1 的值提高到expr2 次方。 |
power(expr1, expr2) | 将expr1 的值提高到expr2 次方。 |
radians(expr) | 将度数转换为弧度。 |
rand([seed]) | 返回[0, 1)范围内独立且等概率分布(i.i.d.)的随机值。 |
randn([seed]) | 返回从标准正态分布中抽取的独立且等概率分布(i.i.d.)的随机值。 |
random([seed]) | 返回[0, 1)范围内独立且等概率分布(i.i.d.)的随机值。 |
rint(expr) | 返回与参数最接近的double值,该值等于一个整数。 |
round(expr, d) | 使用HALF_UP舍入模式将expr 四舍五入为d 位小数。 |
sec(expr) | 返回expr 的正割,就像由1/java.lang.Math.cos 计算的一样。 |
shiftleft(base, expr) | 按位左移。 |
sign(expr) | 如果expr 为负数,则返回-1.0;如果expr 为0,则返回0.0;如果expr 为正数,则返回1.0。 |
signum(expr) | 如果expr 为负数,则返回-1.0;如果expr 为0,则返回0.0;如果expr 为正数,则返回1.0。 |
sin(expr) | 返回expr 的正弦,就像由java.lang.Math.sin 计算的一样。 |
sinh(expr) | 返回expr 的双曲正弦,就像由java.lang.Math.sinh 计算的一样。 |
sqrt(expr) | 返回expr 的平方根。 |
tan(expr) | 返回expr 的正切,就像由java.lang.Math.tan 计算的一样。 |
tanh(expr) | 返回expr 的双曲正切,就像由java.lang.Math.tanh 计算的一样。 |
try_add(expr1, expr2) | 返回expr1 和expr2 的和,如果溢出则结果为null。可接受的输入类型与+ 运算符相同。 |
try_divide(dividend, divisor) | 返回dividend 除以divisor 的结果。它始终执行浮点除法。如果divisor 为0,则其结果始终为null。dividend 必须是数字或间隔值。divisor 必须是数字。 |
try_multiply(expr1, expr2) | 返回expr1 乘以expr2 的结果,如果溢出则结果为null。可接受的输入类型与* 运算符相同。 |
try_subtract(expr1, expr2) | 返回expr1 减去expr2 的结果,如果溢出则结果为null。可接受的输入类型与- 运算符相同。 |
unhex(expr) | 将十六进制的expr 转换为二进制。 |
width_bucket(value, min_value, max_value, num_bucket) | 返回在区间min_value 到max_value 范围内等宽直方图中将value 分配给的桶号。" |
示例
-- %
SELECT 2 % 1.8;
+---------+
|(2 % 1.8)|
+---------+
| 0.2|
+---------+
SELECT MOD(2, 1.8);
+-----------+
|mod(2, 1.8)|
+-----------+
| 0.2|
+-----------+
-- *
SELECT 2 * 3;
+-------+
|(2 * 3)|
+-------+
| 6|
+-------+
-- +
SELECT 1 + 2;
+-------+
|(1 + 2)|
+-------+
| 3|
+-------+
-- -
SELECT 2 - 1;
+-------+
|(2 - 1)|
+-------+
| 1|
+-------+
-- /
SELECT 3 / 2;
+-------+
|(3 / 2)|
+-------+
| 1.5|
+-------+
SELECT 2L / 2L;
+-------+
|(2 / 2)|
+-------+
| 1.0|
+-------+
-- abs
SELECT abs(-1);
+-------+
|abs(-1)|
+-------+
| 1|
+-------+
SELECT abs(INTERVAL -'1-1' YEAR TO MONTH);
+----------------------------------+
|abs(INTERVAL '-1-1' YEAR TO MONTH)|
+----------------------------------+
| INTERVAL '1-1' YE...|
+----------------------------------+
-- acos
SELECT acos(1);
+-------+
|ACOS(1)|
+-------+
| 0.0|
+-------+
SELECT acos(2);
+-------+
|ACOS(2)|
+-------+
| NaN|
+-------+
-- acosh
SELECT acosh(1);
+--------+
|ACOSH(1)|
+--------+
| 0.0|
+--------+
SELECT acosh(0);
+--------+
|ACOSH(0)|
+--------+
| NaN|
+--------+
-- asin
SELECT asin(0);
+-------+
|ASIN(0)|
+-------+
| 0.0|
+-------+
SELECT asin(2);
+-------+
|ASIN(2)|
+-------+
| NaN|
+-------+
-- asinh
SELECT asinh(0);
+--------+
|ASINH(0)|
+--------+
| 0.0|
+--------+
-- atan
SELECT atan(0);
+-------+
|ATAN(0)|
+-------+
| 0.0|
+-------+
-- atan2
SELECT atan2(0, 0);
+-----------+
|ATAN2(0, 0)|
+-----------+
| 0.0|
+-----------+
-- atanh
SELECT atanh(0);
+--------+
|ATANH(0)|
+--------+
| 0.0|
+--------+
SELECT atanh(2);
+--------+
|ATANH(2)|
+--------+
| NaN|
+--------+
-- bin
SELECT bin(13);
+-------+
|bin(13)|
+-------+
| 1101|
+-------+
SELECT bin(-13);
+--------------------+
| bin(-13)|
+--------------------+
|11111111111111111...|
+--------------------+
SELECT bin(13.3);
+---------+
|bin(13.3)|
+---------+
| 1101|
+---------+
-- bround
SELECT bround(2.5, 0);
+--------------+
|bround(2.5, 0)|
+--------------+
| 2|
+--------------+
SELECT bround(25, -1);
+--------------+
|bround(25, -1)|
+--------------+
| 20|
+--------------+
-- cbrt
SELECT cbrt(27.0);
+----------+
|CBRT(27.0)|
+----------+
| 3.0|
+----------+
-- ceil
SELECT ceil(-0.1);
+----------+
|CEIL(-0.1)|
+----------+
| 0|
+----------+
SELECT ceil(5);
+-------+
|CEIL(5)|
+-------+
| 5|
+-------+
SELECT ceil(3.1411, 3);
+---------------+
|ceil(3.1411, 3)|
+---------------+
| 3.142|
+---------------+
SELECT ceil(3.1411, -3);
+----------------+
|ceil(3.1411, -3)|
+----------------+
| 1000|
+----------------+
-- ceiling
SELECT ceiling(-0.1);
+-------------+
|ceiling(-0.1)|
+-------------+
| 0|
+-------------+
SELECT ceiling(5);
+----------+
|ceiling(5)|
+----------+
| 5|
+----------+
SELECT ceiling(3.1411, 3);
+------------------+
|ceiling(3.1411, 3)|
+------------------+
| 3.142|
+------------------+
SELECT ceiling(3.1411, -3);
+-------------------+
|ceiling(3.1411, -3)|
+-------------------+
| 1000|
+-------------------+
-- conv
SELECT conv('100', 2, 10);
+----------------+
|conv(100, 2, 10)|
+----------------+
| 4|
+----------------+
SELECT conv(-10, 16, -10);
+------------------+
|conv(-10, 16, -10)|
+------------------+
| -16|
+------------------+
-- cos
SELECT cos(0);
+------+
|COS(0)|
+------+
| 1.0|
+------+
-- cosh
SELECT cosh(0);
+-------+
|COSH(0)|
+-------+
| 1.0|
+-------+
-- cot
SELECT cot(1);
+------------------+
| COT(1)|
+------------------+
|0.6420926159343306|
+------------------+
-- csc
SELECT csc(1);
+------------------+
| CSC(1)|
+------------------+
|1.1883951057781212|
+------------------+
-- degrees
SELECT degrees(3.141592653589793);
+--------------------------+
|DEGREES(3.141592653589793)|
+--------------------------+
| 180.0|
+--------------------------+
-- div
SELECT 3 div 2;
+---------+
|(3 div 2)|
+---------+
| 1|
+---------+
SELECT INTERVAL '1-1' YEAR TO MONTH div INTERVAL '-1' MONTH;
+------------------------------------------------------+
|(INTERVAL '1-1' YEAR TO MONTH div INTERVAL '-1' MONTH)|
+------------------------------------------------------+
| -13|
+------------------------------------------------------+
-- e
SELECT e();
+-----------------+
| E()|
+-----------------+
|2.718281828459045|
+-----------------+
-- exp
SELECT exp(0);
+------+
|EXP(0)|
+------+
| 1.0|
+------+
-- expm1
SELECT expm1(0);
+--------+
|EXPM1(0)|
+--------+
| 0.0|
+--------+
-- factorial
SELECT factorial(5);
+------------+
|factorial(5)|
+------------+
| 120|
+------------+
-- floor
SELECT floor(-0.1);
+-----------+
|FLOOR(-0.1)|
+-----------+
| -1|
+-----------+
SELECT floor(5);
+--------+
|FLOOR(5)|
+--------+
| 5|
+--------+
SELECT floor(3.1411, 3);
+----------------+
|floor(3.1411, 3)|
+----------------+
| 3.141|
+----------------+
SELECT floor(3.1411, -3);
+-----------------+
|floor(3.1411, -3)|
+-----------------+
| 0|
+-----------------+
-- greatest
SELECT greatest(10, 9, 2, 4, 3);
+------------------------+
|greatest(10, 9, 2, 4, 3)|
+------------------------+
| 10|
+------------------------+
-- hex
SELECT hex(17);
+-------+
|hex(17)|
+-------+
| 11|
+-------+
SELECT hex('Spark SQL');
+------------------+
| hex(Spark SQL)|
+------------------+
|537061726B2053514C|
+------------------+
-- hypot
SELECT hypot(3, 4);
+-----------+
|HYPOT(3, 4)|
+-----------+
| 5.0|
+-----------+
-- least
SELECT least(10, 9, 2, 4, 3);
+---------------------+
|least(10, 9, 2, 4, 3)|
+---------------------+
| 2|
+---------------------+
-- ln
SELECT ln(1);
+-----+
|ln(1)|
+-----+
| 0.0|
+-----+
-- log
SELECT log(10, 100);
+------------+
|LOG(10, 100)|
+------------+
| 2.0|
+------------+
-- log10
SELECT log10(10);
+---------+
|LOG10(10)|
+---------+
| 1.0|
+---------+
-- log1p
SELECT log1p(0);
+--------+
|LOG1P(0)|
+--------+
| 0.0|
+--------+
-- log2
SELECT log2(2);
+-------+
|LOG2(2)|
+-------+
| 1.0|
+-------+
-- mod
SELECT 2 % 1.8;
+---------+
|(2 % 1.8)|
+---------+
| 0.2|
+---------+
SELECT MOD(2, 1.8);
+-----------+
|mod(2, 1.8)|
+-----------+
| 0.2|
+-----------+
-- negative
SELECT negative(1);
+-----------+
|negative(1)|
+-----------+
| -1|
+-----------+
-- pi
SELECT pi();
+-----------------+
| PI()|
+-----------------+
|3.141592653589793|
+-----------------+
-- pmod
SELECT pmod(10, 3);
+-----------+
|pmod(10, 3)|
+-----------+
| 1|
+-----------+
SELECT pmod(-10, 3);
+------------+
|pmod(-10, 3)|
+------------+
| 2|
+------------+
-- positive
SELECT positive(1);
+-----+
|(+ 1)|
+-----+
| 1|
+-----+
-- pow
SELECT pow(2, 3);
+---------+
|pow(2, 3)|
+---------+
| 8.0|
+---------+
-- power
SELECT power(2, 3);
+-----------+
|POWER(2, 3)|
+-----------+
| 8.0|
+-----------+
-- radians
SELECT radians(180);
+-----------------+
| RADIANS(180)|
+-----------------+
|3.141592653589793|
+-----------------+
-- rand
SELECT rand();
+------------------+
| rand()|
+------------------+
|0.6627134854017663|
+------------------+
SELECT rand(0);
+------------------+
| rand(0)|
+------------------+
|0.7604953758285915|
+------------------+
SELECT rand(null);
+------------------+
| rand(NULL)|
+------------------+
|0.7604953758285915|
+------------------+
-- randn
SELECT randn();
+-------------------+
| randn()|
+-------------------+
|-0.9888922220705292|
+-------------------+
SELECT randn(0);
+------------------+
| randn(0)|
+------------------+
|1.6034991609278433|
+------------------+
SELECT randn(null);
+------------------+
| randn(NULL)|
+------------------+
|1.6034991609278433|
+------------------+
-- random
SELECT random();
+-----------------+
| rand()|
+-----------------+
|0.714484218718629|
+-----------------+
SELECT random(0);
+------------------+
| rand(0)|
+------------------+
|0.7604953758285915|
+------------------+
SELECT random(null);
+------------------+
| rand(NULL)|
+------------------+
|0.7604953758285915|
+------------------+
-- rint
SELECT rint(12.3456);
+-------------+
|rint(12.3456)|
+-------------+
| 12.0|
+-------------+
-- round
SELECT round(2.5, 0);
+-------------+
|round(2.5, 0)|
+-------------+
| 3|
+-------------+
-- sec
SELECT sec(0);
+------+
|SEC(0)|
+------+
| 1.0|
+------+
-- shiftleft
SELECT shiftleft(2, 1);
+---------------+
|shiftleft(2, 1)|
+---------------+
| 4|
+---------------+
-- sign
SELECT sign(40);
+--------+
|sign(40)|
+--------+
| 1.0|
+--------+
SELECT sign(INTERVAL -'100' YEAR);
+--------------------------+
|sign(INTERVAL '-100' YEAR)|
+--------------------------+
| -1.0|
+--------------------------+
-- signum
SELECT signum(40);
+----------+
|SIGNUM(40)|
+----------+
| 1.0|
+----------+
SELECT signum(INTERVAL -'100' YEAR);
+----------------------------+
|SIGNUM(INTERVAL '-100' YEAR)|
+----------------------------+
| -1.0|
+----------------------------+
-- sin
SELECT sin(0);
+------+
|SIN(0)|
+------+
| 0.0|
+------+
-- sinh
SELECT sinh(0);
+-------+
|SINH(0)|
+-------+
| 0.0|
+-------+
-- sqrt
SELECT sqrt(4);
+-------+
|SQRT(4)|
+-------+
| 2.0|
+-------+
-- tan
SELECT tan(0);
+------+
|TAN(0)|
+------+
| 0.0|
+------+
-- tanh
SELECT tanh(0);
+-------+
|TANH(0)|
+-------+
| 0.0|
+-------+
-- try_add
SELECT try_add(1, 2);
+-------------+
|try_add(1, 2)|
+-------------+
| 3|
+-------------+
SELECT try_add(2147483647, 1);
+----------------------+
|try_add(2147483647, 1)|
+----------------------+
| NULL|
+----------------------+
SELECT try_add(date'2021-01-01', 1);
+-----------------------------+
|try_add(DATE '2021-01-01', 1)|
+-----------------------------+
| 2021-01-02|
+-----------------------------+
SELECT try_add(date'2021-01-01', interval 1 year);
+---------------------------------------------+
|try_add(DATE '2021-01-01', INTERVAL '1' YEAR)|
+---------------------------------------------+
| 2022-01-01|
+---------------------------------------------+
SELECT try_add(timestamp'2021-01-01 00:00:00', interval 1 day);
+----------------------------------------------------------+
|try_add(TIMESTAMP '2021-01-01 00:00:00', INTERVAL '1' DAY)|
+----------------------------------------------------------+
| 2021-01-02 00:00:00|
+----------------------------------------------------------+
SELECT try_add(interval 1 year, interval 2 year);
+---------------------------------------------+
|try_add(INTERVAL '1' YEAR, INTERVAL '2' YEAR)|
+---------------------------------------------+
| INTERVAL '3' YEAR|
+---------------------------------------------+
-- try_divide
SELECT try_divide(3, 2);
+----------------+
|try_divide(3, 2)|
+----------------+
| 1.5|
+----------------+
SELECT try_divide(2L, 2L);
+----------------+
|try_divide(2, 2)|
+----------------+
| 1.0|
+----------------+
SELECT try_divide(1, 0);
+----------------+
|try_divide(1, 0)|
+----------------+
| NULL|
+----------------+
SELECT try_divide(interval 2 month, 2);
+---------------------------------+
|try_divide(INTERVAL '2' MONTH, 2)|
+---------------------------------+
| INTERVAL '0-1' YE...|
+---------------------------------+
SELECT try_divide(interval 2 month, 0);
+---------------------------------+
|try_divide(INTERVAL '2' MONTH, 0)|
+---------------------------------+
| NULL|
+---------------------------------+
-- try_multiply
SELECT try_multiply(2, 3);
+------------------+
|try_multiply(2, 3)|
+------------------+
| 6|
+------------------+
SELECT try_multiply(-2147483648, 10);
+-----------------------------+
|try_multiply(-2147483648, 10)|
+-----------------------------+
| NULL|
+-----------------------------+
SELECT try_multiply(interval 2 year, 3);
+----------------------------------+
|try_multiply(INTERVAL '2' YEAR, 3)|
+----------------------------------+
| INTERVAL '6-0' YE...|
+----------------------------------+
-- try_subtract
SELECT try_subtract(2, 1);
+------------------+
|try_subtract(2, 1)|
+------------------+
| 1|
+------------------+
SELECT try_subtract(-2147483648, 1);
+----------------------------+
|try_subtract(-2147483648, 1)|
+----------------------------+
| NULL|
+----------------------------+
SELECT try_subtract(date'2021-01-02', 1);
+----------------------------------+
|try_subtract(DATE '2021-01-02', 1)|
+----------------------------------+
| 2021-01-01|
+----------------------------------+
SELECT try_subtract(date'2021-01-01', interval 1 year);
+--------------------------------------------------+
|try_subtract(DATE '2021-01-01', INTERVAL '1' YEAR)|
+--------------------------------------------------+
| 2020-01-01|
+--------------------------------------------------+
SELECT try_subtract(timestamp'2021-01-02 00:00:00', interval 1 day);
+---------------------------------------------------------------+
|try_subtract(TIMESTAMP '2021-01-02 00:00:00', INTERVAL '1' DAY)|
+---------------------------------------------------------------+
| 2021-01-01 00:00:00|
+---------------------------------------------------------------+
SELECT try_subtract(interval 2 year, interval 1 year);
+--------------------------------------------------+
|try_subtract(INTERVAL '2' YEAR, INTERVAL '1' YEAR)|
+--------------------------------------------------+
| INTERVAL '1' YEAR|
+--------------------------------------------------+
-- unhex
SELECT decode(unhex('537061726B2053514C'), 'UTF-8');
+----------------------------------------+
|decode(unhex(537061726B2053514C), UTF-8)|
+----------------------------------------+
| Spark SQL|
+----------------------------------------+
-- width_bucket
SELECT width_bucket(5.3, 0.2, 10.6, 5);
+-------------------------------+
|width_bucket(5.3, 0.2, 10.6, 5)|
+-------------------------------+
| 3|
+-------------------------------+
SELECT width_bucket(-2.1, 1.3, 3.4, 3);
+-------------------------------+
|width_bucket(-2.1, 1.3, 3.4, 3)|
+-------------------------------+
| 0|
+-------------------------------+
SELECT width_bucket(8.1, 0.0, 5.7, 4);
+------------------------------+
|width_bucket(8.1, 0.0, 5.7, 4)|
+------------------------------+
| 5|
+------------------------------+
SELECT width_bucket(-0.9, 5.2, 0.5, 2);
+-------------------------------+
|width_bucket(-0.9, 5.2, 0.5, 2)|
+-------------------------------+
| 3|
+-------------------------------+
SELECT width_bucket(INTERVAL '0' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10);
+--------------------------------------------------------------------------+
|width_bucket(INTERVAL '0' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10)|
+--------------------------------------------------------------------------+
| 1|
+--------------------------------------------------------------------------+
SELECT width_bucket(INTERVAL '1' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10);
+--------------------------------------------------------------------------+
|width_bucket(INTERVAL '1' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10)|
+--------------------------------------------------------------------------+
| 2|
+--------------------------------------------------------------------------+
SELECT width_bucket(INTERVAL '0' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10);
+-----------------------------------------------------------------------+
|width_bucket(INTERVAL '0' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10)|
+-----------------------------------------------------------------------+
| 1|
+-----------------------------------------------------------------------+
SELECT width_bucket(INTERVAL '1' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10);
+-----------------------------------------------------------------------+
|width_bucket(INTERVAL '1' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10)|
+-----------------------------------------------------------------------+
| 2|
+-----------------------------------------------------------------------+
字符串函数
字符串函数
函数名 | 描述 |
---|---|
ascii(str) | 返回str 的第一个字符的数值。 |
base64(bin) | 将二进制参数bin 转换为Base64字符串。 |
bit_length(expr) | 返回字符串数据的位长度或二进制数据的位数。 |
btrim(str) | 从str 中删除前导和尾随空格字符。 |
btrim(str, trimStr) | 从str 中删除前导和尾随trimStr 字符。 |
char(expr) | 返回具有与expr 二进制等效的ASCII字符。如果n大于256,则结果等同于chr(n % 256)。 |
char_length(expr) | 返回字符串数据的字符长度或二进制数据的字节数。字符串数据的长度包括尾随空格。二进制数据的长度包括二进制零。 |
character_length(expr) | 返回字符串数据的字符长度或二进制数据的字节数。字符串数据的长度包括尾随空格。二进制数据的长度包括二进制零。 |
chr(expr) | 返回具有与expr 二进制等效的ASCII字符。如果n大于256,则结果等同于chr(n % 256)。 |
concat_ws(sep[, str | array(str)]+) |
contains(left, right) | 返回布尔值。如果right 在left 中找到,则值为True。如果任一输入表达式为NULL,则返回NULL。否则,返回False。left 和right 必须为STRING或BINARY类型。 |
decode(bin, charset) | 使用第二个参数字符集解码第一个参数。 |
decode(expr, search, result [, search, result ] … [, default]) | 按顺序将expr 与每个搜索值进行比较。如果expr 等于搜索值,则decode返回相应的结果。如果没有找到匹配项,则返回default。如果省略了default,则返回null。 |
elt(n, input1, input2, …) | 返回第n 个输入值,例如当n 为2时返回input2 。如果索引超过数组长度且spark.sql.ansi.enabled 设置为false,则函数返回NULL。如果spark.sql.ansi.enabled 设置为true,则对于无效的索引会抛出ArrayIndexOutOfBoundsException异常。 |
encode(str, charset) | 使用第二个参数字符集对第一个参数进行编码。 |
endswith(left, right) | 返回布尔值。如果left 以right 结尾,则值为True。如果任一输入表达式为NULL,则返回NULL。否则,返回False。left 和right 必须为STRING或BINARY类型。 |
find_in_set(str, str_array) | 返回给定字符串(str )在逗号分隔列表(str_array )中的索引(从1开始)。如果未找到该字符串或者给定字符串(str )包含逗号,则返回0。 |
format_number(expr1, expr2) | 格式化数字expr1 ,格式为“#,###,###.##”,四舍五入到expr2 位小数。如果expr2 为0,则结果没有小数点或小数部分。expr2 也接受用户指定的格式。这相当于MySQL的FORMAT函数。 |
format_string(strfmt, obj, …) | 使用printf样式的格式字符串返回一个格式化的字符串。 |
initcap(str) | 返回每个单词首字母大写的str 。其他字母都是小写。由空格分隔。 |
instr(str, substr) | 返回str 中第一个出现substr 的位置(从1开始计数)。 |
lcase(str) | 返回将所有字符更改为小写的str 。 |
left(str, len) | 返回字符串str 的左侧len (len 可以是字符串类型)个字符,如果len 小于等于0,则结果为空字符串。 |
len(expr) | 返回字符串数据的字符长度或二进制数据的字节数。字符串数据的长度包括尾随空格。二进制数据的长度包括二进制零。 |
length(expr) | 返回字符串数据的字符长度或二进制数据的字节数。字符串数据的长度包括尾随空格。二进制数据的长度包括二进制零。 |
levenshtein(str1, str2[, threshold]) | 返回两个给定字符串之间的Levenshtein距离。如果threshold设置且距离大于它,则返回-1。 |
locate(substr, str[, pos]) | 返回str 中从位置pos 之后第一个出现substr 的位置(从1开始计数)。给定的pos 和返回值都是基于1的。 |
lower(str) | 返回将所有字符更改为小写的str 。 |
lpad(str, len[, pad]) | 将str 左侧填充到长度len (如果str 超过len ,则返回值将缩短到len 个字符或字节)。如果未指定pad ,则如果str 是字符字符串,则在左侧使用空格字符进行填充;如果str 是字节序列,则使用零进行填充。 |
ltrim(str) | 从str 中删除前导空格字符。 |
luhn_check(str) | 检查数字字符串是否符合Luhn算法的校验。该校验函数广泛应用于信用卡号码和政府身份证件号码上,以区分有效号码和输入错误的号码。 |
mask(input[, upperChar, lowerChar, digitChar, otherChar]) | 掩盖给定的字符串值。该函数用’X’或’x’替换字符,并用’n’替换数字。这对于创建不包含敏感信息的表副本非常有用。 |
octet_length(expr) | 返回字符串数据的字节长度或二进制数据的字节数。 |
overlay(input, replace, pos[, len]) | 将replace 替换为input 中从pos 开始的长度为len 的部分。 |
position(substr, str[, pos]) | 返回str 中从位置pos 之后第一个出现substr 的位置(从1开始计数)。给定的pos 和返回值都是基于1的。 |
printf(strfmt, obj, …) | 使用printf样式的格式字符串返回一个格式化的字符串。 |
regexp_count(str, regexp) | 返回正则表达式模式regexp 在字符串str 中匹配的次数。 |
regexp_extract(str, regexp[, idx]) | 提取与正则表达式regexp 匹配的字符串str 中的第一个字符串,并对应于正则表达式组索引。 |
regexp_extract_all(str, regexp[, idx]) | 提取与正则表达式regexp 匹配的字符串str 中的所有字符串,并对应于正则表达式组索引。 |
regexp_instr(str, regexp) | 在字符串中搜索正则表达式,并返回指示匹配子字符串开始位置的整数。位置从1开始,而不是从0开始。如果找不到匹配项,则返回0。 |
regexp_replace(str, regexp, rep[, position]) | 用rep 替换str 中与正则表达式regexp 匹配的所有子字符串。 |
regexp_substr(str, regexp) | 返回在字符串str 中与正则表达式regexp 匹配的子字符串。如果未找到正则表达式,则结果为null。 |
repeat(str, n) | 返回重复给定字符串值n次的字符串。 |
replace(str, search[, replace]) | 将str 中所有出现的search 替换为replace 。 |
right(str, len) | 返回字符串str 的右侧len (len 可以是字符串类型)个字符,如果len 小于等于0,则结果为空字符串。 |
rpad(str, len[, pad]) | 将str 右侧填充到长度len (如果str 超过len ,则返回值将缩短到len 个字符)。如果未指定pad ,则如果str 是字符字符串,则在右侧使用空格字符进行填充;如果str 是字节序列,则使用零进行填充。 |
rtrim(str) | 从str 中删除尾随空格字符。 |
sentences(str[, lang, country]) | 将str 拆分为一个单词数组的数组。 |
soundex(str) | 返回字符串的Soundex代码。 |
space(n) | 返回由n 个空格组成的字符串。 |
split(str, regex, limit) | 使用匹配regex 的位置拆分str 并返回最多包含limit 个元素的数组。 |
split_part(str, delimiter, partNum) | 使用分隔符拆分str 并返回所请求的拆分部分(基于1的索引)。如果任一输入为null,则返回null。如果partNum 超出了拆分部分的范围,则返回空字符串。如果partNum 为0,则抛出错误。如果partNum 为负数,则从字符串末尾开始计算部分。如果delimiter 为空字符串,则不拆分str 。 |
startswith(left, right) | 返回布尔值。如果left 以right 开头,则值为True。如果任一输入表达式为NULL,则返回NULL。否则,返回False。left 和right 必须为STRING或BINARY类型。 |
substr(str, pos[, len]) | 返回从str 的位置pos 开始长度为len 的子字符串,或者返回从pos 开始长度为len 的字节数组切片。 |
substring(str, pos[, len]) | 返回从str 的位置pos 开始长度为len 的子字符串,或者返回从pos 开始长度为len 的字节数组切片。 |
substring_index(str, delim, count) | 返回在str 中,在第count 个delim 分隔符之前的子字符串。如果count 为正数,则返回最后一个分隔符(从左边计数)左侧的所有内容。如果count 为负数,则返回最后一个分隔符(从右边计数)右侧的所有内容。函数substring_index在搜索delim 时执行区分大小写的匹配。 |
to_binary(str[, fmt]) | 根据提供的fmt 将输入str 转换为二进制值。fmt 可以是大小写不敏感的字符串字面量,包括"hex"、“utf-8”、“utf8"或"base64”。如果省略了fmt ,则默认情况下转换的二进制格式是"hex"。如果至少一个输入参数为NULL,则函数返回NULL。 |
to_char(numberExpr, formatExpr) | 根据formatExpr 将numberExpr 转换为字符串。如果转换失败,则抛出异常。格式可以包含以下字符(不区分大小写):‘0’或’9’:指定0到9之间的期望数字。格式字符串中的0或9序列与输入值中的数字序列相匹配,生成与格式字符串中对应序列长度相同的结果字符串。如果0/9序列比十进制值的匹配部分包含更多位数,并且以0开头并且位于小数点之前,则结果字符串左侧会填充零。否则,它将填充空格。‘.‘或’D’:指定小数点的位置(可选,只允许一次)。’,‘或’G’:指定分组(千位)分隔符(,)的位置。每个分组分隔符的左右两侧必须有0或9。‘′:指定货币符号的位置。该字符只能指定一次。‘S’或’MI’:指定’-‘或’+‘符号的位置(可选,只允许在格式字符串的开头或末尾出现一次)。注意,‘S’打印’+‘作为正值,但’MI’打印空格。‘PR’:只允许在格式字符串的末尾;如果输入值为负数,则指定结果字符串将用尖括号括起来。(’<1>’)。 |
to_number(expr, fmt) | 根据字符串格式fmt 将字符串expr 转换为数字。如果转换失败,则抛出异常。格式可以包含以下字符(不区分大小写):‘0’或’9’:指定0到9之间的期望数字。格式字符串中的0或9序列与输入字符串中的数字序列匹配。如果0/9序列以0开头并且位于小数点之前,则它只能匹配具有相同位数的数字序列。否则,如果序列以9开头或位于小数点之后,则它可以匹配具有相同或更小位数的数字序列。‘.‘或’D’:指定小数点的位置(可选,只允许一次)。’,‘或’G’:指定分组(千位)分隔符(,)的位置。每个分组分隔符的左右两侧必须有0或9。expr 必须与数字大小相关的分组分隔符匹配。‘′:指定货币符号的位置。该字符只能指定一次。‘S’或’MI’:指定’-‘或’+‘符号的位置(可选,只允许在格式字符串的开头或末尾出现一次)。注意,‘S’允许’-’,但’MI’不允许。‘PR’:只允许在格式字符串的末尾;指定expr 表示带有尖括号包装的负数。(‘<1>’)。 |
to_varchar(numberExpr, formatExpr) | 根据formatExpr 将numberExpr 转换为字符串。如果转换失败,则抛出异常。格式可以包含以下字符(不区分大小写):‘0’或’9’:指定0到9之间的期望数字。格式字符串中的0或9序列与输入值中的数字序列相匹配,生成与格式字符串中对应序列长度相同的结果字符串。结果字符串左侧填充零,如果0/9序列比十进制值的匹配部分包含更多位数,并且以0开头并且位于小数点之前。否则,它将填充空格。‘.‘或’D’:指定小数点的位置(可选,只允许一次)。’,‘或’G’:指定分组(千位)分隔符(,)的位置。每个分组分隔符的左右两侧必须有0或9。‘′:指定货币符号的位置。该字符只能指定一次。‘S’或’MI’:指定’-‘或’+‘符号的位置(可选,只允许在格式字符串的开头或末尾出现一次)。注意,‘S’打印’+‘作为正值,但’MI’打印空格。‘PR’:只允许在格式字符串的末尾;如果输入值为负数,则指定结果字符串将用尖括号括起来。(’<1>’)。 |
translate(input, from, to) | 通过使用to 字符串中的字符替换from 字符串中的字符来转换input 字符串。 |
trim(str) | 从str 中删除前导和尾随空格字符。 |
trim(BOTH FROM str) | 从str 中删除前导和尾随空格字符。 |
trim(LEADING FROM str) | 从str 中删除前导空格字符。 |
trim(TRAILING FROM str) | 从str 中删除尾随空格字符。 |
trim(trimStr FROM str) | 从str 中删除前导和尾随trimStr 字符。 |
trim(BOTH trimStr FROM str) | 从str 中删除前导和尾随trimStr 字符。 |
trim(LEADING trimStr FROM str) | 从str 中删除前导trimStr 字符。 |
trim(TRAILING trimStr FROM str) | 从str 中删除尾随trimStr 字符。 |
try_to_binary(str[, fmt]) | 这是to_binary 的特殊版本,执行相同的操作,但如果无法进行转换,则返回NULL值,而不是引发错误。 |
try_to_number(expr, fmt) | 根据字符串格式fmt 将字符串’expr’转换为数字。如果字符串’expr’与预期的格式不匹配,则返回NULL。格式遵循to_number函数的语义。 |
ucase(str) | 返回将所有字符更改为大写的str 。 |
unbase64(str) | 将Base64字符串str 转换为二进制。 |
upper(str) | 返回将所有字符更改为大写的str 。 |
示例
- ascii
SELECT ascii('222');
+----------+
|ascii(222)|
+----------+
| 50|
+----------+
SELECT ascii(2);
+--------+
|ascii(2)|
+--------+
| 50|
+--------+
-- base64
SELECT base64('Spark SQL');
+-----------------+
|base64(Spark SQL)|
+-----------------+
| U3BhcmsgU1FM|
+-----------------+
SELECT base64(x'537061726b2053514c');
+-----------------------------+
|base64(X'537061726B2053514C')|
+-----------------------------+
| U3BhcmsgU1FM|
+-----------------------------+
-- bit_length
SELECT bit_length('Spark SQL');
+---------------------+
|bit_length(Spark SQL)|
+---------------------+
| 72|
+---------------------+
SELECT bit_length(x'537061726b2053514c');
+---------------------------------+
|bit_length(X'537061726B2053514C')|
+---------------------------------+
| 72|
+---------------------------------+
-- btrim
SELECT btrim(' SparkSQL ');
+----------------------+
|btrim( SparkSQL )|
+----------------------+
| SparkSQL|
+----------------------+
SELECT btrim(encode(' SparkSQL ', 'utf-8'));
+-------------------------------------+
|btrim(encode( SparkSQL , utf-8))|
+-------------------------------------+
| SparkSQL|
+-------------------------------------+
SELECT btrim('SSparkSQLS', 'SL');
+---------------------+
|btrim(SSparkSQLS, SL)|
+---------------------+
| parkSQ|
+---------------------+
SELECT btrim(encode('SSparkSQLS', 'utf-8'), encode('SL', 'utf-8'));
+---------------------------------------------------+
|btrim(encode(SSparkSQLS, utf-8), encode(SL, utf-8))|
+---------------------------------------------------+
| parkSQ|
+---------------------------------------------------+
-- char
SELECT char(65);
+--------+
|char(65)|
+--------+
| A|
+--------+
-- char_length
SELECT char_length('Spark SQL ');
+-----------------------+
|char_length(Spark SQL )|
+-----------------------+
| 10|
+-----------------------+
SELECT char_length(x'537061726b2053514c');
+----------------------------------+
|char_length(X'537061726B2053514C')|
+----------------------------------+
| 9|
+----------------------------------+
SELECT CHAR_LENGTH('Spark SQL ');
+-----------------------+
|char_length(Spark SQL )|
+-----------------------+
| 10|
+-----------------------+
SELECT CHARACTER_LENGTH('Spark SQL ');
+----------------------------+
|character_length(Spark SQL )|
+----------------------------+
| 10|
+----------------------------+
-- character_length
SELECT character_length('Spark SQL ');
+----------------------------+
|character_length(Spark SQL )|
+----------------------------+
| 10|
+----------------------------+
SELECT character_length(x'537061726b2053514c');
+---------------------------------------+
|character_length(X'537061726B2053514C')|
+---------------------------------------+
| 9|
+---------------------------------------+
SELECT CHAR_LENGTH('Spark SQL ');
+-----------------------+
|char_length(Spark SQL )|
+-----------------------+
| 10|
+-----------------------+
SELECT CHARACTER_LENGTH('Spark SQL ');
+----------------------------+
|character_length(Spark SQL )|
+----------------------------+
| 10|
+----------------------------+
-- chr
SELECT chr(65);
+-------+
|chr(65)|
+-------+
| A|
+-------+
-- concat_ws
SELECT concat_ws(' ', 'Spark', 'SQL');
+------------------------+
|concat_ws( , Spark, SQL)|
+------------------------+
| Spark SQL|
+------------------------+
SELECT concat_ws('s');
+------------+
|concat_ws(s)|
+------------+
| |
+------------+
SELECT concat_ws('/', 'foo', null, 'bar');
+----------------------------+
|concat_ws(/, foo, NULL, bar)|
+----------------------------+
| foo/bar|
+----------------------------+
SELECT concat_ws(null, 'Spark', 'SQL');
+---------------------------+
|concat_ws(NULL, Spark, SQL)|
+---------------------------+
| NULL|
+---------------------------+
-- contains
SELECT contains('Spark SQL', 'Spark');
+--------------------------+
|contains(Spark SQL, Spark)|
+--------------------------+
| true|
+--------------------------+
SELECT contains('Spark SQL', 'SPARK');
+--------------------------+
|contains(Spark SQL, SPARK)|
+--------------------------+
| false|
+--------------------------+
SELECT contains('Spark SQL', null);
+-------------------------+
|contains(Spark SQL, NULL)|
+-------------------------+
| NULL|
+-------------------------+
SELECT contains(x'537061726b2053514c', x'537061726b');
+----------------------------------------------+
|contains(X'537061726B2053514C', X'537061726B')|
+----------------------------------------------+
| true|
+----------------------------------------------+
-- decode
SELECT decode(encode('abc', 'utf-8'), 'utf-8');
+---------------------------------+
|decode(encode(abc, utf-8), utf-8)|
+---------------------------------+
| abc|
+---------------------------------+
SELECT decode(2, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic');
+----------------------------------------------------------------------------------+
|decode(2, 1, Southlake, 2, San Francisco, 3, New Jersey, 4, Seattle, Non domestic)|
+----------------------------------------------------------------------------------+
| San Francisco|
+----------------------------------------------------------------------------------+
SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic');
+----------------------------------------------------------------------------------+
|decode(6, 1, Southlake, 2, San Francisco, 3, New Jersey, 4, Seattle, Non domestic)|
+----------------------------------------------------------------------------------+
| Non domestic|
+----------------------------------------------------------------------------------+
SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle');
+--------------------------------------------------------------------+
|decode(6, 1, Southlake, 2, San Francisco, 3, New Jersey, 4, Seattle)|
+--------------------------------------------------------------------+
| NULL|
+--------------------------------------------------------------------+
SELECT decode(null, 6, 'Spark', NULL, 'SQL', 4, 'rocks');
+-------------------------------------------+
|decode(NULL, 6, Spark, NULL, SQL, 4, rocks)|
+-------------------------------------------+
| SQL|
+-------------------------------------------+
-- elt
SELECT elt(1, 'scala', 'java');
+-------------------+
|elt(1, scala, java)|
+-------------------+
| scala|
+-------------------+
SELECT elt(2, 'a', 1);
+------------+
|elt(2, a, 1)|
+------------+
| 1|
+------------+
-- encode
SELECT encode('abc', 'utf-8');
+------------------+
|encode(abc, utf-8)|
+------------------+
| [61 62 63]|
+------------------+
-- endswith
SELECT endswith('Spark SQL', 'SQL');
+------------------------+
|endswith(Spark SQL, SQL)|
+------------------------+
| true|
+------------------------+
SELECT endswith('Spark SQL', 'Spark');
+--------------------------+
|endswith(Spark SQL, Spark)|
+--------------------------+
| false|
+--------------------------+
SELECT endswith('Spark SQL', null);
+-------------------------+
|endswith(Spark SQL, NULL)|
+-------------------------+
| NULL|
+-------------------------+
SELECT endswith(x'537061726b2053514c', x'537061726b');
+----------------------------------------------+
|endswith(X'537061726B2053514C', X'537061726B')|
+----------------------------------------------+
| false|
+----------------------------------------------+
SELECT endswith(x'537061726b2053514c', x'53514c');
+------------------------------------------+
|endswith(X'537061726B2053514C', X'53514C')|
+------------------------------------------+
| true|
+------------------------------------------+
-- find_in_set
SELECT find_in_set('ab','abc,b,ab,c,def');
+-------------------------------+
|find_in_set(ab, abc,b,ab,c,def)|
+-------------------------------+
| 3|
+-------------------------------+
-- format_number
SELECT format_number(12332.123456, 4);
+------------------------------+
|format_number(12332.123456, 4)|
+------------------------------+
| 12,332.1235|
+------------------------------+
SELECT format_number(12332.123456, '##################.###');
+---------------------------------------------------+
|format_number(12332.123456, ##################.###)|
+---------------------------------------------------+
| 12332.123|
+---------------------------------------------------+
-- format_string
SELECT format_string("Hello World %d %s", 100, "days");
+-------------------------------------------+
|format_string(Hello World %d %s, 100, days)|
+-------------------------------------------+
| Hello World 100 days|
+-------------------------------------------+
-- initcap
SELECT initcap('sPark sql');
+------------------+
|initcap(sPark sql)|
+------------------+
| Spark Sql|
+------------------+
-- instr
SELECT instr('SparkSQL', 'SQL');
+--------------------+
|instr(SparkSQL, SQL)|
+--------------------+
| 6|
+--------------------+
-- lcase
SELECT lcase('SparkSql');
+---------------+
|lcase(SparkSql)|
+---------------+
| sparksql|
+---------------+
-- left
SELECT left('Spark SQL', 3);
+------------------+
|left(Spark SQL, 3)|
+------------------+
| Spa|
+------------------+
SELECT left(encode('Spark SQL', 'utf-8'), 3);
+---------------------------------+
|left(encode(Spark SQL, utf-8), 3)|
+---------------------------------+
| [53 70 61]|
+---------------------------------+
-- len
SELECT len('Spark SQL ');
+---------------+
|len(Spark SQL )|
+---------------+
| 10|
+---------------+
SELECT len(x'537061726b2053514c');
+--------------------------+
|len(X'537061726B2053514C')|
+--------------------------+
| 9|
+--------------------------+
SELECT CHAR_LENGTH('Spark SQL ');
+-----------------------+
|char_length(Spark SQL )|
+-----------------------+
| 10|
+-----------------------+
SELECT CHARACTER_LENGTH('Spark SQL ');
+----------------------------+
|character_length(Spark SQL )|
+----------------------------+
| 10|
+----------------------------+
-- length
SELECT length('Spark SQL ');
+------------------+
|length(Spark SQL )|
+------------------+
| 10|
+------------------+
SELECT length(x'537061726b2053514c');
+-----------------------------+
|length(X'537061726B2053514C')|
+-----------------------------+
| 9|
+-----------------------------+
SELECT CHAR_LENGTH('Spark SQL ');
+-----------------------+
|char_length(Spark SQL )|
+-----------------------+
| 10|
+-----------------------+
SELECT CHARACTER_LENGTH('Spark SQL ');
+----------------------------+
|character_length(Spark SQL )|
+----------------------------+
| 10|
+----------------------------+
-- levenshtein
SELECT levenshtein('kitten', 'sitting');
+----------------------------+
|levenshtein(kitten, sitting)|
+----------------------------+
| 3|
+----------------------------+
SELECT levenshtein('kitten', 'sitting', 2);
+-------------------------------+
|levenshtein(kitten, sitting, 2)|
+-------------------------------+
| -1|
+-------------------------------+
-- locate
SELECT locate('bar', 'foobarbar');
+-------------------------+
|locate(bar, foobarbar, 1)|
+-------------------------+
| 4|
+-------------------------+
SELECT locate('bar', 'foobarbar', 5);
+-------------------------+
|locate(bar, foobarbar, 5)|
+-------------------------+
| 7|
+-------------------------+
SELECT POSITION('bar' IN 'foobarbar');
+-------------------------+
|locate(bar, foobarbar, 1)|
+-------------------------+
| 4|
+-------------------------+
-- lower
SELECT lower('SparkSql');
+---------------+
|lower(SparkSql)|
+---------------+
| sparksql|
+---------------+
-- lpad
SELECT lpad('hi', 5, '??');
+---------------+
|lpad(hi, 5, ??)|
+---------------+
| ???hi|
+---------------+
SELECT lpad('hi', 1, '??');
+---------------+
|lpad(hi, 1, ??)|
+---------------+
| h|
+---------------+
SELECT lpad('hi', 5);
+--------------+
|lpad(hi, 5, )|
+--------------+
| hi|
+--------------+
SELECT hex(lpad(unhex('aabb'), 5));
+--------------------------------+
|hex(lpad(unhex(aabb), 5, X'00'))|
+--------------------------------+
| 000000AABB|
+--------------------------------+
SELECT hex(lpad(unhex('aabb'), 5, unhex('1122')));
+--------------------------------------+
|hex(lpad(unhex(aabb), 5, unhex(1122)))|
+--------------------------------------+
| 112211AABB|
+--------------------------------------+
-- ltrim
SELECT ltrim(' SparkSQL ');
+----------------------+
|ltrim( SparkSQL )|
+----------------------+
| SparkSQL |
+----------------------+
-- luhn_check
SELECT luhn_check('8112189876');
+----------------------+
|luhn_check(8112189876)|
+----------------------+
| true|
+----------------------+
SELECT luhn_check('79927398713');
+-----------------------+
|luhn_check(79927398713)|
+-----------------------+
| true|
+-----------------------+
SELECT luhn_check('79927398714');
+-----------------------+
|luhn_check(79927398714)|
+-----------------------+
| false|
+-----------------------+
-- mask
SELECT mask('abcd-EFGH-8765-4321');
+----------------------------------------+
|mask(abcd-EFGH-8765-4321, X, x, n, NULL)|
+----------------------------------------+
| xxxx-XXXX-nnnn-nnnn|
+----------------------------------------+
SELECT mask('abcd-EFGH-8765-4321', 'Q');
+----------------------------------------+
|mask(abcd-EFGH-8765-4321, Q, x, n, NULL)|
+----------------------------------------+
| xxxx-QQQQ-nnnn-nnnn|
+----------------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q');
+--------------------------------+
|mask(AbCD123-@$#, Q, q, n, NULL)|
+--------------------------------+
| QqQQnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#');
+--------------------------------+
|mask(AbCD123-@$#, X, x, n, NULL)|
+--------------------------------+
| XxXXnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q');
+--------------------------------+
|mask(AbCD123-@$#, Q, x, n, NULL)|
+--------------------------------+
| QxQQnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q');
+--------------------------------+
|mask(AbCD123-@$#, Q, q, n, NULL)|
+--------------------------------+
| QqQQnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q', 'd');
+--------------------------------+
|mask(AbCD123-@$#, Q, q, d, NULL)|
+--------------------------------+
| QqQQddd-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q', 'd', 'o');
+-----------------------------+
|mask(AbCD123-@$#, Q, q, d, o)|
+-----------------------------+
| QqQQdddoooo|
+-----------------------------+
SELECT mask('AbCD123-@$#', NULL, 'q', 'd', 'o');
+--------------------------------+
|mask(AbCD123-@$#, NULL, q, d, o)|
+--------------------------------+
| AqCDdddoooo|
+--------------------------------+
SELECT mask('AbCD123-@$#', NULL, NULL, 'd', 'o');
+-----------------------------------+
|mask(AbCD123-@$#, NULL, NULL, d, o)|
+-----------------------------------+
| AbCDdddoooo|
+-----------------------------------+
SELECT mask('AbCD123-@$#', NULL, NULL, NULL, 'o');
+--------------------------------------+
|mask(AbCD123-@$#, NULL, NULL, NULL, o)|
+--------------------------------------+
| AbCD123oooo|
+--------------------------------------+
SELECT mask(NULL, NULL, NULL, NULL, 'o');
+-------------------------------+
|mask(NULL, NULL, NULL, NULL, o)|
+-------------------------------+
| NULL|
+-------------------------------+
SELECT mask(NULL);
+-------------------------+
|mask(NULL, X, x, n, NULL)|
+-------------------------+
| NULL|
+-------------------------+
SELECT mask('AbCD123-@$#', NULL, NULL, NULL, NULL);
+-----------------------------------------+
|mask(AbCD123-@$#, NULL, NULL, NULL, NULL)|
+-----------------------------------------+
| AbCD123-@$#|
+-----------------------------------------+
-- octet_length
SELECT octet_length('Spark SQL');
+-----------------------+
|octet_length(Spark SQL)|
+-----------------------+
| 9|
+-----------------------+
SELECT octet_length(x'537061726b2053514c');
+-----------------------------------+
|octet_length(X'537061726B2053514C')|
+-----------------------------------+
| 9|
+-----------------------------------+
-- overlay
SELECT overlay('Spark SQL' PLACING '_' FROM 6);
+----------------------------+
|overlay(Spark SQL, _, 6, -1)|
+----------------------------+
| Spark_SQL|
+----------------------------+
SELECT overlay('Spark SQL' PLACING 'CORE' FROM 7);
+-------------------------------+
|overlay(Spark SQL, CORE, 7, -1)|
+-------------------------------+
| Spark CORE|
+-------------------------------+
SELECT overlay('Spark SQL' PLACING 'ANSI ' FROM 7 FOR 0);
+-------------------------------+
|overlay(Spark SQL, ANSI , 7, 0)|
+-------------------------------+
| Spark ANSI SQL|
+-------------------------------+
SELECT overlay('Spark SQL' PLACING 'tructured' FROM 2 FOR 4);
+-----------------------------------+
|overlay(Spark SQL, tructured, 2, 4)|
+-----------------------------------+
| Structured SQL|
+-----------------------------------+
SELECT overlay(encode('Spark SQL', 'utf-8') PLACING encode('_', 'utf-8') FROM 6);
+----------------------------------------------------------+
|overlay(encode(Spark SQL, utf-8), encode(_, utf-8), 6, -1)|
+----------------------------------------------------------+
| [53 70 61 72 6B 5...|
+----------------------------------------------------------+
SELECT overlay(encode('Spark SQL', 'utf-8') PLACING encode('CORE', 'utf-8') FROM 7);
+-------------------------------------------------------------+
|overlay(encode(Spark SQL, utf-8), encode(CORE, utf-8), 7, -1)|
+-------------------------------------------------------------+
| [53 70 61 72 6B 2...|
+-------------------------------------------------------------+
SELECT overlay(encode('Spark SQL', 'utf-8') PLACING encode('ANSI ', 'utf-8') FROM 7 FOR 0);
+-------------------------------------------------------------+
|overlay(encode(Spark SQL, utf-8), encode(ANSI , utf-8), 7, 0)|
+-------------------------------------------------------------+
| [53 70 61 72 6B 2...|
+-------------------------------------------------------------+
SELECT overlay(encode('Spark SQL', 'utf-8') PLACING encode('tructured', 'utf-8') FROM 2 FOR 4);
+-----------------------------------------------------------------+
|overlay(encode(Spark SQL, utf-8), encode(tructured, utf-8), 2, 4)|
+-----------------------------------------------------------------+
| [53 74 72 75 63 7...|
+-----------------------------------------------------------------+
-- position
SELECT position('bar', 'foobarbar');
+---------------------------+
|position(bar, foobarbar, 1)|
+---------------------------+
| 4|
+---------------------------+
SELECT position('bar', 'foobarbar', 5);
+---------------------------+
|position(bar, foobarbar, 5)|
+---------------------------+
| 7|
+---------------------------+
SELECT POSITION('bar' IN 'foobarbar');
+-------------------------+
|locate(bar, foobarbar, 1)|
+-------------------------+
| 4|
+-------------------------+
-- printf
SELECT printf("Hello World %d %s", 100, "days");
+------------------------------------+
|printf(Hello World %d %s, 100, days)|
+------------------------------------+
| Hello World 100 days|
+------------------------------------+
-- regexp_count
SELECT regexp_count('Steven Jones and Stephen Smith are the best players', 'Ste(v|ph)en');
+------------------------------------------------------------------------------+
|regexp_count(Steven Jones and Stephen Smith are the best players, Ste(v|ph)en)|
+------------------------------------------------------------------------------+
| 2|
+------------------------------------------------------------------------------+
SELECT regexp_count('abcdefghijklmnopqrstuvwxyz', '[a-z]{3}');
+--------------------------------------------------+
|regexp_count(abcdefghijklmnopqrstuvwxyz, [a-z]{3})|
+--------------------------------------------------+
| 8|
+--------------------------------------------------+
-- regexp_extract
SELECT regexp_extract('100-200', '(\\d+)-(\\d+)', 1);
+---------------------------------------+
|regexp_extract(100-200, (\d+)-(\d+), 1)|
+---------------------------------------+
| 100|
+---------------------------------------+
-- regexp_extract_all
SELECT regexp_extract_all('100-200, 300-400', '(\\d+)-(\\d+)', 1);
+----------------------------------------------------+
|regexp_extract_all(100-200, 300-400, (\d+)-(\d+), 1)|
+----------------------------------------------------+
| [100, 300]|
+----------------------------------------------------+
-- regexp_instr
SELECT regexp_instr('user@spark.apache.org', '@[^.]*');
+----------------------------------------------+
|regexp_instr(user@spark.apache.org, @[^.]*, 0)|
+----------------------------------------------+
| 5|
+----------------------------------------------+
-- regexp_replace
SELECT regexp_replace('100-200', '(\\d+)', 'num');
+--------------------------------------+
|regexp_replace(100-200, (\d+), num, 1)|
+--------------------------------------+
| num-num|
+--------------------------------------+
-- regexp_substr
SELECT regexp_substr('Steven Jones and Stephen Smith are the best players', 'Ste(v|ph)en');
+-------------------------------------------------------------------------------+
|regexp_substr(Steven Jones and Stephen Smith are the best players, Ste(v|ph)en)|
+-------------------------------------------------------------------------------+
| Steven|
+-------------------------------------------------------------------------------+
SELECT regexp_substr('Steven Jones and Stephen Smith are the best players', 'Jeck');
+------------------------------------------------------------------------+
|regexp_substr(Steven Jones and Stephen Smith are the best players, Jeck)|
+------------------------------------------------------------------------+
| NULL|
+------------------------------------------------------------------------+
-- repeat
SELECT repeat('123', 2);
+--------------+
|repeat(123, 2)|
+--------------+
| 123123|
+--------------+
-- replace
SELECT replace('ABCabc', 'abc', 'DEF');
+-------------------------+
|replace(ABCabc, abc, DEF)|
+-------------------------+
| ABCDEF|
+-------------------------+
-- right
SELECT right('Spark SQL', 3);
+-------------------+
|right(Spark SQL, 3)|
+-------------------+
| SQL|
+-------------------+
-- rpad
SELECT rpad('hi', 5, '??');
+---------------+
|rpad(hi, 5, ??)|
+---------------+
| hi???|
+---------------+
SELECT rpad('hi', 1, '??');
+---------------+
|rpad(hi, 1, ??)|
+---------------+
| h|
+---------------+
SELECT rpad('hi', 5);
+--------------+
|rpad(hi, 5, )|
+--------------+
| hi |
+--------------+
SELECT hex(rpad(unhex('aabb'), 5));
+--------------------------------+
|hex(rpad(unhex(aabb), 5, X'00'))|
+--------------------------------+
| AABB000000|
+--------------------------------+
SELECT hex(rpad(unhex('aabb'), 5, unhex('1122')));
+--------------------------------------+
|hex(rpad(unhex(aabb), 5, unhex(1122)))|
+--------------------------------------+
| AABB112211|
+--------------------------------------+
-- rtrim
SELECT rtrim(' SparkSQL ');
+----------------------+
|rtrim( SparkSQL )|
+----------------------+
| SparkSQL|
+----------------------+
-- sentences
SELECT sentences('Hi there! Good morning.');
+--------------------------------------+
|sentences(Hi there! Good morning., , )|
+--------------------------------------+
| [[Hi, there], [Go...|
+--------------------------------------+
-- soundex
SELECT soundex('Miller');
+---------------+
|soundex(Miller)|
+---------------+
| M460|
+---------------+
-- space
SELECT concat(space(2), '1');
+-------------------+
|concat(space(2), 1)|
+-------------------+
| 1|
+-------------------+
-- split
SELECT split('oneAtwoBthreeC', '[ABC]');
+--------------------------------+
|split(oneAtwoBthreeC, [ABC], -1)|
+--------------------------------+
| [one, two, three, ]|
+--------------------------------+
SELECT split('oneAtwoBthreeC', '[ABC]', -1);
+--------------------------------+
|split(oneAtwoBthreeC, [ABC], -1)|
+--------------------------------+
| [one, two, three, ]|
+--------------------------------+
SELECT split('oneAtwoBthreeC', '[ABC]', 2);
+-------------------------------+
|split(oneAtwoBthreeC, [ABC], 2)|
+-------------------------------+
| [one, twoBthreeC]|
+-------------------------------+
-- split_part
SELECT split_part('11.12.13', '.', 3);
+--------------------------+
|split_part(11.12.13, ., 3)|
+--------------------------+
| 13|
+--------------------------+
-- startswith
SELECT startswith('Spark SQL', 'Spark');
+----------------------------+
|startswith(Spark SQL, Spark)|
+----------------------------+
| true|
+----------------------------+
SELECT startswith('Spark SQL', 'SQL');
+--------------------------+
|startswith(Spark SQL, SQL)|
+--------------------------+
| false|
+--------------------------+
SELECT startswith('Spark SQL', null);
+---------------------------+
|startswith(Spark SQL, NULL)|
+---------------------------+
| NULL|
+---------------------------+
SELECT startswith(x'537061726b2053514c', x'537061726b');
+------------------------------------------------+
|startswith(X'537061726B2053514C', X'537061726B')|
+------------------------------------------------+
| true|
+------------------------------------------------+
SELECT startswith(x'537061726b2053514c', x'53514c');
+--------------------------------------------+
|startswith(X'537061726B2053514C', X'53514C')|
+--------------------------------------------+
| false|
+--------------------------------------------+
-- substr
SELECT substr('Spark SQL', 5);
+--------------------------------+
|substr(Spark SQL, 5, 2147483647)|
+--------------------------------+
| k SQL|
+--------------------------------+
SELECT substr('Spark SQL', -3);
+---------------------------------+
|substr(Spark SQL, -3, 2147483647)|
+---------------------------------+
| SQL|
+---------------------------------+
SELECT substr('Spark SQL', 5, 1);
+-----------------------+
|substr(Spark SQL, 5, 1)|
+-----------------------+
| k|
+-----------------------+
SELECT substr('Spark SQL' FROM 5);
+-----------------------------------+
|substring(Spark SQL, 5, 2147483647)|
+-----------------------------------+
| k SQL|
+-----------------------------------+
SELECT substr('Spark SQL' FROM -3);
+------------------------------------+
|substring(Spark SQL, -3, 2147483647)|
+------------------------------------+
| SQL|
+------------------------------------+
SELECT substr('Spark SQL' FROM 5 FOR 1);
+--------------------------+
|substring(Spark SQL, 5, 1)|
+--------------------------+
| k|
+--------------------------+
SELECT substr(encode('Spark SQL', 'utf-8'), 5);
+-----------------------------------------------+
|substr(encode(Spark SQL, utf-8), 5, 2147483647)|
+-----------------------------------------------+
| [6B 20 53 51 4C]|
+-----------------------------------------------+
-- substring
SELECT substring('Spark SQL', 5);
+-----------------------------------+
|substring(Spark SQL, 5, 2147483647)|
+-----------------------------------+
| k SQL|
+-----------------------------------+
SELECT substring('Spark SQL', -3);
+------------------------------------+
|substring(Spark SQL, -3, 2147483647)|
+------------------------------------+
| SQL|
+------------------------------------+
SELECT substring('Spark SQL', 5, 1);
+--------------------------+
|substring(Spark SQL, 5, 1)|
+--------------------------+
| k|
+--------------------------+
SELECT substring('Spark SQL' FROM 5);
+-----------------------------------+
|substring(Spark SQL, 5, 2147483647)|
+-----------------------------------+
| k SQL|
+-----------------------------------+
SELECT substring('Spark SQL' FROM -3);
+------------------------------------+
|substring(Spark SQL, -3, 2147483647)|
+------------------------------------+
| SQL|
+------------------------------------+
SELECT substring('Spark SQL' FROM 5 FOR 1);
+--------------------------+
|substring(Spark SQL, 5, 1)|
+--------------------------+
| k|
+--------------------------+
SELECT substring(encode('Spark SQL', 'utf-8'), 5);
+--------------------------------------------------+
|substring(encode(Spark SQL, utf-8), 5, 2147483647)|
+--------------------------------------------------+
| [6B 20 53 51 4C]|
+--------------------------------------------------+
-- substring_index
SELECT substring_index('www.apache.org', '.', 2);
+-------------------------------------+
|substring_index(www.apache.org, ., 2)|
+-------------------------------------+
| www.apache|
+-------------------------------------+
-- to_binary
SELECT to_binary('abc', 'utf-8');
+---------------------+
|to_binary(abc, utf-8)|
+---------------------+
| [61 62 63]|
+---------------------+
-- to_char
SELECT to_char(454, '999');
+-----------------+
|to_char(454, 999)|
+-----------------+
| 454|
+-----------------+
SELECT to_char(454.00, '000D00');
+-----------------------+
|to_char(454.00, 000D00)|
+-----------------------+
| 454.00|
+-----------------------+
SELECT to_char(12454, '99G999');
+----------------------+
|to_char(12454, 99G999)|
+----------------------+
| 12,454|
+----------------------+
SELECT to_char(78.12, '$99.99');
+----------------------+
|to_char(78.12, $99.99)|
+----------------------+
| $78.12|
+----------------------+
SELECT to_char(-12454.8, '99G999D9S');
+----------------------------+
|to_char(-12454.8, 99G999D9S)|
+----------------------------+
| 12,454.8-|
+----------------------------+
-- to_number
SELECT to_number('454', '999');
+-------------------+
|to_number(454, 999)|
+-------------------+
| 454|
+-------------------+
SELECT to_number('454.00', '000.00');
+-------------------------+
|to_number(454.00, 000.00)|
+-------------------------+
| 454.00|
+-------------------------+
SELECT to_number('12,454', '99,999');
+-------------------------+
|to_number(12,454, 99,999)|
+-------------------------+
| 12454|
+-------------------------+
SELECT to_number('$78.12', '$99.99');
+-------------------------+
|to_number($78.12, $99.99)|
+-------------------------+
| 78.12|
+-------------------------+
SELECT to_number('12,454.8-', '99,999.9S');
+-------------------------------+
|to_number(12,454.8-, 99,999.9S)|
+-------------------------------+
| -12454.8|
+-------------------------------+
-- to_varchar
SELECT to_varchar(454, '999');
+-----------------+
|to_char(454, 999)|
+-----------------+
| 454|
+-----------------+
SELECT to_varchar(454.00, '000D00');
+-----------------------+
|to_char(454.00, 000D00)|
+-----------------------+
| 454.00|
+-----------------------+
SELECT to_varchar(12454, '99G999');
+----------------------+
|to_char(12454, 99G999)|
+----------------------+
| 12,454|
+----------------------+
SELECT to_varchar(78.12, '$99.99');
+----------------------+
|to_char(78.12, $99.99)|
+----------------------+
| $78.12|
+----------------------+
SELECT to_varchar(-12454.8, '99G999D9S');
+----------------------------+
|to_char(-12454.8, 99G999D9S)|
+----------------------------+
| 12,454.8-|
+----------------------------+
-- translate
SELECT translate('AaBbCc', 'abc', '123');
+---------------------------+
|translate(AaBbCc, abc, 123)|
+---------------------------+
| A1B2C3|
+---------------------------+
-- trim
SELECT trim(' SparkSQL ');
+---------------------+
|trim( SparkSQL )|
+---------------------+
| SparkSQL|
+---------------------+
SELECT trim(BOTH FROM ' SparkSQL ');
+---------------------+
|trim( SparkSQL )|
+---------------------+
| SparkSQL|
+---------------------+
SELECT trim(LEADING FROM ' SparkSQL ');
+----------------------+
|ltrim( SparkSQL )|
+----------------------+
| SparkSQL |
+----------------------+
SELECT trim(TRAILING FROM ' SparkSQL ');
+----------------------+
|rtrim( SparkSQL )|
+----------------------+
| SparkSQL|
+----------------------+
SELECT trim('SL' FROM 'SSparkSQLS');
+-----------------------------+
|TRIM(BOTH SL FROM SSparkSQLS)|
+-----------------------------+
| parkSQ|
+-----------------------------+
SELECT trim(BOTH 'SL' FROM 'SSparkSQLS');
+-----------------------------+
|TRIM(BOTH SL FROM SSparkSQLS)|
+-----------------------------+
| parkSQ|
+-----------------------------+
SELECT trim(LEADING 'SL' FROM 'SSparkSQLS');
+--------------------------------+
|TRIM(LEADING SL FROM SSparkSQLS)|
+--------------------------------+
| parkSQLS|
+--------------------------------+
SELECT trim(TRAILING 'SL' FROM 'SSparkSQLS');
+---------------------------------+
|TRIM(TRAILING SL FROM SSparkSQLS)|
+---------------------------------+
| SSparkSQ|
+---------------------------------+
-- try_to_binary
SELECT try_to_binary('abc', 'utf-8');
+-------------------------+
|try_to_binary(abc, utf-8)|
+-------------------------+
| [61 62 63]|
+-------------------------+
select try_to_binary('a!', 'base64');
+-------------------------+
|try_to_binary(a!, base64)|
+-------------------------+
| NULL|
+-------------------------+
select try_to_binary('abc', 'invalidFormat');
+---------------------------------+
|try_to_binary(abc, invalidFormat)|
+---------------------------------+
| NULL|
+---------------------------------+
-- try_to_number
SELECT try_to_number('454', '999');
+-----------------------+
|try_to_number(454, 999)|
+-----------------------+
| 454|
+-----------------------+
SELECT try_to_number('454.00', '000.00');
+-----------------------------+
|try_to_number(454.00, 000.00)|
+-----------------------------+
| 454.00|
+-----------------------------+
SELECT try_to_number('12,454', '99,999');
+-----------------------------+
|try_to_number(12,454, 99,999)|
+-----------------------------+
| 12454|
+-----------------------------+
SELECT try_to_number('$78.12', '$99.99');
+-----------------------------+
|try_to_number($78.12, $99.99)|
+-----------------------------+
| 78.12|
+-----------------------------+
SELECT try_to_number('12,454.8-', '99,999.9S');
+-----------------------------------+
|try_to_number(12,454.8-, 99,999.9S)|
+-----------------------------------+
| -12454.8|
+-----------------------------------+
-- ucase
SELECT ucase('SparkSql');
+---------------+
|ucase(SparkSql)|
+---------------+
| SPARKSQL|
+---------------+
-- unbase64
SELECT unbase64('U3BhcmsgU1FM');
+----------------------+
|unbase64(U3BhcmsgU1FM)|
+----------------------+
| [53 70 61 72 6B 2...|
+----------------------+
-- upper
SELECT upper('SparkSql');
+---------------+
|upper(SparkSql)|
+---------------+
| SPARKSQL|
+---------------+
转换函数
函数 | 描述 |
---|---|
bigint(expr) | 将值 expr 转换为目标数据类型 bigint 。 |
binary(expr) | 将值 expr 转换为目标数据类型 binary 。 |
boolean(expr) | 将值 expr 转换为目标数据类型 boolean 。 |
cast(expr AS type) | 将值 expr 转换为目标数据类型 type 。 |
date(expr) | 将值 expr 转换为目标数据类型 date 。 |
decimal(expr) | 将值 expr 转换为目标数据类型 decimal 。 |
double(expr) | 将值 expr 转换为目标数据类型 double 。 |
float(expr) | 将值 expr 转换为目标数据类型 float 。 |
int(expr) | 将值 expr 转换为目标数据类型 int 。 |
smallint(expr) | 将值 expr 转换为目标数据类型 smallint 。 |
string(expr) | 将值 expr 转换为目标数据类型 string 。 |
timestamp(expr) | 将值 expr 转换为目标数据类型 timestamp 。 |
tinyint(expr) | 将值 expr 转换为目标数据类型 tinyint 。 |
示例
-- cast
SELECT cast('10' as int);
+---------------+
|CAST(10 AS INT)|
+---------------+
| 10|
+---------------+
8.常看高质文章
Spark SQL示例用法所有函数权威详解
SparkSQL性能调优官网权威资料
spark dataset/DataFrame比RDD好在哪些地方?
Spark YARN Cluster和Client两种不同提交模式区别
spark RDD 概述用法官网权威资料(建议收藏)
Spark on k8s如何在Kubernetes运行官方权威资料
Spark性能权威调优指南
Spark提交任务官网权威指南详解(建议收藏)
Spark数据类型官网权威详解
Spark 集群模式概述
在YARN上启动Spark任务原理用法官方权威资料
spark SQL Implicits原理用法示例源码分析
Spark Dataset DataFrame原理方法示例源码分析
Spark SparkSession由来方法示例源码分析
Spark RDD由来原理方法示例源码详解文章来源:https://www.toymoban.com/news/detail-751319.html
Spark SparkContext原理用法示例源码详解文章来源地址https://www.toymoban.com/news/detail-751319.html
到了这里,关于Spark SQL示例用法所有函数示例权威详解一【建议收藏】的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!