五、浅析[ElasticSearch]底层原理与分组聚合查询

这篇具有很好参考价值的文章主要介绍了五、浅析[ElasticSearch]底层原理与分组聚合查询。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

集群节点介绍

es配置文件夹中

主节点:node.master:true
数据节点: node.data: true
  1. 客户端节点
      当主节点和数据节点配置都设置为false的时候,该节点只能处理路由请求,处理搜索,分发索引操作等,从本质上来说该客户节点表现为智能负载平衡器。独立的客户端节点在一个比较大的集群中是非常有用的,他协调主节点和数据节点,客户端节点加入集群可以得到集群的状态,根据集群的状态可以直接路由请求。

  2. 数据节点
      数据节点主要是存储索引数据的节点,主要对文档进行增删改查操作,聚合操作等。数据节点对cpu,内存,io要求较高, 在优化的时候需要监控数据节点的状态,当资源不够的时候,需要在集群中添加新的节点。

  3. 主节点
      主资格节点的主要职责是和集群操作相关的内容,如创建或删除索引,跟踪哪些节点是群集的一部分,并决定哪些分片分配给相关的节点。稳定的主节点对集群的健康是非常重要的,默认情况下任何一个集群中的节点都有可能被选为主节点,索引数据和搜索查询等操作会占用大量的cpu,内存,io资源,为了确保一个集群的稳定,分离主节点和数据节点是一个比较好的选择。

一、ElasticSearch文档分值_score计算底层原理

1.boolean model

第一步、根据用户的query条件,先过滤出包含指定term(关键字)的doc(文档)
例如查询"hello world"

query "hello world"  拆分不同的term-->  hello / world / hello & world

第二步、根据你的条件进行筛选

bool --> must/must not/should 筛选条件--> 过滤 --> 包含 / 不包含 / 可能包含

到这里还没有进行打分。

2.relevance score算法

该算法是计算出一个索引中的文本,与搜索文本,他们之间的关联匹配程度。
Elasticsearch使用的是 term frequency/inverse document frequency算法,简称为TF/IDF算法(TF除以IDF)。
第三步、开始计算

  1. Term frequency(TF):搜索文本中的各个词条在field文本中出现了多少次,出现次数越多,就越相关。
    例如
    搜索请求:hello world
    会拆成hello和world。去文档中去找这些关键字出现的次数。出现次数越多,分数越高。
doc1:hello you, and world is very good

doc2:hello, how are you
  1. Inverse document frequency(IDF):搜索文本中的各个词条在整个索引的所有文档中出现了多少次,出现的次数越多,就越不相关。
    (可以这么理解,就比如你搜索的关键字为:'的,是’这些关键字几乎在整个索引存在很多。考虑到类似这一情况进行的该算法。)
    例如
    搜索请求:hello world
doc1:hello, july is good

doc2:hi world, how are you

此外处理上述的tf和idf外还有一个因素有关
3. Field-length norm:field长度,field越长,相关度越弱

例如
搜索请求:hello world

doc1:{ "title": "hello july", "content": "...... 1000个单词" }
doc2:{ "title": "my baby", "content": "...... 1000个单词,hi world" }

hello world在整个index中出现的次数是一样多的,但是,doc1更相关,title 字段中内容更短。

2、分析一个document上的_score是如何被计算出来的

使用_explain进行一个简单的查询举例。

GET /test_index08/_doc/3/_explain
{"query":{"match":{"f":"hello"}}}

结果
包含上述所说的idf和tf等相关分数,这里先简单了解。es的计算分数涉及到的数学知识还是比较复杂的这里不展开讲解了。
五、浅析[ElasticSearch]底层原理与分组聚合查询,elasticsearch,elasticsearch,大数据,java


二、分词器工作流程

1.character filter、tokenizer、token filter

  • 切分词语和normalization

根据指定的分词器,把要保存到es中的数据进行切分,给你一段句子,然后将这段句子拆分成一个一个的单个的单词,同时对每个单词进行normalization(时态转换,单复数转换等)。

工作流程大致可以分为三个步骤
第一步:character filter:在一段文本进行分词之前,先进行预处理,比如说最常见的就是,过滤一些内容(把html标签过滤掉,把一些特殊符号进行转换& --> and,&转and等。)

第二步:tokenizer:分词,hello you and me --> hello, you, and, me

第三步:token filter:lowercase,stop word,synonymom,(例如处理大小写转换,停用词的处理,同义词的处理等。)

经过各种处理后,最后处理好的结果才会拿去建立倒排索引。

2、内置分词器的简单介绍

测试内容:Set the shape to semi-transparent by calling set_trans(5)

  • standard analyze
    结果:set, the, shape, to, semi, transparent, by, calling, set_trans, 5(默认的是standard分词器)
  • simple analyzer
    结果:set, the, shape, to, semi, transparent, by, calling, set, trans
  • whitespace analyzer
    结果:Set, the, shape, to, semi-transparent, by, calling, set_trans(5)
  • stop analyzer
    结果:移除停用词,比如a the it等等

举例

POST _analyze
{
  "analyzer": "standard",
  "text": "Set the shape to semi-transparent by calling set_trans(5)"
}

详细结果

{
  "tokens" : [
    {
      "token" : "set",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "the",
      "start_offset" : 4,
      "end_offset" : 7,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "shape",
      "start_offset" : 8,
      "end_offset" : 13,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
      "token" : "to",
      "start_offset" : 14,
      "end_offset" : 16,
      "type" : "<ALPHANUM>",
      "position" : 3
    },
    {
      "token" : "semi",
      "start_offset" : 17,
      "end_offset" : 21,
      "type" : "<ALPHANUM>",
      "position" : 4
    },
    {
      "token" : "transparent",
      "start_offset" : 22,
      "end_offset" : 33,
      "type" : "<ALPHANUM>",
      "position" : 5
    },
    {
      "token" : "by",
      "start_offset" : 34,
      "end_offset" : 36,
      "type" : "<ALPHANUM>",
      "position" : 6
    },
    {
      "token" : "calling",
      "start_offset" : 37,
      "end_offset" : 44,
      "type" : "<ALPHANUM>",
      "position" : 7
    },
    {
      "token" : "set_trans",
      "start_offset" : 45,
      "end_offset" : 54,
      "type" : "<ALPHANUM>",
      "position" : 8
    },
    {
      "token" : "5",
      "start_offset" : 55,
      "end_offset" : 56,
      "type" : "<NUM>",
      "position" : 9
    }
  ]
}

3、定制分词器

3.1默认的分词器–standard

standard tokenizer:以单词边界进行切分

standard token filter:什么都不做

lowercase token filter:将所有字母转换为小写

stop token filer(默认被禁用):移除停用词,比如a the it等等

3.2修改分词器的设置

英文环境下,启用停用词。
例如
创建一个名为my_index的索引,其中es_std为自定义分词器名称,stopwords为设置英文环境下启用停用词。

PUT /my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "es_std": {
          "type": "standard",
          "stopwords": "_english_"
        }
      }
    }
  }
}

默认分词器分词

GET /my_index/_analyze
{
  "analyzer": "standard", 
  "text": "a dog is in the house"
}

结果

{
  "tokens" : [
    {
      "token" : "a",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "dog",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "is",
      "start_offset" : 6,
      "end_offset" : 8,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
      "token" : "in",
      "start_offset" : 9,
      "end_offset" : 11,
      "type" : "<ALPHANUM>",
      "position" : 3
    },
    {
      "token" : "the",
      "start_offset" : 12,
      "end_offset" : 15,
      "type" : "<ALPHANUM>",
      "position" : 4
    },
    {
      "token" : "house",
      "start_offset" : 16,
      "end_offset" : 21,
      "type" : "<ALPHANUM>",
      "position" : 5
    }
  ]
}

测试自定义分词器的分词结果

GET /my_index/_analyze
{
  "analyzer": "es_std", 
  "text": "a dog is in the house"
}

结果

{
  "tokens" : [
    {
      "token" : "dog",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "house",
      "start_offset" : 16,
      "end_offset" : 21,
      "type" : "<ALPHANUM>",
      "position" : 5
    }
  ]
}

3.3定制化自己的分词器

创建一个my_index2索引,要求内容中的 & 转换成and,其中&Toand名称是自定义的,类型为mapping(映射关系),多个条件使用逗号分隔,设置停用词文本中有the、a把他过滤掉,其中my_stopwords名称自定义,类型为stop(停用词)。my_analyzer为自定分词的名称,类型为custom(自定义分词器),html_strip为es中自带的,自动过滤掉html标签,lowercase作用是大写转小写,“tokenizer”: "standard"表示在standard分词器基础上进行扩展。

PUT /my_index2
{
  "settings": {
    "analysis": {
      "char_filter": {
        "&Toand": {
          "type": "mapping",
          "mappings": [
            "&=> and",
            "!=> not"
          ]
        }
      },
      "filter": {
        "my_stopwords": {
          "type": "stop",
          "stopwords": [
            "the",
            "a"
          ]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "char_filter": [
            "html_strip",
            "&Toand"
          ],
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "my_stopwords"
          ]
        }
      }
    }
  }
}

进行测试

GET /my_index2/_analyze
{
  "text": "tom&jerry are a friend in the house, <a>, HAHA!!",
  "analyzer": "my_analyzer"
}

结果

{
  "tokens" : [
    {
      "token" : "tomandjerry",
      "start_offset" : 0,
      "end_offset" : 9,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "are",
      "start_offset" : 10,
      "end_offset" : 13,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "friend",
      "start_offset" : 16,
      "end_offset" : 22,
      "type" : "<ALPHANUM>",
      "position" : 3
    },
    {
      "token" : "in",
      "start_offset" : 23,
      "end_offset" : 25,
      "type" : "<ALPHANUM>",
      "position" : 4
    },
    {
      "token" : "house",
      "start_offset" : 30,
      "end_offset" : 35,
      "type" : "<ALPHANUM>",
      "position" : 6
    },
    {
      "token" : "hahanotnot",
      "start_offset" : 42,
      "end_offset" : 48,
      "type" : "<ALPHANUM>",
      "position" : 7
    }
  ]
}

3.4 ik分词器详解

ik配置文件地址:config目录下
五、浅析[ElasticSearch]底层原理与分组聚合查询,elasticsearch,elasticsearch,大数据,java
文件主要作用:

  1. IKAnalyzer.cfg.xml:用来配置自定义词库
  2. main.dic:ik原生内置的中文词库,总共有27万多条,只要是这些单词,都会被分在一起
  3. quantifier.dic:放了一些单位相关的词
  4. suffix.dic:放了一些后缀
  5. surname.dic:中国的姓氏
  6. stopword.dic:英文停用词
  7. main.dic:包含了原生的中文词语,会按照这个里面的词语去分词
  8. stopword.dic:包含了英文的停用词

如何对IK分词器自定义词库?
方法1:
增加需要自定义的词库,更改指定配置文件中的内容,把增加的词库地址配置进去。
例如,我在config目录下新建了一个文件夹叫custom,然后里边有一个custom.dic文件
修改IKAnalyzer.cfg.xml配置文件内容(每个节点都要修改)

<entry key="ext_dict">custom/custom.dic</entry>

这种方法需要重启es,才能生效。

方法2(IK热更新):
把整个custom.dic文件放到一个指定的地址上,比如192.168.5.5:8888/custom.dic。当配置es 的时候把地址统一写成这个地址,此时你要更新custom.dic内容时,直接对它进行修改即可。也不需要再重启es了。

方法3(修改源码):
修改es中的源码,使其读取mysql中的词库。下载源码进行修改。


三、高亮显示

1.高亮简述

多查询的内容,进行高亮显示,类似百度搜索的结果。
五、浅析[ElasticSearch]底层原理与分组聚合查询,elasticsearch,elasticsearch,大数据,java
高亮演示
先新建一个索引并增加一条数据。
指定某些字段使用的分词器。

PUT /test_highlight
{
  "mappings": {

      "properties": {
        "title": {
          "type": "text",
          "analyzer": "ik_max_word"
        },
        "content": {
          "type": "text",
          "analyzer": "ik_max_word"
        }
      }
    }
  
}

或者设置索引默认分词器

PUT /test_highlight
{
    "settings" : {
        "index" : {
            "analysis.analyzer.default.type": "ik_max_word"
        }
    }
}

插入数据

PUT /test_highlight/_doc/1
{
  "title": "这是july写的第一篇文章",
  "content": "大家好,这是我写的第一篇文章,特别喜欢这个文章"
}

查询内容进行高亮

GET /test_highlight/_doc/_search
{
  "query": {
    "match": {
      "title": "文章"
    }
  },
  "highlight": {
    "fields": {
      "title": {}
    }
  }
}

结果

{
  "took" : 416,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "test_highlight",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "title" : "这是july写的第一篇文章",
          "content" : "大家好,这是我写的第一篇文章,特别喜欢这个文章"
        },
        "highlight" : {
          "title" : [
            "这是july写的第一篇<em>文章</em>"
          ]
        }
      }
    ]
  }
}

<em></em>标签,会变成红色,所以说你的指定的field中,如果包含了那个搜索词的话,就会在那个field的文本中,对搜索词进行红色的高亮显示

注意:这里只有query中的title条件这一个字段进行高亮,如果你想让content也高亮的话,content字段需要出现在query中,如果只是添加在highlight中是不生效的!请看如下举例

GET /test_highlight/_doc/_search 
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": "文章"
          }
        },
        {
          "match": {
            "content": "文章"
          }
        }
      ]
    }
  },
  "highlight": {
    "fields": {
      "title": {},
      "content": {}
    }
  }
}

结果

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.68324494,
    "hits" : [
      {
        "_index" : "test_highlight",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.68324494,
        "_source" : {
          "title" : "这是july写的第一篇文章",
          "content" : "大家好,这是我写的第一篇文章,特别喜欢这个文章"
        },
        "highlight" : {
          "title" : [
            "这是july写的第一篇<em>文章</em>"
          ],
          "content" : [
            "大家好,这是我写的第一篇<em>文章</em>,特别喜欢这个<em>文章</em>"
          ]
        }
      }
    ]
  }
}

2.常用的highlight

  • plain highlight,lucene highlight,默认

  • posting highlight,index_options=offsets

posting性能比plain要高,因为不需要重新对高亮文本进行分词。对磁盘的消耗更少。

高亮查询如何使用posting方式
在新建索引时,指定mapping格式如下。
例如:要对content字段进行高亮,设置"index_options": “offsets”。

PUT /test_highlight
{
  "mappings": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "ik_max_word"
        },
        "content": {
          "type": "text",
          "analyzer": "ik_max_word",
          "index_options": "offsets"
        }
      }
  }
}

查询方式和默认高亮是一样的

GET /test_highlight/_doc/_search 
{
  "query": {
    "match": {
      "content": "文章"
    }
  },
  "highlight": {
    "fields": {
      "content": {}
    }
  }
}

3.fast vector highlight

index-time term vector设置在mapping中,就会用fast verctor highlight。
对大field而言(大于1mb),性能更高
如何使用
例如:要对content字段进行高亮,设置"term_vector" : “with_positions_offsets”
PUT /test_highlight

{
  "mappings": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "ik_max_word"
        },
        "content": {
          "type": "text",
          "analyzer": "ik_max_word",
          "term_vector" : "with_positions_offsets"
        }
      }
  }
}

查询方式也是一样的。
如何强制使用指定高亮类型查询

GET /test_highlight/_doc/_search 
{
  "query": {
    "match": {
      "content": "文章"
    }
  },
  "highlight": {
    "fields": {
      "content": {
        "type": "plain"
      }
    }
  }
}

4.高亮片段fragment的设置

场景:你需要高亮的内容’java’,对应字段中内容超过1w个字。那么我可能不需要把所有内容都拿出来,只需要拿出来一小部分就可以,也不需要把所有匹配的一下子都展示出来,只展示前边几个高亮的就可以。

GET /test_highlight/_search
{
    "query" : {
        "match": { "content": "文章" }
    },
    "highlight" : {
        "fields" : {
            "content" : {"fragment_size" : 5, "number_of_fragments" : 3 }
        }
    }
}

fragment_size: 默认是100,设置获取内容的长度。
number_of_fragments:你可能你的高亮的fragment文本片段有多个片段,你可以指定就显示几个片段。

四、 聚合搜索技术深入

1.bucket和metric

在Elasticsearch中,bucket和metric是两种重要的聚合(Aggregation)类型。它们被用于在搜索结果中分组、过滤和计算数据。
Bucket:是一个用于将文档分成段或者桶的聚合操作。我们可以将Bucket看作是一种分类操作,通过Bucket聚合可以将搜索结果按照某种规则进行分组,形成多个不同的Bucket。

常见的Bucket类型有:

  • Terms Bucket:按照指定字段的值进行分组,类似于SQL中的GROUP BY。
  • Date Histogram Bucket:按照时间间隔对文档进行分组,比如每天、每周、每月等。
  • Range Bucket:按照数值范围进行分组,例如按照价格区间进行分组。

Metric:是对Bucket中的文档进行计算的聚合操作。Metric通常会应用于已经分组的数据上,从而计算出汇总数据。
常见的Metric类型有:

  • Sum Metric:对指定字段的数值进行求和计算。
  • Avg Metric:对指定字段的数值进行平均计算。
  • Max Metric:对指定字段的数值取最大值。
  • Min Metric:对指定字段的数值取最小值。
  • Cardinality Metric:对指定字段的不同值进行计数。

举个例子,如果我们有一个包含产品销售记录的索引,其中有字段"category"表示产品类型,那么我们可以使用Terms Bucket对每种产品类型进行分组,然后再应用某些Metric,如Sum Metric来计算每种产品类型的总销售额。
这可以通过以下Elasticsearch查询实现:

{
    "aggs": {
        "sales_by_category": {
            "terms": { "field": "category" },
            "aggs": {
                "total_sales": { "sum": { "field": "price" } }
            }
        }
    }
}

上述查询首先使用Terms Bucket将所有产品按照产品类型进行分组,然后使用Sum Metric对每个分组内的价格进行求和,最终得到每个产品类型的总销售额。其中sales_by_category为自定的分组名称。

2聚合操作案例

新建索引,并插入数据。

PUT /cars
{
  "mappings": {
    "properties": {
      "price": {
        "type": "long"
      },
      "color": {
        "type": "keyword"
      },
      "brand": {
        "type": "keyword"
      },
      "model": {
        "type": "keyword"
      },
      "sold_date": {
        "type": "date"
      },
      "remark": {
        "type": "text",
        "analyzer": "ik_max_word"
      }
    }
  }
}

添加数据

POST /cars/_bulk
{"index":{}}
{"price":258000,"color":"金色","brand":"大众","model":"大众迈腾","sold_date":"2021-10-28","remark":"大众中档车"}
{"index":{}}
{"price":123000,"color":"金色","brand":"大众","model":"大众速腾","sold_date":"2021-11-05","remark":"大众神车"}
{"index":{}}
{"price":239800,"color":"白色","brand":"标志","model":"标志508","sold_date":"2021-05-18","remark":"标志品牌全球上市车型"}
{"index":{}}
{"price":148800,"color":"白色","brand":"标志","model":"标志408","sold_date":"2021-07-02","remark":"比较大的紧凑型车"}
{"index":{}}
{"price":1998000,"color":"黑色","brand":"大众","model":"大众辉腾","sold_date":"2021-08-19","remark":"大众最让人肝疼的车"}
{"index":{}}
{"price":218000,"color":"红色","brand":"奥迪","model":"奥迪A4","sold_date":"2021-11-05","remark":"小资车型"}
{"index":{}}
{"price":489000,"color":"黑色","brand":"奥迪","model":"奥迪A6","sold_date":"2022-01-01","remark":"政府专用?"}
{"index":{}}
{"price":1899000,"color":"黑色","brand":"奥迪","model":"奥迪A 8","sold_date":"2022-02-12","remark":"很贵的大A6"}

①根据color分组统计销售数量
只执行聚合分组,不做复杂的聚合统计。在ES中最基础的聚合为terms,相当于SQL中的count。
在ES中默认为分组数据做排序,使用的是doc_count数据执行降序排列。可以使用_key元数据,根据分组后的字段数据执行不同的排序方案,也可以根据_count元数据,根据分组后的统计值执行不同的排序方案。

GET /cars/_search
{
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color",
        "order": {
          "_count": "desc"
        }
      }
    }
  }
}

结果,其中hits展示的是元数据内容,aggregations展示的是聚合后的内容。

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "group_by_color" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "黑色",
          "doc_count" : 3
        },
        {
          "key" : "白色",
          "doc_count" : 2
        },
        {
          "key" : "金色",
          "doc_count" : 2
        },
        {
          "key" : "红色",
          "doc_count" : 1
        }
      ]
    }
  }
}

如果不想要元数据则需设置一下size即可。

GET /cars/_search
{
  "size": 0, 
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color",
        "order": {
          "_count": "desc"
        }
      }
    }
  }
}

②统计不同color车辆的平均价格(下钻分析,aggs嵌套aggs)
本案例先根据color执行聚合分组,在此分组的基础上,对组内数据执行聚合统计,这个组内数据的聚合统计就是metric。同样可以执行排序,因为组内有聚合统计,且对统计数据给予了命名avg_by_price,所以可以根据这个聚合统计数据字段名执行排序逻辑。

GET /cars/_search
{
  "size": 0, 
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color",
        "order": {
          "avg_by_price": "asc"
        }
      },
      "aggs": {
        "avg_by_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_color" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "金色",
          "doc_count" : 2,
          "avg_by_price" : {
            "value" : 190500.0
          }
        },
        {
          "key" : "白色",
          "doc_count" : 2,
          "avg_by_price" : {
            "value" : 194300.0
          }
        },
        {
          "key" : "红色",
          "doc_count" : 1,
          "avg_by_price" : {
            "value" : 218000.0
          }
        },
        {
          "key" : "黑色",
          "doc_count" : 3,
          "avg_by_price" : {
            "value" : 1462000.0
          }
        }
      ]
    }
  }
}

size可以设置为0,表示不返回ES中的文档,只返回ES聚合之后的数据,提高查询速度,当然如果你需要这些文档的话,也可以按照实际情况进行设置。

③统计不同color不同brand中车辆的平均价格

查询

GET /cars/_search
{
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color",
        "order": {
          "avg_by_price_color": "asc"
        }
      },
      "aggs": {
        "avg_by_price_color": {
          "avg": {
            "field": "price"
          }
        },
        "group_by_brand": {
          "terms": {
            "field": "brand",
            "order": {
              "avg_by_price_brand": "desc"
            }
          },
          "aggs": {
            "avg_by_price_brand": {
              "avg": {
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
}

结果

{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "group_by_color" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "金色",
          "doc_count" : 2,
          "avg_by_price_color" : {
            "value" : 190500.0
          },
          "group_by_brand" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "大众",
                "doc_count" : 2,
                "avg_by_price_brand" : {
                  "value" : 190500.0
                }
              }
            ]
          }
        },
        {
          "key" : "白色",
          "doc_count" : 2,
          "avg_by_price_color" : {
            "value" : 194300.0
          },
          "group_by_brand" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "标志",
                "doc_count" : 2,
                "avg_by_price_brand" : {
                  "value" : 194300.0
                }
              }
            ]
          }
        },
        {
          "key" : "红色",
          "doc_count" : 1,
          "avg_by_price_color" : {
            "value" : 218000.0
          },
          "group_by_brand" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "奥迪",
                "doc_count" : 1,
                "avg_by_price_brand" : {
                  "value" : 218000.0
                }
              }
            ]
          }
        },
        {
          "key" : "黑色",
          "doc_count" : 3,
          "avg_by_price_color" : {
            "value" : 1462000.0
          },
          "group_by_brand" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "大众",
                "doc_count" : 1,
                "avg_by_price_brand" : {
                  "value" : 1998000.0
                }
              },
              {
                "key" : "奥迪",
                "doc_count" : 2,
                "avg_by_price_brand" : {
                  "value" : 1194000.0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

先根据color聚合分组,在组内根据brand再次聚合分组,这种操作可以称为下钻分析。(即嵌套定义)
aggs也可水平定义,、格式如下。

GET /index_name/type_name/_search
{
"aggs" : {
"分组名称1" : {},
"分组名称2" : {}
}
}

举例:

GET /cars/_search
{
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color"
      }
    },
    "avg_by_price_color": {
      "avg": {
        "field": "price"
      }
    }
  }

}

结果

{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "avg_by_price_color" : {
      "value" : 671700.0
    },
    "group_by_color" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "黑色",
          "doc_count" : 3
        },
        {
          "key" : "白色",
          "doc_count" : 2
        },
        {
          "key" : "金色",
          "doc_count" : 2
        },
        {
          "key" : "红色",
          "doc_count" : 1
        }
      ]
    }
  }
}

④统计不同color中的最大和最小价格、总价
查询

GET /cars/_search
{
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color"
      },
      "aggs": {
        "max_price": {
          "max": {
            "field": "price"
          }
        },
        "min_price": {
          "min": {
            "field": "price"
          }
        },
        "sum_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "group_by_color" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "黑色",
          "doc_count" : 3,
          "max_price" : {
            "value" : 1998000.0
          },
          "min_price" : {
            "value" : 489000.0
          },
          "sum_price" : {
            "value" : 4386000.0
          }
        },
        {
          "key" : "白色",
          "doc_count" : 2,
          "max_price" : {
            "value" : 239800.0
          },
          "min_price" : {
            "value" : 148800.0
          },
          "sum_price" : {
            "value" : 388600.0
          }
        },
        {
          "key" : "金色",
          "doc_count" : 2,
          "max_price" : {
            "value" : 258000.0
          },
          "min_price" : {
            "value" : 123000.0
          },
          "sum_price" : {
            "value" : 381000.0
          }
        },
        {
          "key" : "红色",
          "doc_count" : 1,
          "max_price" : {
            "value" : 218000.0
          },
          "min_price" : {
            "value" : 218000.0
          },
          "sum_price" : {
            "value" : 218000.0
          }
        }
      ]
    }
  }
}

⑤统计不同品牌汽车中价格排名最高的车型
查询

GET cars/_search
{
  "size": 0,
  "aggs": {
    "group_by_brand": {
      "terms": {
        "field": "brand"
      },
      "aggs": {
        "top_car": {
          "top_hits": {
            "size": 1,
            "sort": [
              {
                "price": {
                  "order": "desc"
                }
              }
            ],
            "_source": {
              "includes": [
                "model",
                "price"
              ]
            }
          }
        }
      }
    }
  }
}

结果

{
  "took" : 11,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_brand" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "大众",
          "doc_count" : 3,
          "top_car" : {
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "cars",
                  "_type" : "_doc",
                  "_id" : "UYR_-4cBUF6rBrkiDpRJ",
                  "_score" : null,
                  "_source" : {
                    "price" : 1998000,
                    "model" : "大众辉腾"
                  },
                  "sort" : [
                    1998000
                  ]
                }
              ]
            }
          }
        },
        {
          "key" : "奥迪",
          "doc_count" : 3,
          "top_car" : {
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "cars",
                  "_type" : "_doc",
                  "_id" : "VIR_-4cBUF6rBrkiDpRJ",
                  "_score" : null,
                  "_source" : {
                    "price" : 1899000,
                    "model" : "奥迪A 8"
                  },
                  "sort" : [
                    1899000
                  ]
                }
              ]
            }
          }
        },
        {
          "key" : "标志",
          "doc_count" : 2,
          "top_car" : {
            "hits" : {
              "total" : {
                "value" : 2,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "cars",
                  "_type" : "_doc",
                  "_id" : "T4R_-4cBUF6rBrkiDpRJ",
                  "_score" : null,
                  "_source" : {
                    "price" : 239800,
                    "model" : "标志508"
                  },
                  "sort" : [
                    239800
                  ]
                }
              ]
            }
          }
        }
      ]
    }
  }
}

2.1聚合操作之histogram 区间统计

histogram类似terms,也是进行bucket分组操作的,是根据一个field,实现数据区间分组。
例如:以100万为一个范围,统计不同范围内车辆的销售量和平均价格。那么使用histogram的聚合的时候,field指定价格字段price。区间范围是100万(即interval : 1000000)。这个时候ES会将price价格区间划分为: [0, 1000000), [1000000, 2000000), [2000000, 3000000)等,依次类推。在划分区间的同时,histogram会类似terms进行数据数量的统计(count),可以通过嵌套aggs对聚合分组后的组内数据做再次聚合分析。

查询

GET /cars/_search
{
  "aggs": {
    "histogram_by_price": {
      "histogram": {
        "field": "price",
        "interval": 1000000
      },
      "aggs": {
        "avg_by_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "histogram_by_price" : {
      "buckets" : [
        {
          "key" : 0.0,
          "doc_count" : 6,
          "avg_by_price" : {
            "value" : 246100.0
          }
        },
        {
          "key" : 1000000.0,
          "doc_count" : 2,
          "avg_by_price" : {
            "value" : 1948500.0
          }
        }
      ]
    }
  }
}

2.2date_histogram区间分组

date_histogram可以对date类型的field执行区间聚合分组,如每月销量,每年销量等。
如:以月为单位,统计不同月份汽车的销售数量及销售总金额。这个时候可以使用date_histogram实现聚合分组,其中field来指定用于聚合分组的字段,interval指定区间范围(可选值有:year、quarter、month、week、day、hour、minute、second),format指定日期格式化,min_doc_count指定每个区间的最少document(如果不指定,默认为0,当区间范围内没有document时,也会显示bucket分组),extended_bounds指定起始时间和结束时间(如果不指定,默认使用字段中日期最小值所在范围和最大值所在范围为起始和结束时间)。

举例:统计2021年到2022年这个区间统计总价。
es7.x之前版本的语法

GET /cars/_search
{
  "aggs": {
    "histogram_by_date": {
      "date_histogram": {
        "field": "sold_date",
        "interval": "month",
        "format": "yyyy-MM-dd",
        "min_doc_count": 1,
        "extended_bounds": {
          "min": "2021-01-01",
          "max": "2022-12-31"
        }
      },
      "aggs": {
        "sum_by_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

#! Deprecation: [interval] on [date_histogram] is deprecated, use [fixed_interval] or [calendar_interval] in the future.
{
  "took" : 12,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "histogram_by_date" : {
      "buckets" : [
        {
          "key_as_string" : "2021-05-01",
          "key" : 1619827200000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 239800.0
          }
        },
        {
          "key_as_string" : "2021-07-01",
          "key" : 1625097600000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 148800.0
          }
        },
        {
          "key_as_string" : "2021-08-01",
          "key" : 1627776000000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 1998000.0
          }
        },
        {
          "key_as_string" : "2021-10-01",
          "key" : 1633046400000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 258000.0
          }
        },
        {
          "key_as_string" : "2021-11-01",
          "key" : 1635724800000,
          "doc_count" : 2,
          "sum_by_price" : {
            "value" : 341000.0
          }
        },
        {
          "key_as_string" : "2022-01-01",
          "key" : 1640995200000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 489000.0
          }
        },
        {
          "key_as_string" : "2022-02-01",
          "key" : 1643673600000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 1899000.0
          }
        }
      ]
    }
  }
}

es7.x版本之后的语法
查询
把关键字interval换成calendar_interval

GET /cars/_search
{
  "aggs": {
    "histogram_by_date": {
      "date_histogram": {
        "field": "sold_date",
        "calendar_interval": "month",
        "format": "yyyy-MM-dd",
        "min_doc_count": 1,
        "extended_bounds": {
          "min": "2021-01-01",
          "max": "2022-12-31"
        }
      },
      "aggs": {
        "sum_by_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "histogram_by_date" : {
      "buckets" : [
        {
          "key_as_string" : "2021-05-01",
          "key" : 1619827200000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 239800.0
          }
        },
        {
          "key_as_string" : "2021-07-01",
          "key" : 1625097600000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 148800.0
          }
        },
        {
          "key_as_string" : "2021-08-01",
          "key" : 1627776000000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 1998000.0
          }
        },
        {
          "key_as_string" : "2021-10-01",
          "key" : 1633046400000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 258000.0
          }
        },
        {
          "key_as_string" : "2021-11-01",
          "key" : 1635724800000,
          "doc_count" : 2,
          "sum_by_price" : {
            "value" : 341000.0
          }
        },
        {
          "key_as_string" : "2022-01-01",
          "key" : 1640995200000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 489000.0
          }
        },
        {
          "key_as_string" : "2022-02-01",
          "key" : 1643673600000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 1899000.0
          }
        }
      ]
    }
  }
}

2.3_global bucket

在聚合统计数据的时候,有些时候需要对比部分数据和总体数据。
例如:
统计某品牌车辆平均价格和所有车辆平均价格。global是用于定义一个全局bucket,这个bucket会忽略query的条件,检索所有document进行对应的聚合统计。
查询

GET /cars/_search
{
  "size": 0,
  "query": {
    "match": {
      "brand": "大众"
    }
  },
  "aggs": {
    "volkswagen_of_avg_price": {
      "avg": {
        "field": "price"
      }
    },
    "all_avg_price": {
      "global": {},
      "aggs": {
        "all_of_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "all_avg_price" : {
      "doc_count" : 8,
      "all_of_price" : {
        "value" : 671700.0
      }
    },
    "volkswagen_of_avg_price" : {
      "value" : 793000.0
    }
  }
}

2.4 aggs+order(聚合+排序)

对聚合统计数据进行排序。
例如:
统计每个品牌的汽车销量和销售总额,按照销售总额的降序排列。
查询

GET /cars/_search
{
  "aggs": {
    "group_of_brand": {
      "terms": {
        "field": "brand",
        "order": {
          "sum_of_price": "desc"
        }
      },
      "aggs": {
        "sum_of_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "group_of_brand" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "奥迪",
          "doc_count" : 3,
          "sum_of_price" : {
            "value" : 2606000.0
          }
        },
        {
          "key" : "大众",
          "doc_count" : 3,
          "sum_of_price" : {
            "value" : 2379000.0
          }
        },
        {
          "key" : "标志",
          "doc_count" : 2,
          "sum_of_price" : {
            "value" : 388600.0
          }
        }
      ]
    }
  }
}

如果有多层aggs,执行下钻聚合的时候,也可以根据最内层聚合数据执行排序。(即外层排序的内容可以使用里层的别名进行排序)
例如
统计每个品牌中每种颜色车辆的销售总额,并根据销售总额降序排列。这就像SQL中的分组排序一样,

只能组内数据排序,而不能跨组实现排序。

查询

GET /cars/_search
{
  "aggs": {
    "group_by_brand": {
      "terms": {
        "field": "brand"
      },
      "aggs": {
        "group_by_color": {
          "terms": {
            "field": "color",
            "order": {
              "sum_of_price": "desc"
            }
          },
          "aggs": {
            "sum_of_price": {
              "sum": {
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "group_by_brand" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "大众",
          "doc_count" : 3,
          "group_by_color" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "黑色",
                "doc_count" : 1,
                "sum_of_price" : {
                  "value" : 1998000.0
                }
              },
              {
                "key" : "金色",
                "doc_count" : 2,
                "sum_of_price" : {
                  "value" : 381000.0
                }
              }
            ]
          }
        },
        {
          "key" : "奥迪",
          "doc_count" : 3,
          "group_by_color" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "黑色",
                "doc_count" : 2,
                "sum_of_price" : {
                  "value" : 2388000.0
                }
              },
              {
                "key" : "红色",
                "doc_count" : 1,
                "sum_of_price" : {
                  "value" : 218000.0
                }
              }
            ]
          }
        },
        {
          "key" : "标志",
          "doc_count" : 2,
          "group_by_color" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "白色",
                "doc_count" : 2,
                "sum_of_price" : {
                  "value" : 388600.0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

2.5search+aggs (条件查询+聚合)

聚合类似SQL中的group by子句,search类似SQL中的where子句。在ES中是完全可以将search和aggregations整合起来,执行相对更复杂的搜索统计。
例如:
统计某品牌车辆每个季度的销量和销售额。
查询

GET /cars/_search
{
  "query": {
    "match": {
      "brand": "大众"
    }
  },
  "aggs": {
    "histogram_by_date": {
      "date_histogram": {
        "field": "sold_date",
        "calendar_interval": "quarter",
        "min_doc_count": 1
      },
      "aggs": {
        "sum_by_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.9444616,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      }
    ]
  },
  "aggregations" : {
    "histogram_by_date" : {
      "buckets" : [
        {
          "key_as_string" : "2021-07-01T00:00:00.000Z",
          "key" : 1625097600000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 1998000.0
          }
        },
        {
          "key_as_string" : "2021-10-01T00:00:00.000Z",
          "key" : 1633046400000,
          "doc_count" : 2,
          "sum_by_price" : {
            "value" : 381000.0
          }
        }
      ]
    }
  }
}

2.6filter+aggs(过滤+聚合)

filter也可以和aggs组合使用实现过滤聚合分析。
例如:
统计10万–50万之间的车辆的平均价格。

GET /cars/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "price": {
            "gte": 100000,
            "lte": 500000
          }
        }
      }
    }
  },
  "aggs": {
    "avg_by_price": {
      "avg": {
        "field": "price"
      }
    }
  }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      }
    ]
  },
  "aggregations" : {
    "avg_by_price" : {
      "value" : 246100.0
    }
  }
}

2.7聚合中使用filter

filter也可以使用在aggs句法中,filter的范围决定了其过滤的范围。
如:统计某品牌汽车最近一年的销售总额。将filter放在aggs内部,代表这个过滤器只对query搜索得到的结果执行filter过滤。如果filter放在aggs外部,过滤器则会过滤所有的数据。

①12M/M 表示 12 个月。
②1y/y 表示 1年。
③d 表示天

查询

GET /cars/_search
{
  "query": {
    "match": {
      "brand": "大众"
    }
  },
  "aggs": {
    "count_last_year": {
      "filter": {
        "range": {
          "sold_date": {
            "gte": "now-12M"
          }
        }
      },
      "aggs": {
        "sum_of_price_last_year": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果文章来源地址https://www.toymoban.com/news/detail-560240.html

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.9444616,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      }
    ]
  },
  "aggregations" : {
    "count_last_year" : {
      "meta" : { },
      "doc_count" : 0,
      "sum_of_price_last_year" : {
        "value" : 0.0
      }
    }
  }
}

到了这里,关于五、浅析[ElasticSearch]底层原理与分组聚合查询的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • Elasticsearch查询以及聚合查询

    must:返回的文档必须满足子句的条件,并且参与计算分值 filter:返回的文档必须满足filter子句的条件,不会参与计算分值 should:返回的文档可能满足should子句的条件。 must_nout:返回的文档必须不满足must_not定义的条件。 注意:如果一个查询既有filter又有should,那么至少包含

    2023年04月13日
    浏览(52)
  • ElasticSearch分组统计查询

    maven依赖: 构建配置类: 根据两个字段进行统计: 实体定义: 创建索引文件:

    2024年02月02日
    浏览(44)
  • elasticsearch的聚合查询

    聚合基本格式 其中NAME表示当前聚合的名字,可以取任意合法的字符串,AGG_TYPE表示聚合的类型,常见的为分为多值聚合和单值聚合 例子 上面的例子表示查询当前库里面的likeCount的和,返回结果: 返回结果中默认会包含命中的document,所以需要把size指定为0,结果中的sum_all为

    2024年02月08日
    浏览(45)
  • Elasticsearch学习-- 聚合查询

     1. 分桶聚合  bucket aggregations 按照每个标签进行分类 ,类似于group by        2. 指标聚合 metrics aggregations   3. 管道聚合 pipeline aggregations 先计算平均值,再计算最小值    默认查询返回结果是10条,可以通过设置size来看返回值数量 1. 统计不同标签的商品数量   2. 为什么上面使

    2023年04月09日
    浏览(39)
  • elasticsearch聚合查询实践

    概念 聚合分类 聚合语法 聚合作用范围及排序 聚合原理及 terms 精准度 聚合实验 桶聚合 指标聚合 Pipeline 聚合 实践一:多商户数据权限聚合分页 实践二:多维度嵌套聚合 实践三:删除 ES 索引重复数据 附:实验环境 用于聚合的字段必须是 exact value ,即 doc_value=true 。分词字

    2024年02月03日
    浏览(46)
  • Elasticsearch 基本使用(四)聚合查询

    说到聚合查询,马上会想到 SQL 中的 group by,ES中也有类似的功能,名叫 Aggregation。 统计分组后的数量 按年龄分组,然后统计每个年龄人数 count(*) ,age xxx group by age 非文档字段分组 文档字段分组 直接使用文档字段分组会报错。 ES没有对文本字段聚合,排序等操作优化;如果对

    2024年02月12日
    浏览(51)
  • 一起学Elasticsearch系列-聚合查询

    本文已收录至Github,推荐阅读 👉 Java随想录 微信公众号:Java随想录 聚合查询是 Elasticsearch 中一种强大的数据分析工具,用于从索引中提取和计算有关数据的统计信息。聚合查询可以执行各种聚合操作,如计数、求和、平均值、最小值、最大值、分组等,以便进行数据汇总和

    2024年01月22日
    浏览(49)
  • Elasticsearch 聚合查询(Aggregation)详解

    Elasticsearch中的聚合查询,类似SQL的SUM/AVG/COUNT/GROUP BY分组查询,主要用于统计分析场景。 实例: 例子聚合统计的效果等价SQL:

    2024年02月04日
    浏览(44)
  • 重学Elasticsearch第5章 : 过滤查询、聚合查询

    其实准确来说,ES中的查询操作分为2种: 查询(query) 和 过滤(filter) 。 查询即是之前提到的query查询,它(查询)默认会计算每个返回文档的得分,然后根据得分排序 。 过滤(filter)只会筛选出符合的文档,并不计算得分,且它可以缓存文档 。所以,单从性能考虑,过滤比查询更快

    2024年02月09日
    浏览(36)
  • Elasticsearch 查询和聚合查询:基本语法和统计数量

    摘要:Elasticsearch是一个强大的分布式搜索和分析引擎,提供了丰富的查询和聚合功能。本文将介绍Elasticsearch的基本查询语法,包括预发查询和聚合查询,以及如何使用聚合功能统计数量。 Elasticsearch是一种开源的分布式搜索和分析引擎,广泛应用于各种场景,包括日志分析、

    2024年02月11日
    浏览(46)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包