ES 聚合和过滤
- 聚合范围限定还有一个自然的扩展就是过滤。因为聚合是在查询结果范围内操作的,任何可以适用于查询的过滤器也可以应用在聚合上。
数据准备
-
PUT cars { "mappings": { "transactions": { "properties": { "color": { "type": "keyword" }, "make": { "type": "keyword" }, "price": { "type": "long" }, "sold": { "type": "date" } } } } }
-
POST /cars/transactions/_bulk { "index": {}} { "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" } { "index": {}} { "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" } { "index": {}} { "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" } { "index": {}} { "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" } { "index": {}} { "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" } { "index": {}} { "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" } { "index": {}} { "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" } { "index": {}} { "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" } { "index": {}} { "price" : 8000, "color" : "blue", "make" : "bmw", "sold" : "2014-06-10" } { "index": {}} { "price" : 20000, "color" : "blue", "make" : "bmw", "sold" : "2014-10-10" }
过滤
-
需求:找到售价在 $10,000 美元之上的所有汽车同时也为这些车计算平均售价
-
GET /cars/transactions/_search { "size" : 0, "query" : { "constant_score": { "filter": { "range": { "price": { "gte": 10000 } } } } }, "aggs" : { "single_avg_price": { "avg" : { "field" : "price" } } } }
-
constant_score
忽略评分,提高查询效率,同时使用缓存缓存查询结果 -
"hits": { "total": 9, "max_score": 1, "hits": [ { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42yy", "_score": 1, "_source": { "price": 15000, "color": "blue", "make": "toyota", "sold": "2014-07-02" } }, { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42yv", "_score": 1, "_source": { "price": 10000, "color": "red", "make": "honda", "sold": "2014-10-28" } }, { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42yw", "_score": 1, "_source": { "price": 20000, "color": "red", "make": "honda", "sold": "2014-11-05" } }, { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42y2", "_score": 1, "_source": { "price": 25000, "color": "blue", "make": "ford", "sold": "2014-02-12" } }, { "_index": "cars", "_type": "transactions", "_id": "AX--vgKqH29RD64k5sAk", "_score": 1, "_source": { "price": 20000, "color": "blue", "make": "bmw", "sold": "2014-10-10" } }, { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42yz", "_score": 1, "_source": { "price": 12000, "color": "green", "make": "toyota", "sold": "2014-08-19" } }, { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42y0", "_score": 1, "_source": { "price": 20000, "color": "red", "make": "honda", "sold": "2014-11-05" } }, { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42y1", "_score": 1, "_source": { "price": 80000, "color": "red", "make": "bmw", "sold": "2014-01-01" } }, { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42yx", "_score": 1, "_source": { "price": 30000, "color": "green", "make": "ford", "sold": "2014-05-18" } } ] }, "aggregations": { "single_avg_price": { "value": 25777.777777777777 } }
-
@Test public void test07(){ AvgAggregationBuilder avgAggregationBuilder = AggregationBuilders.avg("single_avg_price").field("price"); BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery().filter(QueryBuilders.rangeQuery("price").gte(10000)); ConstantScoreQueryBuilder constantScoreQueryBuilder = QueryBuilders.constantScoreQuery(boolQueryBuilder); SearchResponse searchResponse = elasticsearchTemplate.getClient().prepareSearch("cars") .setTypes("transactions") .setQuery(constantScoreQueryBuilder) .addAggregation(avgAggregationBuilder) .execute() .actionGet(); SearchHit[] hits = searchResponse.getHits().getHits(); for (SearchHit searchHit : hits){ Map<String, Object> sourceAsMap = searchHit.getSourceAsMap(); Integer price = (Integer) sourceAsMap.get("price"); String color = (String) sourceAsMap.get("color"); String make = (String) sourceAsMap.get("make"); String sold = (String) sourceAsMap.get("sold"); System.out.println(price+"-"+color+"-"+make+"-"+sold); } InternalAvg internalAvg = searchResponse.getAggregations().get("single_avg_price"); System.out.println(internalAvg.getValue()); }
-
返回结果 15000-blue-toyota-2014-07-02 10000-red-honda-2014-10-28 20000-red-honda-2014-11-05 25000-blue-ford-2014-02-12 20000-blue-bmw-2014-10-10 12000-green-toyota-2014-08-19 20000-red-honda-2014-11-05 80000-red-bmw-2014-01-01 30000-green-ford-2014-05-18 25777.777777777777
过滤桶
-
只对聚合结果进行过滤
-
需求:找出经销商是
ford
并且出售时间大于2014-03-01的 -
GET /cars/transactions/_search { "query":{ "match": { "make": "ford" } }, "aggs":{ "recent_sales": { "filter": { "range": { "sold": { "gte": "2014-03-01" } } }, "aggs": { "average_price":{ "avg": { "field": "price" } } } } } }
-
返回结果 "hits": { "total": 2, "max_score": 1.2039728, "hits": [ { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42y2", "_score": 1.2039728, "_source": { "price": 25000, "color": "blue", "make": "ford", "sold": "2014-02-12" } }, { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42yx", "_score": 0.2876821, "_source": { "price": 30000, "color": "green", "make": "ford", "sold": "2014-05-18" } } ] }, "aggregations": { "recent_sales": { "doc_count": 1, "average_price": { "value": 30000 } } }
-
@Test public void test08(){ MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("make", "ford"); FilterAggregationBuilder filterAggregationBuilder = AggregationBuilders.filter("recent_sales", QueryBuilders.rangeQuery("sold").gte("2014-03-01")); AvgAggregationBuilder avgAggregationBuilder = AggregationBuilders.avg("average_price").field("price"); filterAggregationBuilder.subAggregation(avgAggregationBuilder); SearchResponse searchResponse = elasticsearchTemplate.getClient().prepareSearch("cars") .setTypes("transactions") .setQuery(matchQueryBuilder) .addAggregation(filterAggregationBuilder) .execute() .actionGet(); SearchHit[] hits = searchResponse.getHits().getHits(); for (SearchHit searchHit : hits){ Map<String, Object> sourceAsMap = searchHit.getSourceAsMap(); Integer price = (Integer) sourceAsMap.get("price"); String color = (String) sourceAsMap.get("color"); String make = (String) sourceAsMap.get("make"); String sold = (String) sourceAsMap.get("sold"); System.out.println(price+"-"+color+"-"+make+"-"+sold); } InternalFilter internalFilter = searchResponse.getAggregations().get("recent_sales"); long docCount = internalFilter.getDocCount(); System.out.println(docCount); InternalAvg internalAvg = internalFilter.getAggregations().get("average_price"); System.out.println(internalAvg.getValue()); }
-
返回结果 25000-blue-ford-2014-02-12 30000-green-ford-2014-05-18 1 30000.0
-
这样查询出来的结果是两个,而聚合的数据却只有一个,因为聚合的时候,出售时间为2011-02-12的数据被过滤掉了
-
filter
桶和其他桶的操作方式一样,所以可以随意将其他桶和度量嵌入其中。
后过滤
-
只过滤搜索结果,不过滤聚合结果,使用
post_filter
-
需求:对
ford
所有颜色的数据进行聚合,但只查询颜色为绿色的数据 -
GET /cars/transactions/_search { "query": { "match": { "make": "ford" } }, "post_filter": { "term" : { "color" : "green" } }, "aggs" : { "all_colors": { "terms" : { "field" : "color" } } } }
-
"hits": { "total": 1, "max_score": 0.2876821, "hits": [ { "_index": "cars", "_type": "transactions", "_id": "AX-0cAvB7h9TQ4Sk42yx", "_score": 0.2876821, "_source": { "price": 30000, "color": "green", "make": "ford", "sold": "2014-05-18" } } ] }, "aggregations": { "all_colors": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "blue", "doc_count": 1 }, { "key": "green", "doc_count": 1 } ] } }
-
@Test public void test09(){ MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("make", "ford"); TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("color", "green"); TermsAggregationBuilder termsAggregationBuilder = AggregationBuilders.terms("all_colors").field("color"); SearchResponse searchResponse = elasticsearchTemplate.getClient().prepareSearch("cars") .setTypes("transactions") .setQuery(matchQueryBuilder) .setPostFilter(termQueryBuilder) .addAggregation(termsAggregationBuilder) .execute() .actionGet(); SearchHit[] hits = searchResponse.getHits().getHits(); for (SearchHit searchHit : hits){ Map<String, Object> sourceAsMap = searchHit.getSourceAsMap(); Integer price = (Integer) sourceAsMap.get("price"); String color = (String) sourceAsMap.get("color"); String make = (String) sourceAsMap.get("make"); String sold = (String) sourceAsMap.get("sold"); System.out.println(price+"-"+color+"-"+make+"-"+sold); } StringTerms stringTerms = searchResponse.getAggregations().get("all_colors"); List<StringTerms.Bucket> buckets = stringTerms.getBuckets(); for (StringTerms.Bucket bucket : buckets){ String keyAsString = bucket.getKeyAsString(); long docCount = bucket.getDocCount(); System.out.println(keyAsString+"-"+docCount); } }
-
30000-green-ford-2014-05-18 blue-1 green-1
-
这样查询出来的数据只有一条,而进行聚合的数据是两条。文章来源:https://www.toymoban.com/news/detail-615131.html
-
另外,
post_filter
的特性是在查询之后进行过滤,和聚合一起使用,使用的情况一般是需求对查询结果和聚合结果进行不同的过滤,不要只在查询的时候使用,影响性能。文章来源地址https://www.toymoban.com/news/detail-615131.html
到了这里,关于ES 聚合和过滤的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!