采用了苏宁的这种结构:
查询的语句如下:
SELECT sum(user_number) AS user_number
FROM testdb.ch_agent_dist_user
RIGHT JOIN
(
WITH (
select bm as u1
from testdb.es_user_feature_int where labelname = ‘label1’ and labelvalue = ‘0’
) as u1,
(
select bm as u2
from testdb.es_user_feature_int where labelname = ‘label2’ and labelvalue = ‘1’
) as u2,
(
select bm as u3
from testdb.es_user_feature_int where labelname = ‘label3’ and labelvalue = ‘2’
) as u3
SELECT
’agent’ AS agentname,
bitmapCardinality(bitmapAnd(bitmapAnd(u1,u2),u3)) AS user_number
) T USING (agentname) settings enable_scalar_subquery_optimization = 0;
这个sql执行了32s,但是其中的with带子查询在每台机器上只需要执行3s。各位大佬有知道这个是什么原因么
查询效率没有达到预期的效果
1 回复
表结构如下,其中ch_agent_user是一张空表。 – **表(本地表) CREATE TABLE testdb.ch_agent_user( agentname String ) ENGINE = MergeTree() PARTITION BY agentname ORDER BY (agentname) SETTINGS index_granularity = 8192;
– **表(分布式表) CREATE TABLE testdb.ch_agent_dist_user AS testdb.ch_agent_user ENGINE = Distributed(‘report_shards_replicas’, ‘testdb’, ‘ch_agent_user’, cityHash64(agentname));
– int类型标签表 CREATE TABLE testdb.es_user_feature_int( labelname String, labelvalue String, bm AggregateFunction( groupBitmap, UInt64 ) ) ENGINE = AggregatingMergeTree() PARTITION BY labelname ORDER BY (labelname, labelvalue) SETTINGS index_granularity = 128;