如何在缺少值的联合上分析sql数据?

cu6pst1q  于 2021-07-26  发布在  Java
关注(0)|答案(1)|浏览(255)

注意:这个数据是模糊的,所以可能没有什么意义。
我有两个源数据表。表 meter_increment 保存时间戳和(水表)上自上一记录以来的增量值。表 meter_billing_sessions 包含表示用水计费时间的事件(在单独的会话中)。它包含一个时间戳和会话id。我在这些表中有一个按时间戳排序的数据并集(请确保向右滚动以显示我的笔记):

+----------------+-------------+-----------+-------------------+----------------+
| eventType      | dataSource  | sessionId |            amount |      timestamp |
|----------------+-------------+-----------+-------------------+----------------|
|                | WATER_METER |           |             82700 |  1588047625320 | \
|                | WATER_METER |           |            124928 |  1588049228489 |  |
|                | WATER_METER |           |            749568 |  1588060228780 |  | - SUM these 
|                | WATER_METER |           |            139264 |  1588066837299 | /
| BILLING_ACTIVE |             | eda6a919  |                 0 |  1588071842076 | <-- start of billing
|                | WATER_METER |           |           6601728 |  1588071842250 | \
| BILLING_ACTIVE |             | eda6a919  |                 0 |  1588072173722 |  \
|                | WATER_METER |           |           7202816 |  1588072173930 |   \
| BILLING_ACTIVE |             | eda6a919  |                 0 |  1588072738485 |    | - SUM these
|                | WATER_METER |           |          16607232 |  1588072738634 |   /
| BILLING_ACTIVE |             | eda6a919  |                 0 |  1588073073864 |  /
|                | WATER_METER |           |          12030976 |  1588073074009 | /
| BILLING_ACTIVE |             | eda6a919  |                 0 |  1588073142312 | <-- end of billing
|                | WATER_METER |           |            616448 |  1588073142489 | \
|                | WATER_METER |           |             32768 |  1588075199418 |  | - SUM these
|                | WATER_METER |           |              4096 |  1588075423888 | /
| BILLING_ACTIVE |             | 2d54d918  |                 0 |  1588077038705 | <-- start of billing
|                | WATER_METER |           |           5370880 |  1588077038845 | \
| BILLING_ACTIVE |             | 2d54d918  |                 0 |  1588077039792 |  \
|                | WATER_METER |           |            176128 |  1588077039868 |   \
| BILLING_ACTIVE |             | 2d54d918  |                 0 |  1588077113129 |    | - SUM these
|                | WATER_METER |           |            823296 |  1588077113216 |   / 
| BILLING_ACTIVE |             | 2d54d918  |                 0 |  1588077541749 |  / 
|                | WATER_METER |           |          22238208 |  1588077541880 | /
| BILLING_ACTIVE |             | 2d54d918  |                 0 |  1588077858532 | <-- end of billing
|                | WATER_METER |           |          32056564 |  1588077858642 | \
|                | WATER_METER |           |          24623884 |  1588077858642 |  \
| BILLING_ACTIVE |             | 130759f4  |                 0 |  1588078142145 |   | SPECIAL case,
|                | WATER_METER |           |                 0 |  1588078142229 |  /   SUM together
|                | WATER_METER |           |          66718720 |  1588078142229 | / 
| BILLING_ACTIVE |             | 6178b903  |                 0 |  1588078472832 | <-- start of billing
|                | WATER_METER |           |         105058304 |  1588078472903 | \
|                | WATER_METER |           |                 0 |  1588078472903 |  \
| BILLING_ACTIVE |             | 6178b903  |                 0 |  1588079039102 |   \
|                | WATER_METER |           |                 0 |  1588079039224 |    \
|                | WATER_METER |           |          52585472 |  1588079039224 |     \
| BILLING_ACTIVE |             | 6178b903  |                 0 |  1588079375536 |      | - SUM these
|                | WATER_METER |           |          38905856 |  1588079375649 |     /
|                | WATER_METER |           |                 0 |  1588079375649 |    /
| BILLING_ACTIVE |             | 6178b903  |                 0 |  1588079941631 |   /
|                | WATER_METER |           |                 0 |  1588079941748 |  /
|                | WATER_METER |           |          40554496 |  1588079941748 | /
| BILLING_ACTIVE |             | 6178b903  |                 0 |  1588080274318 | <-- end of billing
|                | WATER_METER |           |                 0 |  1588080274406 | \
|                | WATER_METER |           |          19591168 |  1588080274406 |  \
| BILLING_ACTIVE |             | 902359b4  |                 0 |  1588080842532 |   \ SPECIAL case,
|                | WATER_METER |           |                 0 |  1588080842666 |   /  SUM together
|                | WATER_METER |           |              1024 |  1591372498349 |  /
|                | WATER_METER |           |              2048 |  1591372525838 | /
+----------------+-------------+-----------+-------------------+----------------+

第一个 BILLING_ACTIVE 特定事件 sessionId 是指在给定会话中开始计费的用水量,最后一个事件的用水量相同 sessionId 是指帐单的结尾。
我想总结一下 amount 在计费会话内发生的事件(不必按sessionid拆分),这样我就可以知道在计费会话期间消耗了多少水,以及在计费会话之外消耗了多少水。
有一种特殊情况,如果某个sessionid有一个单独的计费事件,那么“围绕”它的数据将被视为不是计费会话的一部分(将其视为计费的瞬时开始和结束)。只有中间的数据 BILLING_ACTIVE 使用相同的 sessionId 被视为计费会话的一部分。
事先非常感谢。
编辑:我需要能够判断 amountbilled 这样地:

+-----------+----------------+--------------+-------------+-----------+---------------+
|    billed | eventtype      | datasource   | sessionid   |    amount |            ts |
|-----------+----------------+--------------+-------------+-----------+---------------|
|     False |                | WATER_METER  |             |     82700 | 1588047625320 |
|     False |                | WATER_METER  |             |    124928 | 1588049228489 |
|     False |                | WATER_METER  |             |    749568 | 1588060228780 |
|     False |                | WATER_METER  |             |    139264 | 1588066837299 |
|      True | BILLING_ACTIVE |              | eda6a919    |         0 | 1588071842076 |
|      True |                | WATER_METER  |             |   6601728 | 1588071842250 |
|      True | BILLING_ACTIVE |              | eda6a919    |         0 | 1588072173722 |
|      True |                | WATER_METER  |             |   7202816 | 1588072173930 |
|      True | BILLING_ACTIVE |              | eda6a919    |         0 | 1588072738485 |
|      True |                | WATER_METER  |             |  16607232 | 1588072738634 |
|      True | BILLING_ACTIVE |              | eda6a919    |         0 | 1588073073864 |
|      True |                | WATER_METER  |             |  12030976 | 1588073074009 |
|      True | BILLING_ACTIVE |              | eda6a919    |         0 | 1588073142312 |
|     False |                | WATER_METER  |             |    616448 | 1588073142489 |
|     False |                | WATER_METER  |             |     32768 | 1588075199418 |
|     False |                | WATER_METER  |             |      4096 | 1588075423888 |
|      True | BILLING_ACTIVE |              | 2d54d918    |         0 | 1588077038705 |
|      True |                | WATER_METER  |             |   5370880 | 1588077038845 |
|      True | BILLING_ACTIVE |              | 2d54d918    |         0 | 1588077039792 |
|      True |                | WATER_METER  |             |    176128 | 1588077039868 |
|      True | BILLING_ACTIVE |              | 2d54d918    |         0 | 1588077113129 |
|      True |                | WATER_METER  |             |    823296 | 1588077113216 |
|      True | BILLING_ACTIVE |              | 2d54d918    |         0 | 1588077541749 |
|      True |                | WATER_METER  |             |  22238208 | 1588077541880 |
|      True | BILLING_ACTIVE |              | 2d54d918    |         0 | 1588077858532 |
|     False |                | WATER_METER  |             |  32056564 | 1588077858642 |
|     False |                | WATER_METER  |             |  24623884 | 1588077858642 |
|     False | BILLING_ACTIVE |              | 130759f4    |         0 | 1588078142145 |
|     False |                | WATER_METER  |             |         0 | 1588078142229 |
|     False |                | WATER_METER  |             |  66718720 | 1588078142229 |
|      True | BILLING_ACTIVE |              | 6178b903    |         0 | 1588078472832 |
|      True |                | WATER_METER  |             | 105058304 | 1588078472903 |
|      True |                | WATER_METER  |             |         0 | 1588078472903 |
|      True | BILLING_ACTIVE |              | 6178b903    |         0 | 1588079039102 |
|      True |                | WATER_METER  |             |         0 | 1588079039224 |
|      True |                | WATER_METER  |             |  52585472 | 1588079039224 |
|      True | BILLING_ACTIVE |              | 6178b903    |         0 | 1588079375536 |
|      True |                | WATER_METER  |             |  38905856 | 1588079375649 |
|      True |                | WATER_METER  |             |         0 | 1588079375649 |
|      True | BILLING_ACTIVE |              | 6178b903    |         0 | 1588079941631 |
|      True |                | WATER_METER  |             |         0 | 1588079941748 |
|      True |                | WATER_METER  |             |  40554496 | 1588079941748 |
|      True | BILLING_ACTIVE |              | 6178b903    |         0 | 1588080274318 |
|     False |                | WATER_METER  |             |         0 | 1588080274406 |
|     False |                | WATER_METER  |             |  19591168 | 1588080274406 |
|     False | BILLING_ACTIVE |              | 902359b4    |         0 | 1588080842532 |
|     False |                | WATER_METER  |             |         0 | 1588080842666 |
|     False |                | WATER_METER  |             |      1024 | 1591372498349 |
|     False |                | WATER_METER  |             |      2048 | 1591372525838 |
+-----------+----------------+--------------+-------------+-----------+---------------+
5lhxktic

5lhxktic1#

如果没有你的回答,我只能想象你需要找到一个标识符来将你标记为必须求和的组分组在一起。vertica有两个会话标识符生成器,到目前为止我在其他任何地方都没有见过: CONDITIONAL_TRUE_EVENT() 以及 CONDITIONAL_CHANGE_EVENT() .
它们是olap函数。它们每次启动并重置为0 PARTITION BY 当某件事是真的或某件事改变时,分别递增一。
对你来说,我会 UNION SELECT 这两个表与您所做的一样,在不适用的地方使用null。然后围绕这个联合运行select,每次会话id不为null时生成一个新的会话标识符。最后,从这个select中再次选择,根据获得的会话标识符进行分组,sum()计算数量,并添加该组的最小和最大时间戳,以便于文档记录。
下面是两个输入表作为一个内嵌的 SELECT :

WITH
meter_increment(dataSource,amount,timestamp) AS (
          SELECT 'WATER_METER',82700,1588047625320
UNION ALL SELECT 'WATER_METER',124928,1588049228489
UNION ALL SELECT 'WATER_METER',749568,1588060228780
UNION ALL SELECT 'WATER_METER',139264,1588066837299
UNION ALL SELECT 'WATER_METER',6601728,1588071842250
UNION ALL SELECT 'WATER_METER',7202816,1588072173930
UNION ALL SELECT 'WATER_METER',16607232,1588072738634
UNION ALL SELECT 'WATER_METER',12030976,1588073074009
UNION ALL SELECT 'WATER_METER',616448,1588073142489
UNION ALL SELECT 'WATER_METER',32768,1588075199418
UNION ALL SELECT 'WATER_METER',4096,1588075423888
UNION ALL SELECT 'WATER_METER',5370880,1588077038845
UNION ALL SELECT 'WATER_METER',176128,1588077039868
UNION ALL SELECT 'WATER_METER',823296,1588077113216
UNION ALL SELECT 'WATER_METER',22238208,1588077541880
UNION ALL SELECT 'WATER_METER',32056564,1588077858642
UNION ALL SELECT 'WATER_METER',24623884,1588077858642
UNION ALL SELECT 'WATER_METER',0,1588078142229
UNION ALL SELECT 'WATER_METER',66718720,1588078142229
UNION ALL SELECT 'WATER_METER',105058304,1588078472903
UNION ALL SELECT 'WATER_METER',0,1588078472903
UNION ALL SELECT 'WATER_METER',0,1588079039224
UNION ALL SELECT 'WATER_METER',52585472,1588079039224
UNION ALL SELECT 'WATER_METER',38905856,1588079375649
UNION ALL SELECT 'WATER_METER',0,1588079375649
UNION ALL SELECT 'WATER_METER',0,1588079941748
UNION ALL SELECT 'WATER_METER',40554496,1588079941748
UNION ALL SELECT 'WATER_METER',0,1588080274406
UNION ALL SELECT 'WATER_METER',19591168,1588080274406
UNION ALL SELECT 'WATER_METER',0,1588080842666
UNION ALL SELECT 'WATER_METER',1024,1591372498349
UNION ALL SELECT 'WATER_METER',2048,1591372525838
)
,
meter_billing_sessions(eventType,sessionId,amount,timestamp) AS (
          SELECT 'BILLING_ACTIVE','eda6a919',0,1588071842076
UNION ALL SELECT 'BILLING_ACTIVE','eda6a919',0,1588072173722
UNION ALL SELECT 'BILLING_ACTIVE','eda6a919',0,1588072738485
UNION ALL SELECT 'BILLING_ACTIVE','eda6a919',0,1588073073864
UNION ALL SELECT 'BILLING_ACTIVE','eda6a919',0,1588073142312
UNION ALL SELECT 'BILLING_ACTIVE','2d54d918',0,1588077038705
UNION ALL SELECT 'BILLING_ACTIVE','2d54d918',0,1588077039792
UNION ALL SELECT 'BILLING_ACTIVE','2d54d918',0,1588077113129
UNION ALL SELECT 'BILLING_ACTIVE','2d54d918',0,1588077541749
UNION ALL SELECT 'BILLING_ACTIVE','2d54d918',0,1588077858532
UNION ALL SELECT 'BILLING_ACTIVE','130759f4',0,1588078142145
UNION ALL SELECT 'BILLING_ACTIVE','6178b903',0,1588078472832
UNION ALL SELECT 'BILLING_ACTIVE','6178b903',0,1588079039102
UNION ALL SELECT 'BILLING_ACTIVE','6178b903',0,1588079375536
UNION ALL SELECT 'BILLING_ACTIVE','6178b903',0,1588079941631
UNION ALL SELECT 'BILLING_ACTIVE','6178b903',0,1588080274318
UNION ALL SELECT 'BILLING_ACTIVE','902359b4',0,1588080842532
)

下面的选择来自这两个表:
根据您的答案编辑解决方案;重读一遍,您似乎还需要报告中的原始会话id。为此,会话id(仅在计费事件中获取,而不在度量中获取)需要与其前一行位于同一组中,并使用 MAX() 表达;所以布尔表达式改变了 sessionId 必须不为空。

WITH
combo (eventtype,datasource,sessionid,amount,ts) AS (
SELECT
  NULL
, datasource
, NULL
, amount
, timestamp
FROM meter_increment
UNION ALL SELECT
  eventtype
, NULL
, sessionid
, amount
, timestamp
FROM meter_billing_sessions
)
,
sessionized AS (
SELECT
  CONDITIONAL_TRUE_EVENT(LAG(sessionid) IS NOT NULL) OVER(w) AS sess_id
, *
FROM combo
WINDOW w AS (ORDER BY ts)
)
SELECT
  sess_id
, MAX(sessionid) AS sessionid
, MIN(ts) AS measure_start_ts
, MAX(ts) AS measure_end_ts
, SUM(amount) AS billed_consumption
FROM sessionized
GROUP BY 1
ORDER BY 1
;
-- out  sess_id | sessionid | measure_start_ts | measure_end_ts | billed_consumption 
-- out ---------+-----------+------------------+----------------+--------------------
-- out        0 | eda6a919  |    1588047625320 |  1588071842076 |            1096460
-- out        1 | eda6a919  |    1588071842250 |  1588072173722 |            6601728
-- out        2 | eda6a919  |    1588072173930 |  1588072738485 |            7202816
-- out        3 | eda6a919  |    1588072738634 |  1588073073864 |           16607232
-- out        4 | eda6a919  |    1588073074009 |  1588073142312 |           12030976
-- out        5 | 2d54d918  |    1588073142489 |  1588077038705 |             653312
-- out        6 | 2d54d918  |    1588077038845 |  1588077039792 |            5370880
-- out        7 | 2d54d918  |    1588077039868 |  1588077113129 |             176128
-- out        8 | 2d54d918  |    1588077113216 |  1588077541749 |             823296
-- out        9 | 2d54d918  |    1588077541880 |  1588077858532 |           22238208
-- out       10 | 130759f4  |    1588077858642 |  1588078142145 |           56680448
-- out       11 | 6178b903  |    1588078142229 |  1588078472832 |           66718720
-- out       12 | 6178b903  |    1588078472903 |  1588079039102 |          105058304
-- out       13 | 6178b903  |    1588079039224 |  1588079375536 |           52585472
-- out       14 | 6178b903  |    1588079375649 |  1588079941631 |           38905856
-- out       15 | 6178b903  |    1588079941748 |  1588080274318 |           40554496
-- out       16 | 902359b4  |    1588080274406 |  1588080842532 |           19591168
-- out       17 |           |    1588080842666 |  1591372525838 |               3072

相关问题