优化,为什么SQL查询在Oracle数据库19中运行非常慢

enxuqcxy  于 2023-08-04  发布在  Oracle
关注(0)|答案(2)|浏览(263)

想要优化一个查询,但是要花费太多的时间...如果他们能给予感谢的帮助。我有一个查询,运行在一个相当大的表,查询永远,大概做了一个完整的表扫描。这个查询是非常非常慢!

select 
transaction_id as APPL_ID
,cast(reason_desc as VARCHAR2(2000)) as APPL_REJECT_REASON_DESC
,cast(reason_code as VARCHAR2(2000)) as APPL_REJECT_REASON
,cast(user_reject as VARCHAR2(100)) as APPL_REJECT_USER
,cast(step_reject as VARCHAR2(100)) as APPL_REJECT_USER_ROLE
,cast(reason_desc_vn as VARCHAR2(2000)) as APPL_REJECT_REASON_DESC_VN
from(
          select wfi.item_id transaction_id
                ,cq1.name as reason_desc
                ,cq1.response as reason_code
                ,COALESCE(wft.executed_by, wft.recipient_shortname) as user_reject
                ,wft.profile_access_right_sname as step_reject              
                ,row_number() over (partition by wfi.item_id order by wft.start_date desc) stt
                ,cq1.name_1 as reason_desc_vn
               --SELECT 1
        from  STA.STA_ACL_WF_INSTANCE wfi 
        inner join STA.STA_ACL_WF_TASK wft 
            on (wfi.item = 'transaction' and wfi.wf_instance_id = wft.wf_instance_id)           
        inner join (
        SELECT t.task_id
        ,LISTAGG (t.response, ', ') WITHIN GROUP (ORDER BY  t.response) as response
        ,LISTAGG (t11.name, ', ') WITHIN GROUP (ORDER BY  t11.name) as  name
        ,LISTAGG (t11.name_1, ', ') WITHIN GROUP (ORDER BY  t11.name_1) as  name_1
        FROM (
        SELECT task_id,
          CAST(TRIM(regexp_substr(t.RESPONSE, '[^,]+', 1, levels.column_value)) AS VARCHAR2(100)) AS RESPONSE
        FROM sta.sta_acl_wf_task_col_quest t,
          table(cast(multiset(select level from dual connect by  level <= length (
                regexp_replace(t.RESPONSE, '[^,]+'))  + 1) as sys.OdciNumberList)) levels
          WHERE t.question in ('Reject Reason','Cancel Reason','Reject Reason For Recommendation','Reject Reason For Approval')
          AND t.response is not null
          ) t 
         LEFT JOIN STA.STA_ACL_STATIC_DATA_TABLE t11 on (t.response= t11.SHORTNAME)
        WHERE 1=1
        GROUP BY t.task_id
        )cq1 on wft.task_id = cq1.task_id
    ) t8 where stt=1

字符串
解释SQL

PLAN_TABLE_OUTPUT                                                                                                                 |
----------------------------------------------------------------------------------------------------------------------------------+
Plan hash value: 3937279373                                                                                                       |
                                                                                                                                  |
-----------------------------------------------------------------------------------------------------------------------------     |
| Id  | Operation                               | Name                      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |     |
-----------------------------------------------------------------------------------------------------------------------------     |
|   0 | SELECT STATEMENT                        |                           |  9560M|    53T|       |  3339M  (2)| 72:27:52 |     |
|*  1 |  VIEW                                   |                           |  9560M|    53T|       |  3339M  (2)| 72:27:52 |     |
|*  2 |   WINDOW NOSORT                         |                           |  9560M|  8431G|       |  3339M  (2)| 72:27:52 |     |
|   3 |    SORT GROUP BY                        |                           |  9560M|  8431G|  8580G|  3339M  (2)| 72:27:52 |     |
|*  4 |     HASH JOIN RIGHT OUTER               |                           |  9560M|  8431G|  3392K|   650M  (1)| 14:07:31 |     |
|   5 |      TABLE ACCESS FULL                  | STA_ACL_STATIC_DATA_TABLE | 19812 |  3153K|       |   399   (3)| 00:00:01 |     |
|   6 |      NESTED LOOPS                       |                           |  7985M|  5830G|       |    40M  (4)| 00:52:07 |     |
|*  7 |       HASH JOIN                         |                           |   488K|   364M|   446M|   958K  (3)| 00:01:15 |     |
|*  8 |        TABLE ACCESS FULL                | STA_ACL_WF_INSTANCE       |  1592K|   428M|       | 77403   (4)| 00:00:07 |     |
|*  9 |        HASH JOIN                        |                           |   986K|   470M|    57M|   787K  (3)| 00:01:02 |     |
|* 10 |         TABLE ACCESS FULL               | STA_ACL_WF_TASK_COL_QUEST |   986K|    46M|       |   396K  (5)| 00:00:32 |     |
|  11 |         TABLE ACCESS FULL               | STA_ACL_WF_TASK           |  5496K|  2364M|       |   139K  (3)| 00:00:11 |     |
|  12 |       COLLECTION ITERATOR SUBQUERY FETCH|                           | 16360 | 32720 |       |    80   (4)| 00:00:01 |     |
|* 13 |        CONNECT BY WITHOUT FILTERING     |                           |       |       |       |            |          |     |
|  14 |         FAST DUAL                       |                           |     1 |       |       |     3   (0)| 00:00:01 |     |
-----------------------------------------------------------------------------------------------------------------------------     |
                                                                                                                                  |
Predicate Information (identified by operation id):                                                                               |
---------------------------------------------------                                                                               |
                                                                                                                                  |
   1 - filter("STT"=1)                                                                                                            |
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "WFI"."ITEM_ID" ORDER BY INTERNAL_FUNCTION("WFT"."START_DATE") DESC                |
              )<=1)                                                                                                               |
   4 - access("T11"."SHORTNAME"(+)=SYS_OP_C2C(CAST(TRIM( REGEXP_SUBSTR ("T"."RESPONSE" /*+ LOB_BY_VALUE */                        |
              ,'[^,]+',1,VALUE(KOKBF$))) AS VARCHAR2(100))))                                                                      |
   7 - access("WFI"."WF_INSTANCE_ID"="WFT"."WF_INSTANCE_ID")                                                                      |
   8 - filter("WFI"."ITEM"=U'transaction')                                                                                        |
   9 - access("WFT"."TASK_ID"="TASK_ID")                                                                                          |
  10 - filter(("T"."QUESTION"=U'Cancel Reason' OR "T"."QUESTION"=U'Reject Reason' OR "T"."QUESTION"=U'Reject Reason               |
              For Approval' OR "T"."QUESTION"=U'Reject Reason For Recommendation') AND "T"."RESPONSE" /*+ LOB_BY_VALUE */  IS NOT |
              NULL)                                                                                                               |
  13 - filter(LEVEL<=LENGTH( REGEXP_REPLACE (:B1,'[^,]+'))+1)                                                                     |


您能指导我如何更改查询以获得更好的性能吗?如果你能帮忙的话,我将不胜感激。

xxls0lw8

xxls0lw81#

正则表达式是缓慢的和hierarchical queries may be slower than recursive queries
你可以使用简单的字符串函数来拆分字符串(这是一种更复杂但更高效的方法):

WITH responses (task_id, response, spos, epos) AS (
  SELECT task_id,
         response,
         1,
         INSTR(response, ',', 1)
  FROM   sta.sta_acl_wf_task_col_quest
  WHERE  question in (
           'Reject Reason',
           'Cancel Reason',
           'Reject Reason For Recommendation',
           'Reject Reason For Approval'
         )
  AND    response is not null
UNION ALL
  SELECT task_id,
         response,
         epos + 1,
         INSTR(response, ',', epos + 1)
  FROM   responses
  WHERE  epos > 0
)
SELECT task_id,
       CASE epos
       WHEN 0
       THEN SUBSTR(response, spos)
       ELSE SUBSTR(response, spos, epos - spos)
       END AS response
FROM   responses

字符串
但是,如果不使用逗号分隔变量存储值,而是使用表的单独行,则效果会更好。

7xllpg7q

7xllpg7q2#

尝试删除不必要的TABLE(CAST(MULTISET。更简单的表达式更可能导致更好的基数估计,这更可能导致更好的执行计划。
例如,下面是原始子查询的解释计划。(虽然我不得不添加文字,使代码在我的系统上工作。)查询返回4行,但Oracle认为它将返回8168。这是因为Oracle只是放弃了尝试估计不可预测的表函数,并返回一个接近块大小的数字。这也许可以解释你的解释计划的16360 -我猜你的系统有一个16 K块大小?

explain plan for
select * from
table(cast(multiset(select level from dual connect by  level <= length (
regexp_replace(/*It.RESPONSE*/ 'A,B,C,D', '[^,]+'))  + 1) as sys.OdciNumberList)) levels
WHERE /*t.question*/ 'Cancel Reason' in ('Reject Reason','Cancel Reason','Reject Reason For Recommendation','Reject Reason For Approval');

select * from table(dbms_xplan.display);

Plan hash value: 3985296316
 
-------------------------------------------------------------------------------------------
| Id  | Operation                          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |      |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  COLLECTION ITERATOR SUBQUERY FETCH|      |  8168 | 16336 |    29   (0)| 00:00:01 |
|*  2 |   CONNECT BY WITHOUT FILTERING     |      |       |       |            |          |
|   3 |    FAST DUAL                       |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(LEVEL<=2)

字符串
删除TABLE(CAST(MULTISET有助于Oracle更好地估计1行,这非常接近实际的4行。(但是如果您使用MTO的建议来避免存储逗号分隔的列表,则可以简化代码并进一步改进基数估计。

explain plan for
select * from
(select level from dual connect by  level <= length (
regexp_replace(/*It.RESPONSE*/ 'A,B,C,D', '[^,]+')) + 1) levels
WHERE /*t.question*/ 'Cancel Reason' in ('Reject Reason','Cancel Reason','Reject Reason For Recommendation','Reject Reason For Approval');

select * from table(dbms_xplan.display);

Plan hash value: 2403765415
 
--------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  VIEW                         |      |     1 |    13 |     2   (0)| 00:00:01 |
|*  2 |   CONNECT BY WITHOUT FILTERING|      |       |       |            |          |
|   3 |    FAST DUAL                  |      |     1 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(LEVEL<=4)


但这只是猜测!我们处理的是解释计划猜测,而不是执行计划的实际值。简化查询以更接近实际值几乎总是有帮助的,但有时它反而会使事情变得更糟。
如果你真的想优化你的查询,请尝试以下步骤并修改你的帖子以获得额外的反馈:
1.查询需要多长时间?你预计要花多长时间?
1.您的表有多大(使用DBA_SEGMENTS.BYTES),从每个表返回的行的百分比是多少?不要太担心全表扫描。索引更适合于从表中检索小百分比的行,而全表扫描更适合于从表中检索大百分比的行。如果查询的一部分必须不可避免地返回一个大表的50%,那么限制因素将是您的硬件读取该表的X GB数据的速度。
1.得到实际的数字而不是猜测。通过查询GV$SQL找到查询的SQL_ID,然后运行select dbms_sqltune.report_sql_monitor('<sql_id>') from dual;。结果将为您提供每个操作的 * 实际 * 时间量和实际行数。它会告诉你哪些操作需要担心,并为你提供线索,说明为什么Oracle选择了一个糟糕的计划。解释计划是一个不错的开始,但与实际价值相比,它们是可悲的。这一步可能要花上几个小时才能理解。

相关问题