如何获取spark sql查询执行计划的dag?

yfwxisqw  于 2021-05-24  发布在  Spark
关注(0)|答案(1)|浏览(1222)

我正在对sparksql查询执行计划进行一些分析。explain()api打印的执行计划可读性不强。如果我们看到SparkWebUI,就会创建一个dag图,它分为作业、阶段和任务,可读性更高。有没有任何方法可以从执行计划或代码中的任何api创建该图?如果没有,是否有任何api可以从ui读取grap?

bxjv4tth

bxjv4tth1#

在我看来,这个项目(https://github.com/absaoss/spline-spark-agent)能够解释执行计划并以可读的方式生成它。这个spark作业是读取一个文件,将其转换为csv文件,然后写入本地。
json中的示例输出如下

  1. {
  2. "id": "3861a1a7-ca31-4fab-b0f5-6dbcb53387ca",
  3. "operations": {
  4. "write": {
  5. "outputSource": "file:/output.csv",
  6. "append": false,
  7. "id": 0,
  8. "childIds": [
  9. 1
  10. ],
  11. "params": {
  12. "path": "output.csv"
  13. },
  14. "extra": {
  15. "name": "InsertIntoHadoopFsRelationCommand",
  16. "destinationType": "csv"
  17. }
  18. },
  19. "reads": [
  20. {
  21. "inputSources": [
  22. "file:/Users/liajiang/Downloads/spark-onboarding-demo-application/src/main/resources/wikidata.csv"
  23. ],
  24. "id": 2,
  25. "schema": [
  26. "6742cfd4-d8b6-4827-89f2-4b2f7e060c57",
  27. "62c022d9-c506-4e6e-984a-ee0c48f9df11",
  28. "26f1d7b5-74a4-459c-87f3-46a3df781400",
  29. "6e4063cf-4fd0-465d-a0ee-0e5c53bd52b0",
  30. "2e019926-3adf-4ece-8ea7-0e01befd296b"
  31. ],
  32. "params": {
  33. "inferschema": "true",
  34. "header": "true"
  35. },
  36. "extra": {
  37. "name": "LogicalRelation",
  38. "sourceType": "csv"
  39. }
  40. }
  41. ],
  42. "other": [
  43. {
  44. "id": 1,
  45. "childIds": [
  46. 2
  47. ],
  48. "params": {
  49. "name": "`source`"
  50. },
  51. "extra": {
  52. "name": "SubqueryAlias"
  53. }
  54. }
  55. ]
  56. },
  57. "systemInfo": {
  58. "name": "spark",
  59. "version": "2.4.2"
  60. },
  61. "agentInfo": {
  62. "name": "spline",
  63. "version": "0.5.5"
  64. },
  65. "extraInfo": {
  66. "appName": "spark-spline-demo-application",
  67. "dataTypes": [
  68. {
  69. "_typeHint": "dt.Simple",
  70. "id": "f0dede5e-8fe1-4c22-ab24-98f7f44a9a5a",
  71. "name": "timestamp",
  72. "nullable": true
  73. },
  74. {
  75. "_typeHint": "dt.Simple",
  76. "id": "dbe1d206-3d87-442c-837d-dfa47c88b9c1",
  77. "name": "string",
  78. "nullable": true
  79. },
  80. {
  81. "_typeHint": "dt.Simple",
  82. "id": "0d786d1e-030b-4997-b005-b4603aa247d7",
  83. "name": "integer",
  84. "nullable": true
  85. }
  86. ],
  87. "attributes": [
  88. {
  89. "id": "6742cfd4-d8b6-4827-89f2-4b2f7e060c57",
  90. "name": "date",
  91. "dataTypeId": "f0dede5e-8fe1-4c22-ab24-98f7f44a9a5a"
  92. },
  93. {
  94. "id": "62c022d9-c506-4e6e-984a-ee0c48f9df11",
  95. "name": "domain_code",
  96. "dataTypeId": "dbe1d206-3d87-442c-837d-dfa47c88b9c1"
  97. },
  98. {
  99. "id": "26f1d7b5-74a4-459c-87f3-46a3df781400",
  100. "name": "page_title",
  101. "dataTypeId": "dbe1d206-3d87-442c-837d-dfa47c88b9c1"
  102. },
  103. {
  104. "id": "6e4063cf-4fd0-465d-a0ee-0e5c53bd52b0",
  105. "name": "count_views",
  106. "dataTypeId": "0d786d1e-030b-4997-b005-b4603aa247d7"
  107. },
  108. {
  109. "id": "2e019926-3adf-4ece-8ea7-0e01befd296b",
  110. "name": "total_response_size",
  111. "dataTypeId": "0d786d1e-030b-4997-b005-b4603aa247d7"
  112. }
  113. ]
  114. }
  115. }
展开查看全部

相关问题