oozie:sqoop动态目标目录

egdjgwm8  于 2021-06-04  发布在  Hadoop
关注(0)|答案(4)|浏览(522)

我在执行死刑 sqoop 工作来源 Oozie 工作流。我可以用静态名称创建目标目录 sqoop 命令如下。

<action name="table1" cred="">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <command>job --exec EMPLOYEE --meta-connect jdbc:hsqldb:hsql://<host>:<port>/sqoop -- --target-dir /user/test/Employee/20150126</command>
        </sqoop>
        <ok to="end" />
        <error to="kill" />
</action>

我需要创建带有日期的动态目标目录。我试着跟着,但没用。

<action name="table1" cred="">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <command>job --exec EMPLOYEE --meta-connect jdbc:hsqldb:hsql://<host>:<port>/sqoop -- --target-dir /user/test/Employee/$(date +%Y%m%d)</command>
        </sqoop>
        <ok to="end" />
        <error to="kill" />
</action>

运行时显示以下错误。

3622 [main] INFO  org.apache.sqoop.Sqoop  - Running Sqoop version: 1.4.5-cdh5.2.0
  3957 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Error parsing arguments for import:
  3957 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: +%Y%m%d)
  Intercepting System.exit(1)
cwxwcias

cwxwcias1#

捕获输出对我有帮助!。

<action name='custom-var'>
   <shell xmlns="uri:oozie:shell-action:0.1">
      ...
     <exec>set_variable.sh</exec>
     <file>set_variable.sh</file>
     <capture-output/>
   </shell>
</action>

<action name='sqoop-test'>
  <sqoop xmlns="uri:oozie:sqoop-action:0.2">
    ...
    <command> --target-dir /test/${wf:actionData('custom-var')['var1']}  --m 1 </command>
  </sqoop>
</action>

set_variable.sh
echo "var1=$(date +%Y/%m/%d)"
yuvru6vn

yuvru6vn2#

您可以将coordina时间从coordinator.xml传递到workflow.xml。所以在工作流中你可以这样做

/user/test/Employee/${timePassedFromCoordinator}
qnyhuwrf

qnyhuwrf3#

在coordinator中,您可以按如下所示的格式获取日期和格式

<action>
        <workflow>
            <app-path>${WF_Maig_1}</app-path>
            <configuration>

                <property><name>currentbatchtime</name><value>${coord:formatTime(coord:dateOffset(coord:nominalTime(),0,'DAY'),"yyyy-MM-dd")}</value></property>
                <property><name>nextbatchtime</name><value>${coord:formatTime(coord:dateOffset(coord:nominalTime(),1,'DAY'),"yyyy-MM-dd")}</value></property>
            </configuration>
        </workflow>
    </action>

现在您可以在workflow.xml和属性文件中使用/user/test/employee/${currentbatchtime}

i2byvkas

i2byvkas4#

您可以使用shell操作通过以下方法将env var指定为日期。
“变量.sh”


# !/bin/sh

outputDir =$(date +%Y%m%d)

工作流.xml

<action name='shell1'>
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <exec>variable.sh</exec>
                      (or)
            <env-var>[outputDir=$(date +%Y%m%d)]</env-var>
        </shell>
        <ok to="table1" />
        <error to="fail" />
</action>
<action name="table1" cred="">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <command>job --exec EMPLOYEE --meta-connect jdbc:hsqldb:hsql://<host>:<port>/sqoop -- --target-dir /user/test/Employee/$(outputDir)</command>
        </sqoop>
        <ok to="end" />
        <error to="kill" />
</action>

相关问题