错误2998:未处理的内部错误空-apache pig

zz2j4svz  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(429)

我已经编写了一个pig代码,我想用多个字符串匹配一个列。如。

A = FOREACH A1 GENERATE
    c1, c2, c3,

--i have substituted junk values--

case
when (  (
       column_name matches '.*abc.*'
    OR column_name matches '.*sdf.*'
    OR column_name matches '.*bcd.*'
    OR column_name MATCHES '.*def.*'
    OR column_name MATCHES '.*efg.*'
    OR column_name MATCHES '.*ggg.*'
    OR column_name MATCHES '.*ghi.*'
    OR column_name MATCHES '.*hij.*'
    OR column_name MATCHES '.*ijk.*'
    OR column_name MATCHES '.*jkl.*'
    OR column_name MATCHES '.*klm.*'
    OR column_name MATCHES '.*lmn.*'
    or column_name matches '.*mno.*'
    or column_name matches '.*mnb.*'
    or column_name matches '.*opq.*'
    or column_name matches '.*pqr.*'
    or column_name matches '.*qrs.*'
    or column_name matches '.*stuv.*'
    or column_name matches '.*tuvw.*'
    or column_name matches '.*wxy.*'
    or column_name matches '.*tuvwx.*'
    or column_name matches '.*xyz.*'
    .
    .
    .
    .
    .
    ) then 1
            else 0 as c4;

可以观察到,当or column\u name matches'--'语句的数目超过672时,pig脚本无法运行,错误如下:

Pig Stack Trace
---------------
ERROR 2998: Unhandled internal error. null

java.lang.StackOverflowError
        at java.util.zip.Deflater.ensureOpen(Deflater.java:543)
        at java.util.zip.Deflater.deflate(Deflater.java:426)
        at java.util.zip.Deflater.deflate(Deflater.java:352)
        at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:251)
        at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
        at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
        at java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1840)
        at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1533)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
        at java.util.ArrayList.writeObject(ArrayList.java:742)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1495)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
        at java.util.ArrayList.writeObject(ArrayList.java:742)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1495)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
        at java.util.ArrayList.writeObject(ArrayList.java:742)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

请提出一个解决方案或替代方案,以满足这一要求。

dkqlctbz

dkqlctbz1#

您可以考虑编写一个定制的filter函数1,在这里您可以更好地控制ram的消耗。很可能您不需要regex,而是需要子字符串搜索。

相关问题