java—如何在hadoop中序列化对象(在hdfs中)

p5fdfcr1 于 2021-06-02 发布在 Hadoop

关注(0)|答案(1)|浏览(284)

我有一个hashmap<string，arraylist。我想将我的hashmap对象（hmap）序列化到hdfs位置，然后在mapper和reducer上反序列化以使用它。
为了在hdfs上序列化hashmap对象，我使用了普通的java对象序列化代码，如下所示，但出现了错误（权限被拒绝）

try
        {
            FileOutputStream fileOut =new FileOutputStream("hashmap.ser");
            ObjectOutputStream out = new ObjectOutputStream(fileOut);
            out.writeObject(hm);
            out.close();

        }
        catch(Exception e)
        {
            e.printStackTrace();
        }

我有以下例外

java.io.FileNotFoundException: hashmap.ser (Permission denied)
    at java.io.FileOutputStream.open(Native Method)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:110)
    at KMerIndex.createIndex(KMerIndex.java:121)
    at MyDriverClass.formRefIndex(MyDriverClass.java:717)
    at MyDriverClass.main(MyDriverClass.java:768)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

有人可以建议或分享如何在hdfs上序列化hadoop对象的示例代码吗？

Java hadoop mapreduce bigdata serialization

来源：https://stackoverflow.com/questions/37545602/how-to-serialize-object-in-hadoop-in-hdfs

1条答案

按热度按时间

at0kjp5o1#

请尝试使用apache commons lang中的serializationutils。
下面是一些方法

static Object   clone(Serializable object)  //Deep clone an Object using serialization.
static Object   deserialize(byte[] objectData) //Deserializes a single Object from an array of bytes.
static Object   deserialize(InputStream inputStream)  //Deserializes an Object from the specified stream.
static byte[]   serialize(Serializable obj) //Serializes an Object to a byte array for storage/serialization.
static void serialize(Serializable obj, OutputStream outputStream) //Serializes an Object to the specified stream.

在存储到hdfs时，您可以存储 byte[] 从序列化返回的。在获取对象时，可以为ex:file对象键入cast到相应的对象，然后可以将其取回。
在我的例子中，我在hbase列中存储了一个hashmap，然后在我的mapper方法中以hashmap原样检索它。。在这方面很成功。
当然，你也可以用同样的方法。。。
另一件事是您也可以使用apachecommons io来引用这个( org.apache.commons.io.FileUtils ); 但稍后您需要将此文件复制到hdfs。因为你想要hdfs作为数据存储。

FileUtils.writeByteArrayToFile(new File("pathname"), myByteArray);

注意：jars apache commons io和apache commons lang在hadoop集群中总是可用的。

赞(0）回复(0）举报 2021-06-02

我来回答

java—如何在hadoop中序列化对象(在hdfs中)

1条答案

相关问题

热门标签

最新问答