MapReduce 异常 LongWritable cannot be cast to Text

2023-06-13,,

有一个txt文件,内容格公式是这样的:

深圳订做T恤	5729944
深圳厂家t恤批发 5729945
深圳定做文化衫 5729944
文化衫厂家 5729944
订做文化衫 5729944
深圳t恤厂家 5729945

前面是搜索关键词,后面的是所属的分类ID,以tab分隔,想统计分类情况。于是用以下的MapReduce程序跑了下:

import java.io.IOException;
import java.util.*; import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.*;
import org.apache.hadoop.mapreduce.lib.output.*;
import org.apache.hadoop.util.*; public class ClassCount extends Configured implements Tool
{
public static class ClassMap
extends Mapper<Text ,Text,Text,IntWritable>
{
private static final IntWritable one = new IntWritable(1);
private Text word = new Text(); public void map(Text key,Text value,Context context)
throws IOException,InterruptedException
{
String eachLine = value.toString();
StringTokenizer tokenizer = new StringTokenizer(eachLine,"\n");
while(tokenizer.hasMoreTokens())
{
StringTokenizer token = new StringTokenizer(tokenizer.nextToken(),"\t");
String keyword = token.nextToken();//i don't use it now.
String classId = token.nextToken();
word.set(classId);
context.write(word,one);
}
}
} public static class Reduce
extends Reducer<Text,IntWritable,Text,IntWritable>
{
public void reduce(Text key,Iterable<IntWritable> values,Context context)
throws IOException,InterruptedException
{
int sum = 0;
for(IntWritable val : values)
sum += val.get();
context.write(key,new IntWritable(sum));
}
}
public int run(String args[]) throws Exception{
Job job = new Job(getConf());
job.setJarByClass(ClassCount.class);
job.setJobName("classCount"); job.setMapperClass(ClassMap.class);
job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job,new Path(args[0]));
FileOutputFormat.setOutputPath(job,new Path(args[1])); boolean success = job.waitForCompletion(true);
return success ? 0 : 1;
}
public static void main(String[] args) throws Exception
{
int ret = ToolRunner.run(new ClassCount(),args);
System.exit(ret);
}
}

抛出例如以下异常

java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text

我以为输入的键是文本就用Text来作为key,但貌似不是这样子的,map方法把文件的行号当成key,所以要用LongWritable。

可是改过来之后,报了以下的异常:

14/04/25 17:21:15 INFO mapred.JobClient: Task Id : attempt_201404211802_0040_m_000000_1, Status : FAILED
java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.IntWritable

这个就更加直观了,须要在run方法中加入以下的两行以明白声明输入的格式。

	job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);


版权声明:本文博客原创文章。博客,未经同意,不得转载。

MapReduce 异常 LongWritable cannot be cast to Text的相关教程结束。

《MapReduce 异常 LongWritable cannot be cast to Text.doc》

下载本文的Word格式文档,以方便收藏与打印。