最近写了一个Spark程序用来读取Hbase中的数据,我的Spark版本是1.6.1,Hbase版本是0.96.2-hadoop2,当程序写完之后,使用下面命令提交作业:
[iteblog@www.iteblog.com $] bin/spark-submit --master yarn-cluster --executor-memory 4g --num-executors 5 --queue iteblog --executor-cores 2 --class com.iteblog.hbase.HBaseRead --jars spark-hbase-connector_2.10-1.0.3.jar,hbase-common-0.96.2-hadoop2.jar,hbase-server-0.96.2-hadoop2.jar,hbase-client-0.96.2-hadoop2.jar,hbase-protocol-0.96.2-hadoop2.jar,htrace-core-2.04.jar,guava-15.0.jar spark-iteblog-1.0-SNAPSHOT.jar
结果作业运行的时候出现了以下异常
16/11/03 09:45:05 ERROR ApplicationMaster: User class threw exception: org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.IllegalAccessError: class com.google.protobuf.HBaseZeroCopyByteString cannot access its superclass com.google.protobuf.LiteralByteString org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.IllegalAccessError: class com.google.protobuf.HBaseZeroCopyByteString cannot access its superclass com.google.protobuf.LiteralByteString at org.apache.hadoop.hbase.client.RpcRetryingCaller.translateException(RpcRetryingCaller.java:215) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:127) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:96) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:264) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:169) at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:164) at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:107) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:720) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:174) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:82) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.isTableAvailable(HConnectionManager.java:911) at org.apache.hadoop.hbase.client.HBaseAdmin.isTableAvailable(HBaseAdmin.java:1074) at org.apache.hadoop.hbase.client.HBaseAdmin.isTableAvailable(HBaseAdmin.java:1082) at com.iteblog.hbase.HBaseRead$.main(HBaseRead.scala:29) at com.iteblog.hbase.HBaseRead.main(HBaseRead.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) Caused by: java.lang.IllegalAccessError: class com.google.protobuf.HBaseZeroCopyByteString cannot access its superclass com.google.protobuf.LiteralByteString at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:800) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.apache.hadoop.hbase.protobuf.RequestConverter.buildRegionSpecifier(RequestConverter.java:910) at org.apache.hadoop.hbase.protobuf.RequestConverter.buildScanRequest(RequestConverter.java:420) at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:297) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:157) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:57) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:120) ... 18 more
Google了一下发现这是由于Hbase的Bug导致的(可以参见https://issues.apache.org/jira/browse/HBASE-10304),这个问题的发生是由于优化了 HBASE-9867 引起的,无意间引进了一个依赖类加载器。它影响使用 -libjars
参数和使用 fat jar两种模式的job,官方对这个Bug的描述:
This is caused by an optimization introduced in HBASE-9867 that inadvertently introduced a classloader dependency.This affects both jobs using the -libjars option and "fat jar," those which package their runtime dependencies in a nested lib folder.
这个Bug影响了 0.96.1 到 0.98.4版本的HBase。对于这个问题的解决,我们可以在运行作业的时候加入比较新的 hbase-protocol jar包到环境变量里面。具体来说如下:
Spark作业解决
如果你是Spark作业,可以通过下面的方法解决上面的问题:
[iteblog@www.iteblog.com ~]$ bin/spark-submit --master yarn-cluster --executor-memory 4g --num-executors 5 --queue iteblog --executor-cores 2 --class com.iteblog.hbase.HBaseRead --jars spark-hbase-connector_2.10-1.0.3.jar,hbase-common-0.98.6-hadoop2.jar,hbase-server-0.98.6-hadoop2.jar,hbase-client-0.98.6-hadoop2.jar,hbase-protocol-0.98.6-hadoop2.jar,htrace-core-2.04.jar,guava-15.0.jar --conf "spark.driver.extraClassPath=/home/iteblog/hbase-protocol-0.98.6-hadoop2.jar" --conf "spark.executor.extraClassPath=/home/iteblog/hbase-protocol-0.98.6-hadoop2.jar" spark-iteblog-1.0-SNAPSHOT.jar
上面通过 spark.driver.extraClassPath
和 spark.executor.extraClassPath
分别在Spark作业的Driver和Exectuor端的环境变量里面引入了 hbase-protocol-0.98.6-hadoop2.jar
包。注意,需要把所有hbase-xxx的jar包都换成0.98.x以后的,否则还是会有上述异常问题。当然,如果你觉得上面比较麻烦,你可以把上面的 --conf
配置写入到 $SPARK_HOME/conf/spark-defaults.conf
文件里面,这样就不需要再在提交作业的时候加上了。
Mapreduce作业解决
如果你是在Mapreduce作业中遇到上面的异常,可以通过设置 HADOOP_CLASSPATH
环境变量解决上面的问题,如下:
[iteblog@www.iteblog.com ~]$ export HADOOP_CLASSPATH="/home/iteblog/hbase-protocol-0.98.6-hadoop2.jar" [iteblog@www.iteblog.com ~]$ bin/hadoop --config /home/stack/conf_hadoop/ jar /home/iteblog/hbase-protocol-0.98.6-hadoop2.jar org.apache.hadoop.hbase.mapreduce.RowCounter hbaseTable本博客文章除特别声明,全部都是原创!
原创文章版权归过往记忆大数据(过往记忆)所有,未经许可不得转载。
本文链接: 【运行Hbase作业出现cannot access its superclass com.google.protobuf.LiteralByteString异常解决】(https://www.iteblog.com/archives/1866.html)