HiveServer2(Spark ThriftServer)自定义权限认证

Hive 除了为我们提供一个 CLI 方式来查询数据之外，还给我们提供了基于 JDBC/ODBC 的方式来连接 Hive，这就是 HiveServer2（HiveServer）。但是默认情况下通过 JDBC 连接 HiveServer2 不需要任何的权限认证（hive.server2.authentication = NONE）；这意味着任何知道 ThriftServer 地址的人都可以连接我们的 Hive，并执行一些操作。更可怕的是，这些人甚至可以执行一些诸如 drop table xxx 之类的操作，这势必威胁到我们的数据安全。基于此，本文介绍如何通过自定义的方式来对连接 ThriftServer 的人进行有效的权限验证。

如果想及时了解Spark、Hadoop或者Hbase相关的文章，欢迎关注微信公共帐号：iteblog_hadoop

HiveServer2 支持多种用户安全认证方式：NONE, NOSASL, KERBEROS, LDAP, PAM ,CUSTOM 等等。我们可以通过 hive.server2.authentication 参数进行配置。这篇文章我们涉及到的配置是 hive.server2.authentication=CUSTOM，这时候我们需要通过 hive.server2.custom.authentication.class 参数配置我们自定义的权限认证类，这个类必须实现 org.apache.hive.service.auth.PasswdAuthenticationProvider 接口，org.apache.hive.service.auth.PasswdAuthenticationProvider 接口的定义如下：

package org.apache.hive.service.auth;

import javax.security.sasl.AuthenticationException;

public interface PasswdAuthenticationProvider {

  /**
   * The Authenticate method is called by the HiveServer2 authentication layer
   * to authenticate users for their requests.
   * If a user is to be granted, return nothing/throw nothing.
   * When a user is to be disallowed, throw an appropriate {@link AuthenticationException}.
   * <p/>
   * For an example implementation, see {@link LdapAuthenticationProviderImpl}.
   *
   * @param user     The username received over the connection request
   * @param password The password received over the connection request
   *
   * @throws AuthenticationException When a user is found to be
   *                                 invalid by the implementation
   */
  void Authenticate(String user, String password) throws AuthenticationException;
}

从上可知，我们需要实现 Authenticate 方法（为啥这个方法第一个字母大写？不应该是 authenticate 吗？）。现在我们来定义一个自己的授权认证类 IteblogPasswdAuthenticationProvider。为了简便，我把用户名和密码等数据写到了文件中；你要是愿意，你可以将这些信息存储到数据库中。实现代码非常简单，如下：

package com.iteblog.hive;

import com.google.common.base.Charsets;
import com.google.common.collect.Maps;
import com.google.common.io.Files;
import com.google.common.io.LineProcessor;
import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hive.service.auth.PasswdAuthenticationProvider;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import javax.security.sasl.AuthenticationException;
import java.io.File;
import java.io.IOException;
import java.util.Map;

public class IteblogPasswdAuthenticationProvider implements PasswdAuthenticationProvider {
    private static Logger logger = LoggerFactory.getLogger("IteblogPasswdAuthenticationProvider");
    private static Map<String, String> lines = null;

    static {
        HiveConf hiveConf = new HiveConf();
        String filePath = hiveConf.get("hive.server2.custom.authentication.passwd.filepath");
        logger.warn("Start init configuration file: {}.", filePath);
        try {
            File file = new File(filePath);
            lines = Files.readLines(file, Charsets.UTF_8, new LineProcessor<Map<String, String>>() {
                Map<String, String> map = Maps.newHashMap();

                public boolean processLine(String line) {
                    String arr[] = line.split(",");
                    if (arr.length != 2) {
                        logger.error("Configuration error: {}", line);
                        return false;
                    }
                    map.put(arr[0], arr[1]);
                    return true;
                }

                public Map<String, String> getResult() {
                    return map;
                }
            });

        } catch (IOException e) {
            logger.error("Read configuration file error: {}", e.getMessage());
            System.exit(127);
        }
    }

    public void Authenticate(String username, String password) throws AuthenticationException {
        if (lines == null) {
            throw new AuthenticationException("Configuration file parser error!");
        }

        String passwd = lines.get(username);
        if (passwd == null) {
            throw new AuthenticationException("Unauthorized user: " + username + ", please contact iteblog to add an account.");
        } else if (!passwd.equals(password)) {
            throw new AuthenticationException("Incorrect password for " + username + ", please try again.");
        }

        logger.warn("User[{}] authorized success.", username);
    }
}

上面的代码逻辑非常简单，我就不介绍了。我们把用户名和密码信息写到了 /home/iteblog/hive-thrift-passwd 文件中，内容日下:

iteblog,123

并通过参数 hive.server2.custom.authentication.passwd.filepath 参数指定这个文件的路径。现在我们编译上面的类，得到的 jar 包放到 $HIVE_HOME/lib 路径下面（如果你是用 Spark，那你将这个 jar 包放到 $SPARK_HOME/jars 或 $SPARK_HOME/lib 目录下）。并在 hive-site.xml 文件里面加入以下的配置：

<property>
  <name>hive.server2.authentication</name>
  <value>CUSTOM</value>
</property>

<property>
  <name>hive.server2.custom.authentication.class</name>
  <value>com.iteblog.hive.IteblogPasswdAuthenticationProvider</value>
</property>

<property>
  <name>hive.server2.custom.authentication.passwd.filepath</name>
  <value>/home/iteblog/hive-thrift-passwd</value>
</property>

重启 thriftServer，现在我们来测试下这个设置是否生效：

beeline> !connect jdbc:hive2://www.iteblog.com:10000
Connecting to jdbc:hive2://www.iteblog.com:10000
Enter username for jdbc:hive2://www.iteblog.com:10000:
Enter password for jdbc:hive2://www.iteblog.com:10000:
Error: Could not open client transport with JDBC Uri: jdbc:hive2://www.iteblog.com:10000: Peer indicated failure: Error validating the login (state=08S01,code=0)
0: jdbc:hive2://www.iteblog.com:10000 (closed)>

上面测试我们登录的时候没有设置用户名和密码，如果你没有输入用户名系统默认会用 anonymous 用户进行登录，但是我们并没有配置这个用户登录，这时候应该是不能登录的。从上面终端输出也可以看出登录失败，而且日志里面有如下异常输出

18/01/10 16:50:40 WARN IteblogPasswdAuthenticationProvider: Start init configuration file: /home/iteblog/hive-thrift-passwd.
18/01/10 16:50:40 ERROR TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: Error validating the login [Caused by javax.security.sasl.AuthenticationException: Unauthorized user: anonymous, please contact iteblog to add an account.]
  at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:109)
  at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)
  at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)
  at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
  at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
  at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
Caused by: javax.security.sasl.AuthenticationException: Unauthorized user: anonymous, please contact iteblog to add an account.
  at com.iteblog.hive.IteblogPasswdAuthenticationProvider.Authenticate(IteblogPasswdAuthenticationProvider.java:58)
  at org.apache.hive.service.auth.CustomAuthenticationProviderImpl.Authenticate(CustomAuthenticationProviderImpl.java:47)
  at org.apache.hive.service.auth.PlainSaslHelper$PlainServerCallbackHandler.handle(PlainSaslHelper.java:106)
  at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:102)
  ... 8 more

现在我们用正确的用户登录，但是密码是错误的，看下结果：

beeline> !connect jdbc:hive2://www.iteblog.com:10000
Connecting to jdbc:hive2://www.iteblog.com:10000
Enter username for jdbc:hive2://www.iteblog.com:10000: iteblog
Enter password for jdbc:hive2://www.iteblog.com:10000: ***
Error: Could not open client transport with JDBC Uri: jdbc:hive2://www.iteblog.com:10000: Peer indicated failure: Error validating the login (state=08S01,code=0)

从上面的结果可以看出，登录同样失败，日志里面也会有以下的异常信息：

18/01/10 16:53:39 ERROR TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: Error validating the login [Caused by javax.security.sasl.AuthenticationException: Incorrect password for iteblog, please try again.]
  at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:109)
  at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)
  at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)
  at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
  at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
  at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
Caused by: javax.security.sasl.AuthenticationException: Incorrect password for iteblog, please try again.
  at com.iteblog.hive.IteblogPasswdAuthenticationProvider.Authenticate(IteblogPasswdAuthenticationProvider.java:60)
  at org.apache.hive.service.auth.CustomAuthenticationProviderImpl.Authenticate(CustomAuthenticationProviderImpl.java:47)
  at org.apache.hive.service.auth.PlainSaslHelper$PlainServerCallbackHandler.handle(PlainSaslHelper.java:106)
  at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:102)
  ... 8 more

最后如果用户名和密码都输入正确，将会成功登录：

beeline> !connect jdbc:hive2://www.iteblog.com:10000
Connecting to jdbc:hive2://www.iteblog.com:10000
Enter username for jdbc:hive2://www.iteblog.com:10000: iteblog
Enter password for jdbc:hive2://www.iteblog.com:10000: ***
Connected to: Spark SQL (version 2.2.1)
Driver: Hive JDBC (version 1.2.1.hive2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
1: jdbc:hive2://www.iteblog.com:10000>

并且日志里面有以下的输出：

18/01/10 16:52:00 WARN IteblogPasswdAuthenticationProvider: User[iteblog] authorized success.

本博客文章除特别声明，全部都是原创！
原创文章版权归过往记忆大数据（过往记忆）所有，未经许可不得转载。
本文链接: 【HiveServer2(Spark ThriftServer)自定义权限认证】（https://www.iteblog.com/archives/2318.html）

大佬，问个问题，spark sql可以使用ranger鉴权吗

AXL2021-05-26 22:31 回复

你好，这个权限控制能够控制更加细粒度的权限吗？比如库级、表级、行级等

AXL2021-05-24 11:55 回复

这个不行的，给的接口只能接收用户名和密码。库级、表级、行级这个可以看下 hive.security.authorization.enabled。
w3970907702021-05-24 17:39 回复

您好, 不同不同用户链接上后对数据库的权限是怎么定义的呢?

Nred2018-10-09 16:39 回复

您好，请问一下这种方式的权限只是限制了thriftserver的连接，而不同用户连接上之后对库和表的权限都是超级用户？是不受hive控制的吗？希望您能在百忙之中抽出时间解答一下。谢谢！

Etrone Zhang2018-09-20 16:42 回复