欢迎关注大数据技术架构与案例微信公众号:过往记忆大数据
过往记忆博客公众号iteblog_hadoop
欢迎关注微信公众号:
过往记忆大数据

Presto 自定义函数功能介绍及使用

Presto 内部提供了大量内置的函数,可以满足我们大部分的日常需求。但总是有一些场景需要我们自己写 UDF,为了满足这个需求,Presto 给我们提供了 Function Namespace Managers 模块使得我们可以实现直接的 UDF。本文将给大家介绍一下如何使用 Presto 的 UDF 功能。

如果需要使用 Function Namespace Managers 功能,需要把 presto-catalog-managers 模块里面的 jar 包移到 plugin 目录里面。另外需要在 etc/function-namespace/ 目录下创建一个后缀为 properties 的文件,比如我这里用到的是 mysql.properties(文件名随便设置),配置如下:

function-namespace-manager.name=mysql
database-url=jdbc:mysql://iteblog.com:3306/functions?user=root&password=password
function-namespaces-table-name=function_namespaces
functions-table-name=sql_functions

需要配置 database-url 的原因是因为 Presto 目前使用 MySQL 来存储用户创建的函数。另外,我们需要到上面配置的 MySQL 里面创建好名为 functions 的数据库(这个可以变)。当 Presto 集群启动的时候会到 MySQL 里面创建名为 function_namespaces 和 sql_functions 两张表。其中:

  • function-namespaces-table-name:用于存储函数 namespaces 的表;
  • functions-table-name:用于存储函数的定义

配置设置好之后,我们需要手动到 MySQL 里面插入如下数据:

INSERT INTO function_namespaces (catalog_name, schema_name)
VALUES('mysql', 'default');

上面 SQL 的意思是创建一个名为 mysql.default 的函数 namespace。

现在我们可以启动 Presto 集群了。

如果集群启动失败,并报了如下异常:

2022-03-14T15:21:43.364+0100    ERROR   main    com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors:

1) Explicit bindings are required and com.facebook.drift.transport.client.MethodInvokerFactory<java.lang.annotation.Annotation> is not explicitly bound.
  while locating com.facebook.drift.transport.client.MethodInvokerFactory<java.lang.annotation.Annotation>
    for the 2nd parameter of com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule.getDriftClientFactory(DriftClientBinder.java:229)
  at com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule.getDriftClientFactory(DriftClientBinder.java:229) (via modules: com.facebook.presto.functionNamespace.execution.SimpleAddressSqlFunctionExecutorsModule -> com.facebook.presto.functionNamespace.execution.thrift.SimpleAddressThriftSqlFunctionExecutionModule -> com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule)

1 error
com.google.inject.CreationException: Unable to create injector, see the following errors:

1) Explicit bindings are required and com.facebook.drift.transport.client.MethodInvokerFactory<java.lang.annotation.Annotation> is not explicitly bound.
  while locating com.facebook.drift.transport.client.MethodInvokerFactory<java.lang.annotation.Annotation>
    for the 2nd parameter of com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule.getDriftClientFactory(DriftClientBinder.java:229)
  at com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule.getDriftClientFactory(DriftClientBinder.java:229) (via modules: com.facebook.presto.functionNamespace.execution.SimpleAddressSqlFunctionExecutorsModule -> com.facebook.presto.functionNamespace.execution.thrift.SimpleAddressThriftSqlFunctionExecutionModule -> com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule)

1 error
        at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543)
        at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:159)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:106)
        at com.google.inject.Guice.createInjector(Guice.java:87)
        at com.facebook.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:251)
        at com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManagerFactory.create(MySqlFunctionNamespaceManagerFactory.java:59)
        at com.facebook.presto.metadata.FunctionAndTypeManager.loadFunctionNamespaceManager(FunctionAndTypeManager.java:166)
        at com.facebook.presto.metadata.StaticFunctionNamespaceStore.loadFunctionNamespaceManager(StaticFunctionNamespaceStore.java:81)
        at com.facebook.presto.metadata.StaticFunctionNamespaceStore.loadFunctionNamespaceManagers(StaticFunctionNamespaceStore.java:63)
        at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:152)
        at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:79)

如果出现以上异常,可以修改 com.facebook.presto.functionNamespace.execution.thrift.SimpleAddressThriftSqlFunctionExecutionModule 类,并在 configure 方法里面加入一下代码:

binder.bind(new TypeLiteral<MethodInvokerFactory<Annotation>>(){})
                .toInstance((MethodInvokerFactory<Annotation>) DriftNettyMethodInvokerFactory.createStaticDriftNettyMethodInvokerFactory(new DriftNettyClientConfig()));
configBinder(binder).bindConfig(DriftNettyClientConfig.class);

同时加上以下依赖:

<dependency>
      <groupId>com.facebook.drift</groupId>
      <artifactId>drift-transport-netty</artifactId>
</dependency>

如果启动 Presto 集群出现以下异常:

2022-03-14T15:21:43.364+0100    ERROR   main    com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors:

1) Error notifying ProvisionListener com.facebook.airlift.bootstrap.LifeCycleModule$$Lambda$1903/1032026871 of com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager.
 Reason: com.facebook.airlift.bootstrap.LifeCycleStartException: Exception in PostConstruct method com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager::initialize()
  at com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManagerModule.configure(MySqlFunctionNamespaceManagerModule.java:43)
  while locating com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager

1 error
com.google.inject.CreationException: Unable to create injector, see the following errors:

1) Error notifying ProvisionListener com.facebook.airlift.bootstrap.LifeCycleModule$$Lambda$1903/1032026871 of com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager.
 Reason: com.facebook.airlift.bootstrap.LifeCycleStartException: Exception in PostConstruct method com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager::initialize()
  at com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManagerModule.configure(MySqlFunctionNamespaceManagerModule.java:43)
  while locating com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager

1 error
        at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543)
        at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:186)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:109)
        at com.google.inject.Guice.createInjector(Guice.java:87)
        at com.facebook.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:251)
        at com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManagerFactory.create(MySqlFunctionNamespaceManagerFactory.java:59)
        at com.facebook.presto.metadata.FunctionAndTypeManager.loadFunctionNamespaceManager(FunctionAndTypeManager.java:166)
        at com.facebook.presto.metadata.StaticFunctionNamespaceStore.loadFunctionNamespaceManager(StaticFunctionNamespaceStore.java:81)
        at com.facebook.presto.metadata.StaticFunctionNamespaceStore.loadFunctionNamespaceManagers(StaticFunctionNamespaceStore.java:63)
        at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:152)
        at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:79)
Caused by: com.facebook.airlift.bootstrap.LifeCycleStartException: Exception in PostConstruct method com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager::initialize()
        at com.facebook.airlift.bootstrap.LifeCycleManager.startInstance(LifeCycleManager.java:245)
        at com.facebook.airlift.bootstrap.LifeCycleManager.addInstance(LifeCycleManager.java:211)
        at com.facebook.airlift.bootstrap.LifeCycleModule.provision(LifeCycleModule.java:62)
        at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision(ProvisionListenerStackCallback.java:120)
        at com.google.inject.internal.ProvisionListenerStackCallback.provision(ProvisionListenerStackCallback.java:66)
        at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:93)
        at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306)
        at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
        at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:168)
        at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39)
        at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:211)
        at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:182)
        ... 9 more
Caused by: org.jdbi.v3.core.ConnectionException: java.sql.SQLException: No suitable driver found for jdbc:mysql://hostname:3306
        at org.jdbi.v3.core.Jdbi.open(Jdbi.java:300)
        at org.jdbi.v3.core.LazyHandleSupplier.initHandle(LazyHandleSupplier.java:58)
        at org.jdbi.v3.core.LazyHandleSupplier.getHandle(LazyHandleSupplier.java:46)
        at org.jdbi.v3.sqlobject.statement.internal.CustomizingStatementHandler.invoke(CustomizingStatementHandler.java:148)
        at org.jdbi.v3.sqlobject.statement.internal.SqlUpdateHandler.invoke(SqlUpdateHandler.java:30)
        at org.jdbi.v3.sqlobject.SqlObjectFactory.lambda$null$13(SqlObjectFactory.java:163)
        at org.jdbi.v3.core.internal.JdbiThreadLocals.invokeInContext(JdbiThreadLocals.java:27)
        at org.jdbi.v3.core.LazyHandleSupplier.lambda$invokeInContext$1(LazyHandleSupplier.java:72)
        at org.jdbi.v3.core.internal.JdbiThreadLocals.invokeInContext(JdbiThreadLocals.java:27)
        at org.jdbi.v3.core.LazyHandleSupplier.invokeInContext(LazyHandleSupplier.java:71)
        at org.jdbi.v3.sqlobject.SqlObjectFactory.lambda$createInvocationHandler$14(SqlObjectFactory.java:162)
        at org.jdbi.v3.core.OnDemandExtensions.invoke(OnDemandExtensions.java:76)
        at org.jdbi.v3.core.OnDemandExtensions.lambda$null$0(OnDemandExtensions.java:63)
        at org.jdbi.v3.core.internal.JdbiThreadLocals.invokeInContext(JdbiThreadLocals.java:27)
        at org.jdbi.v3.core.OnDemandExtensions.lambda$null$1(OnDemandExtensions.java:62)
        at org.jdbi.v3.core.Jdbi.withExtension(Jdbi.java:439)
        at org.jdbi.v3.core.OnDemandExtensions.lambda$create$2(OnDemandExtensions.java:61)
        at com.sun.proxy.$Proxy201.createFunctionNamespacesTableIfNotExists(Unknown Source)
        at com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager.initialize(MySqlFunctionNamespaceManager.java:97)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.facebook.airlift.bootstrap.LifeCycleManager.startInstance(LifeCycleManager.java:240)
        ... 20 more
Caused by: java.sql.SQLException: No suitable driver found for jdbc:mysql://hostname:3306
        at java.sql.DriverManager.getConnection(DriverManager.java:689)
        at java.sql.DriverManager.getConnection(DriverManager.java:270)
        at com.facebook.presto.functionNamespace.mysql.MySqlConnectionModule.lambda$setup$0(MySqlConnectionModule.java:41)
        at org.jdbi.v3.core.Jdbi.open(Jdbi.java:285)
        ... 43 more

请在 etc/jvm.properties 文件里加上以下内容:

-Xbootclasspath/a:/path/to/presto-server-0.246/plugin/mysql/mysql-connector-java-5.1.48.jar

启动完 Presto 集群之后,我们可以通过 Presto cli 创建一个 UDF:

presto:iteblog> CREATE FUNCTION mysql.default.my_tan(x double) RETURNS double DETERMINISTIC RETURNS NULL ON NULL INPUT RETURN sin(x) / cos(x);

关于 CREATE FUNCTION 的语法参见这里。创建完函数之后,我们就可以使用它了:

presto:iteblog> select mysql.default.my_tan(100) as xxx;
         xxx
---------------------
 -0.5872139151569291
(1 row)

Query 20220315_061526_00007_skndn, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:25 [0 rows, 0B] [0 rows/s, 0B/s]

上面的例子是定义的函数内容是 sin(x) / cos(x),其实 Presto 还支持通过 thrift 来调用外部的函数服务,不过这个目前没看到文档有介绍,所以本文暂时不介绍如何使用,感兴趣的同学可以自己去研究。

本博客文章除特别声明,全部都是原创!
原创文章版权归过往记忆大数据(过往记忆)所有,未经许可不得转载。
本文链接: 【Presto 自定义函数功能介绍及使用】(https://www.iteblog.com/archives/10153.html)
喜欢 (1)
分享 (0)
发表我的评论
取消评论

表情
本博客评论系统带有自动识别垃圾评论功能,请写一些有意义的评论,谢谢!