Presto 内部提供了大量内置的函数,可以满足我们大部分的日常需求。但总是有一些场景需要我们自己写 UDF,为了满足这个需求,Presto 给我们提供了 Function Namespace Managers 模块使得我们可以实现直接的 UDF。本文将给大家介绍一下如何使用 Presto 的 UDF 功能。
如果需要使用 Function Namespace Managers 功能,需要把 presto-catalog-managers 模块里面的 jar 包移到 plugin 目录里面。另外需要在 etc/function-namespace/ 目录下创建一个后缀为 properties 的文件,比如我这里用到的是 mysql.properties(文件名随便设置),配置如下:
function-namespace-manager.name=mysql database-url=jdbc:mysql://iteblog.com:3306/functions?user=root&password=password function-namespaces-table-name=function_namespaces functions-table-name=sql_functions
需要配置 database-url 的原因是因为 Presto 目前使用 MySQL 来存储用户创建的函数。另外,我们需要到上面配置的 MySQL 里面创建好名为 functions 的数据库(这个可以变)。当 Presto 集群启动的时候会到 MySQL 里面创建名为 function_namespaces 和 sql_functions 两张表。其中:
- function-namespaces-table-name:用于存储函数 namespaces 的表;
- functions-table-name:用于存储函数的定义
配置设置好之后,我们需要手动到 MySQL 里面插入如下数据:
INSERT INTO function_namespaces (catalog_name, schema_name) VALUES('mysql', 'default');
上面 SQL 的意思是创建一个名为 mysql.default 的函数 namespace。
现在我们可以启动 Presto 集群了。
2022-03-14T15:21:43.364+0100 ERROR main com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors: 1) Explicit bindings are required and com.facebook.drift.transport.client.MethodInvokerFactory<java.lang.annotation.Annotation> is not explicitly bound. while locating com.facebook.drift.transport.client.MethodInvokerFactory<java.lang.annotation.Annotation> for the 2nd parameter of com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule.getDriftClientFactory(DriftClientBinder.java:229) at com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule.getDriftClientFactory(DriftClientBinder.java:229) (via modules: com.facebook.presto.functionNamespace.execution.SimpleAddressSqlFunctionExecutorsModule -> com.facebook.presto.functionNamespace.execution.thrift.SimpleAddressThriftSqlFunctionExecutionModule -> com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule) 1 error com.google.inject.CreationException: Unable to create injector, see the following errors: 1) Explicit bindings are required and com.facebook.drift.transport.client.MethodInvokerFactory<java.lang.annotation.Annotation> is not explicitly bound. while locating com.facebook.drift.transport.client.MethodInvokerFactory<java.lang.annotation.Annotation> for the 2nd parameter of com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule.getDriftClientFactory(DriftClientBinder.java:229) at com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule.getDriftClientFactory(DriftClientBinder.java:229) (via modules: com.facebook.presto.functionNamespace.execution.SimpleAddressSqlFunctionExecutorsModule -> com.facebook.presto.functionNamespace.execution.thrift.SimpleAddressThriftSqlFunctionExecutionModule -> com.facebook.drift.client.guice.DriftClientBinder$DriftClientBinderModule) 1 error at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543) at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:159) at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:106) at com.google.inject.Guice.createInjector(Guice.java:87) at com.facebook.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:251) at com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManagerFactory.create(MySqlFunctionNamespaceManagerFactory.java:59) at com.facebook.presto.metadata.FunctionAndTypeManager.loadFunctionNamespaceManager(FunctionAndTypeManager.java:166) at com.facebook.presto.metadata.StaticFunctionNamespaceStore.loadFunctionNamespaceManager(StaticFunctionNamespaceStore.java:81) at com.facebook.presto.metadata.StaticFunctionNamespaceStore.loadFunctionNamespaceManagers(StaticFunctionNamespaceStore.java:63) at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:152) at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:79)
如果出现以上异常,可以修改 com.facebook.presto.functionNamespace.execution.thrift.SimpleAddressThriftSqlFunctionExecutionModule 类,并在 configure 方法里面加入一下代码:
binder.bind(new TypeLiteral<MethodInvokerFactory<Annotation>>(){}) .toInstance((MethodInvokerFactory<Annotation>) DriftNettyMethodInvokerFactory.createStaticDriftNettyMethodInvokerFactory(new DriftNettyClientConfig())); configBinder(binder).bindConfig(DriftNettyClientConfig.class);
同时加上以下依赖:
<dependency> <groupId>com.facebook.drift</groupId> <artifactId>drift-transport-netty</artifactId> </dependency>
如果启动 Presto 集群出现以下异常:
2022-03-14T15:21:43.364+0100 ERROR main com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors: 1) Error notifying ProvisionListener com.facebook.airlift.bootstrap.LifeCycleModule$$Lambda$1903/1032026871 of com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager. Reason: com.facebook.airlift.bootstrap.LifeCycleStartException: Exception in PostConstruct method com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager::initialize() at com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManagerModule.configure(MySqlFunctionNamespaceManagerModule.java:43) while locating com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager 1 error com.google.inject.CreationException: Unable to create injector, see the following errors: 1) Error notifying ProvisionListener com.facebook.airlift.bootstrap.LifeCycleModule$$Lambda$1903/1032026871 of com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager. Reason: com.facebook.airlift.bootstrap.LifeCycleStartException: Exception in PostConstruct method com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager::initialize() at com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManagerModule.configure(MySqlFunctionNamespaceManagerModule.java:43) while locating com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager 1 error at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543) at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:186) at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:109) at com.google.inject.Guice.createInjector(Guice.java:87) at com.facebook.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:251) at com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManagerFactory.create(MySqlFunctionNamespaceManagerFactory.java:59) at com.facebook.presto.metadata.FunctionAndTypeManager.loadFunctionNamespaceManager(FunctionAndTypeManager.java:166) at com.facebook.presto.metadata.StaticFunctionNamespaceStore.loadFunctionNamespaceManager(StaticFunctionNamespaceStore.java:81) at com.facebook.presto.metadata.StaticFunctionNamespaceStore.loadFunctionNamespaceManagers(StaticFunctionNamespaceStore.java:63) at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:152) at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:79) Caused by: com.facebook.airlift.bootstrap.LifeCycleStartException: Exception in PostConstruct method com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager::initialize() at com.facebook.airlift.bootstrap.LifeCycleManager.startInstance(LifeCycleManager.java:245) at com.facebook.airlift.bootstrap.LifeCycleManager.addInstance(LifeCycleManager.java:211) at com.facebook.airlift.bootstrap.LifeCycleModule.provision(LifeCycleModule.java:62) at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision(ProvisionListenerStackCallback.java:120) at com.google.inject.internal.ProvisionListenerStackCallback.provision(ProvisionListenerStackCallback.java:66) at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:93) at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306) at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:168) at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39) at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:211) at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:182) ... 9 more Caused by: org.jdbi.v3.core.ConnectionException: java.sql.SQLException: No suitable driver found for jdbc:mysql://hostname:3306 at org.jdbi.v3.core.Jdbi.open(Jdbi.java:300) at org.jdbi.v3.core.LazyHandleSupplier.initHandle(LazyHandleSupplier.java:58) at org.jdbi.v3.core.LazyHandleSupplier.getHandle(LazyHandleSupplier.java:46) at org.jdbi.v3.sqlobject.statement.internal.CustomizingStatementHandler.invoke(CustomizingStatementHandler.java:148) at org.jdbi.v3.sqlobject.statement.internal.SqlUpdateHandler.invoke(SqlUpdateHandler.java:30) at org.jdbi.v3.sqlobject.SqlObjectFactory.lambda$null$13(SqlObjectFactory.java:163) at org.jdbi.v3.core.internal.JdbiThreadLocals.invokeInContext(JdbiThreadLocals.java:27) at org.jdbi.v3.core.LazyHandleSupplier.lambda$invokeInContext$1(LazyHandleSupplier.java:72) at org.jdbi.v3.core.internal.JdbiThreadLocals.invokeInContext(JdbiThreadLocals.java:27) at org.jdbi.v3.core.LazyHandleSupplier.invokeInContext(LazyHandleSupplier.java:71) at org.jdbi.v3.sqlobject.SqlObjectFactory.lambda$createInvocationHandler$14(SqlObjectFactory.java:162) at org.jdbi.v3.core.OnDemandExtensions.invoke(OnDemandExtensions.java:76) at org.jdbi.v3.core.OnDemandExtensions.lambda$null$0(OnDemandExtensions.java:63) at org.jdbi.v3.core.internal.JdbiThreadLocals.invokeInContext(JdbiThreadLocals.java:27) at org.jdbi.v3.core.OnDemandExtensions.lambda$null$1(OnDemandExtensions.java:62) at org.jdbi.v3.core.Jdbi.withExtension(Jdbi.java:439) at org.jdbi.v3.core.OnDemandExtensions.lambda$create$2(OnDemandExtensions.java:61) at com.sun.proxy.$Proxy201.createFunctionNamespacesTableIfNotExists(Unknown Source) at com.facebook.presto.functionNamespace.mysql.MySqlFunctionNamespaceManager.initialize(MySqlFunctionNamespaceManager.java:97) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.facebook.airlift.bootstrap.LifeCycleManager.startInstance(LifeCycleManager.java:240) ... 20 more Caused by: java.sql.SQLException: No suitable driver found for jdbc:mysql://hostname:3306 at java.sql.DriverManager.getConnection(DriverManager.java:689) at java.sql.DriverManager.getConnection(DriverManager.java:270) at com.facebook.presto.functionNamespace.mysql.MySqlConnectionModule.lambda$setup$0(MySqlConnectionModule.java:41) at org.jdbi.v3.core.Jdbi.open(Jdbi.java:285) ... 43 more
请在 etc/jvm.properties 文件里加上以下内容:
-Xbootclasspath/a:/path/to/presto-server-0.246/plugin/mysql/mysql-connector-java-5.1.48.jar
启动完 Presto 集群之后,我们可以通过 Presto cli 创建一个 UDF:
presto:iteblog> CREATE FUNCTION mysql.default.my_tan(x double) RETURNS double DETERMINISTIC RETURNS NULL ON NULL INPUT RETURN sin(x) / cos(x);
关于 CREATE FUNCTION 的语法参见这里。创建完函数之后,我们就可以使用它了:
presto:iteblog> select mysql.default.my_tan(100) as xxx; xxx --------------------- -0.5872139151569291 (1 row) Query 20220315_061526_00007_skndn, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0:25 [0 rows, 0B] [0 rows/s, 0B/s]
上面的例子是定义的函数内容是 sin(x) / cos(x),其实 Presto 还支持通过 thrift 来调用外部的函数服务,不过这个目前没看到文档有介绍,所以本文暂时不介绍如何使用,感兴趣的同学可以自己去研究。
本博客文章除特别声明,全部都是原创!原创文章版权归过往记忆大数据(过往记忆)所有,未经许可不得转载。
本文链接: 【Presto 自定义函数功能介绍及使用】(https://www.iteblog.com/archives/10153.html)