欢迎关注大数据技术架构与案例微信公众号:过往记忆大数据
过往记忆博客公众号iteblog_hadoop
欢迎关注微信公众号:
过往记忆大数据

Spark 2.0技术预览版正式发布下载

  在过去Spark社区创建了Spark 2.0的技术预览版,经过几天的投票,目前该技术预览版今天正式公布。《Spark 2.0技术预览:更容易、更快速、更智能》文章中详细介绍了Spark 2.0给我们带来的新功能,总体上Spark 2.0提升了下面三点:

  1. 对标准的SQL支持,统一DataFrame和Dataset API。现在已经可以运行TPC-DS所有的99个查询,这99个查询需要SQL 2003的许多特性。

  2. 采用第二代Tungsten engine,建立在现代编译器和MPP数据库的想法上,并且把它们应用于数据的处理过程中。主要想法是通过在运行期间优化那些拖慢整个查询的代码到一个单独的函数中,消除虚拟函数的调用以及利用CPU寄存器来存放那些中间数据。最高性能提升10倍。

  3. 一种新颖的流处理方式:Structured Streaming APIs,利用Catalyst优化器来发现什么时候可以透明的将静态的程序转到增量执行的动态工作或者无限数据流中。当我们从这个数据结构的角度来看到我们的数据,这就简化了流数据。

  看到这些提升是不是非常激动地想尝试一下Spark 2.0呢?我们可以到https://dist.apache.org/repos/dist/release/spark/spark-2.0.0-preview/里面下载。此版本并不是最终发行版,仅供尝鲜使用,里面可能包含了非常严重的Bug。

Spark 2.0.0-preview下载地址

https://dist.apache.org/repos/dist/release/spark/spark-2.0.0-preview/

完整邮件通知

In the past the Spark community have created preview packages (not official releases) and used those as opportunities to ask community members to test the upcoming versions of Apache Spark. Several people in the Apache community have suggested we conduct votes for these preview packages and turn them into formal releases by the Apache foundation's standard. This is a result of that.

Note that this preview release should contain almost all the new features that will be in Apache Spark 2.0.0. However, it is not meant to be functional, i.e. the preview release contain critical bugs and documentation errors. To download, please see the bottom of this web page: http://spark.apache.org/downloads.html

For the list of known issues, please see https://issues.apache.org/jira/browse/SPARK-15520?jql=project%20%3D%20SPARK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20%22Target%20Version%2Fs%22%20%3D%202.0.0

Note 1: The current download link goes directly to dist.apache.org. Once all the files are propagated to all mirrors, I will update the link to link to the mirror selector instead.

Note 2: This is the first time we are publishing official, voted preview releases. Would love to hear feedback.

本博客文章除特别声明,全部都是原创!
原创文章版权归过往记忆大数据(过往记忆)所有,未经许可不得转载。
本文链接: 【Spark 2.0技术预览版正式发布下载】(https://www.iteblog.com/archives/1678.html)
喜欢 (3)
分享 (0)
发表我的评论
取消评论

表情
本博客评论系统带有自动识别垃圾评论功能,请写一些有意义的评论,谢谢!