的内容

Apache Iceberg 的时间旅行是如何实现的？

为了更好的使用 Apache Iceberg，理解其时间旅行是很有必要的，这个其实也会对 Iceberg 表的读取过程有个大致了解。不过在介绍 Apache Iceberg 的时间旅行（Time travel）之前，我们需要了解 Apache Iceberg 的底层数据组织结构。Apache Iceberg 的底层数据组织我们在《一条数据在 Apache Iceberg 之旅：写过程分析》这篇文章中详细地介绍了 Apache I

w397090770 4年前 (2020-11-29) 3759℃ 0评论4喜欢

Apache Iceberg

Apache Iceberg 小文件合并原理及实践

在《一条数据在 Apache Iceberg 之旅：写过程分析》这篇文章中我们分析了 Apache Iceberg 写数据的源码。如下是我们使用 Spark 写两次数据到 Iceberg 表的数据目录布局（测试代码在这里）：[code lang="bash"]/data/hive/warehouse/default.db/iteblog├── data│ └── ts_year=2020│ ├── id_bucket=0│ │ ├── 00000-0-19603f5a-d38a

w397090770 4年前 (2020-11-20) 6929℃ 6评论8喜欢

Apache Iceberg

一条数据在 Apache Iceberg 之旅：写过程分析

本文基于 Apache Iceberg 0.9.0 最新分支，主要分析 Apache Iceberg 中使用 Spark 2.4.6 来写数据到 Iceberg 表中，也就是对应 iceberg-spark2 模块。当然，Apache Iceberg 也支持 Flink 来读写 Iceberg 表，其底层逻辑也 Spark 类似，感兴趣的同学可以去看看。使用 Spark2 将数据写到 Apache Iceberg在介绍下面文章之前，我们先来看下在 Apache Spark 2.4.6 中写数

w397090770 4年前 (2020-11-12) 6050℃ 0评论9喜欢

过往记忆

专注于大数据技术构架及应用，微信公众号:过往记忆大数据

的内容

Apache Iceberg 的时间旅行是如何实现的？

Apache Iceberg 小文件合并原理及实践

一条数据在 Apache Iceberg 之旅：写过程分析