The Tabularium

Share this post

February 2023 - Iceberg Community News

tabular.substack.com

February 2023 - Iceberg Community News

February 2023 - Iceberg Community News

Tabular
Feb 28
Share this post

February 2023 - Iceberg Community News

tabular.substack.com

Iceberg updates

  • Flink support for inspecting metadata tables (Liwei Li)

  • Flink write and read support for branch and tag (Amogh and Liwei Li)

  • Flink read and write support for Avro GenericRecord (Steven Wu)

  • Implemented branch commits for all operations (Namratha Keshavaprakash & Amogh Jahagirdar)

  • DDL support for tags and branches (Amogh, Liwei, Xuwei, and Chidayong)

  • Added branch/tag support to VERSION AS OF in Spark (Jack Ye)

  • Added position deletes metadata table (Szehon Ho)

  • Added a Snowflake catalog (Dennis Huo)

  • REST catalog supports lazy snapshot loading (Daniel Weeks)

  • Updated Spark to remove filters that are completely pushed down (Anton Okolnychyi)

  • Added Delta to Iceberg table conversion (Rushan and Eric)

  • Added SQL view representation implementation (Amogh)

  • Added S3 REST Signer client + REST spec (Eduard Tudenhoefner)

Releases

  • 1.2.0

    • https://github.com/apache/iceberg/milestone/28

    • Spark row-level commands and branches

PyIceberg updates

Version 0.3.0 was released. The new release includes the following, for more details, please refer to the PRs:

Thanks for reading The Tabularium! Subscribe for free to receive new posts and support my work.

  • Full projection by Iceberg Field ID

  • Parallelization of job planning that speeds up the metadata operations by an order of magnitude

  • Overhaul of reading Avro files without the use of Pydantic to improve performance

  • Bugfix when reading a zero-length binary field

  • Bugfix for PyArrow that led to multiple calls to the filesystem.

  • Added Python support for static tables (Ruben)

  • Added DynamoDB catalog support in Python (Armin)

More information can be found on the project site, and the installer can be found here

Iceberg in the industry

  • Streamsets is adopting Iceberg support

  • ClickHouse has added Iceberg support

  • Databend has added Iceberg support to Datafuse

Blogs from the community

  • Starburst - Automated maintenance for Apache Iceberg tables in Starburst Galaxy

  • Dremio - Dealing with Data Incidents Using the Rollback Feature in Apache Iceberg

  • Ancestry - Scaling Ancestry.com: How to Optimize Updates for Iceberg Tables with 100 Billion Rows

Iceberg in the news

  • Datanami: Open Table Formats Square Off in Lakehouse Data Smackdown

  • BCG: A New Architecture to Manage Data Costs and Complexity

  • Forbes: An Open Approach To Hybrid Data Clouds

  • InfoQ: Netflix Built a Scalable Annotation Service Using Cassandra, Elasticsearch and Iceberg

Keep up to date on all things iceberg

Watch for new blog posts added to the Blogs page

See the community Contribute guide to learn how to start contributing to Iceberg

Join the Apache Iceberg workspace on Slack using the invite link

Subscribe to the Apache Iceberg mailing list

Originally published at

https://tabular.io

on February 28, 2023.

Thanks for reading The Tabularium! Subscribe for free to receive new posts and support my work.

Share this post

February 2023 - Iceberg Community News

tabular.substack.com
Previous
Next
Comments
TopNew

No posts

Ready for more?

© 2023 Tabular
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing