Skip to content

feat: split libiceberg into iceberg-core and iceberg-data#628

Closed
zhjwpku wants to merge 1 commit intoapache:mainfrom
zhjwpku:split-iceberg-core-data
Closed

feat: split libiceberg into iceberg-core and iceberg-data#628
zhjwpku wants to merge 1 commit intoapache:mainfrom
zhjwpku:split-iceberg-core-data

Conversation

@zhjwpku
Copy link
Copy Markdown
Collaborator

@zhjwpku zhjwpku commented Apr 25, 2026

Replace the monolithic iceberg library with two targets:

  • iceberg-core — schema/types/partition/sort/transform, table/snapshot/ metadata/transactions/updates, manifest handling, expressions, catalog (including in-memory), utilities, file I/O abstractions, and delete_file_index (kept in core because manifest_group holds DeleteFileIndex::Builder by value and only depends on core types).

  • iceberg-datadata/, deletes/, puffin/; publicly links iceberg-core.

iceberg_bundle links iceberg-data; iceberg_rest links iceberg-core only. There is no umbrella iceberg compatibility target.

CMake: optional EXPORT_NAME on add_iceberg_lib; hyphenated target names map to ICEBERG_*_EXPORTING via underscore substitution for MSVC. Per-package iceberg_install_cmake_package installs iceberg-core-config.cmake, iceberg-data-config.cmake, and matching *-targets.cmake under lib/cmake/<package>/; install namespace remains iceberg:: (e.g. iceberg::iceberg-core_static). Vendored third-party installs are routed to the export set that owns them (core vs bundle vs rest). System find_dependency lists are split per package so iceberg-core consumers do not pull Arrow/Parquet/Avro/cpr.

Symbol visibility: iceberg_export.h defines ICEBERG_CORE_EXPORT / ICEBERG_EXPORT (core-only toggles); iceberg_data_export.h defines ICEBERG_DATA_EXPORT for data-layer public APIs. Cross-boundary symbols used from data (WriterProperties, ArrowArrayGuard, ArrowSchemaGuard) are exported from core.

@zhjwpku zhjwpku force-pushed the split-iceberg-core-data branch 4 times, most recently from c4f0097 to 4f31660 Compare April 27, 2026 10:04
@zhjwpku zhjwpku marked this pull request as ready for review April 27, 2026 10:48
Move data writers, deletes/, and puffin/ into a separate `iceberg_data`
library that links the existing `iceberg` target. `delete_file_index`
stays in `iceberg` because manifest_group embeds DeleteFileIndex::Builder
with only core dependencies.

* `iceberg` — unchanged target name for metadata/planning, expressions,
  manifests, catalog (incl. in-memory), utilities, file I/O abstractions,
  and delete_file_index.

* `iceberg_data` — data/, deletes/, puffin/; links `iceberg`.

`iceberg_bundle` links `iceberg_data` when the bundle is built.
`iceberg_rest` links `iceberg` and cpr only.

CMake: per-package `iceberg_install_cmake_package` installs
`iceberg-config.cmake`, `iceberg_data-config.cmake`, optional
`iceberg_bundle-config.cmake`, and matching `*-targets.cmake` under
`lib/cmake/<package>/`; install namespace remains `iceberg::`.
Split `find_dependency` lists so `iceberg` consumers do not pull
Arrow/Parquet/Avro/cpr. Vendored third-party exports are attached to the
package that owns them (core vs bundle vs rest). Add
`iceberg_rest-config.cmake.in` for the REST catalog package.

Meson and tests link the split targets; the example uses
`find_package(iceberg_bundle)` and `iceberg::iceberg_bundle_static`.

Symbol visibility: `iceberg_export.h` documents `ICEBERG_EXPORT` for the
main `iceberg` DLL; `iceberg_data_export.h` adds `ICEBERG_DATA_EXPORT` for
data-layer public APIs. Shared builds define `${UPPER_LIB_NAME}_EXPORTING`
(e.g. ICEBERG_DATA_EXPORTING) for MSVC. Types used across the boundary
(e.g. WriterProperties, Arrow guards) remain exported from `iceberg`.
@zhjwpku zhjwpku force-pushed the split-iceberg-core-data branch from af1c2f1 to 6b85b4f Compare April 27, 2026 12:01
@zhjwpku zhjwpku closed this Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant