Skip to content

⚡ perf: track import bunch size incrementally instead of full serialize#40738

Open
lbajsarowicz wants to merge 2 commits intomagento:2.4-developfrom
lbajsarowicz:perf/importexport-incremental-size
Open

⚡ perf: track import bunch size incrementally instead of full serialize#40738
lbajsarowicz wants to merge 2 commits intomagento:2.4-developfrom
lbajsarowicz:perf/importexport-incremental-size

Conversation

@lbajsarowicz
Copy link
Copy Markdown
Contributor

Summary

  • Replace O(n²) strlen(serialize($bunchRows)) per-row check in AbstractEntity::_saveValidatedBunches() with incremental size tracking that serializes only the new row and adds to a running total.
  • The original code re-serialized the ENTIRE accumulated bunch array on every row iteration to check the size limit. For a 1000-row bunch, this means ~500K total serialization calls.
  • Includes key serialization overhead in the estimate to prevent drift for non-first bunches with non-sequential array keys.

Performance Impact

Metric Before After
Serialization calls per bunch O(n²) — full array per row O(n) — one row per row
1000-row bunch ~500K serialize operations 1000 serialize operations
Size estimate accuracy Exact Within bytes (includes key overhead)

Test Plan

  • Unit test verifies incremental approach produces identical bunch splits as legacy full-serialize
  • Test covers multi-bunch scenario with non-sequential keys (after flush)
  • PHPCS clean (requires CI)
  • PHPStan clean (requires CI)

Ref: #40726

⭐ Support my work

Do you like the fix? Remember to react with "👍🏻" to get it merged faster,
Then Sponsor me on Github so I can spend more time on fixing issues like this one.

Learn more at https://github.com/sponsors/lbajsarowicz

Replace O(n²) strlen(serialize($bunchRows)) per-row with incremental
size tracking that serializes only the new row and adds to a running total.

Ref: magento#40726
Include serialized key length in the running size total to prevent
estimate drift for non-first bunches with non-sequential array keys.
@m2-assistant
Copy link
Copy Markdown

m2-assistant Bot commented Apr 14, 2026

Hi @lbajsarowicz. Thank you for your contribution!
Here are some useful tips on how you can test your changes using Magento test environment.
❗ Automated tests can be triggered manually with an appropriate comment:

  • @magento run all tests - run or re-run all required tests against the PR changes
  • @magento run <test-build(s)> - run or re-run specific test build(s)
    For example: @magento run Unit Tests

<test-build(s)> is a comma-separated list of build names.

Allowed build names are:
  1. Database Compare
  2. Functional Tests CE
  3. Functional Tests EE
  4. Functional Tests B2B
  5. Integration Tests
  6. Magento Health Index
  7. Sample Data Tests CE
  8. Sample Data Tests EE
  9. Sample Data Tests B2B
  10. Static Tests
  11. Unit Tests
  12. WebAPI Tests
  13. Semantic Version Checker

You can find more information about the builds here
ℹ️ Run only required test builds during development. Run all test builds before sending your pull request for review.


For more details, review the Code Contributions documentation.
Join Magento Community Engineering Slack and ask your questions in #github channel.

@lbajsarowicz
Copy link
Copy Markdown
Contributor Author

@magento run all tests

@engcom-Dash engcom-Dash added the Priority: P2 A defect with this priority could have functionality issues which are not to expectations. label Apr 21, 2026
@github-project-automation github-project-automation Bot moved this to Pending Review in Pull Requests Dashboard Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Priority: P2 A defect with this priority could have functionality issues which are not to expectations. Progress: pending review

Projects

Status: Pending Review

Development

Successfully merging this pull request may close these issues.

2 participants