[Nexthop][Distro] Fix fboss_init.sh startup and package missing shared libs#1107
Open
benoit-nexthop wants to merge 1 commit intofacebook:mainfrom
Open
[Nexthop][Distro] Fix fboss_init.sh startup and package missing shared libs#1107benoit-nexthop wants to merge 1 commit intofacebook:mainfrom
benoit-nexthop wants to merge 1 commit intofacebook:mainfrom
Conversation
…package shared library dependencies fboss_init.sh never sourced setup_fboss_env, so weutil couldn't find its shared libraries and silently failed to generate fruid.json. Downstream services (qsfp_service, sw agent) would then crash on startup because fruid.json was missing. setup_fboss_env: derive FBOSS root from the script's own location instead of $PWD, and add an idempotency guard to prevent duplicate PATH and LD_LIBRARY_PATH prepends when the script is sourced multiple times. fboss_init.sh: source setup_fboss_env at startup so weutil resolves its shared libraries; make generate_fruid() propagate failure with return 1; make main() exit 1 if fruid generation fails so systemd marks fboss_init.service as failed rather than succeeded. fboss_cmd_find.sh: source setup_fboss_env before exec so every FBOSS command runs with the correct PATH and LD_LIBRARY_PATH. package.py: the thrift-python migration (commit f9d8d65, "CMake and Docker changes for thrift-python migration") switched fboss's manifest dependencies from the base variants (folly, wangle, fizz, mvfst, fmt, zstd, fbthrift) to their -python variants, which are built with BUILD_SHARED_LIBS=ON. As a result, every FBOSS binary (platform_manager, weutil, fsdb, ...) now links against libfolly.so, libwangle.so, libfmt.so, libfizz.so, libmvfst.so at runtime, not just Python extensions. Bundle these shared libraries in the forwarding-stack and platform-stack tarballs via a new _find_getdeps_libs() helper that locates .so files under each package's getdeps install prefix. LIB_NAME_OVERRIDES handles packages whose getdeps name differs from the library filename (e.g. fmt-python builds libfmt.so). Also add a lib64 -> lib symlink in each tarball so binaries with RPATH/RUNPATH pointing to lib64/ resolve correctly, and bundle libunwind from the LLVM build container since CentOS does not ship a compatible version and getdeps does not install it.
Contributor
Author
|
Chain of events that led us here:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
fboss_init.shwasn't sourcingsetup_fboss_env, soweutilcouldn't find its shared libraries and silently failed to generatefruid.json. Downstream services would then crash on startup becausefruid.jsonwas missing.setup_fboss_env: derive FBOSS root from the script's own location instead of$PWD, and add an idempotency guard to prevent duplicatePATHandLD_LIBRARY_PATHprepends when the script is sourced multiple times.fboss_init.sh:source setup_fboss_envat startup soweutilresolves its shared libraries; makegenerate_fruid()propagate failure withreturn 1; makemain()exit 1 if fruid generation fails so systemd marksfboss_init.serviceas failed rather than succeeded.fboss_cmd_find.sh:source setup_fboss_envbefore exec so every FBOSS command runs with the correctPATHandLD_LIBRARY_PATH.package.py: the thrift-python migration (commit f9d8d65) switched FBOSS's manifest dependencies from the base variants (folly, wangle, fizz, mvfst, fmt, zstd, fbthrift) to their-pythonvariants, which are built withBUILD_SHARED_LIBS=ON. As a result, every FBOSS binary (platform_manager,weutil,fsdb, ...) now links againstlibfolly.so,libwangle.so,libfmt.so,libfizz.so,libmvfst.soat runtime, not just Python extensions. Bundle these shared libraries in the forwarding-stack and platform-stack tarballs via a new_find_getdeps_libs()helper that locates.sofiles under each package's getdeps install prefix.LIB_NAME_OVERRIDEShandles packages whose getdeps name differs from the library filename (e.g.fmt-pythonbuildslibfmt.so). Also add alib64->libsymlink in each tarball so binaries withRPATH/RUNPATHpointing tolib64/resolve correctly, and bundlelibunwind.sofrom the LLVM build container since CentOS does not ship a compatible version and getdeps does not install it.Test Plan
These fixes were needed to properly image DUTs with the latest distro image build. Without them no FBOSS agent would come up.