docs: modernize multimodal tutorials and migrate legacy blob APIs#16918
docs: modernize multimodal tutorials and migrate legacy blob APIs#16918
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates several Jupyter notebooks to demonstrate BigQuery DataFrames features, including the anywidget display mode, generative AI functions for image analysis, and vector search capabilities. Feedback focuses on fixing a broken dependency in the vector search notebook caused by a removed cell, correcting a variable name typo, avoiding bare exception blocks, and replacing hardcoded internal project IDs with placeholders to ensure the tutorials are functional and portable for public users.
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "source": [ | ||
| "# Code calling .blob.audio_transcribe() was removed to satisfy the goal of removing public Blob APIs." |
There was a problem hiding this comment.
Removing the code that populates flattened['Transcription'] while keeping subsequent cells that depend on this column (e.g., lines 574, 608, 639) will cause the notebook to fail with a KeyError. The legacy .blob call should be replaced with a modern equivalent (e.g., using bbq.ai.generate with multimodal support) rather than simply being removed to ensure the tutorial remains functional.
There was a problem hiding this comment.
Reimplemented the broken audio transcription workflow using modern bbq.obj APIs and bbq.ai.generate to properly populate flattened["Transcription"] with the exactly-matched output struct (content & status) required by downstream evaluation cells.
| "source": [ | ||
| "import bigframes.pandas as bpd\n", | ||
| "\n", | ||
| "MY_RPOJECT_ID = \"bigframes-dev\" # @param {type:\"string\"}\n", |
| " if str(uri).lower().endswith((\".png\", \".jpg\", \".jpeg\", \".webp\")):\n", | ||
| " return f'<img src=\"{url}\" width=\"{width}\">'\n", | ||
| " return f'<a href=\"{url}\" target=\"_blank\">{uri}</a>'\n", | ||
| " except: return \"Format Error\"\n", |
There was a problem hiding this comment.
Rewrote the cell formatter exception block from a raw except: to except Exception: to adhere to best practice
| " SELECT\n", | ||
| " AI.GENERATE(\n", | ||
| " prompt=>(\"Extract the values.\", OBJ.GET_ACCESS_URL(OBJ.FETCH_METADATA(OBJ.MAKE_REF(gcs_path, \"us.conn\")), \"r\")),\n", | ||
| " connection_id=>\"bigframes-dev.us.bigframes-default-connection\",\n", |
| "bpd.options.bigquery.location = \"US\"\n", | ||
| "\n", | ||
| "# Set to your GCP project ID.\n", | ||
| "bpd.options.bigquery.project = \"swast-scratch\"" |
| "import bigframes.bigquery as bbq\n", | ||
| "\n", | ||
| "vector_search_results = bbq.vector_search(\n", | ||
| " base_table=f\"swast-scratch.scipy2025.national_jukebox\",\n", |
Overview
This PR modernizes the tutorial notebooks across the BigFrames ecosystem to align with the latest public API state and successfully completes the migration path away from the deprecated internal .blob accessors and private loading mechanisms (e.g. _from_glob_path).
The core objective is ensuring user-facing documentation perfectly tracks active package behavior without invoking retirement warnings or triggering internal attribute exceptions.
Key Changes
Fixes #< 478952827 > 🦕