Prototype hybrid lazy imports by hebaalazzeh · Pull Request #17572 · googleapis/google-cloud-python

hebaalazzeh · 2026-06-25T07:47:28Z

Overview

This PR resolves the severe initialization bottlenecks (~10s-13s) experienced by customers in serverless environments (Cloud Run/Functions) caused by eager loading cascades across generated service clients and protobuf types.

To deliver immediate latency relief to all current users while bridging the gap to native language performance, this PR adopts a Hybrid Lazy Loading Strategy (PEP 0810 + PEP 562).

Implementation Architecture

We utilize a branched architecture in the generated __init__.py files to provide optimal performance per environment without sacrificing IDE autocompletion or static type checking:

Native Python 3.15+ Path (PEP 0810): We emit a __lazy_modules__ set containing fully-qualified module paths. For users on Python 3.15+, the interpreter natively intercepts these imports and treats them as fast C-level lazy proxies.
Perfect Static Typing: Standard imports are wrapped in an if typing.TYPE_CHECKING: guard so that IDEs (IntelliSense) and static analyzers (MyPy) maintain zero-friction support.
Fallback Lazy Loading (PEP 562): For users on older environments (Python 3.9 - 3.14), eager imports are skipped. Instead, we utilize a centralized lazy_imports.attach_module() helper from google-api-core to dynamically load clients and types on-demand.

Example Structure:

import sys
import typing

# 1. Native C-Level Lazy Proxying for Python 3.15+
__lazy_modules__ = {
   f"{__name__}.services.accelerator_types",
   f"{__name__}.services.addresses",
   f"{__name__}.types.compute",
}

# 2. Perfect tooling support (MyPy / IntelliSense)
if typing.TYPE_CHECKING or sys.version_info >= (3, 15):
   from .services.accelerator_types import AcceleratorTypesClient
   from .services.addresses import AddressesClient
   from .types.compute import Address

# 3. Fallback lazy loading for Python 3.9 - 3.14 (PEP 562)
else:
   from google.api_core import lazy_imports
   _LAZY_IMPORTS = {
       "AcceleratorTypesClient": ".services.accelerator_types",
       "AddressesClient": ".services.addresses",
       "Address": ".types.compute",
   }
   __getattr__, __dir__ = lazy_imports.attach_module(__name__, _LAZY_IMPORTS)

__all__ = (
    "AcceleratorTypesClient",
    "AddressesClient",
    "Address",
)

Related Links:

Design Doc: go/sdk-performance-design
Towards googleapis/python-aiplatform#4749

…metrics

- Add fallback to handle single-iteration quantile calculation crashes - Normalize .pyc file paths during line counting and log exceptions - Expose worker process stderr to facilitate debugging CalledProcessError - Fix absolute paths in documentation.md to use relative directory paths

…g, and encodings - Use importlib.util.source_from_cache for robust .pyc resolution - Move importlib.util and logging imports to module level - Refine json.loads to parse only the last line of stdout - Specify UTF-8 encoding when opening files for writing

- Extract worker logic to _run_worker_and_parse for robust JSON parsing - Add early validation for iterations parameter in run_master - Add non-zero exit code check and warning for run_trace failures

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

- Execute cProfile in a fresh subprocess to ensure cold-start accuracy - Execute tracemalloc in a fresh subprocess to capture all memory allocations - Clean up master process exception handling for missing taskset vs crashed workers

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

- Use json.dumps to safely escape target_module string in all subprocess commands (mprofile, cprofile, trace)

- Replace detailed implementation architecture and alternative solutions with a concise high-level usage and expected output format per PR review feedback

…me latency

- Read /proc/self/statm inside worker before and after import - Exposes pure C-extension memory allocations (e.g. grpc/protobuf) missed by Python's internal tracemalloc - Requires zero dependencies, bypassing the need for psutil

- Recursively wipes __pycache__ and .pyc files via pathlib prior to iteration loops. - Employs Python's -B flag on worker subprocesses to prevent re-caching during the benchmark, enforcing raw .py parsing for every execution.

- Replaces --clear-pycache with an opt-out --keep-pycache flag - The profiler now automatically sweeps and enforces disk-level cold starts without requiring explicit flags from the user.

Address PR review comment by moving setup tracking (rss_before, modules_before) outside of the time.perf_counter() window to prevent artificial latency inflation.

Addresses PR feedback to avoid broad exception catching when parsing physical python files for line counts.

Addresses PR review by explicitly documenting the expected PEP 3147/488 error path for irregular .pyc files.

…ssing modules Addresses PR review by replacing null-checks with a cleaner try/except block that assumes normal module behavior and explicitly logs when a C-extension or built-in is skipped due to a missing __file__ attribute.

Addresses PR review by replacing the 'none' string check with a typed NO_CPU_PINNING = -1 constant for cleaner argument parsing and state tracking.

…askset Addresses PR review by removing the fallback unpinned execution logic when taskset is not installed. The script now fails immediately with a clear error message, preventing code duplication.

Addresses PR review by collapsing the repeated statistics.quantiles logic into a reusable _calculate_percentiles helper method.

…terations Addresses PR review by validating that 'loaded_modules' and 'loaded_lines' remain deterministic across cold-start iterations, warning the user if non-deterministic import paths are triggered.

… tracemalloc profiling Addresses PR review by replacing the hard-to-maintain dynamic code string inside run_mprofile with a properly structured multiprocessing 'spawn' context.

… syntax

…562)

gemini-code-assist

Code Review

This pull request introduces a Python SDK import profiling tool, consisting of a documentation guide, a benchmarking and tracing script (profiler.py), and a utility script (rewrite_init.py) to automate lazy-loading in package init.py files. The review feedback highlights several opportunities to make the scripts more robust, such as handling comments during import parsing, supporting both lists and tuples for all matching, resolving file paths dynamically, adding a dir method to the generated lazy-loaded modules for better IDE support, and removing a redundant local import of tracemalloc.

gemini-code-assist · 2026-06-25T07:50:25Z

+    while i < len(lines):
+        line = lines[i]
+        if line.startswith("from .") and " import " in line:
+            parts = line.split(" import ")
+            mod = parts[0].replace("from ", "").strip()
+            names_part = parts[1].strip()
+            if "(" in names_part and ")" not in names_part:
+                # multi-line import
+                import_names_str = names_part.replace("(", "")
+                i += 1
+                while ")" not in lines[i]:
+                    import_names_str += lines[i]
+                    i += 1
+                import_names_str += lines[i].replace(")", "")
+            else:
+                import_names_str = names_part.replace("(", "").replace(")", "")
+
+            names = [n.strip() for n in import_names_str.split(",") if n.strip()]
+            parsed_imports.append((mod, names))
+        i += 1


The manual import parsing logic is fragile and will fail or produce invalid syntax if there are comments (such as # type: ignore or other annotations) on the import lines. Stripping comments before parsing the imported names makes the script much more robust.

Suggested change

while i < len(lines):

line = lines[i]

if line.startswith("from .") and " import " in line:

parts = line.split(" import ")

mod = parts[0].replace("from ", "").strip()

names_part = parts[1].strip()

if "(" in names_part and ")" not in names_part:

# multi-line import

import_names_str = names_part.replace("(", "")

i += 1

while ")" not in lines[i]:

import_names_str += lines[i]

i += 1

import_names_str += lines[i].replace(")", "")

else:

import_names_str = names_part.replace("(", "").replace(")", "")

names = [n.strip() for n in import_names_str.split(",") if n.strip()]

parsed_imports.append((mod, names))

i += 1

while i < len(lines):

line = lines[i]

if line.startswith("from .") and " import " in line:

parts = line.split(" import ")

mod = parts[0].replace("from ", "").strip()

names_part = parts[1].split("#")[0].strip()

if "(" in names_part and ")" not in names_part:

# multi-line import

import_names_str = names_part.replace("(", "")

i += 1

while ")" not in lines[i]:

import_names_str += lines[i].split("#")[0]

i += 1

import_names_str += lines[i].split("#")[0].replace(")", "")

else:

import_names_str = names_part.replace("(", "").replace(")", "")

names = [n.strip() for n in import_names_str.split(",") if n.strip()]

parsed_imports.append((mod, names))

i += 1

gemini-code-assist · 2026-06-25T07:50:26Z

+    def __getattr__(name):
+        if name in _LAZY_IMPORTS:
+            module = importlib.import_module(_LAZY_IMPORTS[name], package=__name__)
+            attr = getattr(module, name)
+            globals()[name] = attr
+            return attr
+        raise AttributeError(f"module {{__name__!r}} has no attribute {{name!r}}")


The generated __init__.py defines __getattr__ but is missing a corresponding __dir__ function. Without __dir__, interactive environments (like IPython, Jupyter notebooks, or standard REPLs) and IDEs will not be able to auto-complete or inspect the lazy-loaded attributes until they are explicitly accessed.

def __getattr__(name): if name in _LAZY_IMPORTS: module = importlib.import_module(_LAZY_IMPORTS[name], package=__name__) attr = getattr(module, name) globals()[name] = attr return attr raise AttributeError(f"module {__name__!r} has no attribute {name!r}") def __dir__(): return sorted(set(globals().keys()) | set(_LAZY_IMPORTS.keys()))

gemini-code-assist · 2026-06-25T07:50:26Z

+import re
+import os
+
+INIT_PATH = "packages/google-cloud-compute/google/cloud/compute_v1/__init__.py"


The relative path INIT_PATH is hardcoded, which causes the script to fail with a FileNotFoundError if executed from any directory other than the repository root. Resolving the path relative to the script's location makes it much more robust.

Suggested change

INIT_PATH = "packages/google-cloud-compute/google/cloud/compute_v1/__init__.py"

import pathlib

INIT_PATH = pathlib.Path(__file__).resolve().parents[2] / "packages/google-cloud-compute/google/cloud/compute_v1/__init__.py"

gemini-code-assist · 2026-06-25T07:50:26Z

+    all_match = re.search(r'__all__\s*=\s*\((.*?)\)', content, re.DOTALL)
+    if not all_match:
+        print("Could not find __all__")
+        return


The regex pattern for __all__ only matches tuples (...). Some packages in the monorepo define __all__ as a list [...]. Supporting both formats makes the script more robust and reusable.

Suggested change

all_match = re.search(r'__all__\s*=\s*\((.*?)\)', content, re.DOTALL)

if not all_match:

print("Could not find __all__")

return

all_match = re.search(r'__all__\s*=\s*[\(\[](.*?)[\)\]]', content, re.DOTALL)

if not all_match:

print("Could not find __all__")

return

gemini-code-assist · 2026-06-25T07:50:26Z

+def _mprofile_worker(target_module):
+    import tracemalloc
+    tracemalloc.start()


The tracemalloc module is already imported at the module level (line 6). Importing it again inside _mprofile_worker is redundant and violates the general rule against local imports of already module-level imported modules.

Suggested change

def _mprofile_worker(target_module):

import tracemalloc

tracemalloc.start()

def _mprofile_worker(target_module):

tracemalloc.start()

References

Do not import modules or classes inside functions if they are already imported at the module level.

hebaalazzeh and others added 30 commits June 15, 2026 21:27

feat: add import profiler tool with dynamic loaded lines code volume …

7e325db

…metrics

fix: refactor profiler worker execution and handling

9340dc9

- Extract worker logic to _run_worker_and_parse for robust JSON parsing - Add early validation for iterations parameter in run_master - Add non-zero exit code check and warning for run_trace failures

Update scripts/import_profiler/profiler.py

d8c26a2

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update scripts/import_profiler/profiler.py

f2c998c

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

refactor: replace manual sys.argv parsing with argparse

d7ae73f

Update scripts/import_profiler/profiler.py

6a94207

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update scripts/import_profiler/profiler.py

e15d972

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update scripts/import_profiler/profiler.py

051b455

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update scripts/import_profiler/profiler.py

9f55f1c

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

fix: resolve code injection vulnerability when importing target_module

fb35fd3

- Use json.dumps to safely escape target_module string in all subprocess commands (mprofile, cprofile, trace)

docs: simplify profiler mechanism section

727d44e

- Replace detailed implementation architecture and alternative solutions with a concise high-level usage and expected output format per PR review feedback

update documenation md

70c2a24

docs: add inline comments explaining isolation of interpreter boot-ti…

40ac359

…me latency

feat: make pycache wiping automatic by default

88c2e7f

- Replaces --clear-pycache with an opt-out --keep-pycache flag - The profiler now automatically sweeps and enforces disk-level cold starts without requiring explicit flags from the user.

fix: tighten performance timer to strictly wrap import_module

fa00b05

Address PR review comment by moving setup tracking (rss_before, modules_before) outside of the time.perf_counter() window to prevent artificial latency inflation.

fix: replace bare Exception with OSError during LOC parsing

137085a

Addresses PR feedback to avoid broad exception catching when parsing physical python files for line counts.

docs: explain why ValueError is silently passed during source_from_cache

68465e5

Addresses PR review by explicitly documenting the expected PEP 3147/488 error path for irregular .pyc files.

refactor: replace magic string for CPU pinning with integer constant

9a3e981

Addresses PR review by replacing the 'none' string check with a typed NO_CPU_PINNING = -1 constant for cleaner argument parsing and state tracking.

refactor: remove code duplication and enforce fail-fast for missing t…

aaaa1a4

…askset Addresses PR review by removing the fallback unpinned execution logic when taskset is not installed. The script now fails immediately with a clear error message, preventing code duplication.

refactor: extract percentile calculation into a helper method

352188f

Addresses PR review by collapsing the repeated statistics.quantiles logic into a reusable _calculate_percentiles helper method.

refactor: emit warnings if dynamic import volume fluctuates between i…

c082ba7

…terations Addresses PR review by validating that 'loaded_modules' and 'loaded_lines' remain deterministic across cold-start iterations, warning the user if non-deterministic import paths are triggered.

refactor: use multiprocessing instead of dynamic string execution for…

3cc3c9d

… tracemalloc profiling Addresses PR review by replacing the hard-to-maintain dynamic code string inside run_mprofile with a properly structured multiprocessing 'spawn' context.

Prototype: PEP 0810 lazy imports architecture for GAPIC generator

237f4d8

chore: update rewrite_init.py to use list-based PEP 0810 lazy loading…

ad932ff

… syntax

Implement hybrid lazy loading strategy for compute_v1 (PEP 810 + PEP …

c2246a6

…562)

gemini-code-assist Bot reviewed Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prototype hybrid lazy imports#17572

Prototype hybrid lazy imports#17572
hebaalazzeh wants to merge 31 commits into
mainfrom
prototype-hybrid-lazy-imports

hebaalazzeh commented Jun 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	INIT_PATH = "packages/google-cloud-compute/google/cloud/compute_v1/__init__.py"
	import pathlib
	INIT_PATH = pathlib.Path(__file__).resolve().parents[2] / "packages/google-cloud-compute/google/cloud/compute_v1/__init__.py"

Uh oh!

Conversation

hebaalazzeh commented Jun 25, 2026

Overview

Implementation Architecture

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant