czbenchmarks.file_utils

Attributes

log

DEFAULT_CACHE_DIR

DEFAULT_CACHE_EXPIRATION_DAYS

Classes

CacheManager

Centralized cache management for remote files.

Functions

download_file_from_remote(→ str)

Download a remote file to a local cache directory.

Module Contents

czbenchmarks.file_utils.log[source]
czbenchmarks.file_utils.DEFAULT_CACHE_DIR
czbenchmarks.file_utils.DEFAULT_CACHE_EXPIRATION_DAYS
class czbenchmarks.file_utils.CacheManager(cache_dir: str | pathlib.Path = DEFAULT_CACHE_DIR, expiration_days: int = DEFAULT_CACHE_EXPIRATION_DAYS)[source]

Centralized cache management for remote files.

cache_dir
expiration_days
ensure_directory_exists(directory: pathlib.Path) None[source]

Ensure the given directory exists.

get_cache_path(remote_url: str) pathlib.Path[source]

Generate a local cache path for a remote file.

is_expired(file_path: pathlib.Path) bool[source]

Check if a cached file is expired.

clean_expired_cache() None[source]

Clean up expired cache files.

czbenchmarks.file_utils.download_file_from_remote(remote_url: str, cache_dir: str | pathlib.Path = None, make_unsigned_request: bool = True) str[source]

Download a remote file to a local cache directory.

Parameters:
  • remote_url (str) – Remote URL of the file (e.g., S3 path).

  • cache_dir (str | Path, optional) – Local directory to save the file. Defaults to the global cache manager’s directory.

  • make_unsigned_request (bool, optional) – Whether to use unsigned requests for S3 (default: True).

Returns:

Local path to the downloaded file.

Return type:

str

Raises:

Notes

  • If the file already exists in the cache and is not expired, it will not be downloaded again.

  • Unsigned requests are tried first, followed by signed requests if the former fails.