gensbi.recipes.utils#

Attributes#

Functions#

_normalize_patch_size(size)

Normalize a patch-size spec into an (obs, cond) tuple.

_resolve_embedding_ids(dim, strategy, semantic_id[, size])

Resolve ID embeddings by strategy name.

build_edm_path(sde, config)

Build an EDM-family diffusion path from an SDE type string and config.

build_sm_path(sde_type, config)

Build a score-matching path from an SDE type string and config.

depatchify_2d(x[, size, grid])

Inverse of patchify_2d().

init_ids_1d(dim[, semantic_id])

Build 1D positional IDs, returning (ids, dim).

init_ids_2d(dim[, semantic_id, size])

Build 2D positional IDs for a patchified image grid.

init_ids_joint(dim_obs, dim_cond)

parse_training_config(config_path)

Parse training and optimizer configuration from a YAML config file.

patchify_2d(x[, size])

scale_lr(batch_size[, base_lr, reference_batch_size])

Scale learning rate based on batch size using square root scaling.

Module Contents#

gensbi.recipes.utils._normalize_patch_size(size)[source]#

Normalize a patch-size spec into an (obs, cond) tuple.

Parameters:

size (int or tuple of int) – A single int is broadcast to both inputs (8 -> (8, 8)). A length-2 tuple is taken as (obs_size, cond_size) so the two inputs can use different patch sizes. Use 1 for an input that is not patchified.

Returns:

(obs_size, cond_size).

Return type:

tuple of int

gensbi.recipes.utils._resolve_embedding_ids(dim, strategy, semantic_id, size=2)[source]#

Resolve ID embeddings by strategy name.

Parameters:
  • dim (int or tuple of int) – Dimension specification (number of tokens, or (H, W) for 2D images).

  • strategy (str) – Embedding strategy name (e.g., “absolute”, “pos1d”, “rope1d”, “pos2d”, “rope2d”).

  • semantic_id (int) – Semantic identifier for the token group (0=obs, 1=cond).

  • size (int, optional) – Patch edge length for 2D strategies (default 2). Ignored for 1D strategies. Use 1 for no patchification.

Returns:

  • ids (Array) – Token ID array.

  • resolved_dim (int) – Resolved flat dimension.

Raises:

ValueError – If strategy is not recognized.

gensbi.recipes.utils.build_edm_path(sde, config)[source]#

Build an EDM-family diffusion path from an SDE type string and config.

Parameters:
  • sde (str) – SDE type: "EDM", "VE", or "VP".

  • config (dict) – Training configuration dict; scheduler hyperparameters are read from here with sensible defaults.

Returns:

Configured diffusion path.

Return type:

EDMPath

Raises:

ValueError – If sde is not one of {"EDM", "VE", "VP"}.

gensbi.recipes.utils.build_sm_path(sde_type, config)[source]#

Build a score-matching path from an SDE type string and config.

Parameters:
  • sde_type (str) – SDE type: "VP" or "VE".

  • config (dict) – Training configuration dict; scheduler hyperparameters are read from here with sensible defaults.

Returns:

Configured score-matching path.

Return type:

SMPath

Raises:

ValueError – If sde_type is not one of {"VP", "VE"}.

gensbi.recipes.utils.depatchify_2d(x, size=2, grid=None)[source]#

Inverse of patchify_2d().

Parameters:
  • x (Array) – Patchified tensor of shape (B, h*w, C*size*size).

  • size (int) – Patch edge length used by patchify_2d().

  • grid (tuple of int, optional) – The (h, w) patch grid. The grid cannot be inferred from the token count alone, so it is required for non-square grids. If None, a square grid (h == w) is assumed.

gensbi.recipes.utils.init_ids_1d(dim, semantic_id=None)[source]#

Build 1D positional IDs, returning (ids, dim).

ids is (1, dim, 1) when semantic_id is None (position only), or (1, dim, 2) otherwise, with the position at axis 0 and the semantic id at axis 1.

FIXME (axis-order footgun): this places the semantic id on the LAST axis, which is the reverse of init_ids_2d() (semantic at axis 0, then h, w – the established convention). The two are safe in isolation, but they do NOT line up if 1D and 2D ids are ever fed into the same RoPE grid (e.g. FieldDiT Phase-2 obs+cond co-tokenization with a shared EmbedND, where axes_dim[i] is matched to ids axis i). This should be unified to the 2D convention (semantic at axis 0). It is not changed here because callers that pass semantic_id – notably the Flux1 rope1d path via init_ids() – and any code indexing ids[..., k] must be updated in lockstep.

Parameters:
  • dim (int)

  • semantic_id (Union[int, None])

gensbi.recipes.utils.init_ids_2d(dim, semantic_id=0, size=2)[source]#

Build 2D positional IDs for a patchified image grid.

The grid has one entry per patch, i.e. (dim[0] // size, dim[1] // size), matching patchify_2d(x, size=size). size is the patch edge length; use size=1 for no patchification (one token per pixel).

Parameters:
  • dim (Tuple[int, int])

  • semantic_id (int)

  • size (int)

gensbi.recipes.utils.init_ids_joint(dim_obs, dim_cond)[source]#
Parameters:
  • dim_obs (int)

  • dim_cond (int)

gensbi.recipes.utils.parse_training_config(config_path)[source]#

Parse training and optimizer configuration from a YAML config file.

Reads the training and optimizer sections of the config and returns a flat dictionary consumed by AbstractPipeline.

Parameters:

config_path (str) – Path to the YAML configuration file.

Returns:

training_config – Parsed training configuration dictionary.

Return type:

dict

gensbi.recipes.utils.patchify_2d(x, size=2)[source]#
Parameters:

x (jax.Array)

gensbi.recipes.utils.scale_lr(batch_size, base_lr=0.0001, reference_batch_size=256)[source]#

Scale learning rate based on batch size using square root scaling.

Parameters:
  • batch_size (int) – The current batch size.

  • base_lr (float) – The base learning rate for the reference batch size.

  • reference_batch_size (int, optional) – The reference batch size. Defaults to 256.

Returns:

The adjusted learning rate.

Return type:

float

gensbi.recipes.utils._EMBEDDINGS_1D[source]#
gensbi.recipes.utils._EMBEDDINGS_2D[source]#