gensbi.recipes.utils#
Attributes#
Functions#
|
Normalize a patch-size spec into an |
|
Resolve ID embeddings by strategy name. |
|
Build an EDM-family diffusion path from an SDE type string and config. |
|
Build a score-matching path from an SDE type string and config. |
|
Inverse of |
|
Build 1D positional IDs, returning |
|
Build 2D positional IDs for a patchified image grid. |
|
|
|
Parse training and optimizer configuration from a YAML config file. |
|
|
|
Scale learning rate based on batch size using square root scaling. |
Module Contents#
- gensbi.recipes.utils._normalize_patch_size(size)[source]#
Normalize a patch-size spec into an
(obs, cond)tuple.- Parameters:
size (int or tuple of int) – A single int is broadcast to both inputs (
8 -> (8, 8)). A length-2 tuple is taken as(obs_size, cond_size)so the two inputs can use different patch sizes. Use1for an input that is not patchified.- Returns:
(obs_size, cond_size).- Return type:
tuple of int
- gensbi.recipes.utils._resolve_embedding_ids(dim, strategy, semantic_id, size=2)[source]#
Resolve ID embeddings by strategy name.
- Parameters:
dim (int or tuple of int) – Dimension specification (number of tokens, or (H, W) for 2D images).
strategy (str) – Embedding strategy name (e.g., “absolute”, “pos1d”, “rope1d”, “pos2d”, “rope2d”).
semantic_id (int) – Semantic identifier for the token group (0=obs, 1=cond).
size (int, optional) – Patch edge length for 2D strategies (default 2). Ignored for 1D strategies. Use 1 for no patchification.
- Returns:
ids (Array) – Token ID array.
resolved_dim (int) – Resolved flat dimension.
- Raises:
ValueError – If
strategyis not recognized.
- gensbi.recipes.utils.build_edm_path(sde, config)[source]#
Build an EDM-family diffusion path from an SDE type string and config.
- Parameters:
sde (str) – SDE type:
"EDM","VE", or"VP".config (dict) – Training configuration dict; scheduler hyperparameters are read from here with sensible defaults.
- Returns:
Configured diffusion path.
- Return type:
- Raises:
ValueError – If
sdeis not one of{"EDM", "VE", "VP"}.
- gensbi.recipes.utils.build_sm_path(sde_type, config)[source]#
Build a score-matching path from an SDE type string and config.
- Parameters:
sde_type (str) – SDE type:
"VP"or"VE".config (dict) – Training configuration dict; scheduler hyperparameters are read from here with sensible defaults.
- Returns:
Configured score-matching path.
- Return type:
- Raises:
ValueError – If
sde_typeis not one of{"VP", "VE"}.
- gensbi.recipes.utils.depatchify_2d(x, size=2, grid=None)[source]#
Inverse of
patchify_2d().- Parameters:
x (Array) – Patchified tensor of shape
(B, h*w, C*size*size).size (int) – Patch edge length used by
patchify_2d().grid (tuple of int, optional) – The
(h, w)patch grid. The grid cannot be inferred from the token count alone, so it is required for non-square grids. IfNone, a square grid (h == w) is assumed.
- gensbi.recipes.utils.init_ids_1d(dim, semantic_id=None)[source]#
Build 1D positional IDs, returning
(ids, dim).idsis(1, dim, 1)whensemantic_id is None(position only), or(1, dim, 2)otherwise, with the position at axis 0 and the semantic id at axis 1.FIXME (axis-order footgun): this places the semantic id on the LAST axis, which is the reverse of
init_ids_2d()(semantic at axis 0, then h, w – the established convention). The two are safe in isolation, but they do NOT line up if 1D and 2D ids are ever fed into the same RoPE grid (e.g. FieldDiT Phase-2 obs+cond co-tokenization with a sharedEmbedND, whereaxes_dim[i]is matched to ids axisi). This should be unified to the 2D convention (semantic at axis 0). It is not changed here because callers that passsemantic_id– notably the Flux1rope1dpath viainit_ids()– and any code indexingids[..., k]must be updated in lockstep.- Parameters:
dim (int)
semantic_id (Union[int, None])
- gensbi.recipes.utils.init_ids_2d(dim, semantic_id=0, size=2)[source]#
Build 2D positional IDs for a patchified image grid.
The grid has one entry per patch, i.e.
(dim[0] // size, dim[1] // size), matchingpatchify_2d(x, size=size).sizeis the patch edge length; usesize=1for no patchification (one token per pixel).- Parameters:
dim (Tuple[int, int])
semantic_id (int)
size (int)
- gensbi.recipes.utils.init_ids_joint(dim_obs, dim_cond)[source]#
- Parameters:
dim_obs (int)
dim_cond (int)
- gensbi.recipes.utils.parse_training_config(config_path)[source]#
Parse training and optimizer configuration from a YAML config file.
Reads the
trainingandoptimizersections of the config and returns a flat dictionary consumed byAbstractPipeline.- Parameters:
config_path (str) – Path to the YAML configuration file.
- Returns:
training_config – Parsed training configuration dictionary.
- Return type:
dict
- gensbi.recipes.utils.scale_lr(batch_size, base_lr=0.0001, reference_batch_size=256)[source]#
Scale learning rate based on batch size using square root scaling.
- Parameters:
batch_size (int) – The current batch size.
base_lr (float) – The base learning rate for the reference batch size.
reference_batch_size (int, optional) – The reference batch size. Defaults to 256.
- Returns:
The adjusted learning rate.
- Return type:
float