V0.7 Scope Freeze#

Summary#

v0.7 is the first structured external-data ingestion release for pdelie.

Its purpose is:

ingest external structured 1D uniform rectilinear PDE data into canonical FieldBatch, so the existing stable scalar Heat/Burgers symmetry and discovery stack can run on imported data rather than only internally generated synthetic fixtures.

v0.7 is intentionally narrow. It is not a broad dataset-adapter release.


Stable Scope#

Stable v0.7 scope is limited to:

  • pdelie.data.from_numpy(...)

  • pdelie.data.from_xarray(...)

  • strict conversion into canonical FieldBatch

  • structured 1D uniform rectilinear trajectory data only

  • scalar-variable stable slice only

  • explicit dims, coords, metadata, mask, and provenance validation

  • parity with the existing Heat/Burgers symmetry and discovery pipeline

Stable v0.7 release definition:

external structured arrays -> canonical FieldBatch -> existing PDELie pipeline


Exact Public API Contracts#

Planned stable public APIs:

  • pdelie.data.from_numpy(values, *, dims, coords, var_name, metadata, mask=None, preprocess_log=None) -> FieldBatch

  • pdelie.data.from_xarray(data_array, *, var_name=None, metadata, mask=None, preprocess_log=None) -> FieldBatch

from_xarray(...) is frozen to xarray.DataArray only in v0.7. xarray.Dataset support is out of scope for the stable slice.

Variable-name rules:

  • from_numpy(...) always requires explicit var_name

  • from_xarray(...) resolves var_name by:

    • explicit var_name argument first

    • otherwise the single var coordinate value when explicit var axis exists

    • otherwise DataArray.name

    • otherwise validation failure


Accepted Layouts#

Stable accepted source layouts are:

  • ("time", "x")

  • ("batch", "time", "x")

  • ("time", "x", "var")

  • ("batch", "time", "x", "var")

Frozen axis/layout rules:

  • time is required

  • x is required

  • var may be omitted only for the scalar stable slice

  • if var is omitted, the importer injects a trailing singleton var axis

  • if var is present, its length must be exactly 1

  • no static / no-time layouts in stable v0.7

  • no dim aliases in stable v0.7

  • no y / z ingestion in stable v0.7


Coordinate and Metadata Validation#

Coordinate validation:

  • time and x coordinates are required

  • both must be 1D finite numeric arrays

  • time must be strictly increasing, uniform, and length >= 3

  • x must be strictly increasing, uniform, and length >= 4

  • x uniformity uses the current FieldBatch spatial tolerance policy

  • time uniformity is required because stable v0.7 imports target the current trajectory / discovery pipeline

  • no coordinate inference beyond extracting time / x from the provided inputs

  • no normalization, sorting, or repair of malformed coordinates

Metadata requirements:

  • the caller must supply the full required FieldBatch metadata mapping

  • there is no stable metadata inference in v0.7

  • required keys remain:

    • boundary_conditions

    • grid_type

    • coordinate_system

    • grid_regularity

    • parameter_tags

  • stable v0.7 imported fields must validate as:

    • grid_type == "rectilinear"

    • grid_regularity == "uniform"

    • coordinate_system == "cartesian"

    • boundary_conditions["x"] == "periodic"

  • parameter_tags must be a mapping but may be empty

  • xarray attrs are not used as stable metadata inference


Mask / NaN Policy#

Stable v0.7 importers preserve missing-data signals rather than normalizing them.

Frozen rules:

  • preserve both explicit masks and existing NaN / non-finite values

  • do not normalize NaN values into masks

  • do not normalize masks into NaN values

  • if mask is provided and var is injected, inject the singleton var axis into the mask too

  • from_numpy(...) accepts array-like mask

  • from_xarray(...) accepts xarray.DataArray mask only

  • mask must align with the pre-normalized input layout and the resulting post-injection shape


Copy / Provenance Semantics#

Stable importers always materialize owned canonical data.

Frozen copy rules:

  • always materialize and copy imported values

  • always copy coordinates

  • always copy masks when present

  • deep-copy metadata

  • if preprocess_log is omitted, start from []

  • if preprocess_log is provided, deep-copy it and then append exactly one new entry

Frozen provenance rule:

  • append exactly one provenance entry:

    • operation = "from_numpy" or "from_xarray"

    • parameters includes at least:

      • source_layout

      • imported_shape

      • injected_var_axis

      • mask_provided

No broader provenance schema is introduced in v0.7 M0.


Optional xarray Dependency Policy#

Stable dependency behavior:

  • from_numpy(...) is core-only

  • from_xarray(...) is a runtime-optional path

  • xarray must be imported lazily inside the function / module path

  • if xarray is unavailable, calling from_xarray(...) raises ImportError with an install message

  • the optional dependency extra name for stable from_xarray(...) support is xarray


Explicit Non-goals#

Out of stable v0.7 scope:

  • no xarray.Dataset stable support

  • no dim aliases

  • no static-field ingestion

  • no multidimensional ingestion

  • no y / z ingestion

  • no nonuniform-grid support

  • no metadata inference layer

  • no PDEBench-specific loader

  • no The Well adapter

  • no HDF5, netCDF, or Zarr stable loader

  • no weak-form methods

  • no operator methods

  • no stable KdV promotion piggybacked into ingestion work

  • no paper-specific experiment logic


Milestones#

Planned v0.7 sequence:

  • Milestone 0 — external ingestion contract freeze

  • Milestone 1 — from_numpy(...)

  • Milestone 2 — from_xarray(...)

  • Milestone 3 — parity tests and compact v0.7 release gate