py_avro_schema

Generate Apache Avro schemas for Python types including standard library data-classes and Pydantic data models.

The main API is a single function, generate(). Its first argument is the Python type or class to generate the Avro schema for.

See also

Data types supported by py-avro-schema: Supported data types.

class py_avro_schema.DecimalType

Bases: object

A decimal type for type annotations including hints for precision and scale

Example

>>> import decimal
>>> my_decimal: DecimalType[4, 2] = decimal.Decimal("12.34")

Here, the subscript (4, 2) refers to the precision and scale of decimal numbers.

class py_avro_schema.Option(value)

Bases: Flag

Schema generation options

Options can be passed in to the function py_avro_schema.generate(). Multiple values are specified like this:

Option.INT_32 | Option.FLOAT_32
AUTO_NAMESPACE_MODULE = 131072

Automatically populate namespaces using full (dotted) module names instead of top-level package names.

DEFAULTS_MANDATORY = 16384

Mandate default values to be specified for all dataclass fields. This option may be used to enforce default values on Avro record fields to support schema evolution/resolution.

FLOAT_32 = 4096

Use float schemas (32-bit) instead of double schemas (64-bit) for Python class float.

INT_32 = 2048

Use int schemas (32-bit) instead of long schemas (64-bit) for Python int.

JSON_APPEND_NEWLINE = 1024

Append a newline character at the end of the JSON data

JSON_INDENT_2 = 1

Format JSON data using 2 spaces indentation

JSON_SORT_KEYS = 32

Sort keys in JSON data

LOGICAL_JSON_STRING = 32768

Model Dict[str, Any] fields as string schemas instead of byte schemas (with logical type json, to support JSON serialization inside Avro).

MILLISECONDS = 8192

Use milliseconds instead of microseconds precision for (date)time schemas

NO_AUTO_NAMESPACE = 65536

Do not populate namespaces automatically based on the package a Python class is defined in.

NO_DOC = 262144

Do not populate doc schema attributes based on Python docstrings

py_avro_schema.generate(py_type: ~typing.Type, *, namespace: ~typing.Optional[str] = None, options: ~py_avro_schema._schemas.Option = Option.None) bytes

Return an Avro schema as a JSON-formatted bytestring for a given Python class or instance

This function is cached and can be called repeatedly with the same arguments without any performance penalty.

Parameters:
  • py_type – The Python class to generate a schema for.

  • namespace – The Avro namespace to add to schemas.

  • options – Schema generation options as defined by Option enum values. Specify multiple values like this: Option.INT_32 | Option.FLOAT_32.