py_avro_schema

Generate Apache Avro schemas for Python types including standard library data-classes and Pydantic data models.

The main API is a single function, generate(). Its first argument is the Python type or class to generate the Avro schema for.

See also

Data types supported by py-avro-schema: Supported data types.

class py_avro_schema.DecimalMeta(precision: int, scale: int | None = None)[source]

Bases: object

Meta data to annotate a decimal.Decimal with precision and scale information

Example

>>> import decimal
>>> from typing import Annotated
>>> my_decimal: Annotated[decimal.Decimal, DecimalMeta(precision=4, scale=2)]

If scale is omitted it defaults to zero as per Avro specification.

class py_avro_schema.DecimalType[source]

Bases: object

A decimal type for type annotations including hints for precision and scale

Deprecated. See DecimalMeta instead.

Example

>>> import decimal
>>> my_decimal: DecimalType[4, 2] = decimal.Decimal("12.34")

Here, the subscript (4, 2) refers to the precision and scale of decimal numbers.

class py_avro_schema.Option(value)[source]

Bases: Flag

Schema generation options

Options can be passed in to the function py_avro_schema.generate(). Multiple values are specified like this:

Option.INT_32 | Option.FLOAT_32
AUTO_NAMESPACE_MODULE = 131072

Automatically populate namespaces using full (dotted) module names instead of top-level package names.

DEFAULTS_MANDATORY = 16384

Mandate default values to be specified for all dataclass fields. This option may be used to enforce default values on Avro record fields to support schema evolution/resolution.

FLOAT_32 = 4096

Use float schemas (32-bit) instead of double schemas (64-bit) for Python class float.

INT_32 = 2048

Use int schemas (32-bit) instead of long schemas (64-bit) for Python int.

JSON_APPEND_NEWLINE = 1024

Append a newline character at the end of the JSON data

JSON_INDENT_2 = 1

Format JSON data using 2 spaces indentation

JSON_SORT_KEYS = 32

Sort keys in JSON data

LOGICAL_JSON_STRING = 32768

Model Dict[str, Any] fields as string schemas instead of byte schemas (with logical type json, to support JSON serialization inside Avro).

MILLISECONDS = 8192

Use milliseconds instead of microseconds precision for (date)time schemas

NO_AUTO_NAMESPACE = 65536

Do not populate namespaces automatically based on the package a Python class is defined in.

NO_DOC = 262144

Do not populate doc schema attributes based on Python docstrings

USE_FIELD_ALIAS = 524288

Use the alias specified in a classes Field instead of the field’s name. This currently only affects Pydantic Models

exception py_avro_schema.TypeNotSupportedError[source]

Bases: TypeError

Error raised when a Avro schema cannot be generated for a given Python type

py_avro_schema.generate(py_type: ~typing.Type, *, namespace: str | None = None, options: ~py_avro_schema._schemas.Option = Option.None) bytes[source]

Return an Avro schema as a JSON-formatted bytestring for a given Python class or instance

This function is cached and can be called repeatedly with the same arguments without any performance penalty.

Parameters:
  • py_type – The Python class to generate a schema for.

  • namespace – The Avro namespace to add to schemas.

  • options – Schema generation options as defined by Option enum values. Specify multiple values like this: Option.INT_32 | Option.FLOAT_32.