pydantic

pypi license

Current Version: v0.32.2

Data validation and settings management using python type hinting.

Define how data should be in pure, canonical python; validate it with pydantic.

PEP 484 introduced type hinting into python 3.5, PEP 526 extended that with syntax for variable annotation in python 3.6.

pydantic uses those annotations to validate that untrusted data takes the form you want.

There’s also support for an extension to dataclasses where the input data is validated.

Example:

from datetime import datetime
from typing import List
from pydantic import BaseModel

class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: datetime = None
    friends: List[int] = []

external_data = {'id': '123', 'signup_ts': '2017-06-01 12:22', 'friends': [1, '2', b'3']}
user = User(**external_data)
print(user)
# > User id=123 name='John Doe' signup_ts=datetime.datetime(2017, 6, 1, 12, 22) friends=[1, 2, 3]
print(user.id)
# > 123

(This script is complete, it should run “as is”)

What’s going on here:

  • id is of type int; the annotation only declaration tells pydantic that this field is required. Strings, bytes or floats will be coerced to ints if possible, otherwise an exception would be raised.

  • name is inferred as a string from the default, it is not required as it has a default.

  • signup_ts is a datetime field which is not required (None if it’s not supplied), pydantic will process either a unix timestamp int (e.g. 1496498400) or a string representing the date & time.

  • friends uses python’s typing system, it is required to be a list of integers, as with id integer-like objects will be converted to integers.

If validation fails pydantic with raise an error with a breakdown of what was wrong:

from pydantic import ValidationError
try:
    User(signup_ts='broken', friends=[1, 2, 'not number'])
except ValidationError as e:
    print(e.json())

"""
[
  {
    "loc": [
      "id"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  },
  {
    "loc": [
      "signup_ts"
    ],
    "msg": "invalid datetime format",
    "type": "type_error.datetime"
  },
  {
    "loc": [
      "friends",
      2
    ],
    "msg": "value is not a valid integer",
    "type": "type_error.integer"
  }
]
"""

Rationale

So pydantic uses some cool new language feature, but why should I actually go and use it?

no brainfuck

no new schema definition micro-language to learn. If you know python (and perhaps skim read the type hinting docs) you know how to use pydantic.

plays nicely with your IDE/linter/brain

because pydantic data structures are just instances of classes you define; auto-completion, linting, mypy and your intuition should all work properly with your validated data.

dual use

pydantic’s BaseSettings class allows it to be used in both a “validate this request data” context and “load my system settings” context. The main difference being that system settings can have defaults changed by environment variables and more complex objects like DSNs and python objects are often required.

fast

In benchmarks pydantic is faster than all other tested libraries.

validate complex structures

use of recursive pydantic models, typing’s List and Dict etc. and validators allow complex data schemas to be clearly and easily defined and then checked.

extensible

pydantic allows custom data types to be defined or you can extend validation with methods on a model decorated with the validator decorator.

Install

Just:

pip install pydantic

pydantic has no required dependencies except python 3.6 or 3.7 (and the dataclasses package in python 3.6). If you’ve got python 3.6 and pip installed - you’re good to go.

pydantic can optionally be compiled with cython which should give a 30-50% performance improvement. manylinux binaries exist for python 3.6 and 3.7, so if you’re installing from PyPI on linux, you should get pydantic compiled with no extra work. If you’re installing manually, install cython before installing pydantic and you should get pydandic compiled. Compilation with cython is not tested on windows or mac. [issue]

To test if pydantic is compiled run:

import pydantic
print('compiled:', pydantic.compiled)

If you want pydantic to parse json faster you can add ujson as an optional dependency. Similarly pydantic’s email validation relies on email-validator

pip install pydantic[ujson]
# or
pip install pydantic[email]
# or just
pip install pydantic[ujson,email]

Of course you can also install these requirements manually with pip install ....

Pydantic is also available on conda under the conda-forge channel:

conda install pydantic -c conda-forge

Usage

PEP 484 Types

pydantic uses typing types to define more complex objects.

from typing import Dict, List, Optional, Sequence, Set, Tuple, Union

from pydantic import BaseModel


class Model(BaseModel):
    simple_list: list = None
    list_of_ints: List[int] = None

    simple_tuple: tuple = None
    tuple_of_different_types: Tuple[int, float, str, bool] = None

    simple_dict: dict = None
    dict_str_float: Dict[str, float] = None

    simple_set: set = None
    set_bytes: Set[bytes] = None

    str_or_bytes: Union[str, bytes] = None
    none_or_str: Optional[str] = None

    sequence_of_ints: Sequence[int] = None

    compound: Dict[Union[str, bytes], List[Set[int]]] = None

print(Model(simple_list=['1', '2', '3']).simple_list)  # > ['1', '2', '3']
print(Model(list_of_ints=['1', '2', '3']).list_of_ints)  # > [1, 2, 3]

print(Model(simple_dict={'a': 1, b'b': 2}).simple_dict)  # > {'a': 1, b'b': 2}
print(Model(dict_str_float={'a': 1, b'b': 2}).dict_str_float)  # > {'a': 1.0, 'b': 2.0}

print(Model(simple_tuple=[1, 2, 3, 4]).simple_tuple)  # > (1, 2, 3, 4)
print(Model(tuple_of_different_types=[1, 2, 3, 4]).tuple_of_different_types)  # > (1, 2.0, '3', True)

print(Model(sequence_of_ints=[1, 2, 3, 4]).sequence_of_ints)  # > [1, 2, 3, 4]
print(Model(sequence_of_ints=(1, 2, 3, 4)).sequence_of_ints)  # > (1, 2, 3, 4)

(This script is complete, it should run “as is”)

dataclasses

Note

New in version v0.14.

If you don’t want to use pydantic’s BaseModel you can instead get the same data validation on standard dataclasses (introduced in python 3.7).

Dataclasses work in python 3.6 using the dataclasses backport package.

from datetime import datetime
from pydantic.dataclasses import dataclass

@dataclass
class User:
    id: int
    name: str = 'John Doe'
    signup_ts: datetime = None


user = User(id='42', signup_ts='2032-06-21T12:00')
print(user)
# > User(id=42, name='John Doe', signup_ts=datetime.datetime(2032, 6, 21, 12, 0))

(This script is complete, it should run “as is”)

You can use all the standard pydantic field types and the resulting dataclass will be identical to the one created by the standard library dataclass decorator.

pydantic.dataclasses.dataclass’s arguments are the same as the standard decorator, except one extra key word argument config which has the same meaning as Config.

Note

As a side effect of getting pydantic dataclasses to play nicely with mypy the config argument will show as invalid in IDEs and mypy, use @dataclass(..., config=Config) # type: ignore as a workaround. See python/mypy#6239 for an explanation of why this is.

Nested dataclasses

Since version v0.17 nested dataclasses are supported both in dataclasses and normal models.

from pydantic import UrlStr
from pydantic.dataclasses import dataclass

@dataclass
class NavbarButton:
    href: UrlStr

@dataclass
class Navbar:
    button: NavbarButton

navbar = Navbar(button=('https://example.com',))
print(navbar)
# > Navbar(button=NavbarButton(href='https://example.com'))

(This script is complete, it should run “as is”)

Dataclasses attributes can be populated by tuples, dictionaries or instances of that dataclass.

Initialize hooks

Since version v0.28 when you initialize a dataclass, it is possible to execute code after validation with the help of __post_init_post_parse__. This is not the same as __post_init__ which executes code before validation.

from datetime import datetime
from pydantic.dataclasses import dataclass

@dataclass
class Birth:
    year: int
    month: int
    day: int


@dataclass
class User:
    birth: Birth

    def __post_init__(self):
        print(self.birth)
        # > {'year': 1995, 'month': 3, 'day': 2}

    def __post_init_post_parse__(self):
        print(self.birth)
        # > Birth(year=1995, month=3, day=2)


user = User(**{'birth': {'year': 1995, 'month': 3, 'day': 2}})

(This script is complete, it should run “as is”)

Choices

pydantic uses python’s standard enum classes to define choices.

from enum import Enum, IntEnum

from pydantic import BaseModel


class FruitEnum(str, Enum):
    pear = 'pear'
    banana = 'banana'


class ToolEnum(IntEnum):
    spanner = 1
    wrench = 2


class CookingModel(BaseModel):
    fruit: FruitEnum = FruitEnum.pear
    tool: ToolEnum = ToolEnum.spanner


print(CookingModel())
# > CookingModel fruit=<FruitEnum.pear: 'pear'> tool=<ToolEnum.spanner: 1>
print(CookingModel(tool=2, fruit='banana'))
# > CookingModel fruit=<FruitEnum.banana: 'banana'> tool=<ToolEnum.wrench: 2>
print(CookingModel(fruit='other'))
# will raise a validation error

(This script is complete, it should run “as is”)

Validators

Custom validation and complex relationships between objects can achieved using the validator decorator.

from pydantic import BaseModel, ValidationError, validator


class UserModel(BaseModel):
    name: str
    password1: str
    password2: str

    @validator('name')
    def name_must_contain_space(cls, v):
        if ' ' not in v:
            raise ValueError('must contain a space')
        return v.title()

    @validator('password2')
    def passwords_match(cls, v, values, **kwargs):
        if 'password1' in values and v != values['password1']:
            raise ValueError('passwords do not match')
        return v


print(UserModel(name='samuel colvin', password1='zxcvbn', password2='zxcvbn'))
# > UserModel name='Samuel Colvin' password1='zxcvbn' password2='zxcvbn'

try:
    UserModel(name='samuel', password1='zxcvbn', password2='zxcvbn2')
except ValidationError as e:
    print(e)
"""
2 validation errors
name
  must contain a space (type=value_error)
password2
  passwords do not match (type=value_error)
"""

(This script is complete, it should run “as is”)

A few things to note on validators:

  • validators are “class methods”, the first value they receive here will be the UserModel not an instance of UserModel

  • their signature can be (cls, value) or (cls, value, values, config, field). As of v0.20, any subset of values, config and field is also permitted, eg. (cls, value, field), however due to the way validators are inspected, the variadic key word argument (“**kwargs”) must be called kwargs.

  • validator should either return the new value or raise a ValueError or TypeError

  • where validators rely on other values, you should be aware that:

    • Validation is done in the order fields are defined, eg. here password2 has access to password1 (and name), but password1 does not have access to password2. You should heed the warning below regarding field order and required fields.

    • If validation fails on another field (or that field is missing) it will not be included in values, hence if 'password1' in values and ... in this example.

Note

From v0.18 onwards validators are not called on keys of dictionaries. If you wish to validate keys, use whole (see below).

Pre and Whole Validators

Validators can do a few more complex things:

import json
from typing import List

from pydantic import BaseModel, ValidationError, validator


class DemoModel(BaseModel):
    numbers: List[int] = []
    people: List[str] = []

    @validator('people', 'numbers', pre=True, whole=True)
    def json_decode(cls, v):
        if isinstance(v, str):
            try:
                return json.loads(v)
            except ValueError:
                pass
        return v

    @validator('numbers')
    def check_numbers_low(cls, v):
        if v > 4:
            raise ValueError(f'number too large {v} > 4')
        return v

    @validator('numbers', whole=True)
    def check_sum_numbers_low(cls, v):
        if sum(v) > 8:
            raise ValueError(f'sum of numbers greater than 8')
        return v


print(DemoModel(numbers='[1, 1, 2, 2]'))
# > DemoModel numbers=[1, 1, 2, 2] people=[]

try:
    DemoModel(numbers='[1, 2, 5]')
except ValidationError as e:
    print(e)
"""
1 validation error
numbers -> 2
  number too large 5 > 4 (type=value_error)
"""

try:
    DemoModel(numbers=[3, 3, 3])
except ValidationError as e:
    print(e)
"""
1 validation error
numbers
  sum of numbers greater than 8 (type=value_error)
"""

(This script is complete, it should run “as is”)

A few more things to note:

  • a single validator can apply to multiple fields, either by defining multiple fields or by the special value '*' which means that validator will be called for all fields.

  • the keyword argument pre will cause validators to be called prior to other validation

  • the whole keyword argument will mean validators are applied to entire objects rather than individual values (applies for complex typing objects eg. List, Dict, Set)

Validate Always

For performance reasons by default validators are not called for fields where the value is not supplied. However there are situations where it’s useful or required to always call the validator, e.g. to set a dynamic default value.

from datetime import datetime

from pydantic import BaseModel, validator


class DemoModel(BaseModel):
    ts: datetime = None

    @validator('ts', pre=True, always=True)
    def set_ts_now(cls, v):
        return v or datetime.now()


print(DemoModel())
# > DemoModel ts=datetime.datetime(2017, 11, 8, 13, 59, 11, 723629)

print(DemoModel(ts='2017-11-08T14:00'))
# > DemoModel ts=datetime.datetime(2017, 11, 8, 14, 0)

(This script is complete, it should run “as is”)

You’ll often want to use this together with pre since otherwise the with always=True pydantic would try to validate the default None which would cause an error.

Dataclass Validators

Validators also work in Dataclasses.

from datetime import datetime

from pydantic import validator
from pydantic.dataclasses import dataclass


@dataclass
class DemoDataclass:
    ts: datetime = None

    @validator('ts', pre=True, always=True)
    def set_ts_now(cls, v):
        return v or datetime.now()


print(DemoDataclass())
# > DemoDataclass(ts=datetime.datetime(2019, 4, 2, 18, 1, 46, 66149))

print(DemoDataclass(ts='2017-11-08T14:00'))
# > DemoDataclass ts=datetime.datetime(2017, 11, 8, 14, 0)

(This script is complete, it should run “as is”)

Field Checks

On class creation validators are checked to confirm that the fields they specify actually exist on the model.

Occasionally however this is not wanted: when you define a validator to validate fields on inheriting models. In this case you should set check_fields=False on the validator.

Recursive Models

More complex hierarchical data structures can be defined using models as types in annotations themselves.

The ellipsis ... just means “Required” same as annotation only declarations above.

from typing import List
from pydantic import BaseModel

class Foo(BaseModel):
    count: int = ...
    size: float = None

class Bar(BaseModel):
    apple = 'x'
    banana = 'y'

class Spam(BaseModel):
    foo: Foo = ...
    bars: List[Bar] = ...


m = Spam(foo={'count': 4}, bars=[{'apple': 'x1'}, {'apple': 'x2'}])
print(m)
# > Spam foo=<Foo count=4 size=None> bars=[<Bar apple='x1' banana='y'>, <Bar apple='x2' banana='y'>]
print(m.dict())
# {'foo': {'count': 4, 'size': None}, 'bars': [{'apple': 'x1', 'banana': 'y'}, {'apple': 'x2', 'banana': 'y'}]}

(This script is complete, it should run “as is”)

Self-referencing Models

Data structures with self-referencing models are also supported, provided the function update_forward_refs() is called once the model is created (you will be reminded with a friendly error message if you don’t).

Within the model, you can refer to the not-yet-constructed model by a string :

from pydantic import BaseModel

class Foo(BaseModel):
    a: int = 123
    #: The sibling of `Foo` is referenced by string
    sibling: 'Foo' = None

Foo.update_forward_refs()

print(Foo())
#> Foo a=123 sibling=None
print(Foo(sibling={'a': '321'}))
#> Foo a=123 sibling=<Foo a=321 sibling=None>

(This script is complete, it should run “as is”)

Since python 3.7, You can also refer it by its type, provided you import annotations (see the relevant paragraph for support depending on Python and pydantic versions).

from __future__ import annotations
from pydantic import BaseModel

class Foo(BaseModel):
    a: int = 123
    #: The sibling of `Foo` is referenced directly by type
    sibling: Foo = None

Foo.update_forward_refs()

print(Foo())
#> Foo a=123 sibling=None
print(Foo(sibling={'a': '321'}))
#> Foo a=123 sibling=<Foo a=321 sibling=None>

(This script is complete, it should run “as is”)

Generic Models

Note

New in version v0.29.

This feature requires Python 3.7+.

Pydantic supports the creation of generic models to make it easier to reuse a common model structure.

In order to declare a generic model, you perform the following steps:

  • Declare one or more typing.TypeVar instances to use to parameterize your model.

  • Declare a pydantic model that inherits from pydantic.generics.GenericModel and typing.Generic, where you pass the TypeVar instances as parameters to typing.Generic.

  • Use the TypeVar instances as annotations where you will want to replace them with other types or pydantic models.

Here is an example using GenericModel to create an easily-reused HTTP response payload wrapper:

from typing import Generic, TypeVar, Optional, List

from pydantic import BaseModel, validator, ValidationError
from pydantic.generics import GenericModel


DataT = TypeVar('DataT')


class Error(BaseModel):
    code: int
    message: str


class DataModel(BaseModel):
    numbers: List[int]
    people: List[str]


class Response(GenericModel, Generic[DataT]):
    data: Optional[DataT]
    error: Optional[Error]

    @validator('error', always=True)
    def check_consistency(cls, v, values):
        if v is not None and values['data'] is not None:
            raise ValueError('must not provide both data and error')
        if v is None and values.get('data') is None:
            raise ValueError('must provide data or error')
        return v


data = DataModel(numbers=[1, 2, 3], people=[])
error = Error(code=404, message='Not found')

print(Response[int](data=1))
# > Response[int] data=1 error=None
print(Response[str](data='value'))
# > Response[str] data='value' error=None
print(Response[str](data='value').dict())
# > {'data': 'value', 'error': None}
print(Response[DataModel](data=data).dict())
# > {'data': {'numbers': [1, 2, 3], 'people': []}, 'error': None}
print(Response[DataModel](error=error).dict())
# > {'data': None, 'error': {'code': 404, 'message': 'Not found'}}

try:
    Response[int](data='value')
except ValidationError as e:
    print(e)
"""
4 validation errors
data
  value is not a valid integer (type=type_error.integer)
data
  value is not none (type=type_error.none.allowed)
error
  value is not a valid dict (type=type_error.dict)
error
  must provide data or error (type=value_error)
"""

(This script is complete, it should run “as is”)

If you set Config or make use of validator in your generic model definition, it is applied to concrete subclasses in the same way as when inheriting from BaseModel. Any methods defined on your generic class will also be inherited.

Pydantic’s generics also integrate properly with mypy, so you get all the type checking you would expect mypy to provide if you were to declare the type without using GenericModel.

Note

Internally, pydantic uses create_model to generate a (cached) concrete BaseModel at runtime, so there is essentially zero overhead introduced by making use of GenericModel.

ORM Mode (aka Arbitrary Class Instances)

Pydantic models can be created from arbitrary class instances to support models that map to ORM objects.

To do this: 1. The Config property orm_mode must be set to True. 2. The special constructor from_orm must be used to create the model instance.

The example here uses SQLAlchemy but the same approach should work for any ORM.

from typing import List
from sqlalchemy import Column, Integer, String
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.ext.declarative import declarative_base
from pydantic import BaseModel, constr

Base = declarative_base()

class CompanyOrm(Base):
    __tablename__ = 'companies'
    id = Column(Integer, primary_key=True, nullable=False)
    public_key = Column(String(20), index=True, nullable=False, unique=True)
    name = Column(String(63), unique=True)
    domains = Column(ARRAY(String(255)))

class CompanyModel(BaseModel):
    id: int
    public_key: constr(max_length=20)
    name: constr(max_length=63)
    domains: List[constr(max_length=255)]

    class Config:
        orm_mode = True

co_orm = CompanyOrm(id=123, public_key='foobar', name='Testing', domains=['example.com', 'foobar.com'])
print(co_orm)
#> <__main__.CompanyOrm object at 0x7ff4bf918278>
co_model = CompanyModel.from_orm(co_orm)
print(co_model)
#> CompanyModel id=123 public_key='foobar' name='Testing' domains=['example.com', 'foobar.com']

(This script is complete, it should run “as is”)

ORM instances will be parsed with from_orm recursively as well as at the top level.

Here a vanilla class is used to demonstrate the principle, but any ORM could be used instead.

from typing import List
from pydantic import BaseModel

class PetCls:
    def __init__(self, *, name: str, species: str):
        self.name = name
        self.species = species

class PersonCls:
    def __init__(self, *, name: str, age: float = None, pets: List[PetCls]):
        self.name = name
        self.age = age
        self.pets = pets

class Pet(BaseModel):
    name: str
    species: str

    class Config:
        orm_mode = True

class Person(BaseModel):
    name: str
    age: float = None
    pets: List[Pet]

    class Config:
        orm_mode = True

bones = PetCls(name='Bones', species='dog')
orion = PetCls(name='Orion', species='cat')
anna = PersonCls(name='Anna', age=20, pets=[bones, orion])
anna_model = Person.from_orm(anna)
print(anna_model)
#> Person name='Anna' pets=[<Pet name='Bones' species='dog'>, <Pet name='Orion' species='cat'>] age=20.0

(This script is complete, it should run “as is”)

Schema Creation

Pydantic allows auto creation of JSON Schemas from models:

from enum import Enum
from pydantic import BaseModel, Schema

class FooBar(BaseModel):
    count: int
    size: float = None

class Gender(str, Enum):
    male = 'male'
    female = 'female'
    other = 'other'
    not_given = 'not_given'

class MainModel(BaseModel):
    """
    This is the description of the main model
    """
    foo_bar: FooBar = Schema(...)
    gender: Gender = Schema(
        None,
        alias='Gender',
    )
    snap: int = Schema(
        42,
        title='The Snap',
        description='this is the value of snap',
        gt=30,
        lt=50,
    )

    class Config:
        title = 'Main'

print(MainModel.schema())
# > {
#       'type': 'object',
#       'title': 'Main',
#       'properties': {
#           'foo_bar': {
#           ...
print(MainModel.schema_json(indent=2))

(This script is complete, it should run “as is”)

Outputs:

{
  "title": "Main",
  "description": "This is the description of the main model",
  "type": "object",
  "properties": {
    "foo_bar": {
      "$ref": "#/definitions/FooBar"
    },
    "Gender": {
      "title": "Gender",
      "enum": [
        "male",
        "female",
        "other",
        "not_given"
      ],
      "type": "string"
    },
    "snap": {
      "title": "The Snap",
      "description": "this is the value of snap",
      "default": 42,
      "exclusiveMinimum": 30,
      "exclusiveMaximum": 50,
      "type": "integer"
    }
  },
  "required": [
    "foo_bar"
  ],
  "definitions": {
    "FooBar": {
      "title": "FooBar",
      "type": "object",
      "properties": {
        "count": {
          "title": "Count",
          "type": "integer"
        },
        "size": {
          "title": "Size",
          "type": "number"
        }
      },
      "required": [
        "count"
      ]
    }
  }
}

The generated schemas are compliant with the specifications: JSON Schema Core, JSON Schema Validation and OpenAPI.

BaseModel.schema will return a dict of the schema, while BaseModel.schema_json will return a JSON string representation of that.

Sub-models used are added to the definitions JSON attribute and referenced, as per the spec.

All sub-models (and their sub-models) schemas are put directly in a top-level definitions JSON key for easy re-use and reference.

“sub-models” with modifications (via the Schema class) like a custom title, description or default value, are recursively included instead of referenced.

The description for models is taken from the docstring of the class or the argument description to the Schema class.

Optionally the Schema class can be used to provide extra information about the field and validations, arguments:

  • default (positional argument), since the Schema is replacing the field’s default, its first argument is used to set the default, use ellipsis (...) to indicate the field is required

  • alias - the public name of the field

  • title if omitted field_name.title() is used

  • description if omitted and the annotation is a sub-model, the docstring of the sub-model will be used

  • const this field must take it’s default value if it is present

  • gt for numeric values (int, float, Decimal), adds a validation of “greater than” and an annotation of exclusiveMinimum to the JSON Schema

  • ge for numeric values, adds a validation of “greater than or equal” and an annotation of minimum to the JSON Schema

  • lt for numeric values, adds a validation of “less than” and an annotation of exclusiveMaximum to the JSON Schema

  • le for numeric values, adds a validation of “less than or equal” and an annotation of maximum to the JSON Schema

  • multiple_of for numeric values, adds a validation of “a multiple of” and an annotation of multipleOf to the JSON Schema

  • min_items for list values, adds a corresponding validation and an annotation of minItems to the JSON Schema

  • max_items for list values, adds a corresponding validation and an annotation of maxItems to the JSON Schema

  • min_length for string values, adds a corresponding validation and an annotation of minLength to the JSON Schema

  • max_length for string values, adds a corresponding validation and an annotation of maxLength to the JSON Schema

  • regex for string values, adds a Regular Expression validation generated from the passed string and an annotation of pattern to the JSON Schema

  • ** any other keyword arguments (eg. examples) will be added verbatim to the field’s schema

Instead of using Schema, the fields property of the Config class can be used to set all the arguments above except default.

The schema is generated by default using aliases as keys, it can also be generated using model property names not aliases with MainModel.schema/schema_json(by_alias=False).

Types, custom field types, and constraints (as max_length) are mapped to the corresponding JSON Schema Core spec format when there’s an equivalent available, next to JSON Schema Validation, OpenAPI Data Types (which are based on JSON Schema), or otherwise use the standard format JSON field to define Pydantic extensions for more complex string sub-types.

The field schema mapping from Python / Pydantic to JSON Schema is done as follows:

Python type

JSON Schema Type

Additional JSON Schema

Defined in

Notes

bool

boolean

JSON Schema Core

str

string

JSON Schema Core

float

number

JSON Schema Core

int

integer

JSON Schema Validation

dict

object

JSON Schema Core

list

array

{"items": {}}

JSON Schema Core

tuple

array

{"items": {}}

JSON Schema Core

set

array

{"items": {}, {"uniqueItems": true}

JSON Schema Validation

List[str]

array

{"items": {"type": "string"}}

JSON Schema Validation

And equivalently for any other sub type, e.g. List[int].

Tuple[str, int]

array

{"items": [{"type": "string"}, {"type": "integer"}]}

JSON Schema Validation

And equivalently for any other set of subtypes. Note: If using schemas for OpenAPI, you shouldn’t use this declaration, as it would not be valid in OpenAPI (although it is valid in JSON Schema).

Dict[str, int]

object

{"additionalProperties": {"type": "integer"}}

JSON Schema Validation

And equivalently for any other subfields for dicts. Have in mind that although you can use other types as keys for dicts with Pydantic, only strings are valid keys for JSON, and so, only str is valid as JSON Schema key types.

Union[str, int]

anyOf

{"anyOf": [{"type": "string"}, {"type": "integer"}]}

JSON Schema Validation

And equivalently for any other subfields for unions.

Enum

enum

{"enum": [...]}

JSON Schema Validation

All the literal values in the enum are included in the definition.

SecretStr

string

{"writeOnly": true}

JSON Schema Validation

SecretBytes

string

{"writeOnly": true}

JSON Schema Validation

EmailStr

string

{"format": "email"}

JSON Schema Validation

NameEmail

string

{"format": "name-email"}

Pydantic standard “format” extension

UrlStr

string

{"format": "uri"}

JSON Schema Validation

DSN

string

{"format": "dsn"}

Pydantic standard “format” extension

bytes

string

{"format": "binary"}

OpenAPI

Decimal

number

JSON Schema Core

UUID1

string

{"format": "uuid1"}

Pydantic standard “format” extension

UUID3

string

{"format": "uuid3"}

Pydantic standard “format” extension

UUID4

string

{"format": "uuid4"}

Pydantic standard “format” extension

UUID5

string

{"format": "uuid5"}

Pydantic standard “format” extension

UUID

string

{"format": "uuid"}

Pydantic standard “format” extension

Suggested in OpenAPI.

FilePath

string

{"format": "file-path"}

Pydantic standard “format” extension

DirectoryPath

string

{"format": "directory-path"}

Pydantic standard “format” extension

Path

string

{"format": "path"}

Pydantic standard “format” extension

datetime

string

{"format": "date-time"}

JSON Schema Validation

date

string

{"format": "date"}

JSON Schema Validation

time

string

{"format": "time"}

JSON Schema Validation

timedelta

number

{"format": "time-delta"}

Difference in seconds (a float), with Pydantic standard “format” extension

Suggested in JSON Schema repository’s issues by maintainer.

Json

string

{"format": "json-string"}

Pydantic standard “format” extension

IPv4Address

string

{"format": "ipv4"}

JSON Schema Validation

IPv6Address

string

{"format": "ipv6"}

JSON Schema Validation

IPvAnyAddress

string

{"format": "ipvanyaddress"}

Pydantic standard “format” extension

IPv4 or IPv6 address as used in ipaddress module

IPv4Interface

string

{"format": "ipv4interface"}

Pydantic standard “format” extension

IPv4 interface as used in ipaddress module

IPv6Interface

string

{"format": "ipv6interface"}

Pydantic standard “format” extension

IPv6 interface as used in ipaddress module

IPvAnyInterface

string

{"format": "ipvanyinterface"}

Pydantic standard “format” extension

IPv4 or IPv6 interface as used in ipaddress module

IPv4Network

string

{"format": "ipv4network"}

Pydantic standard “format” extension

IPv4 network as used in ipaddress module

IPv6Network

string

{"format": "ipv6network"}

Pydantic standard “format” extension

IPv6 network as used in ipaddress module

IPvAnyNetwork

string

{"format": "ipvanynetwork"}

Pydantic standard “format” extension

IPv4 or IPv6 network as used in ipaddress module

StrictBool

boolean

JSON Schema Core

StrictStr

string

JSON Schema Core

ConstrainedStr

string

JSON Schema Core

If the type has values declared for the constraints, they are included as validations. See the mapping for constr below.

constr(regex='^text$', min_length=2, max_length=10)

string

{"pattern": "^text$", "minLength": 2, "maxLength": 10}

JSON Schema Validation

Any argument not passed to the function (not defined) will not be included in the schema.

ConstrainedInt

integer

JSON Schema Core

If the type has values declared for the constraints, they are included as validations. See the mapping for conint below.

conint(gt=1, ge=2, lt=6, le=5, multiple_of=2)

integer

{"maximum": 5, "exclusiveMaximum": 6, "minimum": 2, "exclusiveMinimum": 1, "multipleOf": 2}

Any argument not passed to the function (not defined) will not be included in the schema.

PositiveInt

integer

{"exclusiveMinimum": 0}

JSON Schema Validation

NegativeInt

integer

{"exclusiveMaximum": 0}

JSON Schema Validation

ConstrainedFloat

number

JSON Schema Core

If the type has values declared for the constraints, they are included as validations.See the mapping for confloat below.

confloat(gt=1, ge=2, lt=6, le=5, multiple_of=2)

number

{"maximum": 5, "exclusiveMaximum": 6, "minimum": 2, "exclusiveMinimum": 1, "multipleOf": 2}

JSON Schema Validation

Any argument not passed to the function (not defined) will not be included in the schema.

PositiveFloat

number

{"exclusiveMinimum": 0}

JSON Schema Validation

NegativeFloat

number

{"exclusiveMaximum": 0}

JSON Schema Validation

ConstrainedDecimal

number

JSON Schema Core

If the type has values declared for the constraints, they are included as validations. See the mapping for condecimal below.

condecimal(gt=1, ge=2, lt=6, le=5, multiple_of=2)

number

{"maximum": 5, "exclusiveMaximum": 6, "minimum": 2, "exclusiveMinimum": 1, "multipleOf": 2}

JSON Schema Validation

Any argument not passed to the function (not defined) will not be included in the schema.

BaseModel

object

JSON Schema Core

All the properties defined will be defined with standard JSON Schema, including submodels.

Color

string

{"format": "color"}

Pydantic standard “format” extension

You can also generate a top-level JSON Schema that only includes a list of models and all their related submodules in its definitions:

import json
from pydantic import BaseModel
from pydantic.schema import schema


class Foo(BaseModel):
    a: str = None


class Model(BaseModel):
    b: Foo


class Bar(BaseModel):
    c: int


top_level_schema = schema([Model, Bar], title='My Schema')
print(json.dumps(top_level_schema, indent=2))

# {
#   "title": "My Schema",
#   "definitions": {
#     "Foo": {
#       "title": "Foo",
#       ...

(This script is complete, it should run “as is”)

Outputs:

{
  "title": "My Schema",
  "definitions": {
    "Foo": {
      "title": "Foo",
      "type": "object",
      "properties": {
        "a": {
          "title": "A",
          "type": "string"
        }
      }
    },
    "Model": {
      "title": "Model",
      "type": "object",
      "properties": {
        "b": {
          "$ref": "#/definitions/Foo"
        }
      },
      "required": [
        "b"
      ]
    },
    "Bar": {
      "title": "Bar",
      "type": "object",
      "properties": {
        "c": {
          "title": "C",
          "type": "integer"
        }
      },
      "required": [
        "c"
      ]
    }
  }
}

You can customize the generated $ref JSON location, the definitions will still be in the key definitions and you can still get them from there, but the references will point to your defined prefix instead of the default.

This is useful if you need to extend or modify JSON Schema default definitions location, e.g. with OpenAPI:

import json
from pydantic import BaseModel
from pydantic.schema import schema

class Foo(BaseModel):
    a: int

class Model(BaseModel):
    a: Foo


top_level_schema = schema([Model], ref_prefix='#/components/schemas/')  # Default location for OpenAPI
print(json.dumps(top_level_schema, indent=2))

# {
#   "definitions": {
#     "Foo": {
#       "title": "Foo",
#       "type": "object",
#       ...

(This script is complete, it should run “as is”)

Outputs:

{
  "definitions": {
    "Foo": {
      "title": "Foo",
      "type": "object",
      "properties": {
        "a": {
          "title": "A",
          "type": "integer"
        }
      },
      "required": [
        "a"
      ]
    },
    "Model": {
      "title": "Model",
      "type": "object",
      "properties": {
        "a": {
          "$ref": "#/components/schemas/Foo"
        }
      },
      "required": [
        "a"
      ]
    }
  }
}

It’s also possible to extend/override the generated JSON schema in a model.

To do it, use the Config sub-class attribute schema_extra.

For example, you could add examples to the JSON Schema:

from pydantic import BaseModel


class Person(BaseModel):
    name: str
    age: int

    class Config:
        schema_extra = {
            'examples': [
                {
                    'name': 'John Doe',
                    'age': 25,
                }
            ]
        }


print(Person.schema())
# {'title': 'Person',
#  'type': 'object',
#  'properties': {'name': {'title': 'Name', 'type': 'string'},
#   'age': {'title': 'Age', 'type': 'integer'}},
#  'required': ['name', 'age'],
#  'examples': [{'name': 'John Doe', 'age': 25}]}
print(Person.schema_json(indent=2))

(This script is complete, it should run “as is”)

Outputs:

{
  "title": "Person",
  "type": "object",
  "properties": {
    "name": {
      "title": "Name",
      "type": "string"
    },
    "age": {
      "title": "Age",
      "type": "integer"
    }
  },
  "required": [
    "name",
    "age"
  ],
  "examples": [
    {
      "name": "John Doe",
      "age": 25
    }
  ]
}

Error Handling

Pydantic will raise ValidationError whenever it finds an error in the data it’s validating.

Note

Validation code should not raise ValidationError itself, but rather raise ValueError or TypeError (or subclasses thereof) which will be caught and used to populate ValidationError.

One exception will be raised regardless of the number of errors found, that ValidationError will contain information about all the errors and how they happened.

You can access these errors in a several ways:

e.errors()

method will return list of errors found in the input data.

e.json()

method will return a JSON representation of errors.

str(e)

method will return a human readable representation of the errors.

Each error object contains:

loc

the error’s location as a list, the first item in the list will be the field where the error occurred, subsequent items will represent the field where the error occurred in sub models when they’re used.

type

a unique identifier of the error readable by a computer.

msg

a human readable explanation of the error.

ctx

an optional object which contains values required to render the error message.

To demonstrate that:

from typing import List
from pydantic import BaseModel, ValidationError, conint

class Location(BaseModel):
    lat = 0.1
    lng = 10.1

class Model(BaseModel):
    is_required: float
    gt_int: conint(gt=42)
    list_of_ints: List[int] = None
    a_float: float = None
    recursive_model: Location = None

data = dict(
    list_of_ints=['1', 2, 'bad'],
    a_float='not a float',
    recursive_model={'lat': 4.2, 'lng': 'New York'},
    gt_int=21,
)

try:
    Model(**data)
except ValidationError as e:
    print(e)
"""
5 validation errors
list_of_ints -> 2
  value is not a valid integer (type=type_error.integer)
a_float
  value is not a valid float (type=type_error.float)
is_required
  field required (type=value_error.missing)
recursive_model -> lng
  value is not a valid float (type=type_error.float)
gt_int
  ensure this value is greater than 42 (type=value_error.number.gt; limit_value=42)
"""

try:
    Model(**data)
except ValidationError as e:
    print(e.json())

"""
[
  {
    "loc": ["is_required"],
    "msg": "field required",
    "type": "value_error.missing"
  },
  {
    "loc": ["gt_int"],
    "msg": "ensure this value is greater than 42",
    "type": "value_error.number.gt",
    "ctx": {
      "limit_value": 42
    }
  },
  {
    "loc": ["list_of_ints", 2],
    "msg": "value is not a valid integer",
    "type": "type_error.integer"
  },
  {
    "loc": ["a_float"],
    "msg": "value is not a valid float",
    "type": "type_error.float"
  },
  {
    "loc": ["recursive_model", "lng"],
    "msg": "value is not a valid float",
    "type": "type_error.float"
  }
]
"""

(This script is complete, it should run “as is”. json() has indent=2 set by default, but I’ve tweaked the JSON here and below to make it slightly more concise.)

In your custom data types or validators you should use TypeError and ValueError to raise errors:

from pydantic import BaseModel, ValidationError, validator

class Model(BaseModel):
    foo: str

    @validator('foo')
    def name_must_contain_space(cls, v):
        if v != 'bar':
            raise ValueError('value must be "bar"')

        return v

try:
    Model(foo='ber')
except ValidationError as e:
    print(e.errors())

"""
[
    {
        'loc': ('foo',),
        'msg': 'value must be "bar"',
        'type': 'value_error',
    },
]
"""

(This script is complete, it should run “as is”)

You can also define your own error class with abilities to specify custom error code, message template and context:

from pydantic import BaseModel, PydanticValueError, ValidationError, validator

class NotABarError(PydanticValueError):
    code = 'not_a_bar'
    msg_template = 'value is not "bar", got "{wrong_value}"'

class Model(BaseModel):
    foo: str

    @validator('foo')
    def name_must_contain_space(cls, v):
        if v != 'bar':
            raise NotABarError(wrong_value=v)
        return v

try:
    Model(foo='ber')
except ValidationError as e:
    print(e.json())
"""
[
  {
    "loc": ["foo"],
    "msg": "value is not \"bar\", got \"ber\"",
    "type": "value_error.not_a_bar",
    "ctx": {
      "wrong_value": "ber"
    }
  }
]
"""

(This script is complete, it should run “as is”)

datetime Types

Pydantic supports the following datetime types:

  • datetime fields can be:

    • datetime, existing datetime object

    • int or float, assumed as Unix time, e.g. seconds (if <= 2e10) or milliseconds (if > 2e10) since 1 January 1970

    • str, following formats work:

      • YYYY-MM-DD[T]HH:MM[:SS[.ffffff]][Z[±]HH[:]MM]]]

      • int or float as a string (assumed as Unix time)

  • date fields can be:

    • date, existing date object

    • int or float, see datetime

    • str, following formats work:

      • YYYY-MM-DD

      • int or float, see datetime

  • time fields can be:

    • time, existing time object

    • str, following formats work:

      • HH:MM[:SS[.ffffff]]

  • timedelta fields can be:

    • timedelta, existing timedelta object

    • int or float, assumed as seconds

    • str, following formats work:

      • [-][DD ][HH:MM]SS[.ffffff]

      • [±]P[DD]DT[HH]H[MM]M[SS]S (ISO 8601 format for timedelta)

from datetime import date, datetime, time, timedelta
from pydantic import BaseModel

class Model(BaseModel):
    d: date = None
    dt: datetime = None
    t: time = None
    td: timedelta = None


m = Model(
    d=1966280412345.6789,
    dt='2032-04-23T10:20:30.400+02:30',
    t=time(4, 8, 16),
    td='P3DT12H30M5S'
)

print(m.dict())
# > {'d': datetime.date(2032, 4, 22),
# 'dt': datetime.datetime(2032, 4, 23, 10, 20, 30, 400000, tzinfo=datetime.timezone(datetime.timedelta(seconds=9000))),
# 't': datetime.time(4, 8, 16),
# 'td': datetime.timedelta(days=3, seconds=45005)}

Exotic Types

Pydantic comes with a number of utilities for parsing or validating common objects.

import uuid
from decimal import Decimal
from ipaddress import IPv4Address, IPv6Address, IPv4Interface, IPv6Interface, IPv4Network, IPv6Network
from pathlib import Path
from uuid import UUID

from pydantic import (DSN, UUID1, UUID3, UUID4, UUID5, BaseModel, DirectoryPath, EmailStr, FilePath, NameEmail,
                      NegativeFloat, NegativeInt, PositiveFloat, PositiveInt, PyObject, StrictBool, UrlStr, conbytes, condecimal,
                      confloat, conint, conlist, constr, IPvAnyAddress, IPvAnyInterface, IPvAnyNetwork, SecretStr, SecretBytes)


class Model(BaseModel):
    cos_function: PyObject = None

    path_to_something: Path = None
    path_to_file: FilePath = None
    path_to_directory: DirectoryPath = None

    short_bytes: conbytes(min_length=2, max_length=10) = None
    strip_bytes: conbytes(strip_whitespace=True)

    short_str: constr(min_length=2, max_length=10) = None
    regex_str: constr(regex='apple (pie|tart|sandwich)') = None
    strip_str: constr(strip_whitespace=True)

    big_int: conint(gt=1000, lt=1024) = None
    mod_int: conint(multiple_of=5) = None
    pos_int: PositiveInt = None
    neg_int: NegativeInt = None

    big_float: confloat(gt=1000, lt=1024) = None
    unit_interval: confloat(ge=0, le=1) = None
    mod_float: confloat(multiple_of=0.5) = None
    pos_float: PositiveFloat = None
    neg_float: NegativeFloat = None

    short_list: conlist(int, min_items=1, max_items=4)

    email_address: EmailStr = None
    email_and_name: NameEmail = None

    is_really_a_bool: StrictBool = None

    url: UrlStr = None

    password: SecretStr = None
    password_bytes: SecretBytes = None

    db_name = 'foobar'
    db_user = 'postgres'
    db_password: str = None
    db_host = 'localhost'
    db_port = '5432'
    db_driver = 'postgres'
    db_query: dict = None
    dsn: DSN = None
    decimal: Decimal = None
    decimal_positive: condecimal(gt=0) = None
    decimal_negative: condecimal(lt=0) = None
    decimal_max_digits_and_places: condecimal(max_digits=2, decimal_places=2) = None
    mod_decimal: condecimal(multiple_of=Decimal('0.25')) = None
    uuid_any: UUID = None
    uuid_v1: UUID1 = None
    uuid_v3: UUID3 = None
    uuid_v4: UUID4 = None
    uuid_v5: UUID5 = None
    ipvany: IPvAnyAddress = None
    ipv4: IPv4Address = None
    ipv6: IPv6Address = None
    ip_vany_network: IPvAnyNetwork = None
    ip_v4_network: IPv4Network = None
    ip_v6_network: IPv6Network = None
    ip_vany_interface: IPvAnyInterface = None
    ip_v4_interface: IPv4Interface = None
    ip_v6_interface: IPv6Interface = None

m = Model(
    cos_function='math.cos',
    path_to_something='/home',
    path_to_file='/home/file.py',
    path_to_directory='home/projects',
    short_bytes=b'foo',
    strip_bytes=b'   bar',
    short_str='foo',
    regex_str='apple pie',
    strip_str='   bar',
    big_int=1001,
    mod_int=155,
    pos_int=1,
    neg_int=-1,
    big_float=1002.1,
    mod_float=1.5,
    pos_float=2.2,
    neg_float=-2.3,
    unit_interval=0.5,
    short_list=[1, 2],
    email_address='Samuel Colvin <s@muelcolvin.com >',
    email_and_name='Samuel Colvin <s@muelcolvin.com >',
    is_really_a_bool=True,
    url='http://example.com',
    password='password',
    password_bytes=b'password2',
    decimal=Decimal('42.24'),
    decimal_positive=Decimal('21.12'),
    decimal_negative=Decimal('-21.12'),
    decimal_max_digits_and_places=Decimal('0.99'),
    mod_decimal=Decimal('2.75'),
    uuid_any=uuid.uuid4(),
    uuid_v1=uuid.uuid1(),
    uuid_v3=uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org'),
    uuid_v4=uuid.uuid4(),
    uuid_v5=uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org'),
    ipvany=IPv4Address('192.168.0.1'),
    ipv4=IPv4Address('255.255.255.255'),
    ipv6=IPv6Address('ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff'),
    ip_vany_network=IPv4Network('192.168.0.0/24'),
    ip_v4_network=IPv4Network('192.168.0.0/24'),
    ip_v6_network=IPv6Network('ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff/128'),
    ip_vany_interface=IPv4Interface('192.168.0.0/24'),
    ip_v4_interface=IPv4Interface('192.168.0.0/24'),
    ip_v6_interface=IPv6Interface('ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff/128')
)
print(m.dict())
"""
{
    'cos_function': <built-in function cos>,
    'path_to_something': PosixPath('/home'),
    'path_to_file': PosixPath('/home/file.py'),
    'path_to_directory': PosixPath('/home/projects'),
    'short_bytes': b'foo',
    'strip_bytes': b'bar',
    'short_str': 'foo',
    'regex_str': 'apple pie',
    'strip_str': 'bar',
    'big_int': 1001,
    'mod_int': 155,
    'pos_int': 1,
    'neg_int': -1,
    'big_float': 1002.1,
    'mod_float': 1.5,
    'pos_float': 2.2,
    'neg_float': -2.3,
    'unit_interval': 0.5,
    'short_list': [1, 2],
    'email_address': 's@muelcolvin.com',
    'email_and_name': <NameEmail("Samuel Colvin <s@muelcolvin.com>")>,
    'is_really_a_bool': True,
    'url': 'http://example.com',
    'password': SecretStr('**********'),
    'password_bytes': SecretBytes(b'**********'),
    ...
    'dsn': 'postgres://postgres@localhost:5432/foobar',
    'decimal': Decimal('42.24'),
    'decimal_positive': Decimal('21.12'),
    'decimal_negative': Decimal('-21.12'),
    'decimal_max_digits_and_places': Decimal('0.99'),
    'mod_decimal': Decimal('2.75'),
    'uuid_any': UUID('ebcdab58-6eb8-46fb-a190-d07a33e9eac8'),
    'uuid_v1': UUID('c96e505c-4c62-11e8-a27c-dca90496b483'),
    'uuid_v3': UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e'),
    'uuid_v4': UUID('22209f7a-aad1-491c-bb83-ea19b906d210'),
    'uuid_v5': UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d'),
    'ipvany': IPv4Address('192.168.0.1'),
    'ipv4': IPv4Address('255.255.255.255'),
    'ipv6': IPv6Address('ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff'),
    'ip_vany_network': IPv4Network('192.168.0.0/24'),
    'ip_v4_network': IPv4Network('192.168.0.0/24'),
    'ip_v6_network': IPv4Network('ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff/128'),
    'ip_vany_interface': IPv4Interface('192.168.0.0/24'),
    'ip_v4_interface': IPv4Interface('192.168.0.0/24'),
    'ip_v6_interface': IPv6Interface('ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff/128')
}
"""

(This script is complete, it should run “as is”)

StrictBool

Unlike normal bool fields, StrictBool can be used to required specifically True or False, nothing else is permitted.

Callable

Fields can also be of type Callable:

from typing import Callable
from pydantic import BaseModel

class Foo(BaseModel):
    callback: Callable[[int], int]

m = Foo(callback=lambda x: x)
print(m)
# Foo callback=<function <lambda> at 0x7f16bc73e1e0>

(This script is complete, it should run “as is”)

Warning

Callable fields only perform a simple check that the argument is callable, no validation of arguments, their types or the return type is performed.

Color Type

You can use the Color data type for storing colors as per CSS3 specification. Color can be defined via:

  • name (e.g. "Black", "azure")

  • hexadecimal value (e.g. "0x000", "#FFFFFF", "7fffd4")

  • RGB/RGBA tuples (e.g. (255, 255, 255), (255, 255, 255, 0.5)

  • RGB/RGBA strings (e.g. "rgb(255, 255, 255)" or "rgba(255, 255, 255, 0.5)")

  • HSL strings (e.g. "hsl(270, 60%, 70%)" or "hsl(270, 60%, 70%, .5)")

from pydantic import BaseModel, ValidationError
from pydantic.color import Color

c = Color('ff00ff')
print(c.as_named())
#> magenta
print(c.as_hex())
#> #f0f

c2 = Color('green')
print(c2.as_rgb_tuple())
#> (0, 128, 0, 1)
print(c2.original())
#> green
print(repr(Color('hsl(180, 100%, 50%)')))
#> <Color('cyan', (0, 255, 255))>

class Model(BaseModel):
    color: Color

print(Model(color='purple'))
# > Model color=<Color('purple', (128, 0, 128))>

try:
    Model(color='hello')
except ValidationError as e:
    print(e)
"""
1 validation error
color
  value is not a valid color: string not recognised as a valid color 
  (type=value_error.color; reason=string not recognised as a valid color)
"""

(This script is complete, it should run “as is”)

Color has the following methods:

original

the original string or tuple passed to Color

as_named

returns a named CSS3 color, fails if the alpha channel is set or no such color exists unless fallback=True is supplied when it falls back to as_hex

as_hex

string in the format #ffffff or #fff, can also be a 4 or 8 hex values if the alpha channel is set, e.g. #7f33cc26

as_rgb

string in the format rgb(<red>, <green>, <blue>) or rgba(<red>, <green>, <blue>, <alpha>) if the alpha channel is set

as_rgb_tuple

returns a 3- or 4-tuple in RGB(a) format, the alpha keyword argument can be used to define whether the alpha channel should be included, options: True - always include, False - never include, None (the default) - include if set

as_hsl

string in the format hsl(<hue deg>, <saturation %>, <lightness %>) or hsl(<hue deg>, <saturation %>, <lightness %>, <alpha>) if the alpha channel is set

as_hsl_tuple

returns a 3- or 4-tuple in HSL(a) format, the alpha keyword argument can be used to define whether the alpha channel should be included, options: True - always include, False - never include, None (the default) - include if set

The __str__ method for Color returns self.as_named(fallback=True).

Note

the as_hsl* refer to hue, saturation, lightness “HSL” as used in html and most of the world, not “HLS” as used in python’s colorsys.

Secret Types

You can use the SecretStr and the SecretBytes data types for storing sensitive information that you do not want to be visible in logging or tracebacks. The SecretStr and SecretBytes will be formatted as either ‘**********’ or ‘’ on conversion to json.

from typing import List

from pydantic import BaseModel, SecretStr, SecretBytes, ValidationError

class SimpleModel(BaseModel):
    password: SecretStr
    password_bytes: SecretBytes

sm = SimpleModel(password='IAmSensitive', password_bytes=b'IAmSensitiveBytes')
print(sm)
# > SimpleModel password=SecretStr('**********') password_bytes=SecretBytes(b'**********')

print(sm.password.get_secret_value())
# > IAmSensitive
print(sm.password_bytes.get_secret_value())
# > b'IAmSensitiveBytes'
print(sm.password.display())
# > '**********'
print(sm.json())
# > '{"password": "**********", "password_bytes": "**********"}'


try:
    SimpleModel(password=[1,2,3], password_bytes=[1,2,3])
except ValidationError as e:
    print(e)
"""
2 validation error
password
  str type expected (type=type_error.str)
password_bytes
  byte type expected (type=type_error.bytes)
"""

(This script is complete, it should run “as is”)

Json Type

You can use Json data type - Pydantic will first parse raw JSON string and then will validate parsed object against defined Json structure if it’s provided.

from typing import List

from pydantic import BaseModel, Json, ValidationError

class SimpleJsonModel(BaseModel):
    json_obj: Json

class ComplexJsonModel(BaseModel):
    json_obj: Json[List[int]]

print(SimpleJsonModel(json_obj='{"b": 1}'))
# > SimpleJsonModel json_obj={'b': 1}

print(ComplexJsonModel(json_obj='[1, 2, 3]'))
# > ComplexJsonModel json_obj=[1, 2, 3]


try:
    ComplexJsonModel(json_obj=12)
except ValidationError as e:
    print(e)
"""
1 validation error
json_obj
  JSON object must be str, bytes or bytearray (type=type_error.json)
"""

try:
    ComplexJsonModel(json_obj='[a, b]')
except ValidationError as e:
    print(e)
"""
1 validation error
json_obj
  Invalid JSON (type=value_error.json)
"""

try:
    ComplexJsonModel(json_obj='["a", "b"]')
except ValidationError as e:
    print(e)
"""
2 validation errors
json_obj -> 0
  value is not a valid integer (type=type_error.integer)
json_obj -> 1
  value is not a valid integer (type=type_error.integer)
"""

(This script is complete, it should run “as is”)

Literal Type

Pydantic supports the use of typing_extensions.Literal as a lightweight way to specify that a field may accept only specific literal values:

from typing_extensions import Literal

from pydantic import BaseModel, ValidationError

class Pie(BaseModel):
    flavor: Literal['apple', 'pumpkin']

Pie(flavor='apple')
Pie(flavor='pumpkin')
try:
    Pie(flavor='cherry')
except ValidationError as e:
    print(str(e))
"""
1 validation error
flavor
  unexpected value; permitted: 'apple', 'pumpkin' (type=value_error.const; given=cherry; permitted=('apple', 'pumpkin'))
"""

(This script is complete, it should run “as is”)

One benefit of this field type is that it can be used to check for equality with one or more specific values without needing to declare custom validators:

from typing import ClassVar, List, Union

from typing_extensions import Literal

from pydantic import BaseModel, ValidationError

class Cake(BaseModel):
    kind: Literal['cake']
    required_utensils: ClassVar[List[str]] = ['fork', 'knife']

class IceCream(BaseModel):
    kind: Literal['icecream']
    required_utensils: ClassVar[List[str]] = ['spoon']

class Meal(BaseModel):
    dessert: Union[Cake, IceCream]

print(type(Meal(dessert={'kind': 'cake'}).dessert).__name__)
# Cake
print(type(Meal(dessert={'kind': 'icecream'}).dessert).__name__)
# IceCream
try:
    Meal(dessert={'kind': 'pie'})
except ValidationError as e:
    print(str(e))
"""
2 validation errors
dessert -> kind
  unexpected value; permitted: 'cake' (type=value_error.const; given=pie; permitted=('cake',))
dessert -> kind
  unexpected value; permitted: 'icecream' (type=value_error.const; given=pie; permitted=('icecream',))
"""

(This script is complete, it should run “as is”)

With proper ordering in an annotated Union, you can use this to parse types of decreasing specificity:

from typing import Optional, Union

from typing_extensions import Literal

from pydantic import BaseModel

class Dessert(BaseModel):
    kind: str

class Pie(Dessert):
    kind: Literal['pie']
    flavor: Optional[str]

class ApplePie(Pie):
    flavor: Literal['apple']

class PumpkinPie(Pie):
    flavor: Literal['pumpkin']

class Meal(BaseModel):
    dessert: Union[ApplePie, PumpkinPie, Pie, Dessert]

print(type(Meal(dessert={'kind': 'pie', 'flavor': 'apple'}).dessert).__name__)
# ApplePie
print(type(Meal(dessert={'kind': 'pie', 'flavor': 'pumpkin'}).dessert).__name__)
# PumpkinPie
print(type(Meal(dessert={'kind': 'pie'}).dessert).__name__)
# Pie
print(type(Meal(dessert={'kind': 'cake'}).dessert).__name__)
# Dessert

(This script is complete, it should run “as is”)

Custom Data Types

You can also define your own data types. The class method __get_validators__ will be called to get validators to parse and validate the input data.

Note

The name of __get_validators__ was changed from get_validators in v0.17, the old name is currently still supported but deprecated and will be removed in future.

from pydantic import BaseModel, ValidationError


class StrictStr(str):
    @classmethod
    def __get_validators__(cls):
        yield cls.validate

    @classmethod
    def validate(cls, v):
        if not isinstance(v, str):
            raise ValueError(f'strict string: str expected not {type(v)}')
        return v


class Model(BaseModel):
    s: StrictStr


print(Model(s='hello'))
# > Model s='hello'

try:
    print(Model(s=123))
except ValidationError as e:
    print(e.json())
"""
[
  {
    "loc": [
      "s"
    ],
    "msg": "strict string: str expected not <class 'int'>",
    "type": "value_error"
  }
]
"""

(This script is complete, it should run “as is”)

Custom Root Types

Pydantic models which do not represent a dict (“object” in JSON parlance) can have a custom root type defined via the __root__ field. The root type can of any type: list, float, int etc.

The root type can be defined via the type hint on the __root__ field. The root value can be passed to model __init__ via the __root__ keyword argument or as the first and only argument to parse_obj.

from typing import List
import json
from pydantic import BaseModel
from pydantic.schema import schema

class Pets(BaseModel):
    __root__: List[str]

print(Pets(__root__=['dog', 'cat']))
# > Pets __root__=['dog', 'cat']

print(Pets.parse_obj(__root__=['dog', 'cat']))
# > Pets __root__=['dog', 'cat']

print(Pets.schema())
# > {'title': 'Pets', 'type': 'array', 'items': {'type': 'string'}}

pets_schema = schema([Pets])
print(json.dumps(pets_schema, indent=2))

# {
#  "definitions": {
#    "Pets": {
#      "title": "Pets",
#      "type": "array",
#      ...

Outputs:

{
  "definitions": {
    "Pets": {
      "title": "Pets",
      "type": "array",
      "items": {
        "type": "string"
      }
    }
  }
}

Helper Functions

Pydantic provides three classmethod helper functions on models for parsing data:

parse_obj

this is almost identical to the __init__ method of the model except if the object passed is not a dict ValidationError will be raised (rather than python raising a TypeError).

parse_raw

takes a str or bytes parses it as json, or pickle data and then passes the result to parse_obj. The data type is inferred from the content_type argument, otherwise json is assumed.

parse_file

reads a file and passes the contents to parse_raw, if content_type is omitted it is inferred from the file’s extension.

import pickle
from datetime import datetime
from pydantic import BaseModel, ValidationError

class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: datetime = None

m = User.parse_obj({'id': 123, 'name': 'James'})
print(m)
# > User id=123 name='James' signup_ts=None

try:
    User.parse_obj(['not', 'a', 'dict'])
except ValidationError as e:
    print(e)
# > error validating input
# > User expected dict not list (error_type=TypeError)

m = User.parse_raw('{"id": 123, "name": "James"}')  # assumes json as no content type passed
print(m)
# > User id=123 name='James' signup_ts=None

pickle_data = pickle.dumps({'id': 123, 'name': 'James', 'signup_ts': datetime(2017, 7, 14)})
m = User.parse_raw(pickle_data, content_type='application/pickle', allow_pickle=True)
print(m)
# > User id=123 name='James' signup_ts=datetime.datetime(2017, 7, 14, 0, 0)

(This script is complete, it should run “as is”)

Note

Since pickle allows complex objects to be encoded, to use it you need to explicitly pass allow_pickle to the parsing function.

Model Config

Behaviour of pydantic can be controlled via the Config class on a model.

Options:

title

title for the generated JSON Schema

anystr_strip_whitespace

strip or not trailing and leading whitespace for str & byte types (default: False)

min_anystr_length

min length for str & byte types (default: 0)

max_anystr_length

max length for str & byte types (default: 2 ** 16)

validate_all

whether or not to validate field defaults (default: False)

extra

whether to ignore, allow or forbid extra attributes in model. Can use either string values of ignore, allow or forbid, or use Extra enum (default is Extra.ignore)

allow_mutation

whether or not models are faux-immutable, e.g. __setattr__ fails (default: True)

use_enum_values

whether to populate models with the value property of enums, rather than the raw enum - useful if you want to serialise model.dict() later (default: False)

fields

schema information on each field, this is equivilant to using the schema class (default: None)

validate_assignment

whether to perform validation on assignment to attributes or not (default: False)

allow_population_by_alias

whether or not an aliased field may be populated by its name as given by the model attribute, rather than strictly the alias; please be sure to read the warning below before enabling this (default: False)

error_msg_templates

let’s you to override default error message templates. Pass in a dictionary with keys matching the error messages you want to override (default: {})

arbitrary_types_allowed

whether to allow arbitrary user types for fields (they are validated simply by checking if the value is instance of that type). If False - RuntimeError will be raised on model declaration (default: False)

json_encoders

customise the way types are encoded to json, see JSON Serialisation for more details.

orm_mode

allows usage of ORM mode

alias_generator

callable that takes field name and returns alias for it

keep_untouched

tuple of types (e. g. descriptors) that won’t change during model creation and won’t be included in the model schemas.

schema_extra

takes a dict to extend/update the generated JSON Schema

Warning

Think twice before enabling allow_population_by_alias! Enabling it could cause previously correct code to become subtly incorrect. As an example, say you have a field named card_number with the alias cardNumber. With population by alias disabled (the default), trying to parse an object with only the key card_number will fail. However, if you enable population by alias, the card_number field can now be populated from cardNumber or card_number, and the previously-invalid example object would now be valid. This may be desired for some use cases, but in others (like the one given here, perhaps!), relaxing strictness with respect to aliases could introduce bugs.

from pydantic import BaseModel, ValidationError


class Model(BaseModel):
    v: str

    class Config:
        max_anystr_length = 10
        error_msg_templates = {
            'value_error.any_str.max_length': 'max_length:{limit_value}',
        }


try:
    Model(v='x' * 20)
except ValidationError as e:
    print(e)
"""
1 validation error
v
  max_length:10 (type=value_error.any_str.max_length; limit_value=10)
"""

(This script is complete, it should run “as is”)

Version for models based on @dataclass decorator:

from datetime import datetime

from pydantic import ValidationError
from pydantic.dataclasses import dataclass


class MyConfig:
    max_anystr_length = 10
    validate_assignment = True
    error_msg_templates = {
        'value_error.any_str.max_length': 'max_length:{limit_value}',
    }


@dataclass(config=MyConfig)
class User:
    id: int
    name: str = 'John Doe'
    signup_ts: datetime = None


user = User(id='42', signup_ts='2032-06-21T12:00')
try:
    user.name = 'x' * 20
except ValidationError as e:
    print(e)
"""
1 validation error
name
  max_length:10 (type=value_error.any_str.max_length; limit_value=10)
"""

(This script is complete, it should run “as is”)

Alias Generator

If data source field names do not match your code style (e. g. CamelCase fields), you can automatically generate aliases using alias_generator:

from pydantic import BaseModel

def to_camel(string: str) -> str:
    return ''.join(word.capitalize() for word in string.split('_'))

class Voice(BaseModel):
    name: str
    gender: str
    language_code: str

    class Config:
        alias_generator = to_camel

voice = Voice(Name='Filiz', Gender='Female', LanguageCode='tr-TR')
print(voice.language_code)
print(voice.dict(by_alias=True))

"""
tr-TR
{'Name': 'Filiz', 'Gender': 'Female', 'LanguageCode': 'tr-TR'}
"""

(This script is complete, it should run “as is”)

Settings

One of pydantic’s most useful applications is to define default settings, allow them to be overridden by environment variables or keyword arguments (e.g. in unit tests).

from typing import Set

from pydantic import BaseModel, DSN, BaseSettings, PyObject


class SubModel(BaseModel):
    foo = 'bar'
    apple = 1


class Settings(BaseSettings):
    redis_host = 'localhost'
    redis_port = 6379
    redis_database = 0
    redis_password: str = None

    auth_key: str = ...

    invoicing_cls: PyObject = 'path.to.Invoice'

    db_name = 'foobar'
    db_user = 'postgres'
    db_password: str = None
    db_host = 'localhost'
    db_port = '5432'
    db_driver = 'postgres'
    db_query: dict = None
    dsn: DSN = None

    # to override domains:
    # export MY_PREFIX_DOMAINS = '["foo.com", "bar.com"]'
    domains: Set[str] = set()

    # to override more_settings:
    # export MY_PREFIX_MORE_SETTINGS = '{"foo": "x", "apple": 1}'
    more_settings: SubModel = SubModel()

    class Config:
        env_prefix = 'MY_PREFIX_'  # defaults to 'APP_'
        fields = {
            'auth_key': {
                'alias': 'my_api_key'
            }
        }

(This script is complete, it should run “as is”)

Here redis_port could be modified via export MY_PREFIX_REDIS_PORT=6380 or auth_key by export my_api_key=6380.

By default BaseSettings considers field values in the following priority (where 3. has the highest priority and overrides the other two):

  1. The default values set in your Settings class

  2. Environment variables eg. MY_PREFIX_REDIS_PORT as described above.

  3. Argument passed to the Settings class on initialisation.

This behaviour can be changed by overriding the _build_values method on BaseSettings.

Complex types like list, set, dict and submodels can be set by using JSON environment variables.

Environment variables can be read in a case insensitive manner:

from pydantic import BaseSettings


class Settings(BaseSettings):
    redis_host = 'localhost'

    class Config:
        case_insensitive = True

Here redis_port could be modified via export APP_REDIS_HOST, export app_redis_host, export app_REDIS_host, etc.

Dynamic model creation

There are some occasions where the shape of a model is not known until runtime, for this pydantic provides the create_model method to allow models to be created on the fly.

from pydantic import BaseModel, create_model

DynamicFoobarModel = create_model('DynamicFoobarModel', foo=(str, ...), bar=123)


class StaticFoobarModel(BaseModel):
    foo: str
    bar: int = 123

Here StaticFoobarModel and DynamicFoobarModel are identical.

Fields are defined by either a a tuple of the form (<type>, <default value>) or just a default value. The special key word arguments __config__ and __base__ can be used to customise the new model. This includes extending a base model with extra fields.

from pydantic import BaseModel, create_model


class FooModel(BaseModel):
    foo: str
    bar: int = 123


BarModel = create_model('BarModel', apple='russet', banana='yellow', __base__=FooModel)
print(BarModel)
# > <class 'pydantic.main.BarModel'>
print(', '.join(BarModel.__fields__.keys()))
# > foo, bar, apple, banana

Usage with mypy

Pydantic works with mypy provided you use the “annotation only” version of required variables:

from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel, NoneStr

class Model(BaseModel):
    age: int
    first_name = 'John'
    last_name: NoneStr = None
    signup_ts: Optional[datetime] = None
    list_of_ints: List[int]

m = Model(age=42, list_of_ints=[1, '2', b'3'])
print(m.age)
# > 42

Model()
# will raise a validation error for age and list_of_ints

(This script is complete, it should run “as is”)

You can also run it through mypy with:

mypy --ignore-missing-imports --follow-imports=skip --strict-optional pydantic_mypy_test.py

Strict Optional

For your code to pass with --strict-optional you need to to use Optional[] or an alias of Optional[] for all fields with None default, this is standard with mypy.

Pydantic provides a few useful optional or union types:

  • NoneStr aka. Optional[str]

  • NoneBytes aka. Optional[bytes]

  • StrBytes aka. Union[str, bytes]

  • NoneStrBytes aka. Optional[StrBytes]

If these aren’t sufficient you can of course define your own.

Required Fields and mypy

The ellipsis notation ... will not work with mypy, you need to use annotation only fields as in the example above.

Warning

Be aware that using annotation only fields will alter the order of your fields in metadata and errors: annotation only fields will always come first, but still in the order they were defined.

To get round this you can use the Required (via from pydantic import Required) field as an alias for ellipses or annotation only.

Faux Immutability

Models can be configured to be immutable via allow_mutation = False this will prevent changing attributes of a model.

Warning

Immutability in python is never strict. If developers are determined/stupid they can always modify a so-called “immutable” object.

from pydantic import BaseModel


class FooBarModel(BaseModel):
    a: str
    b: dict

    class Config:
        allow_mutation = False


foobar = FooBarModel(a='hello', b={'apple': 'pear'})

try:
    foobar.a = 'different'
except TypeError as e:
    print(e)
    # > "FooBarModel" is immutable and does not support item assignment

print(foobar.a)
# > hello

print(foobar.b)
# > {'apple': 'pear'}

foobar.b['apple'] = 'grape'
print(foobar.b)
# > {'apple': 'grape'}

Trying to change a caused an error and it remains unchanged, however the dict b is mutable and the immutability of foobar doesn’t stop being changed.

Copying

The dict function returns a dictionary containing the attributes of a model. Sub-models are recursively converted to dicts, copy allows models to be duplicated, this is particularly useful for immutable models.

dict, copy, and json (described below) all take the optional include and exclude keyword arguments to control which attributes are returned or copied, respectively. copy accepts extra keyword arguments, update, which accepts a dict mapping attributes to new values that will be applied as the model is duplicated and deep to make a deep copy of the model.

dict and json take the optional skip_defaults keyword argument which will skip attributes that were not explicitly set. This is useful to reduce the serialized size of models thats have many default fields that are not often changed.

from pydantic import BaseModel

class BarModel(BaseModel):
    whatever: int

class FooBarModel(BaseModel):
    banana: float
    foo: str
    bar: BarModel

m = FooBarModel(banana=3.14, foo='hello', bar={'whatever': 123})

print(m.dict())
# (returns a dictionary)
# > {'banana': 3.14, 'foo': 'hello', 'bar': {'whatever': 123}}

print(m.dict(include={'foo', 'bar'}))
# > {'foo': 'hello', 'bar': {'whatever': 123}}

print(m.dict(exclude={'foo', 'bar'}))
# > {'banana': 3.14}

print(m.copy())
# > FooBarModel banana=3.14 foo='hello' bar=<BarModel whatever=123>

print(m.copy(include={'foo', 'bar'}))
# > FooBarModel foo='hello' bar=<BarModel whatever=123>

print(m.copy(exclude={'foo', 'bar'}))
# > FooBarModel banana=3.14

print(m.copy(update={'banana': 0}))
# > FooBarModel banana=0 foo='hello' bar=<BarModel whatever=123>

print(id(m.bar), id(m.copy().bar))
# normal copy gives the same object reference for `bar`
# > 140494497582280 140494497582280

print(id(m.bar), id(m.copy(deep=True).bar))
# deep copy gives a new object reference for `bar`
# > 140494497582280 140494497582856

Advanced include and exclude

The dict, json and copy methods support include and exclude arguments which can either be sets or dictionaries, allowing nested selection of which fields to export:

from pydantic import BaseModel, SecretStr

class User(BaseModel):
    id: int
    username: str
    password: SecretStr

class Transaction(BaseModel):
    id: str
    user: User
    value: int

transaction = Transaction(
    id="1234567890",
    user=User(
        id=42,
        username="JohnDoe",
        password="hashedpassword"
    ),
    value=9876543210
)

# using a set:
print(transaction.dict(exclude={'user', 'value'}))
#> {'id': '1234567890'}

# using a dict:
print(transaction.dict(exclude={'user': {'username', 'password'}, 'value': ...}))
#> {'id': '1234567890', 'user': {'id': 42}}

print(transaction.dict(include={'id': ..., 'user': {'id'}}))
#> {'id': '1234567890', 'user': {'id': 42}}

The ... value indicates that we want to exclude or include entire key, just as if we included it in a set.

Of course same can be done on any depth level:

import datetime
from typing import List

from pydantic import BaseModel, SecretStr

class Country(BaseModel):
    name: str
    phone_code: int

class Address(BaseModel):
    post_code: int
    country: Country

class CardDetails(BaseModel):
    number: SecretStr
    expires: datetime.date

class Hobby(BaseModel):
    name: str
    info: str

class User(BaseModel):
    first_name: str
    second_name: str
    address: Address
    card_details: CardDetails
    hobbies: List[Hobby]

user = User(
    first_name='John',
    second_name='Doe',
    address=Address(
        post_code=123456,
        country=Country(
            name='USA',
            phone_code=1
        )
    ),
    card_details=CardDetails(
        number=4212934504460000,
        expires=datetime.date(2020, 5, 1)
    ),
    hobbies=[
        Hobby(name='Programming', info='Writing code and stuff'),
        Hobby(name='Gaming', info='Hell Yeah!!!')
    ]

)

exclude_keys = {
    'second_name': ...,
    'address': {'post_code': ..., 'country': {'phone_code'}},
    'card_details': ...,
    'hobbies': {-1: {'info'}},  # You can exclude values from tuples and lists by indexes
}

include_keys = {
    'first_name': ...,
    'address': {'country': {'name'}},
    'hobbies': {0: ..., -1: {'name'}}
}

print(
    user.dict(include=include_keys) == user.dict(exclude=exclude_keys) == {
        'first_name': 'John',
        'address': {'country': {'name': 'USA'}},
        'hobbies': [
            {'name': 'Programming', 'info': 'Writing code and stuff'},
            {'name': 'Gaming'}
        ]
    }
)
# True

Same goes for json and copy methods.

Serialisation

pydantic has native support for serialisation to JSON and Pickle, you can of course serialise to any other format you like by processing the result of dict().

JSON Serialisation

The json() method will serialise a model to JSON, json() in turn calls dict() and serialises its result.

Serialisation can be customised on a model using the json_encoders config property, the keys should be types and the values should be functions which serialise that type, see the example below.

If this is not sufficient, json() takes an optional encoder argument which allows complete control over how non-standard types are encoded to JSON.

from datetime import datetime, timedelta
from pydantic import BaseModel
from pydantic.json import timedelta_isoformat

class BarModel(BaseModel):
    whatever: int

class FooBarModel(BaseModel):
    foo: datetime
    bar: BarModel

m = FooBarModel(foo=datetime(2032, 6, 1, 12, 13, 14), bar={'whatever': 123})
print(m.json())
# (returns a str)
# > {"foo": "2032-06-01T12:13:14", "bar": {"whatever": 123}}

class WithCustomEncoders(BaseModel):
    dt: datetime
    diff: timedelta

    class Config:
        json_encoders = {
            datetime: lambda v: (v - datetime(1970, 1, 1)).total_seconds(),
            timedelta: timedelta_isoformat,
        }

m = WithCustomEncoders(dt=datetime(2032, 6, 1), diff=timedelta(hours=100))
print(m.json())
# > {"dt": 1969660800.0, "diff": "P4DT4H0M0.000000S"}

(This script is complete, it should run “as is”)

By default timedelta’s are encoded as a simple float of total seconds. The timedelta_isoformat is provided as an optional alternative which implements ISO 8601 time diff encoding.

Pickle Serialisation

Using the same plumbing as copy() pydantic models support efficient pickling and unpicking.

import pickle
from pydantic import BaseModel


class FooBarModel(BaseModel):
    a: str
    b: int


m = FooBarModel(a='hello', b=123)
print(m)
# > FooBarModel a='hello' b=123

data = pickle.dumps(m)
print(data)
# > b'\x80\x03c...'

m2 = pickle.loads(data)
print(m2)
# > FooBarModel a='hello' b=123

(This script is complete, it should run “as is”)

Abstract Base Classes

Pydantic models can be used alongside Python’s Abstract Base Classes (ABCs).

import abc
from pydantic import BaseModel


class FooBarModel(BaseModel, abc.ABC):
    a: str
    b: int

    @abc.abstractmethod
    def my_abstract_method(self):
        pass

(This script is complete, it should run “as is”)

Postponed Annotations

Note

Both postponed annotations via the future import and ForwardRef require python 3.7+.

Support for those features starts from pydantic v0.18.

Postponed annotations (as described in PEP563) “just work”.

from __future__ import annotations
from typing import List
from pydantic import BaseModel

class Model(BaseModel):
    a: List[int]

print(Model(a=('1', 2, 3)))
#> Model a=[1, 2, 3]

(This script is complete, it should run “as is”)

Internally pydantic will call a method similar to typing.get_type_hints to resolve annotations.

In cases where the referenced type is not yet defined, ForwardRef can be used (although referencing the type directly or by its string is a simpler solution in the case of self-referencing models).

You may need to call Model.update_forward_refs() after creating the model, this is because in the example below Foo doesn’t exist before it has been created (obviously) so ForwardRef can’t initially be resolved. You have to wait until after Foo is created, then call update_forward_refs to properly set types before the model can be used.

from typing import ForwardRef
from pydantic import BaseModel

Foo = ForwardRef('Foo')

class Foo(BaseModel):
    a: int = 123
    b: Foo = None

Foo.update_forward_refs()

print(Foo())
#> Foo a=123 b=None
print(Foo(b={'a': '321'}))
#> Foo a=123 b=<Foo a=321 b=None>

(This script is complete, it should run “as is”)

Warning

To resolve strings (type names) into annotations (types) pydantic needs a dict to lookup, for this is uses module.__dict__ just as get_type_hints does. That means pydantic does not play well with types not defined in the global scope of a module.

For example, this works fine:

from __future__ import annotations
from typing import List  # <-- List is defined in the module's global scope
from pydantic import BaseModel

def this_works():
    class Model(BaseModel):
        a: List[int]
    print(Model(a=(1, 2)))

While this will break:

from __future__ import annotations
from pydantic import BaseModel

def this_is_broken():
    from typing import List  # <-- List is defined inside the function so is not in the module's global scope
    class Model(BaseModel):
        a: List[int]
    print(Model(a=(1, 2)))

Resolving this is beyond the call for pydantic: either remove the future import or declare the types globally.

Usage of Union in Annotations and Type Order

The Union type allows a model attribute to accept different types, e.g.:

(This script is complete, it should run but may be is wrong, see below)

from uuid import UUID
from typing import Union
from pydantic import BaseModel


class User(BaseModel):
    id: Union[int, str, UUID]
    name: str


user_01 = User(id=123, name='John Doe')
print(user_01)
# > User id=123 name='John Doe'
print(user_01.id)
# > 123

user_02 = User(id='1234', name='John Doe')
print(user_02)
# > User id=1234 name='John Doe'
print(user_02.id)
# > 1234

user_03_uuid = UUID('cf57432e-809e-4353-adbd-9d5c0d733868')
user_03 = User(id=user_03_uuid, name='John Doe')
print(user_03)
# > User id=275603287559914445491632874575877060712 name='John Doe'
print(user_03.id)
# > 275603287559914445491632874575877060712
print(user_03_uuid.int)
# > 275603287559914445491632874575877060712

However, as can be seen above, pydantic will attempt to ‘match’ any of the types defined under Union and will use the first one that matches. In the above example the id of user_03 was defined as a uuid.UUID class (which is defined under the attribute’s Union annotation) but as the uuid.UUID can be marshalled into an int it chose to match against the int type and disregarded the other types.

As such, it is recommended that when defining Union annotations that the most specific type is defined first and followed by less specific types. In the above example, the UUID class should precede the int and str classes to preclude the unexpected representation as such:

from uuid import UUID
from typing import Union
from pydantic import BaseModel


class User(BaseModel):
    id: Union[UUID, int, str]
    name: str


user_03_uuid = UUID('cf57432e-809e-4353-adbd-9d5c0d733868')
user_03 = User(id=user_03_uuid, name='John Doe')
print(user_03)
# > User id=UUID('cf57432e-809e-4353-adbd-9d5c0d733868') name='John Doe'
print(user_03.id)
# > cf57432e-809e-4353-adbd-9d5c0d733868
print(user_03_uuid.int)
# > 275603287559914445491632874575877060712

(This script is complete, it should run “as is”)

Benchmarks

Below are the results of crude benchmarks comparing pydantic to other validation libraries.

Package

Relative Performance

Mean validation time

std. dev.

pydantic

12.0μs

0.263μs

toasted-marshmallow

1.9x slower

22.7μs

0.186μs

marshmallow

2.1x slower

25.2μs

0.173μs

trafaret

2.2x slower

26.6μs

0.259μs

django-restful-framework

20.0x slower

240.7μs

1.410μs

See the benchmarks code for more details on the test case. Feel free to submit more benchmarks or improve an existing one.

Benchmarks were run with python 3.7.2 and the following package versions:

  • pydantic pre v0.27 d473f4a compiled with cython

  • toasted-marshmallow v0.2.6

  • marshmallow the version installed by toasted-marshmallow, see this issue.

  • trafaret v1.2.0

  • django-restful-framework v3.9.4

Contributing to Pydantic

We’d love you to contribute to pydantic, it should be extremely simple to get started and create a Pull Request. pydantic is released regularly so you should see your improvements release in a matter of days or weeks.

If you’re looking for something to get your teeth into, check out the “help wanted” label on github.

To make contributing as easy and fast as possible, you’ll want to run tests and linting locally. Luckily since pydantic has few dependencies, doesn’t require compiling and tests don’t need access to databases etc., setting up and running tests should be very simple.

You’ll need to have python 3.6 or 3.7, virtualenv, git, and make installed.

# 1. clone your fork and cd into the repo directory
git clone git@github.com:<your username>/pydantic.git
cd pydantic

# 2. Set up a virtualenv for running tests
virtualenv -p `which python3.7` env
source env/bin/activate
# (or however you prefer to setup a python environment, 3.6 will work too)

# 3. Install pydantic, dependencies and test dependencies
make install

# 4. Checkout a new branch and make your changes
git checkout -b my-new-feature-branch
# make your changes...

# 5. Fix formatting and imports
make format
# Pydantic uses black to enforce formatting and isort to fix imports
# (https://github.com/ambv/black, https://github.com/timothycrosley/isort)

# 6. Run tests and linting
make
# there are a few sub-commands in Makefile like `test`, `testcov` and `lint`
# which you might want to use, but generally just `make` should be all you need

# 7. Build documentation
make docs
# if you have changed the documentation make sure it builds successfully

# ... commit, push, and create your pull request

tl;dr: use make format to fix formatting, make to run tests and linting & make docs to build docs.

Using Pydantic

Third party libraries based on pydantic.

  • FastAPI is a high performance API framework, easy to learn, fast to code and ready for production, based on pydantic and Starlette.

  • aiohttp-toolbox numerous utilities for aiohttp including data parsing using pydantic.

  • harrier a better static site generator built with python.

More packages using pydantic can be found by visiting pydantic’s page on libraries.io.

History

v0.32.3 (unreleased)

  • fix error messages for Literal types with multiple allowed values, #770 by @dmontagu

v0.32.2 (2019-08-17)

  • fix __post_init__ usage with dataclass inheritance, fix #739 by @samuelcolvin

  • fix required fields validation on GenericModels classes, #742 by @amitbl

  • fix defining custom Schema on GenericModel fields, #754 by @amitbl

v0.32.1 (2019-08-08)

v0.32 (2019-08-06)

  • add model name to ValidationError error message, #676 by @dmontagu

  • breaking change: remove __getattr__ and rename __values__ to __dict__ on BaseModel, deprecation warning on use __values__ attr, attributes access speed increased up to 14 times, #712 by @MrMrRobat

  • support ForwardRef (without self-referencing annotations) in Python 3.6, #706 by @koxudaxi

  • implement schema_extra in Config sub-class, #663 by @tiangolo

v0.31.1 (2019-07-31)

  • fix json generation for EnumError, #697 by @dmontagu

  • update numerous dependencies

v0.31 (2019-07-24)

v0.30.1 (2019-07-15)

  • fix so nested classes which inherit and change __init__ are correctly processed while still allowing self as a parameter, #644 by @lnaden and @dgasmith

v0.30 (2019-07-07)

v0.29 (2019-06-19)

  • support dataclasses.InitVar, #592 by @pfrederiks

  • Updated documentation to elucidate the usage of Union when defining multiple types under an attribute’s annotation and showcase how the type-order can affect marshalling of provided values, #594 by @somada141

  • add conlist type, #583 by @hmvp

  • add support for generics, #595 by @dmontagu

v0.28 (2019-06-06)

v0.27 (2019-05-30)

  • breaking change _pydantic_post_init to execute dataclass’ original __post_init__ before validation, #560 by @HeavenVolkoff

  • fix handling of generic types without specified parameters, #550 by @dmontagu

  • breaking change (maybe): this is the first release compiled with cython, see the docs and please submit an issue if you run into problems

v0.27.0a1 (2019-05-26)

  • fix JSON Schema for list, tuple, and set, #540 by @tiangolo

  • compiling with cython, manylinux binaries, some other performance improvements, #548 by @samuelcolvin

v0.26 (2019-05-22)

  • fix to schema generation for IPvAnyAddress, IPvAnyInterface, IPvAnyNetwork #498 by @pilosus

  • fix variable length tuples support, #495 by @pilosus

  • fix return type hint for create_model, #526 by @dmontagu

  • Breaking Change: fix .dict(skip_keys=True) skipping values set via alias (this involves changing validate_model() to always returns Tuple[Dict[str, Any], Set[str], Optional[ValidationError]]), #517 by @sommd

  • fix to schema generation for IPv4Address, IPv6Address, IPv4Interface, IPv6Interface, IPv4Network, IPv6Network #532 by @euri10

  • add Color type, #504 by @pilosus and @samuelcolvin

v0.25 (2019-05-05)

v0.24 (2019-04-23)

v0.23 (2019-04-04)

v0.22 (2019-03-29)

v0.21.0 (2019-03-15)

v0.20.1 (2019-02-26)

v0.20.0 (2019-02-18)

  • fix tests for python 3.8, #396 by @samuelcolvin

  • Adds fields to the dir method for autocompletion in interactive sessions, #398 by @dgasmith

  • support ForwardRef (and therefore from __future__ import annotations) with dataclasses, #397 by @samuelcolvin

v0.20.0a1 (2019-02-13)

  • breaking change (maybe): more sophisticated argument parsing for validators, any subset of values, config and field is now permitted, eg. (cls, value, field), however the variadic key word argument (“**kwargs”) must be called kwargs, #388 by @samuelcolvin

  • breaking change: Adds skip_defaults argument to BaseModel.dict() to allow skipping of fields that were not explicitly set, signature of Model.construct() changed, #389 by @dgasmith

  • add py.typed marker file for PEP-561 support, #391 by @je-l

  • Fix extra behaviour for multiple inheritance/mix-ins, #394 by @YaraslauZhylko

v0.19.0 (2019-02-04)

  • Support Callable type hint, fix #279 by @proofit404

  • Fix schema for fields with validator decorator, fix #375 by @tiangolo

  • Add multiple_of constraint to ConstrainedDecimal, ConstrainedFloat, ConstrainedInt and their related types condecimal, confloat, and conint #371, thanks @StephenBrown2

  • Deprecated ignore_extra and allow_extra Config fields in favor of extra, #352 by @liiight

  • Add type annotations to all functions, test fully with mypy, #373 by @samuelcolvin

  • fix for ‘missing’ error with validate_all or validate_always, #381 by @samuelcolvin

  • Change the second/millisecond watershed for date/datetime parsing to 2e10, #385 by @samuelcolvin

v0.18.2 (2019-01-22)

v0.18.1 (2019-01-17)

  • add ConstrainedBytes and conbytes types, #315 @Gr1N

  • adding MANIFEST.in to include license in package .tar.gz, #358 by @samuelcolvin

v0.18.0 (2019-01-13)

  • breaking change: don’t call validators on keys of dictionaries, #254 by @samuelcolvin

  • Fix validators with always=True when the default is None or the type is optional, also prevent whole validators being called for sub-fields, fix #132 by @samuelcolvin

  • improve documentation for settings priority and allow it to be easily changed, #343 by @samuelcolvin

  • fix ignore_extra=False and allow_population_by_alias=True, fix #257 by @samuelcolvin

  • breaking change: Set BaseConfig attributes min_anystr_length and max_anystr_length to None by default, fix #349 in #350 by @tiangolo

  • add support for postponed annotations, #348 by @samuelcolvin

v0.17.0 (2018-12-27)

  • fix schema for timedelta as number, #325 by @tiangolo

  • prevent validators being called repeatedly after inheritance, #327 by @samuelcolvin

  • prevent duplicate validator check in ipython, fix #312 by @samuelcolvin

  • add “Using Pydantic” section to docs, #323 by @tiangolo & #326 by @samuelcolvin

  • fix schema generation for fields annotated as : dict, : list, : tuple and : set, #330 & #335 by @nkonin

  • add support for constrained strings as dict keys in schema, #332 by @tiangolo

  • support for passing Config class in dataclasses decorator, #276 by @jarekkar (breaking change: this supersedes the validate_assignment argument with config)

  • support for nested dataclasses, #334 by @samuelcolvin

  • better errors when getting an ImportError with PyObject, #309 by @samuelcolvin

  • rename get_validators to __get_validators__, deprecation warning on use of old name, #338 by @samuelcolvin

  • support ClassVar by excluding such attributes from fields, #184 by @samuelcolvin

v0.16.1 (2018-12-10)

  • fix create_model to correctly use the passed __config__, #320 by @hugoduncan

v0.16.0 (2018-12-03)

  • breaking change: refactor schema generation to be compatible with JSON Schema and OpenAPI specs, #308 by @tiangolo

  • add schema to schema module to generate top-level schemas from base models, #308 by @tiangolo

  • add additional fields to Schema class to declare validation for str and numeric values, #311 by @tiangolo

  • rename _schema to schema on fields, #318 by @samuelcolvin

  • add case_insensitive option to BaseSettings Config, #277 by @jasonkuhrt

v0.15.0 (2018-11-18)

v0.14.0 (2018-10-02)

v0.13.1 (2018-09-21)

  • fix issue where int_validator doesn’t cast a bool to an int #264 by @nphyatt

  • add deep copy support for BaseModel.copy() #249, @gangefors

v0.13.0 (2018-08-25)

  • raise an exception if a field’s name shadows an existing BaseModel attribute #242

  • add UrlStr and urlstr types #236

  • timedelta json encoding ISO8601 and total seconds, custom json encoders #247, by @cfkanesan and @samuelcolvin

  • allow timedelta objects as values for properties of type timedelta (matches datetime etc. behavior) #247

v0.12.1 (2018-07-31)

  • fix schema generation for fields defined using typing.Any #237

v0.12.0 (2018-07-31)

  • add by_alias argument in .dict() and .json() model methods #205

  • add Json type support #214

  • support tuples #227

  • major improvements and changes to schema #213

v0.11.2 (2018-07-05)

  • add NewType support #115

  • fix list, set & tuple validation #225

  • separate out validate_model method, allow errors to be returned along with valid values #221

v0.11.1 (2018-07-02)

v0.11.0 (2018-06-28)

  • make list, tuple and set types stricter #86

  • breaking change: remove msgpack parsing #201

  • add FilePath and DirectoryPath types #10

  • model schema generation #190

  • JSON serialisation of models and schemas #133

v0.10.0 (2018-06-11)

  • add Config.allow_population_by_alias #160, thanks @bendemaree

  • breaking change: new errors format #179, thanks @Gr1N

  • breaking change: removed Config.min_number_size and Config.max_number_size #183, thanks @Gr1N

  • breaking change: correct behaviour of lt and gt arguments to conint etc. #188 for the old behaviour use le and ge #194, thanks @jaheba

  • added error context and ability to redefine error message templates using Config.error_msg_templates #183, thanks @Gr1N

  • fix typo in validator exception #150

  • copy defaults to model values, so different models don’t share objects #154

v0.9.1 (2018-05-10)

  • allow custom get_field_config on config classes #159

  • add UUID1, UUID3, UUID4 and UUID5 types #167, thanks @Gr1N

  • modify some inconsistent docstrings and annotations #173, thanks @YannLuo

  • fix type annotations for exotic types #171, thanks @Gr1N

  • re-use type validators in exotic types #171

  • scheduled monthly requirements updates #168

  • add Decimal, ConstrainedDecimal and condecimal types #170, thanks @Gr1N

v0.9.0 (2018-04-28)

  • tweak email-validator import error message #145

  • fix parse error of parse_date() and parse_datetime() when input is 0 #144, thanks @YannLuo

  • add Config.anystr_strip_whitespace and strip_whitespace kwarg to constr, by default values is False #163, thanks @Gr1N

  • add ConstrainedFloat, confloat, PositiveFloat and NegativeFloat types #166, thanks @Gr1N

v0.8.0 (2018-03-25)

  • fix type annotation for inherit_config #139

  • breaking change: check for invalid field names in validators #140

  • validate attributes of parent models #141

  • breaking change: email validation now uses email-validator #142

v0.7.1 (2018-02-07)

  • fix bug with create_model modifying the base class

v0.7.0 (2018-02-06)

  • added compatibility with abstract base classes (ABCs) #123

  • add create_model method #113 #125

  • breaking change: rename .config to .__config__ on a model

  • breaking change: remove deprecated .values() on a model, use .dict() instead

  • remove use of OrderedDict and use simple dict #126

  • add Config.use_enum_values #127

  • add wildcard validators of the form @validate('*') #128

v0.6.4 (2018-02-01)

  • allow python date and times objects #122

v0.6.3 (2017-11-26)

  • fix direct install without README.rst present

v0.6.2 (2017-11-13)

  • errors for invalid validator use

  • safer check for complex models in Settings

v0.6.1 (2017-11-08)

  • prevent duplicate validators, #101

  • add always kwarg to validators, #102

v0.6.0 (2017-11-07)

  • assignment validation #94, thanks petroswork!

  • JSON in environment variables for complex types, #96

  • add validator decorators for complex validation, #97

  • depreciate values(...) and replace with .dict(...), #99

v0.5.0 (2017-10-23)

  • add UUID validation #89

  • remove index and track from error object (json) if they’re null #90

  • improve the error text when a list is provided rather than a dict #90

  • add benchmarks table to docs #91

v0.4.0 (2017-07-08)

  • show length in string validation error

  • fix aliases in config during inheritance #55

  • simplify error display

  • use unicode ellipsis in truncate

  • add parse_obj, parse_raw and parse_file helper functions #58

  • switch annotation only fields to come first in fields list not last

v0.3.0 (2017-06-21)

  • immutable models via config.allow_mutation = False, associated cleanup and performance improvement #44

  • immutable helper methods construct() and copy() #53

  • allow pickling of models #53

  • setattr is removed as __setattr__ is now intelligent #44

  • raise_exception removed, Models now always raise exceptions #44

  • instance method validators removed

  • django-restful-framework benchmarks added #47

  • fix inheritance bug #49

  • make str type stricter so list, dict etc are not coerced to strings. #52

  • add StrictStr which only always strings as input #52

v0.2.1 (2017-06-07)

  • pypi and travis together messed up the deploy of v0.2 this should fix it

v0.2.0 (2017-06-07)

  • breaking change: values() on a model is now a method not a property, takes include and exclude arguments

  • allow annotation only fields to support mypy

  • add pretty to_string(pretty=True) method for models

v0.1.0 (2017-06-03)

  • add docs

  • add history