pandas.DataFrame.to_json #

数据框。to_json ( path_or_buf = None , * , orient = None , date_format = None , double_ precision = 10 , force_ascii = True , date_unit = 'ms' , default_handler = None , lines = False , Compression = 'infer' , index = None , indent =无, storage_options = None , mode = 'w' ) [来源] #

将对象转换为 JSON 字符串。

注意 NaN 和 None 将转换为 null，日期时间对象将转换为 UNIX 时间戳。

参数：

path_or_buf str，路径对象，类文件对象，或 None，默认 None

字符串、路径对象（实现 os.PathLike[str]）或实现 write() 函数的类文件对象。如果没有，则结果以字符串形式返回。

东方海峡

预期 JSON 字符串格式的指示。

系列：
- 默认为“索引”
- 允许的值为：{'split'、'records'、'index'、'table'}。
数据框：
- 默认为“列”
- 允许的值为：{'split'、'records'、'index'、'columns'、'values'、'table'}。
JSON 字符串的格式：
- 'split' : 像 {'index' -> [index], 'columns' -> [columns], 'data' -> [values]} 这样的字典
- 'records' ：列表如 [{column -> value}, … , {column -> value}]
- 'index' : 像 {index -> {column -> value}} 这样的字典
- 'columns' : 像 {column -> {index -> value}} 这样的字典
- 'values' ：只是值数组
- 'table' : 像 {'schema': {schema}, 'data': {data}} 一样的字典
描述数据，其中数据组件类似于orient='records'。

date_format {无，'纪元'，'iso'}

日期转换的类型。 'epoch' = 纪元毫秒，'iso' = ISO8601。默认值取决于orient。对于 orient='table'，默认值为“iso”。对于所有其他东方，默认值为“纪元”。

double_ precision int，默认10

编码浮点值时使用的小数位数。可能的最大值为 15。传递大于 15 的 double_ precision 将引发 ValueError。

force_ascii bool, 默认 True

强制编码字符串为 ASCII。

date_unit str，默认'ms'（毫秒）

编码的时间单位控制时间戳和 ISO8601 精度。 “s”、“ms”、“us”、“ns”之一分别表示秒、毫秒、微秒和纳秒。

default_handler可调用，默认 None

如果对象无法转换为合适的 JSON 格式，则调用处理程序。应该接收一个参数，该参数是要转换的对象并返回可序列化的对象。

lines bool, 默认 False

如果“orient”是“records”，则写出行分隔的 json 格式。如果“orient”不正确，则会抛出 ValueError，因为其他的不是类似列表的。

压缩str 或 dict，默认 'infer'

用于输出数据的动态压缩。如果“infer”和“path_or_buf”是类似路径，则检测以下扩展名的压缩：“.gz”、“.bz2”、“.zip”、“.xz”、“.zst”、“.tar” 、“.tar.gz”、“.tar.xz”或“.tar.bz2”（否则不压缩）。设置None为不压缩。也可以是键设置为 { , , , , , }'method'之一的字典，其他键值对分别转发到 , , , ,或。例如，可以传递以下内容以加快压缩速度并创建可重现的 gzip 存档： .'zip''gzip''bz2''zstd''xz''tar'zipfile.ZipFilegzip.GzipFilebz2.BZ2Filezstandard.ZstdCompressorlzma.LZMAFiletarfile.TarFilecompression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1}

1.5.0 版本中的新增功能：添加了对.tar文件的支持。

版本 1.4.0 中更改： Zstandard 支持。

索引bool 或 None, 默认 None

仅当“orient”为“split”、“index”、“column”或“table”时才使用索引。其中， 'index' 和 'column' 不支持 index=False。

缩进int，可选

用于缩进每条记录的空白长度。

storage_options字典，可选

对于特定存储连接有意义的额外选项，例如主机、端口、用户名、密码等。对于 HTTP(S) URL，键值对将urllib.request.Request作为标头选项转发。对于其他 URL（例如以“s3://”和“gcs://”开头），键值对将转发到fsspec.open。请参阅fsspec和urllib了解更多详细信息，有关存储选项的更多示例，请参阅此处。

模式str，默认 'w'（书写）

提供 path_or_buf 时指定输出的 IO 模式。接受的参数仅是“w”（写入）和“a”（追加）。仅当lines为True且orient为'records'时才支持mode='a'。

返回：

无或 str: 如果 path_or_buf 为 None，则以字符串形式返回结果 json 格式。否则返回 None。

也可以看看

read_json: 将 JSON 字符串转换为 pandas 对象。

笔记

的行为indent=0与 stdlib 不同，后者不会缩进输出，但会插入换行符。目前， pandas 中的indent=0 默认值indent=None是等效的，尽管这可能在未来的版本中发生变化。

orient='table'在“schema”下包含一个“pandas_version”字段。这存储了最新版本的 schema 中使用的pandas版本。

例子

>>> from json import loads, dumps
>>> df = pd.DataFrame(
...     [["a", "b"], ["c", "d"]],
...     index=["row 1", "row 2"],
...     columns=["col 1", "col 2"],
... )

>>> result = df.to_json(orient="split")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
{
    "columns": [
        "col 1",
        "col 2"
    ],
    "index": [
        "row 1",
        "row 2"
    ],
    "data": [
        [
            "a",
            "b"
        ],
        [
            "c",
            "d"
        ]
    ]
}

使用格式化的 JSON 对 Dataframe 进行编码/解码'records'。请注意，此编码不会保留索引标签。

>>> result = df.to_json(orient="records")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
[
    {
        "col 1": "a",
        "col 2": "b"
    },
    {
        "col 1": "c",
        "col 2": "d"
    }
]

使用格式化 JSON 对 Dataframe 进行编码/解码'index'：

>>> result = df.to_json(orient="index")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
{
    "row 1": {
        "col 1": "a",
        "col 2": "b"
    },
    "row 2": {
        "col 1": "c",
        "col 2": "d"
    }
}

使用格式化 JSON 对 Dataframe 进行编码/解码'columns'：

>>> result = df.to_json(orient="columns")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
{
    "col 1": {
        "row 1": "a",
        "row 2": "c"
    },
    "col 2": {
        "row 1": "b",
        "row 2": "d"
    }
}

使用格式化 JSON 对 Dataframe 进行编码/解码'values'：

>>> result = df.to_json(orient="values")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
[
    [
        "a",
        "b"
    ],
    [
        "c",
        "d"
    ]
]

使用表架构编码：

>>> result = df.to_json(orient="table")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
{
    "schema": {
        "fields": [
            {
                "name": "index",
                "type": "string"
            },
            {
                "name": "col 1",
                "type": "string"
            },
            {
                "name": "col 2",
                "type": "string"
            }
        ],
        "primaryKey": [
            "index"
        ],
        "pandas_version": "1.4.0"
    },
    "data": [
        {
            "index": "row 1",
            "col 1": "a",
            "col 2": "b"
        },
        {
            "index": "row 2",
            "col 1": "c",
            "col 2": "d"
        }
    ]
}