1.3.0 的新增内容（2021 年 7 月 2 日）#

这些是 pandas 1.3.0 中的变化。请参阅发行说明以获取完整的变更日志，包括其他版本的 pandas。

警告

读取新的 Excel 2007+ ( ) 文件时，当选项设置为时.xlsx，默认参数 engine=None现在read_excel()将导致在所有情况下都使用 openpyxl引擎。以前，某些情况会使用 xlrd引擎。有关此更改的背景信息，请参阅 What's new 1.2.0 。io.excel.xlsx.reader"auto"

增强功能#

读取 csv 或 json 文件时自定义 HTTP(s) 标头#

当从 fsspec 未处理的远程 URL（例如 HTTP 和 HTTPS）读取时，传递到的字典storage_options将用于创建请求中包含的标头。这可用于控制用户代理标头或发送其他自定义标头（GH 36688）。例如：

In [1]: headers = {"User-Agent": "pandas"}
In [2]: df = pd.read_csv(
   ...:     "https://download.bls.gov/pub/time.series/cu/cu.item",
   ...:     sep="\t",
   ...:     storage_options=headers
   ...: )

读取和写入 XML 文档#

我们添加了 I/O 支持，以使用和读取和呈现XML文档的浅层版本。使用lxml作为解析器，XPath 1.0 和 XSLT 1.0 都可用。（GH 27554）read_xml()DataFrame.to_xml()

In [1]: xml = """<?xml version='1.0' encoding='utf-8'?>
   ...: <data>
   ...:  <row>
   ...:     <shape>square</shape>
   ...:     <degrees>360</degrees>
   ...:     <sides>4.0</sides>
   ...:  </row>
   ...:  <row>
   ...:     <shape>circle</shape>
   ...:     <degrees>360</degrees>
   ...:     <sides/>
   ...:  </row>
   ...:  <row>
   ...:     <shape>triangle</shape>
   ...:     <degrees>180</degrees>
   ...:     <sides>3.0</sides>
   ...:  </row>
   ...:  </data>"""

In [2]: df = pd.read_xml(xml)
In [3]: df
Out[3]:
      shape  degrees  sides
0    square      360    4.0
1    circle      360    NaN
2  triangle      180    3.0

In [4]: df.to_xml()
Out[4]:
<?xml version='1.0' encoding='utf-8'?>
<data>
  <row>
    <index>0</index>
    <shape>square</shape>
    <degrees>360</degrees>
    <sides>4.0</sides>
  </row>
  <row>
    <index>1</index>
    <shape>circle</shape>
    <degrees>360</degrees>
    <sides/>
  </row>
  <row>
    <index>2</index>
    <shape>triangle</shape>
    <degrees>180</degrees>
    <sides>3.0</sides>
  </row>
</data>

有关更多信息，请参阅IO 工具用户指南中的编写 XML 。

造型器增强功能#

我们提供了一些针对的重点开发Styler。另请参阅经过修订和改进的Styler 文档（ GH 39720、GH 39317、GH 40493）。

该方法Styler.set_table_styles()现在可以接受更自然的 CSS 语言作为参数，例如'color:red;'代替( GH 39563 )[('color', 'red')]

方法Styler.highlight_null()、Styler.highlight_min()、和Styler.highlight_max()现在允许自定义 CSS 突出显示，而不是默认的背景颜色（GH 40242）

Styler.apply()现在接受返回ndarraywhen的函数axis=None，使其现在与axis=0andaxis=1行为一致（GH 39359）

Styler.apply()当通过或给出格式不正确的 CSS 时Styler.applymap()，渲染时会引发错误 ( GH 39660 )

Styler.format()escape现在接受可选 HTML 和 LaTeX 转义的关键字参数（ GH 40388、GH 41619）

Styler.background_gradient()gmap已经获得了为着色提供特定梯度图的论点（ GH 22727）

Styler.clear()现在也清除了Styler.hidden_index（Styler.hidden_columnsGH 40484）

添加了方法Styler.highlight_between()（GH 39821）

添加了方法Styler.highlight_quantile()（GH 40926）

添加了方法Styler.text_gradient()（GH 41098）

Styler.set_tooltips()添加允许悬停工具提示的方法；这可用于增强交互式显示（GH 21266、GH 40284）

precision在方法中添加了Styler.format()控制浮点数显示的参数（ GH 40134）

Styler渲染的 HTML 输出现在遵循w3 HTML 样式指南( GH 39626 )

该类的许多功能Styler现在可以在具有非唯一索引或列的 DataFrame 上部分或完全使用（GH 41143）

通过使用新的样式器选项对索引或列进行单独的稀疏化，可以更好地控制显示，这些选项也可以通过option_context()（GH 41142）使用。

styler.render.max_elements添加了在设置大型 DataFrame 样式时避免浏览器过载的选项（ GH 40712）

添加了方法Styler.to_latex()（GH 21673、GH 42320），该方法还允许一些有限的 CSS 转换（GH 40731）

添加了方法Styler.to_html()（GH 13379）

Styler.set_sticky()添加了使索引和列标题在滚动 HTML 框架中永久可见的方法（ GH 29072）

DataFrame 构造函数尊重`copy=False`dict #

将字典传递给DataFramewith时copy=False，将不再制作副本（GH 32960）。

In [1]: arr = np.array([1, 2, 3])

In [2]: df = pd.DataFrame({"A": arr, "B": arr.copy()}, copy=False)

In [3]: df
Out[3]: 
   A  B
0  1  1
1  2  2
2  3  3

df["A"]仍然持有以下观点arr：

In [4]: arr[0] = 0

In [5]: assert df.iloc[0, 0] == 0

不通过时的默认行为copy将保持不变，即将进行复制。

PyArrow 支持的字符串数据类型#

我们增强了StringDtype，这是一种专用于字符串数据的扩展类型。（GH 39908）

现在可以storage为指定关键字选项StringDtype。使用 pandas 选项或使用指定 dtypedtype='string[pyarrow]'来允许 PyArrow 数组而不是 Python 对象的 NumPy 数组支持 StringArray。

PyArrow 支持的 StringArray 需要安装 pyarrow 1.0.0 或更高版本。

警告

string[pyarrow]目前被认为是实验性的。 API 的实现和部分可能会在没有警告的情况下发生更改。

In [6]: pd.Series(['abc', None, 'def'], dtype=pd.StringDtype(storage="pyarrow"))
Out[6]: 
0     abc
1    <NA>
2     def
dtype: string

"string[pyarrow]"您也可以使用别名。

In [7]: s = pd.Series(['abc', None, 'def'], dtype="string[pyarrow]")

In [8]: s
Out[8]: 
0     abc
1    <NA>
2     def
dtype: string

您还可以使用 pandas 选项创建 PyArrow 支持的字符串数组。

In [9]: with pd.option_context("string_storage", "pyarrow"):
   ...:     s = pd.Series(['abc', None, 'def'], dtype="string")
   ...: 

In [10]: s
Out[10]: 
0     abc
1    <NA>
2     def
dtype: string

通常的字符串访问器方法可以工作。在适当的情况下，DataFrame 的 Series 或列的返回类型也将具有字符串数据类型。

In [11]: s.str.upper()
Out[11]: 
0     ABC
1    <NA>
2     DEF
dtype: string

In [12]: s.str.split('b', expand=True).dtypes
Out[12]: 
0    string[pyarrow]
1    string[pyarrow]
dtype: object

返回整数的字符串访问器方法将返回一个值Int64Dtype

In [13]: s.str.count("a")
Out[13]: 
0       1
1    <NA>
2       0
dtype: Int64

居中的类似日期时间的滚动窗口#

当使用类似日期时间的索引对 DataFrame 和 Series 对象执行滚动计算时，现在可以使用居中的类似日期时间的窗口 ( GH 38780 )。例如：

In [14]: df = pd.DataFrame(
   ....:     {"A": [0, 1, 2, 3, 4]}, index=pd.date_range("2020", periods=5, freq="1D")
   ....: )
   ....: 

In [15]: df
Out[15]: 
            A
2020-01-01  0
2020-01-02  1
2020-01-03  2
2020-01-04  3
2020-01-05  4

In [16]: df.rolling("2D", center=True).mean()
Out[16]: 
              A
2020-01-01  0.5
2020-01-02  1.5
2020-01-03  2.5
2020-01-04  3.5
2020-01-05  4.0

其他增强功能#

DataFrame.rolling()、、、现在支持带有选项的参数Series.rolling()，该选项可对整个.请参阅窗口概述了解性能和功能优势（GH 15095、GH 38995）DataFrame.expanding()Series.expanding()method'table'DataFrame
ExponentialMovingWindow现在支持一种online可以mean在线方式执行计算的方法。请参阅窗口概述( GH 41673 )
添加MultiIndex.dtypes()（GH 37062）
在( GH 37804 )中添加了参数的end选项end_dayoriginDataFrame.resample()
usecols改进了当and与andnames不匹配时的错误消息（GH 29042）read_csv()engine="c"
改进了在Window 方法win_type中传递无效参数时错误消息的一致性（GH 15969）
read_sql_query()现在接受一个dtype参数，根据用户输入从 SQL 数据库转换柱状数据（GH 10285）
read_csv()现在，如果未指定ParserWarning时标头或给定名称的长度与数据长度不匹配（ GH 21768）usecols
改进了使用时从 pandas 到 SQLAlchemy 的整数类型映射DataFrame.to_sql()( GH 35076 )
to_numeric()现在支持可为空对象的向下转换ExtensionDtype（GH 33013）
MultiIndex.set_names添加了对和MultiIndex.rename( GH 20421 )中类似字典的名称的支持
read_excel()现在可以自动检测 .xlsb 文件和较旧的 .xls 文件（GH 35416、GH 41225）
ExcelWriter现在接受一个if_sheet_exists参数来控制写入现有工作表时追加模式的行为（GH 40230）
Rolling.sum(), Expanding.sum(), Rolling.mean(), Expanding.mean(), ExponentialMovingWindow.mean(), Rolling.median(), Expanding.median(), Rolling.max(), Expanding.max(),Rolling.min()现在Expanding.min()支持使用关键字执行Numbaengine ( GH 38895 , GH 41267 )
DataFrame.apply()现在可以接受 NumPy 一元运算符作为字符串，例如，这已经是( GH 39116 )df.apply("sqrt")的情况Series.apply()
DataFrame.apply()现在可以接受不可调用的 DataFrame 属性作为字符串，例如，（GH 39116）df.apply("size")已经是这种情况Series.apply()
DataFrame.applymap()现在可以接受 kwargs 传递给用户提供的func（GH 39987）
现在不允许将DataFrame索引器传递给和( GH 39004 )ilocSeries.__getitem__()DataFrame.__getitem__()
Series.apply()现在可以接受不是列表或字典的类似列表或类似字典的参数，例如，（GH 39140）已经是这种情况ser.apply(np.array(["sum", "mean"]))DataFrame.apply()
DataFrame.plot.scatter()现在可以接受参数的分类列c（GH 12380，GH 31357）
Series.loc()现在，当系列具有 aMultiIndex并且索引器具有太多维度时，会引发有用的错误消息 ( GH 35349 )
read_stata()现在支持从压缩文件中读取数据（GH 26599）
添加了对解析带有负号的类似时间戳的支持（GH 37172）ISO 8601Timedelta
FloatingArray在( GH 38749 )中添加了对一元运算符的支持
RangeIndex现在可以通过直接传递对象来构造range，例如pd.RangeIndex(range(3))（GH 12067）
Series.round()现在DataFrame.round()可以使用可为空的整数和浮点数据类型（GH 38844）
read_csv()并read_json()公开参数encoding_errors来控制如何处理编码错误（GH 39450）
DataFrameGroupBy.any()、、、并使用具有可为空数据类型的 Kleene 逻辑 ( GH SeriesGroupBy.any() 37506 )DataFrameGroupBy.all()SeriesGroupBy.all()
DataFrameGroupBy.any()、SeriesGroupBy.any()、DataFrameGroupBy.all()和对于具有可为空数据类型的列SeriesGroupBy.all()返回 a ( GH 33449 )BooleanDtype
DataFrameGroupBy.any()、SeriesGroupBy.any()、DataFrameGroupBy.all()和并用包含即使( GH 37501 ) 的数据SeriesGroupBy.all()进行提升objectpd.NAskipna=True
DataFrameGroupBy.rank()现在SeriesGroupBy.rank()支持对象数据类型数据（GH 38278）
现在，使用 Python 可迭代对象（不是由NumPy标量组成的NumPy ）构造一个DataFrameor ，将产生精度为 NumPy 标量最大值的 dtype ；当是 NumPy ( GH 40908 )时已经是这种情况Seriesdatandarraydatandarray
添加关键字sort以pivot_table()允许对结果进行不排序（GH 39143）
添加关键字dropna以DataFrame.value_counts()允许对包含NA值的行进行计数（GH 41325）
Series.replace()现在将把结果投射到PeriodDtype可能的地方而不是objectdtype ( GH 41526 )
corr改进了、和中的错误消息，以及当不是or时的cov方法（GH 41741）RollingExpandingExponentialMovingWindowotherDataFrameSeries
Series.between()现在可以接受left或right作为inclusive仅包含左边界或右边界的参数（GH 40245）
DataFrame.explode()现在支持爆炸多列。它的column参数现在还接受 str 或元组列表，以便同时在多个列上爆炸（GH 39240）
DataFrame.sample()现在接受ignore_index采样后重置索引的参数，类似于DataFrame.drop_duplicates()和DataFrame.sort_values()( GH 38581 )

值得注意的错误修复#

这些错误修复可能会带来显着的行为变化。

`Categorical.unique`现在始终保持与原始相同的数据类型#

以前，当Categorical.unique()使用分类数据调用时，新数组中未使用的类别将被删除，从而使新数组的 dtype 与原始数组不同（GH 18291）

作为一个例子，给出：

In [17]: dtype = pd.CategoricalDtype(['bad', 'neutral', 'good'], ordered=True)

In [18]: cat = pd.Categorical(['good', 'good', 'bad', 'bad'], dtype=dtype)

In [19]: original = pd.Series(cat)

In [20]: unique = original.unique()

以前的行为：

In [1]: unique
['good', 'bad']
Categories (2, object): ['bad' < 'good']
In [2]: original.dtype == unique.dtype
False

新行为：

In [21]: unique
Out[21]: 
['good', 'bad']
Categories (3, object): ['bad' < 'neutral' < 'good']

In [22]: original.dtype == unique.dtype
Out[22]: True

保留#中的数据类型 `DataFrame.combine_first()`

DataFrame.combine_first()现在将保留 dtypes ( GH 7509 )

In [23]: df1 = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=[0, 1, 2])

In [24]: df1
Out[24]: 
   A  B
0  1  1
1  2  2
2  3  3

In [25]: df2 = pd.DataFrame({"B": [4, 5, 6], "C": [1, 2, 3]}, index=[2, 3, 4])

In [26]: df2
Out[26]: 
   B  C
2  4  1
3  5  2
4  6  3

In [27]: combined = df1.combine_first(df2)

以前的行为：

In [1]: combined.dtypes
Out[2]:
A    float64
B    float64
C    float64
dtype: object

新行为：

In [28]: combined.dtypes
Out[28]: 
A    float64
B      int64
C    float64
dtype: object

Groupby 方法 agg 和 Transform 不再更改可调用对象的返回数据类型#

以前，当参数可调用时，方法DataFrameGroupBy.aggregate()、 SeriesGroupBy.aggregate()、DataFrameGroupBy.transform()和可能会强制转换结果数据类型，这可能会导致不良结果 ( GH 21240 )。如果结果是数字，并且转换回输入数据类型不会更改测量的任何值，则会发生转换。现在没有发生这样的铸造。SeriesGroupBy.transform()funcnp.allclose

In [29]: df = pd.DataFrame({'key': [1, 1], 'a': [True, False], 'b': [True, True]})

In [30]: df
Out[30]: 
   key      a     b
0    1   True  True
1    1  False  True

以前的行为：

In [5]: df.groupby('key').agg(lambda x: x.sum())
Out[5]:
        a  b
key
1    True  2

新行为：

In [31]: df.groupby('key').agg(lambda x: x.sum())
Out[31]: 
     a  b
key      
1    1  2

`floatDataFrameGroupBy.mean()`、`DataFrameGroupBy.median()`、`GDataFrameGroupBy.var()`、`SeriesGroupBy.mean()`、、`SeriesGroupBy.median()`和#的结果`SeriesGroupBy.var()`

以前，这些方法可能会根据输入值产生不同的数据类型。现在，这些方法将始终返回浮点数据类型。（GH 41137）

In [32]: df = pd.DataFrame({'a': [True], 'b': [1], 'c': [1.0]})

以前的行为：

In [5]: df.groupby(df.index).mean()
Out[5]:
        a  b    c
0    True  1  1.0

新行为：

In [33]: df.groupby(df.index).mean()
Out[33]: 
     a    b    c
0  1.0  1.0  1.0

`loc`使用和#设置值时尝试就地操作`iloc`

loc当使用或设置整个列时iloc，pandas 会尝试将值插入到现有数据中，而不是创建一个全新的数组。

In [34]: df = pd.DataFrame(range(3), columns=["A"], dtype="float64")

In [35]: values = df.values

In [36]: new = np.array([5, 6, 7], dtype="int64")

In [37]: df.loc[[0, 1, 2], "A"] = new

在新行为和旧行为中，中的数据values都会被覆盖，但在旧行为中的 dtypedf["A"]更改为int64。

以前的行为：

In [1]: df.dtypes
Out[1]:
A    int64
dtype: object
In [2]: np.shares_memory(df["A"].values, new)
Out[2]: False
In [3]: np.shares_memory(df["A"].values, values)
Out[3]: False

在pandas 1.3.0中，df继续与以下人员共享数据values

新行为：

In [38]: df.dtypes
Out[38]: 
A    float64
dtype: object

In [39]: np.shares_memory(df["A"], new)
Out[39]: False

In [40]: np.shares_memory(df["A"], values)
Out[40]: True

设置#时切勿就地操作`frame[keys] = values`

当使用新数组设置多个列时，将替换这些键的预先存在的数组，这不会被覆盖（GH 39510）。因此，列将保留的 dtype ，而不会转换为现有数组的 dtype。frame[keys] = valuesvalues

In [41]: df = pd.DataFrame(range(3), columns=["A"], dtype="float64")

In [42]: df[["A"]] = 5

在旧的行为中，5被转换float64并插入到现有的数组支持中df：

以前的行为：

In [1]: df.dtypes
Out[1]:
A    float64

在新行为中，我们得到一个新数组，并保留一个整数类型5：

新行为：

In [43]: df.dtypes
Out[43]: 
A    int64
dtype: object

与布尔系列设置一致的铸造#

将非布尔值设置为 aSeries现在dtype=bool一致地转换为dtype=object( GH 38709 )

In [1]: orig = pd.Series([True, False])

In [2]: ser = orig.copy()

In [3]: ser.iloc[1] = np.nan

In [4]: ser2 = orig.copy()

In [5]: ser2.iloc[1] = 2.0

以前的行为：

In [1]: ser
Out [1]:
0    1.0
1    NaN
dtype: float64

In [2]:ser2
Out [2]:
0    True
1     2.0
dtype: object

新行为：

In [1]: ser
Out [1]:
0    True
1     NaN
dtype: object

In [2]:ser2
Out [2]:
0    True
1     2.0
dtype: object

DataFrameGroupBy.rolling 和 SeriesGroupBy.rolling 不再返回值中的分组列#

现在将从操作结果中删除分组依据列 groupby.rolling（GH 32262）

In [44]: df = pd.DataFrame({"A": [1, 1, 2, 3], "B": [0, 1, 2, 3]})

In [45]: df
Out[45]: 
   A  B
0  1  0
1  1  1
2  2  2
3  3  3

以前的行为：

In [1]: df.groupby("A").rolling(2).sum()
Out[1]:
       A    B
A
1 0  NaN  NaN
1    2.0  1.0
2 2  NaN  NaN
3 3  NaN  NaN

新行为：

In [46]: df.groupby("A").rolling(2).sum()
Out[46]: 
       B
A       
1 0  NaN
  1  1.0
2 2  NaN
3 3  NaN

删除了滚动方差和标准差中的人为截断#

Rolling.std()并且Rolling.var()将不再人为地将小于~1e-8和~1e-15分别截断为零的结果（GH 37051、GH 40448、GH 39872）。

但是，当滚动较大的值时，结果中现在可能存在浮点伪影。

In [47]: s = pd.Series([7, 5, 5, 5])

In [48]: s.rolling(3).var()
Out[48]: 
0         NaN
1         NaN
2    1.333333
3    0.000000
dtype: float64

具有 MultiIndex 的 DataFrameGroupBy.rolling 和 SeriesGroupBy.rolling 不再降低结果中的级别#

DataFrameGroupBy.rolling()并且SeriesGroupBy.rolling()将不再降低结果中DataFrame 带有 a 的 a的级别。MultiIndex这可能会导致结果中出现级别重复 MultiIndex，但此更改恢复了版本 1.1.3 中存在的行为（GH 38787、GH 38523）。

In [49]: index = pd.MultiIndex.from_tuples([('idx1', 'idx2')], names=['label1', 'label2'])

In [50]: df = pd.DataFrame({'a': [1], 'b': [2]}, index=index)

In [51]: df
Out[51]: 
               a  b
label1 label2      
idx1   idx2    1  2

以前的行为：

In [1]: df.groupby('label1').rolling(1).sum()
Out[1]:
          a    b
label1
idx1    1.0  2.0

新行为：

In [52]: df.groupby('label1').rolling(1).sum()
Out[52]: 
                        a    b
label1 label1 label2          
idx1   idx1   idx2    1.0  2.0

向后不兼容的 API 更改#

增加了依赖项的最低版本#

更新了一些依赖项的最低支持版本。如果安装了，我们现在需要：

包裹	最低版本	必需的	改变了
麻木	1.17.3	X	X
皮茨	2017.3	X
python-dateutil	2.7.3	X
瓶颈	1.2.1
数值表达式	2.7.0		X
pytest（开发）	6.0		X
mypy（开发）	0.812		X
设置工具	38.6.0		X

对于可选库，一般建议使用最新版本。下表列出了当前在 pandas 开发过程中测试的每个库的最低版本。低于最低测试版本的可选库可能仍然有效，但不被视为受支持。

包裹	最低版本	改变了
美丽汤4	4.6.0
快速镶木地板	0.4.0	X
FS规范	0.7.4
GCSFS	0.6.0
lxml	4.3.0
绘图库	2.2.3
努巴	0.46.0
开放式pyxl	3.0.0	X
皮箭头	0.17.0	X
pymysql	0.8.1	X
pytables	3.5.1
s3fs	0.4.0
scipy	1.2.0
sqlalchemy	1.3.0	X
制表	0.8.7	X
阵列	0.12.0
xlrd	1.2.0
XLSX作家	1.0.2
xlwt	1.3.0
pandas-GBQ	0.12.0

有关更多信息，请参阅依赖项和可选依赖项。

其他 API 更改#

部分初始化的CategoricalDtype对象（即带有的对象categories=None）将不再与完全初始化的 dtype 对象进行比较（GH 38516）
访问_constructor_expanddimaDataFrame并_constructor_sliced在 a 上Series现在引发AttributeError.之前NotImplementedError提出了 a ( GH 38782 )
添加了新的engine和**engine_kwargs参数以DataFrame.to_sql()支持其他未来的“SQL 引擎”。目前我们仍然只SQLAlchemy在引擎盖下使用，但计划支持更多引擎，例如turbodbc（GH 36893）
freq从PeriodIndex字符串表示中删除了冗余（ GH 41653）
ExtensionDtype.construct_array_type()现在是子类的必需方法而不是可选方法ExtensionDtype（GH 24860）
调用hash不可散列的 pandas 对象现在将引发TypeError内置错误消息（例如）。以前它会引发自定义消息，例如.此外，现在将返回（GH 40013）unhashable type: 'Series''Series' objects are mutable, thus they cannot be hashedisinstance(<Series>, abc.collections.Hashable)False
Styler.from_custom_template()现在有两个新的模板名称参数，并删除了旧的参数name，因为引入了模板继承以更好地解析（GH 42053）。还需要对 Styler 属性进行子类化修改。

建造＃

.pptx和格式的文档.pdf不再包含在轮子或源代码发行版中。（GH 30741）

弃用#

已弃用在 DataFrame 缩减和 DataFrameGroupBy 操作中删除烦人的列#

在 a上调用归约（例如、 .min、.max）.sum（默认），归约引发 a 的列将被静默忽略并从结果中删除。DataFramenumeric_only=NoneTypeError

此行为已被弃用。在未来的版本中，TypeError将会提高，用户在调用该函数之前将需要仅选择有效的列。

例如：

In [53]: df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)})

In [54]: df
Out[54]: 
   A          B
0  1 2016-01-01
1  2 2016-01-02
2  3 2016-01-03
3  4 2016-01-04

旧行为：

In [3]: df.prod()
Out[3]:
Out[3]:
A    24
dtype: int64

未来的行为：

In [4]: df.prod()
...
TypeError: 'DatetimeArray' does not implement reduction 'prod'

In [5]: df[["A"]].prod()
Out[5]:
A    24
dtype: int64

类似地，当将函数应用于时DataFrameGroupBy，函数引发的列TypeError当前会被静默忽略并从结果中删除。

此行为已被弃用。在未来的版本中，TypeError 将会提高，用户在调用该函数之前将需要仅选择有效的列。

例如：

In [55]: df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)})

In [56]: gb = df.groupby([1, 1, 2, 2])

旧行为：

In [4]: gb.prod(numeric_only=False)
Out[4]:
A
1   2
2  12

未来的行为：

In [5]: gb.prod(numeric_only=False)
...
TypeError: datetime64 type does not support prod operations

In [6]: gb[["A"]].prod(numeric_only=False)
Out[6]:
    A
1   2
2  12

其他弃用#

已弃用，允许将标量传递给Categorical构造函数（GH 38433）
不推荐CategoricalIndex在不传递类似列表的数据的情况下进行构造（GH 38944）
不推荐在构造函数中允许特定于子类的关键字参数Index，而是直接使用特定子类（GH 14093、GH 21311、GH 22315、GH 26974）
已弃用astype()datetimelike ( timedelta64[ns], datetime64[ns], Datetime64TZDtype, PeriodDtype) 转换为整数数据类型的方法，请改用values.view(...)( GH 38544 )。此弃用后来在 pandas 1.4.0 中恢复。
已弃用MultiIndex.is_lexsorted()和MultiIndex.lexsort_depth()，请改用MultiIndex.is_monotonic_increasing()( GH 32259 )
、、、try_cast中已弃用的关键字；如果需要，手动投射结果（GH 38836）Series.where()Series.mask()DataFrame.where()DataFrame.mask()
Timestamp已弃用对象与对象的比较datetime.date。而不是例如使用或（GH 36131）ts <= mydatets <= pd.Timestamp(mydate)ts.date() <= mydate
已弃用Rolling.win_type返回"freq"（GH 38963）
已弃用Rolling.is_datetimelike（GH 38963）
已弃用和的DataFrame索引器（GH 39004）Series.__setitem__()DataFrame.__setitem__()
已弃用ExponentialMovingWindow.vol()（GH 39220）
使用在dtype.astype之间进行转换已被弃用，并将在未来版本中提出，请使用or代替 ( GH 38622 )datetime64[ns]DatetimeTZDtypeobj.tz_localizeobj.dt.tz_localize
已弃用将datetime.date对象转换为在、、和中datetime64使用时改为传递( GH 39767 )fill_valueDataFrame.unstack()DataFrame.shift()Series.shift()DataFrame.reindex()pd.Timestamp(dateobj)
已弃用Styler.set_na_rep()并Styler.set_precision()赞成分别Styler.format()使用na_rep和precision作为现有输入参数和新输入参数（GH 40134、GH 40425）
已弃用Styler.where()，赞成使用替代配方Styler.applymap()（GH 40821）
已弃用，允许在类似列表或类似字典的情况Series.transform()下出现部分失败，并引发除之外的任何内容；在未来版本中除了将筹集资金之外的任何资金（ GH 40211）DataFrame.transform()funcTypeErrorfuncTypeError
已弃用的论点error_bad_lines以及warn_bad_lines支持read_csv()和read_table()赞成的论点on_bad_lines（GH 15122）
np.ma.mrecords.MaskedRecords构造函数中已弃用支持DataFrame，请改为传递( GH 40363 ){name: data[name] for name in data.dtype.names}
已弃用在不同数量的级别上使用merge(), DataFrame.merge(), 和( GH 34862 )DataFrame.join()
已弃用**kwargsin ExcelWriter；使用关键字参数engine_kwargs代替（GH 40430）
弃用了level关键字 forDataFrame和Seriesaggregations；使用 groupby 代替（GH 39983）
已弃用、、、、inplace的参数，并将在未来版本中删除 ( GH 37643 )Categorical.remove_categories()Categorical.add_categories()Categorical.reorder_categories()Categorical.rename_categories()Categorical.set_categories()
不推荐通过关键字和现有列merge()生成重复列（ GH 22818）suffixes
已弃用的设置，请使用所需的代码Categorical._codes创建一个新的设置（ GH 40606）Categorical
弃用了andconvert_float中的可选参数（GH 41127）read_excel()ExcelFile.parse()
DatetimeIndex.union()已弃用混合时区的行为；在未来的版本中，两者都将转换为 UTC 而不是对象 dtype ( GH 39328 )
不推荐使用withusecols的越界索引( GH 25623 )read_csv()engine="c"
不推荐对构造函数中第一个元素为 Categorical 的列表进行特殊处理DataFrame；传递为（GH 38845）pd.DataFrame({col: categorical, ...})
DataFrame当传递 adtype并且数据无法转换为该数据类型时，构造函数的行为已弃用。在未来的版本中，这将引发而不是被默默地忽略（GH 24435）
已弃用该Timestamp.freq属性。对于使用它的属性 ( is_month_start, is_month_end, is_quarter_start, is_quarter_end, is_year_start, is_year_end)，当您有时freq，请使用例如freq.is_month_start(ts)( GH 15146 )
已弃用 data 和 dtype 的构造Series或DataFrame带有DatetimeTZDtypedata 和datetime64[ns]dtype 的构造。使用Series(data).dt.tz_localize(None)替代（GH 41555，GH 33401）
Series不推荐使用大整数值和小整数数据类型默默溢出的构造行为；使用Series(data).astype(dtype)（GH 41734）
DataFrame即使有损，也已弃用使用浮动数据和整数数据类型转换进行构造的行为；在未来的版本中，这将保持浮动，匹配Series行为（GH 41770）
当传递包含字符串的数据且不传递任何数据时，不建议在构造中推断timedelta64[ns]、datetime64[ns]或DatetimeTZDtypedtypes ( GH 33558 )Seriesdtype
在未来版本中，构造Seriesor DataFramewith datetime64[ns]dataDatetimeTZDtype并将数据视为 wall-time 而不是 UTC 时间（匹配 DatetimeIndex 行为）。要将数据视为 UTC 时间，请使用pd.Series(data).dt.tz_localize("UTC").dt.tz_convert(dtype.tz)或( GH 33401 )pd.Series(data.view("int64"), dtype=dtype)
key已弃用有关和DataFrame.xs()的通过列表Series.xs()（GH 41760）
已弃用inclusivein的布尔参数作为标准参数值 ( GH 40628 )Series.between(){"left", "right", "neither", "both"}
已弃用将参数作为以下所有项的位置传递，但有例外情况 ( GH 41485 )：
- concat()（以外objs）
- read_csv()（以外filepath_or_buffer）
- read_table()（以外filepath_or_buffer）
- DataFrame.clip()和Series.clip()（除了upper和lower）
- DataFrame.drop_duplicates()（除了subset）Series.drop_duplicates()、Index.drop_duplicates()和MultiIndex.drop_duplicates()
- DataFrame.drop()（除了labels）和Series.drop()
- DataFrame.dropna()和Series.dropna()
- DataFrame.ffill()，Series.ffill()，DataFrame.bfill()，和Series.bfill()
- DataFrame.fillna()和Series.fillna()（除了value）
- DataFrame.interpolate()和Series.interpolate()（除了method）
- DataFrame.mask()和Series.mask()（除了cond和other）
- DataFrame.reset_index()（除了level）和Series.reset_index()
- DataFrame.set_axis()和Series.set_axis()（除了labels）
- DataFrame.set_index()（以外keys）
- DataFrame.sort_index()和Series.sort_index()
- DataFrame.sort_values()（除了by）和Series.sort_values()
- DataFrame.where()和Series.where()（除了cond和other）
- Index.set_names()和MultiIndex.set_names()（除了names）
- MultiIndex.codes()（除了codes）
- MultiIndex.set_levels()（除了levels）
- Resampler.interpolate()（以外method）

性能改进#

IntervalIndex.isin()( GH 38353 )的性能改进
Series.mean()可空数据类型的性能改进( GH 34814 )
Series.isin()可空数据类型的性能改进( GH 38340 )
DataFrame.fillna()使用可为空浮点和可为空整数数据类型method="pad"时的性能改进（ GH 39953）method="backfill"
( GH 28329 )DataFrame.corr()的性能改进method=kendall
( GH 40956、GH 41885 )DataFrame.corr()的性能改进method=spearman
Rolling.corr()和Rolling.cov()( GH 39388 )的性能改进
RollingGroupby.corr()、ExpandingGroupby.corr()和( GH 39591 )ExpandingGroupby.corr()的性能改进ExpandingGroupby.cov()
unique()对象数据类型的性能改进（ GH 37615）
json_normalize()基本情况（包括分离器）的性能改进（ GH 40035 GH 15621）
聚合方法的性能改进ExpandingGroupby（GH 39664）
Styler渲染时间减少 50% 以上且现在匹配的性能改进DataFrame.to_html()（GH 39972、 GH 39952、GH 40425）
该方法Styler.set_td_classes()现在的性能与Styler.apply()和一样Styler.applymap()，在某些情况下甚至更高 ( GH 40453 )
性能改进ExponentialMovingWindow.mean()( timesGH 39784 )
需要 Python 回退实现时DataFrameGroupBy.apply()的性能改进（ GH 40176）SeriesGroupBy.apply()
PyArrow 布尔数组转换为 pandas 可空布尔数组的性能改进（GH 41051）
数据与类型串联的性能改进CategoricalDtype（GH 40193）
DataFrameGroupBy.cummin()、SeriesGroupBy.cummin()、DataFrameGroupBy.cummax()和SeriesGroupBy.cummax()可空数据类型的性能改进( GH 37493 )
Series.nunique()nan 值的性能改进（ GH 40865）
性能改进DataFrame.transpose()( GH Series.unstack()40149 )DatetimeTZDtype
入口点延迟加载Series.plot()中的性能改进（ GH 41492）DataFrame.plot()

Bug修复＃

分类#

传递标量数据时CategoricalIndex错误地未能引发错误（ GH 38614）TypeError
CategoricalIndex.reindex当Index传递的不是分类的但其值是类别中的所有标签时，错误失败（ GH 28690）
Categorical从 object-dtype 对象数组构造一个错误，无法使用（GH 38552）date正确往返astype
DataFrame从 anndarray和 a构造 a 时出现错误CategoricalDtype( GH 38857 )
DataFrame将分类值设置到( GH 39136 )中的对象数据类型列中时出现错误
当新索引包含重复项并且旧索引是 a ( GH 38906 )时，错误DataFrame.reindex()会引发 anIndexErrorCategoricalIndex
Categorical.fillna()使用类似元组的类别提升NotImplementedError而不是ValueError填充非类别元组时出现错误（ GH 41914）

类似日期时间#

错误DataFrame和构造函数有时会从（resp. ）和（resp. ）（GH 38032）Series中删除纳秒TimestampTimedeltadatadtype=datetime64[ns]timedelta64[ns]
当第一天是一个月的最后一天时，偏移量为一个月的错误会返回不正确的结果（DataFrame.first()GH 29623）Series.first()
构造 aDataFrame或Series时datetime64数据和timedelta64dtype 不匹配的错误，反之亦然，无法引发 a TypeError( GH 38575、GH 38764、GH 38792 )
使用超出 dtype 范围的对象构造orSeries或超出dtype 范围的对象时出现错误（GH 38792、GH 38965）DataFramedatetimedatetime64[ns]timedeltatimedelta64[ns]
错误DatetimeIndex.intersection()，，，在操作时总是返回对象数据类型DatetimeIndex.symmetric_difference()（GH 38741）PeriodIndex.intersection()PeriodIndex.symmetric_difference()CategoricalIndex
DatetimeIndex.intersection()使用非 Tick 频率给出不正确结果的错误( GH 42104 )n != 1
错误地将值Series.where()转换为( GH 37682 )datetime64int64
错误地将对象Categorical类型转换datetime为Timestamp( GH 38878 )
Timestamp对象与datetime64超出纳秒实现范围的对象之间的比较错误datetime64（GH 39221）
Timestamp.round()、Timestamp.floor()、中的错误，对于( GH 39244 )Timestamp.ceil()的实现范围附近的值Timestamp
Timedelta.round()、Timedelta.floor()、中的错误，对于( GH 38964 )Timedelta.ceil()的实现范围附近的值Timedelta
在极端情况下date_range()错误地创建DatetimeIndex包含NaT而不是引发的错误（ GH 24124）OutOfBoundsDatetime
如果后者具有时区并跨越 DST 边界，则错误地infer_freq()无法推断“H”频率（ GH 39556）DatetimeIndex
错误Series支持DatetimeArray或TimedeltaArray有时无法将数组设置freq为None（GH 41425）

时间增量#

Timedelta从np.timedelta64具有非纳秒单位的对象构建时出现错误timedelta64[ns]（GH 38965）
构造TimedeltaIndex错误接受np.datetime64("NaT")对象时的错误（GH 39462）
从仅包含符号而没有数字的输入字符串构建时出现的错误Timedelta未能引发错误（GH 39710）
当传递非纳秒数组时出现错误TimedeltaIndex并且to_timedelta()无法引发，该数组在转换为（GH 40008）timedelta64时溢出timedelta64[ns]

时区＃

表示 UTC 的不同对象中的错误tzinfo不被视为等效（GH 39216）
dateutil.tz.gettz("UTC")不被识别为与其他代表 UTC 的 tzinfo 等效的错误( GH 39276 )

数字#

错误DataFrame.quantile()，DataFrame.sort_values()导致后续索引行为不正确 ( GH 38351 )
为空DataFrame.sort_values()引发错误（GH 40258）IndexErrorby
错误DataFrame.select_dtypes()会include=np.number删除数字ExtensionDtype列（GH 35340）
错误DataFrame.mode()并且没有为空输入Series.mode()保持一致的整数（ GH 33321）Index
DataFrame.rank()当 DataFrame 包含时出现错误np.inf（GH 32593）
错误DataFrame.rank()与axis=0持有无与伦比的类型提高IndexError（GH 38932）
Series.rank()、DataFrame.rank()、中的错误DataFrameGroupBy.rank()，并将SeriesGroupBy.rank()最大负值int64视为缺失（GH 32859）
DataFrame.select_dtypes()Windows 和 Linux 之间不同行为的错误include="int"( GH 36596 )
错误DataFrame.apply()以及DataFrame.agg()传递参数时func="size"将对整个DataFrame而不是行或列进行操作（GH 39934）
当传递字典并且缺少列时，错误DataFrame.transform()会引发错误；SpecificationError现在将提出一个KeyError( GH 40004 )
错误DataFrameGroupBy.rank()并SeriesGroupBy.rank()给出不正确的结果，pct=True并且连续组之间的值相等（GH 40518）
当参数( GH 40908 )时，错误Series.count()会导致int32在 32 位平台上产生结果level=None
方法中的错误Series和减少，并且不返回对象数据的布尔结果（GH 12863、GH 35450、GH 27709）DataFrameanyall
Series.clip()如果系列包含 NA 值并且具有可为空的 int 或 float 作为数据类型（GH 40851），则错误将失败
错误UInt64Index.where()并错误地引发了dtype UInt64Index.putmask()( GH 41974 )np.int64otherTypeError
DataFrame.agg()当一个或多个聚合函数无法生成结果时，不按照提供的聚合函数的顺序对聚合轴进行排序的错误( GH 33634 )
DataFrame.clip()不将缺失值解释为无阈值的错误（ GH 40420）

转换＃

Series.to_dict()现在， with中的错误orient='records'返回 Python 本机类型 ( GH 25969 )
在类似日期时间 (,,,,) 数据类型之间转换时出现Series.view()错误( GH 39788 )Index.view()datetime64[ns]datetime64[ns, tz]timedelta64period
DataFrame从空创建np.recarray不保留原始数据类型时出现错误（GH 40121）
从 a 构建时DataFrame未能引发 a 的错误( GH 40163 )TypeErrorfrozenset
当数据无法转换为该数据类型时，构造中的错误Index会默默地忽略传递的错误（ GH 21311）dtype
StringArray.astype()回退到 NumPy 并在转换为dtype='categorical'( GH 40450 )时引发错误
错误在于factorize()，当给定一个数字 NumPy dtype 低于 int64、uint64 和 float64 的数组时，唯一值不会保留其原始 dtype ( GH 41132 )
DataFrame包含类似数组的字典的构造错误ExtensionDtype，但copy=True无法进行复制（GH 38939）
qcut()作为输入时引发错误的错误Float64DType（GH 40730）
数据错误DataFrame和Series构造导致对象而不是对象（GH 41599）datetime64[ns]dtype=objectdatetimeTimestamp
数据错误DataFrame和Series构造导致对象而不是对象（GH 41599）timedelta64[ns]dtype=objectnp.timedelta64Timedelta
DataFrame当给定二维对象 dtypenp.ndarray或Period对象Interval未能分别转换为PeriodDtype或时，构造中的错误IntervalDtype（GH 41812）
Series从列表和 a构造 a 时出现错误PandasDtype( GH 39357 )
从不适合 dtype 范围的对象创建 aSeries时出现错误（GH 30173）rangeint64
Series从dict带有全元组键的a创建 a和Index需要重新索引的 an 时出现错误（GH 41707）
infer_dtype()无法识别具有周期 dtype 的系列、索引或数组的错误( GH 23553 )
infer_dtype()对一般对象引发错误的错误ExtensionArray。它现在将返回"unknown-array"而不是提高（GH 37367）
在空 DataFrame 上调用时DataFrame.convert_dtypes()错误地引发了错误（ GH 40393）ValueError

字符串#

当原始块为零时，从pyarrow.ChunkedArray到的转换中存在错误（ GH 41040）StringArray
错误Series.replace()并忽略数据DataFrame.replace()替换（GH 41333，GH 35977）regex=TrueStringDType
返回空对象数据类型Series.str.extract()时出现错误（GH 41441）StringArrayDataFrame
当（GH 41602）时Series.str.replace()参数case被忽略的错误regex=False

间隔＃

使用（GH 38653、GH 38741）进行操作时出现错误IntervalIndex.intersection()并IntervalIndex.symmetric_difference()始终返回对象数据类型CategoricalIndex
IntervalIndex.intersection()当至少一个对象Index具有另一个对象中存在的重复项时返回重复项的错误( GH 38743 )
IntervalIndex.union(), IntervalIndex.intersection(), IntervalIndex.difference(),IntervalIndex.symmetric_difference()现在转换为适当的 dtype 而不是在与另一个具有不兼容 dtype 的TypeError操作时引发 a ( GH 39267 )IntervalIndex
PeriodIndex.union(), PeriodIndex.intersection(), PeriodIndex.symmetric_difference(),PeriodIndex.difference()现在转换为对象 dtype，而不是在与另一个具有不兼容 dtype 的IncompatibleFrequency操作时引发 ( GH 39306 )PeriodIndex
IntervalIndex.is_monotonic()、IntervalIndex.get_loc()、IntervalIndex.get_indexer_for()和中IntervalIndex.__contains__()存在 NA 值时的错误( GH 41831 )

索引#

当不是单调或设置为（GH 36289，GH 31326，GH 40862）时出现错误Index.union()并MultiIndex.union()删除重复Index值IndexsortFalse
非唯一时CategoricalIndex.get_indexer()无法引发的错误（ GH 38372）InvalidIndexError
IntervalIndex.get_indexer()当target索引CategoricalDtype和目标都包含 NA 值时出现错误（ GH 41934）
当使用布尔列表过滤输入并且要设置的值是具有较低维度的列表时Series.loc()引发的错误（ GH 20438）ValueError
将许多新列插入到DataFrame导致不正确的后续索引行为的错误（GH 38380）
将多个值设置为重复列时DataFrame.__setitem__()引发的错误（ GH 15695）ValueError
、中的错误DataFrame.loc()，Series.loc()以及为字符串切片的非单调返回不正确的元素( GH 33146 )DataFrame.__getitem__()Series.__getitem__()DatetimeIndex
DataFrame.reindex()时Series.reindex()区感知索引中的 bug会引发TypeErrorformethod="ffill"和method="bfill"指定的错误tolerance( GH 38566 )
当需要转换为对象数据类型时，错误地转换DataFrame.reindex()为整数datetime64[ns]或错误地转换为整数（ GH 39755）timedelta64[ns]fill_value
使用指定列和非空值设置空值时DataFrame.__setitem__()引发错误（GH 38831）ValueErrorDataFrameDataFrame
当具有重复列时对唯一列进行操作时DataFrame.loc.__setitem__()引发错误( GH 38521 )ValueErrorDataFrame
使用字典值设置时混合数据类型DataFrame.iloc.__setitem__()中的错误（ GH 38335）DataFrame.loc.__setitem__()
当提供布尔生成器时出现错误Series.loc.__setitem__()并DataFrame.loc.__setitem__()引发（ GH 39614）KeyError
当提供生成器时出现错误Series.iloc()并DataFrame.iloc()引发错误（ GH 39614）KeyError
当右侧是列数错误的a 时DataFrame.__setitem__()，不引发 a 的错误（ GH 38604）ValueErrorDataFrame
使用标量索引器设置 a 时Series.__setitem__()引发 a 的错误（ GH 38303）ValueErrorSeries
当用作输入只有一行时DataFrame.loc()，降低 a 级别的错误（ GH 10521）MultiIndexDataFrame
当使用现有字符串进行切片时，其中有毫秒（GH 33589）DataFrame.__getitem__()，并且Series.__getitem__()总是引发错误KeyErrorIndex
设置timedelta64或datetime64数值中的错误Series无法转换为对象数据类型（GH 39086、GH 39619）
Interval将值设置为Series或DataFrame不匹配的错误IntervalDtype将新值错误地转换为现有的数据类型（GH 39120）
datetime64将值设置为Series整数 dtype 的错误错误地将 datetime64 值转换为整数 ( GH 39266 )
设置np.datetime64("NaT")为 aSeries时出现错误Datetime64TZDtype，错误地将 timezone-naive 值视为时区感知 ( GH 39769 )
当和被指定但不在( GH 39382 )中时Index.get_loc()不引发错误KeyErrorkey=NaNmethodNaNIndex
插入时区感知索引DatetimeIndex.insert()时错误地将时区原始值视为时区感知时出现错误（ GH 39769）np.datetime64("NaT")
Index.insert()当设置不能保留在现有列中的新列时，错误地引发错误frame.columns，或者在转换为兼容的数据类型时出现错误（Series.reset_index()GH 39068）DataFrame.reset_index()
RangeIndex.append()错误连接长度为 1 的单个对象的错误( GH 39401 )
错误在于RangeIndex.astype()，当转换为时CategoricalIndex，类别变为 aInt64Index而不是 a RangeIndex( GH 41263 )
使用布尔索引器将值设置numpy.timedelta64为对象数据类型时出现错误（ GH 39488）Series
Series使用at或iat无法转换为对象数据类型将数值设置为布尔数据类型时出现错误（ GH 39582）
尝试使用行切片进行索引并将列表设置为值时出现错误DataFrame.__setitem__()并DataFrame.iloc.__setitem__()引发错误（ GH 40440）ValueError
当未找到密钥且未完全指定级别时DataFrame.loc()不引发错误（ GH 41170）KeyErrorMultiIndex
DataFrame.loc.__setitem__()当扩展轴中的索引包含重复项时，扩展设置错误地引发错误（ GH 40096）
当至少一个索引列具有 float 数据类型并且我们检索标量时，转换为 float 会DataFrame.loc.__getitem__()出现错误（ GH 41369）MultiIndex
DataFrame.loc()错误匹配非布尔索引元素的错误（ GH 20432）
当键存在时，使用np.nanaSeries或DataFrame错误CategoricalIndex地提升索引时出现错误（ GH 41933）KeyErrornp.nan
Series.__delitem__()错误地ExtensionDtype投射到ndarray（GH 40386）
传递整数键时返回错误结果DataFrame.at()的错误（ GH 41846）CategoricalIndex
如果索引器有重复项，则会以错误的顺序DataFrame.loc()返回 a 的错误( GH 40978 )MultiIndex
使用子类作为带有 a 的列名时DataFrame.__setitem__()引发 a 的错误( GH 37366 )TypeErrorstrDatetimeIndex
当给定一个不匹配的a 时PeriodIndex.get_loc()，无法引发 a 的错误（GH 41670）KeyErrorPeriodfreq
.loc.__getitem__aUInt64Index和负整数键的错误在某些情况OverflowError下会升高，而KeyError在其他情况下会环绕为正整数（GH 41777）
在某些情况下Index.get_indexer()无法使用无效、或参数引发错误（GH 41918）ValueErrormethodlimittolerance
当传递无效字符串而不是 a时，对aSeries或DataFramea进行切片时出现错误（GH 41821）TimedeltaIndexValueErrorTypeError
构造函数中的错误Index有时会默默地忽略指定的dtype（GH 38879）
Index.where()行为现在镜像Index.putmask()行为，即匹配（GH 39412）index.where(mask, other)index.putmask(~mask, other)

丢失的＃

Bug inGrouper没有正确传播dropna论点；现在可以正确处理( GH 35612 )DataFrameGroupBy.transform()的缺失值dropna=True
isna()、Series.isna()、Index.isna()、DataFrame.isna()和相应函数中的错误notna无法识别Decimal("NaN")对象（GH 39409）
DataFrame.fillna()不接受关键字字典的错误downcast（GH 40809）
isna()不返回可为空类型的掩码副本的错误，导致任何后续掩码修改都会更改原始数组（ GH 40935）
DataFrame使用包含浮点数据NaN和整数dtype转换而不是保留的构造错误NaN（GH 26919）
错误Series.isin()并且MultiIndex.isin()没有将所有 nan 视为等效（如果它们位于元组中）（GH 41836）

多重索引#

当不唯一且未提供时DataFrame.drop()引发 a 的错误（ GH 36293）TypeErrorMultiIndexlevel
结果MultiIndex.intersection()重复时出现错误（ GH 38623）NaN
即使它们的顺序不同，错误也会MultiIndex.equals()错误地返回（GH 38439）TrueMultiIndexNaN
与( GH 38653 )MultiIndex.intersection()相交时总是返回空结果的错误CategoricalIndex
当索引包含不可排序条目时MultiIndex.difference()错误引发的错误（ GH 41915）TypeError
当用于空并且仅索引特定级别时MultiIndex.reindex()引发 a 的错误（ GH 41170）ValueErrorMultiIndex
针对平盘重新索引时MultiIndex.reindex()加注错误( GH 41707 )TypeErrorIndex

输入/输出#

错误发生Index.__repr__()时display.max_seq_items=1（GH 38415）
如果设置了read_csv()参数，则无法识别科学计数法，并且存在错误（GH 31920）decimalengine="python"
将值read_csv()解释为注释时的错误，何时包含修复的注释字符串（GH 34002）NANAengine="python"
read_csv()引发IndexError具有多个标题列的错误，并且index_col在文件没有数据行时指定 ( GH 38292 )
read_csv()不接受与( GH 16469 )usecols不同的长度的错误namesengine="python"
使用并指定 for ( GH 35873 )read_csv()时返回对象数据类型的错误delimiter=","usecolsparse_datesengine="python"
在指定为( GH 33699 )时read_csv()引发错误TypeErrornamesparse_datesengine="c"
WSL 中存在错误read_clipboard()且DataFrame.to_clipboard()无法工作 ( GH 38527 )
允许,和( GH 35185 )parse_dates的参数自定义错误值read_sql()read_sql_query()read_sql_table()
尝试申请or的子类时出现错误DataFrame.to_hdf()并Series.to_hdf()引发 a ( GH 33748 )KeyErrorDataFrameSeries
使用非字符串 dtype 保存 DataFrame 时HDFStore.put()引发错误的错误（GH 34274）TypeError
json_normalize()导致生成器对象的第一个元素不包含在返回的 DataFrame 中的错误（ GH 35923）
当应该解析日期列并指定为（GH 39365）read_csv()时，将千位分隔符应用于日期列时出现错误usecolsengine="python"
指定多个标题和索引列时read_excel()前向填充名称中的错误( GH 34673 )MultiIndex
read_excel()不尊重的错误set_option()（GH 34252）
read_csv()不切换true_values和可为空布尔数据类型的错误false_values（GH 34655）
不维护数字字符串索引read_json()时出现错误（ GH 28556）orient="split"
read_sql()chunksize如果非零且查询未返回任何结果，则返回空生成器。现在返回一个带有单个空 DataFrame 的生成器（GH 34411）
read_hdf()使用参数过滤分类字符串列时返回意外记录的错误where( GH 39189 )
当为空时read_sas()引发错误（GH 39725）ValueErrordatetimes
read_excel()从单列电子表格中删除空值的错误（ GH 39808）
read_excel()加载某些文件类型的尾随空行/列时出现错误（ GH 41167）
当 excel 文件有一个标题后跟两个空行并且没有索引时read_excel()引发的错误（ GH 40442）AttributeErrorMultiIndex
read_excel()、read_csv()、read_table()、read_fwf()和中的错误read_clipboard()，其中没有索引的标题后的一个空白行将MultiIndex被删除（GH 40442）
( GH 40904 )DataFrame.to_string()时截断列错位的错误index=False
在DataFrame.to_string()添加额外的点和未对齐截断行时出现错误index=False（GH 40904）
read_orc()总是提出错误AttributeError（GH 40918）
错误read_csv()并read_table()默默地忽略prefixifnames和prefix已定义，现在引发ValueError( GH 39123 )
当设置为（GH 35211）时，出现错误read_csv()并且read_excel()不尊重重复列名的 dtypemangle_dupe_colsTrue
read_csv()默默地忽略sepifdelimiter和已定义的错误sep，现在引发ValueError( GH 39823 )
先前调用时出现错误read_csv()并read_table()误解参数（ GH 41069）sys.setprofile
从 PyArrow 到 pandas 的转换（例如，用于读取 Parquet）中存在可空 dtypes 的错误，并且 PyArrow 数组的数据缓冲区大小不是 dtype 大小的倍数（GH 40896）
read_excel()即使用户指定了参数，当 pandas 无法确定文件类型时，错误也会引发错误engine（GH 41225）
read_clipboard()如果第一列中有空值，则从 Excel 文件复制时出现的错误会将值移至错误的列 ( GH 41108 )
尝试将字符串列附加到不兼容的列时出现错误DataFrame.to_hdf()并Series.to_hdf()引发错误（ GH 41897）TypeError

时期＃

Period对象或Index、Series或DataFrame与不匹配的比较PeriodDtype现在的行为类似于其他不匹配类型的比较，返回False等于、True不等于以及引发TypeError不等检查（GH 39274）

绘图#

plotting.scatter_matrix()当 2d 参数通过时引发错误ax（GH 16253）
constrained_layout启用Matplotlib 时防止出现警告( GH 25261 )
如果重复调用该函数并且使用了某些调用而其他调用没有使用，则错误DataFrame.plot()会在图例中显示错误的颜色（ GH 39522）yerr
如果重复调用该函数并且使用某些调用而其他调用使用（GH 40044），则错误DataFrame.plot()会在图例中显示错误的颜色secondary_ylegend=False
DataFrame.plot.box()选择主题时出现的错误dark_background，绘图的大写字母或最小/最大标记不可见（GH 40769）

分组/重新采样/滚动#

列中的错误DataFrameGroupBy.agg()和SeriesGroupBy.agg()列PeriodDtype错误地过度激进地投射结果（GH 38254）
SeriesGroupBy.value_counts()未统计分组分类系列中未观察到的类别的错误（ GH 38672）
SeriesGroupBy.value_counts()在空系列上引发错误的错误( GH 39172 )
GroupBy.indices()当 groupby 键中存在空值时，错误将包含不存在的索引（ GH 9304）
修复了现在使用 Kahan 求和导致精度损失的DataFrameGroupBy.sum()错误（GH 38778）SeriesGroupBy.sum()
DataFrameGroupBy.cumsum()修复了、SeriesGroupBy.cumsum()、、中的错误DataFrameGroupBy.mean()，并SeriesGroupBy.mean()通过使用 Kahan 求和导致精度损失（GH 38934）
当缺少键具有混合数据类型时，错误Resampler.aggregate()并DataFrame.transform()引发 aTypeError而不是（ GH 39025）SpecificationError
DataFrameGroupBy.idxmin()列中和列DataFrameGroupBy.idxmax()中的错误ExtensionDtype( GH 38733 )
Series.resample()当索引由( GH 39227 )PeriodIndex组成时，会引发错误NaT
错误RollingGroupby.corr()以及ExpandingGroupby.corr()groupby 列将返回的位置，0而不是np.nan提供other比每个组更长的值（GH 39591）
错误ExpandingGroupby.corr()以及将返回的ExpandingGroupby.cov()位置，而不是提供比每个组更长的时间（GH 39591）1np.nanother
DataFrameGroupBy.mean()、SeriesGroupBy.mean()、DataFrameGroupBy.median()、中的错误SeriesGroupBy.median()，并且DataFrame.pivot_table()不传播元数据 ( GH 28283 )
当窗口是偏移并且日期按降序排列时，错误Series.rolling()并且无法正确计算窗口边界（ GH 40002）DataFrame.rolling()
直接使用方法、、、、、、和/或通过、、或使用它们时，空Series.groupby()或会丢失索引、列和/或数据类型（GH 26411）DataFrame.groupby()SeriesDataFrameidxmaxidxminmadminmaxsumprodskewapplyaggregateresample
在对象上使用时创建 a 而DataFrameGroupBy.apply()不是SeriesGroupBy.apply()an 的错误( GH 39732 )MultiIndexIndexRollingGroupby
指定索引且索引为( GH 39927 )DataFrameGroupBy.sample()时引发错误的错误weightsInt64Index
当传递字典并且缺少列时，DataFrameGroupBy.aggregate()有时Resampler.aggregate()会出现错误；SpecificationError现在总是会提出一个KeyError( GH 40004 )
DataFrameGroupBy.sample()在计算结果之前未应用列选择的错误（ GH 39928）
当提供时ExponentialMovingWindow调用__getitem__会错误地引发错误（GH 40164）ValueErrortimes
ExponentialMovingWindow调用时出现的错误__getitem__不会保留com、span或属性（alphaGH 40164）halflife
ExponentialMovingWindow现在由于计算不正确而指定withNotImplementedError时会引发 a ( GH 40098 )timesadjust=False
当（GH 40951）时ExponentialMovingWindowGroupby.mean()参数times被忽略的错误engine='numba'
ExponentialMovingWindowGroupby.mean()在多个组的情况下使用错误时间的错误（ GH 40951）
ExponentialMovingWindowGroupby对于非平凡组，时间向量和值变得不同步的错误（ GH 40951）
当索引未排序时出现错误Series.asfreq()并删除行（ GH 39805）DataFrame.asfreq()
聚合函数中的错误，因为在给出关键字时DataFrame不尊重numeric_only参数（ GH 40660）level
SeriesGroupBy.aggregate()使用用户定义的函数聚合具有对象类型的系列会Index导致Index形状不正确的错误（GH 40014）
忽略参数的RollingGroupby错误（GH 39433）as_index=Falsegroupby
错误DataFrameGroupBy.any()，SeriesGroupBy.any()以及DataFrameGroupBy.all()在与可空类型列一起使用时SeriesGroupBy.all()引发 a甚至与( GH 40585 )ValueErrorNAskipna=True
DataFrameGroupBy.cummin()、SeriesGroupBy.cummin()和中的错误DataFrameGroupBy.cummax()以及在实现边界SeriesGroupBy.cummax()附近错误地舍入整数值int64( GH 40767 )
可空数据类型中的错误DataFrameGroupBy.rank()错误SeriesGroupBy.rank()地引发了TypeError( GH 41010 )
DataFrameGroupBy.cummin()、和中的错误SeriesGroupBy.cummin()，DataFrameGroupBy.cummax()以及SeriesGroupBy.cummax()在转换为浮点数时可空数据类型太大而无法往返时计算错误结果 ( GH 37493 )
如果计算不稳定，则为所有窗口DataFrame.rolling()返回均值零的错误（ GH 41053）NaNmin_periods=0
如果计算不稳定，则所有窗口DataFrame.rolling()返回总和不为零的错误（ GH 41053）NaNmin_periods=0
无法在保留顺序的聚合上SeriesGroupBy.agg()保留有序的错误（ GH 41147）CategoricalDtype
DataFrameGroupBy.min()、SeriesGroupBy.min()和中存在多个 object-dtype 列中的错误，DataFrameGroupBy.max()并错误地引发了( GH 41111 )SeriesGroupBy.max()numeric_only=FalseValueError
DataFrameGroupBy.rank()GroupBy 对象axis=0和rank方法的关键字出现错误axis=1( GH 41320 )
DataFrameGroupBy.__getitem__()非唯一列的错误错误地返回格式错误SeriesGroupBy的而不是DataFrameGroupBy（GH 41427）
DataFrameGroupBy.transform()非唯一列的错误错误地引发了AttributeError( GH 41427 )
Resampler.apply()非唯一列的错误错误地删除了重复的列（ GH 41445）
Series.groupby()聚合中的错误错误地返回空Series，而不是引发TypeError对其 dtype 无效的聚合，例如.prod使用datetime64[ns]dtype ( GH 41342 )
DataFrameGroupBy当没有有效列时，聚合中的错误无法删除具有无效数据类型的列（ GH 41291）
DataFrame.rolling.__iter__()where中的错误on未分配给结果对象的索引（GH 40373）
使用用户传递的函数缓存的DataFrameGroupBy.transform()错误DataFrameGroupBy.agg()（GH 41647）engine="numba"*args
DataFrameGroupBy方法中的错误agg, transform, sum, bfill, ffill, pad, pct_change, shift,ohlc丢弃.columns.names( GH 41497 )

重塑#

merge()使用部分索引执行内部联接以及right_index=True索引之间没有重叠时引发错误的错误（ GH 33814）
缺少级别的错误DataFrame.unstack()导致索引名称不正确（GH 37510）
使用和规范merge_asof()传播右索引而不是左索引时出现错误（ GH 33463）left_index=Trueright_on
当两个索引之一只有一级时，DataFrame.join()DataFrame 上的错误会返回错误的结果 ( GH 36909 )MultiIndex
merge_asof()现在在非数字合并列的情况下引发 aValueError而不是 cryptic ( GH 29130 )TypeError
DataFrame.join()当 DataFrame 具有其中MultiIndex至少一个维度具有Categorical非按字母顺序排序的类别的dtype 时，无法正确分配值的错误( GH 38502 )
Series.value_counts()现在Series.mode()按原始顺序返回一致的键（GH 12679、GH 11227和GH 39007）
DataFrame.stack()未正确处理列NaN中的错误MultiIndex（GH 39481）
DataFrame.apply()当参数func为字符串axis=1且不支持 axis 参数时，错误会给出不正确的结果；现在提出一个ValueError代替（GH 39211）
DataFrame.sort_values()对列进行排序后未正确重塑索引的错误ignore_index=True（GH 39464）
DataFrame.append()使用 dtypes 组合返回不正确的 dtypes时出现错误ExtensionDtype( GH 39454 )
与和dtypesDataFrame.append()组合使用时返回不正确的 dtypes 的错误( GH 39574 )datetime64timedelta64
错误地使用DataFrame.append()aDataFrame并MultiIndex附加不是 aSeries的a ( GH 41707 )IndexMultiIndex
DataFrame.pivot_table()在空 DataFrame 上操作时返回MultiIndex单个值的错误（ GH 13483）
Index现在可以传递给numpy.all()函数（GH 40180）
DataFrame.stack()未保存CategoricalDtype在MultiIndex( GH 36991 )中的错误
to_datetime()当输入序列包含不可散列的项目时引发错误的错误（GH 39756）
Series.explode()当ignore_indexwasTrue和值为标量时保留索引的错误( GH 40487 )
当包含and且具有超过 50 个元素时to_datetime()引发 a 的错误( GH 39882 )ValueErrorSeriesNoneNaT
包含时区感知日期时间对象的对象 dtype 值中的错误错误地Series.unstack()引发（GH 41875）DataFrame.unstack()TypeError
当重复列用作（GH 41951）时DataFrame.melt()引发错误InvalidIndexErrorDataFramevalue_vars

稀疏#

使用没有 a 的数字列DataFrame.sparse.to_coo()引发 a时出现错误（GH 18414）KeyErrorIndex0
从整数数据类型转换为浮点数据类型时产生错误结果的SparseArray.astype()错误（GH 34456）copy=False
错误SparseArray.max()并且SparseArray.min()总是返回空结果（GH 40921）

扩展数组#

DataFrame.where()当other系列包含ExtensionDtype( GH 38729 )时出现错误
修复了当基础数据为( GH 32749、GH 33719、GH 36566 )时Series.idxmax()、Series.idxmin()、Series.argmax()和会失败的错误Series.argmin()ExtensionArray
PandasExtensionDtype修复了子类的某些属性未正确缓存的错误（ GH 40329）
使用引发 a屏蔽DataFrame.mask()DataFrame 的错误( GH 40941 )ExtensionDtypeValueError

造型器#

Styler方法中的参数subset对某些有效的 MultiIndex 切片引发错误的错误 ( GH 33562 )
Styler渲染的 HTML 输出已进行了细微更改以支持 w3 良好代码标准 ( GH 39626 )
Styler渲染的 HTML 缺少某些标题单元格的列类标识符的错误( GH 39716 )
Styler.background_gradient()未正确确定文本颜色的错误（ GH 39888）
Styler.set_table_styles()参数的 CSS 选择器中的多个元素table_styles未正确添加的错误( GH 34061 )
Styler从 Jupyter 复制时出现左上方单元格丢失且标题未对齐的错误( GH 12147 )
Styler.where未kwargs传递给适用的可调用项的错误（ GH 40845）
Styler导致 CSS 在多个渲染上重复的错误（ GH 39395、GH 40334）

其他＃

inspect.getmembers(Series)不再引发AbstractMethodError( GH 38782 )
Series.where()数字 dtype 存在错误other=None且未转换为nan( GH 39761 )
当属性具有无法识别的 NA 类型时，、和中的错误会错误assert_series_equal()地assert_frame_equal()引发( assert_index_equal()GH 39461 )assert_extension_array_equal()
将实例与类别进行比较时，错误assert_index_equal()不会引发（GH 41263）exact=TrueCategoricalIndexInt64IndexRangeIndex
DataFrame.equals()、Series.equals()和中的错误Index.equals()包含包含np.datetime64("NaT")or np.timedelta64("NaT")( GH 39650 )的 object-dtype
show_versions()控制台 JSON 输出不是正确 JSON 的错误( GH 39701 )
使用xlc时 pandas 现在可以在 z/OS 上进行编译( GH 35826 )
当输入对象类型为 a 时pandas.util.hash_pandas_object()无法识别的hash_key错误（GH 41404）encodingcategorizeDataFrame

贡献者#

共有 251 人为此版本贡献了补丁。名字带有“+”的人首次贡献了补丁。

阿布舍克 R +
艾达·德拉金达
亚当·斯图尔特
亚当·特纳 +
艾丹·费尔德曼 +
阿吉特什·辛格 +
阿克沙特·贾恩 +
阿尔伯特·维拉诺瓦·德尔·莫拉尔
亚历山大·普林斯·莱维塞尔 +
安德鲁·哈维尔克 +
安德鲁·维特斯卡
安格丽娜·班布拉 +
安库什·杜阿 +
安娜·达格利斯
阿什兰·帕克 +
阿什瓦尼 +
阿维纳什·潘查姆
阿尤斯曼·库马尔 +
豆南
伯努瓦·维诺
巴拉特·拉古纳坦
比杰·雷格米 +
波宾·马修 +
博格丹·皮利亚维茨 +
布莱恩·休莱特 +
孙布莱恩+
布洛克+
布莱恩·卡特勒
迦勒+
何志坚 +
查图拉·维达纳奇 +
钦梅·雷恩 +
克里斯·林奇
克里斯·威瑟斯
克里斯托·彼得罗普洛斯
科朗坦·吉拉德 +
DaPy15 +
达摩达拉普杜 +
丹尼尔·赫里斯卡
丹尼尔·萨克斯顿
丹尼尔·费文斯
敢于阿德乌米 +
戴夫·威尔默
大卫·施拉赫特 +
大卫-dmh +
迪邦拉瓦尔 +
多丽丝·李 +
Jan-Philip Gehrcke 博士 +
干燥S+
迪伦·珀西
埃尔凡·纳里曼
梁家伟
埃里克·李尔 +
前夕
李芳辰
菲利克斯·迪沃
弗洛里安·杰特
弗雷德·赖斯
GFJ138+
高拉夫·谢尼 +
杰弗里·B·艾森巴斯 +
格萨·斯图佩里奇 +
格里芬·安塞尔 +
古斯塔沃·C·马西尔 +
海蒂+
亨利+
吴鸿义 +
伊恩·奥兹瓦尔德 +
欧夫·勒斯蒂格
艾萨克·钟 +
艾萨克·维尔舒普
JHM 达比郡 (MBP) +
JHM Darbyshire (iMac) +
刘杰克+
詹姆斯·兰姆 +
杰特·帕雷克
杰夫·雷巴克
杰正2018 +
乔迪·克莱马克
约翰·卡尔斯特罗姆 +
约翰·麦奎根
乔里斯·范登博什
何塞
何塞海军
乔什·迪马斯基
乔什·弗里德兰德
约书亚·克莱因 +
朱莉娅·西格内尔
朱利安·施尼茨勒 +
董凯琪
卡西姆·潘吉里 +
凯蒂·史密斯 +
凯莉+
凯尼尔+
开普勒，凯尔 +
凯文·谢泼德
许清威 +
凯莉·休伊特 +
拉里·黄 +
光年+
卢卡斯·霍尔茨 +
卢卡斯·罗德斯·吉拉奥
幸运的西瓦古鲁纳坦 +
路易斯·平托
马切伊·科斯 +
马克·加西亚
马可·爱德华·戈雷利 +
马可·戈雷利
马可·戈雷利 +
马克·格雷厄姆
马丁·登格勒 +
马丁·格里戈洛夫 +
马蒂·鲁道夫 +
马特·罗斯克
马修·罗斯克
马修·泽特林
马克斯·博林布鲁克
马克西姆·伊万诺夫
马克西姆·库普弗 +
马尤尔+
米塞克斯机器
米凯尔·贾尼亚克
谢家明 +
米歇尔·德鲁伊特 +
迈克·罗伯茨 +
米罗斯拉夫·塞迪维
穆罕默德·贾法尔·马什哈迪
莫里萨·曼泽拉 +
莫尔塔达·梅哈尔
穆克坦+
纳文·阿格拉沃尔 +
诺亚
诺法尔·米斯拉基 +
奥莱·科齐涅茨
奥尔加·马图拉 +
奥利+
奥马尔·阿菲菲
奥马尔·奥扎斯兰 +
欧文·拉蒙特 +
奥赞·厄赫雷登 +
熊猫开发团队
保罗·拉门斯
冻糕 Gasana +
帕特里克·赫夫勒
保罗·麦卡锡 +
保罗·科斯塔 +
帕夫A
彼得
普拉杜姆纳·拉胡尔 +
普尼特瓦拉 +
侯QP +
拉胡尔·乔汉
拉胡尔·萨塔那帕利
理查德·沙德拉克
罗伯特·布拉德肖
罗宾对罗克塞尔
罗希特·古普塔
萨姆·普尔基斯 +
塞缪尔·吉法德 +
肖恩·M·劳 +
沙哈尔·纳维 +
沙哈尔·纳维 +
希夫·古普塔 +
施瑞·迪克西特 +
杨树东 +
西蒙·伯姆 +
西蒙·霍金斯
西尼德·贝克 +
斯特凡·梅杰尔加德 +
史蒂文·皮特曼 +
史蒂文·谢勒 +
史蒂芬·吉尤 +
劳夫+
泰加·D·普拉塔玛 +
特尔吉·彼得森
西奥多罗斯·尼古拉乌 +
托马斯·迪克森
托马斯·李
托马斯·史密斯
于托马斯+
托马斯·布莱斯QC +
蒂姆·霍夫曼
汤姆·奥格斯普格
托斯顿·沃特温
泰勒·雷迪
尤里尔·玛德
乌韦·科恩
文纳图鲁姆 +
维罗莎李
弗拉基米尔·波多尔斯基
维姆·帕塔克 +
王爱勇
瓦尔特里·科斯基宁 +
司文俊+
威廉·艾德
耶斯万斯 N +
耿元浩
齐托·雷洛娃 +
阿夫拉02 +
逮捕+
攻击68
cdknox +
青格+
探寻者+
弗特里哈乔 +
github-actions[机器人] +
贡贾-索兰基 +
古鲁基兰
哈桑亚曼
我-秋-y +
杰布罗克门德尔
杰姆霍尔泽 +
乔迪·克雷斯波 +
乔塔西 +
杰雷巴克
朱利安斯米德克 +
凯勒克普勒
雷皮通+
卢卡斯罗德斯
玛罗斯96 +
米克罗因 +
姆隆斯基恩
莫因克+
莫尔梅
姆施莫克勒 +
姆泽特林11
钠2+
诺法米什拉基 +
帕尔捷夫
帕特里克
p型
真实领先
拉什德拉赫
鲁克维 +
玫瑰金+
索科德+
斯德门滕 +
肖恩布朗
sstiijn +
斯蒂芬利德 +
苏克里蒂1 +
泰泽豪
欧尔利+
西奥多朱 +
索迪斯特拉+
托尼耶耶 +
青格格 +
图书书+
旺戈拉德 +
弗拉杜+
韦尔塔+

1.3.0 的新增内容（2021 年 7 月 2 日）#

增强功能#

读取 csv 或 json 文件时自定义 HTTP(s) 标头#

读取和写入 XML 文档#

造型器增强功能#

DataFrame 构造函数尊重copy=Falsedict #

PyArrow 支持的字符串数据类型#

居中的类似日期时间的滚动窗口#

其他增强功能#

值得注意的错误修复#

Categorical.unique现在始终保持与原始相同的数据类型#

保留#中的数据类型DataFrame.combine_first()

Groupby 方法 agg 和 Transform 不再更改可调用对象的返回数据类型#

floatDataFrameGroupBy.mean()、DataFrameGroupBy.median()、GDataFrameGroupBy.var()、SeriesGroupBy.mean()、 、SeriesGroupBy.median()和#的结果SeriesGroupBy.var()

loc使用和#设置值时尝试就地操作iloc

设置#时切勿就地操作frame[keys] = values

与布尔系列设置一致的铸造#

DataFrameGroupBy.rolling 和 SeriesGroupBy.rolling 不再返回值中的分组列#

删除了滚动方差和标准差中的人为截断#

具有 MultiIndex 的 DataFrameGroupBy.rolling 和 SeriesGroupBy.rolling 不再降低结果中的级别#

向后不兼容的 API 更改#

增加了依赖项的最低版本#

其他 API 更改#

建造＃

弃用#

已弃用在 DataFrame 缩减和 DataFrameGroupBy 操作中删除烦人的列#

其他弃用#

性能改进#

Bug修复＃

分类#

类似日期时间#

时间增量#

时区＃

数字#

转换＃

字符串#

间隔＃

索引#

丢失的＃

多重索引#

输入/输出#

时期＃

绘图#

分组/重新采样/滚动#

重塑#

稀疏#

扩展数组#

造型器#

其他＃

贡献者#

DataFrame 构造函数尊重`copy=False`dict #

`Categorical.unique`现在始终保持与原始相同的数据类型#

保留#中的数据类型 `DataFrame.combine_first()`

`floatDataFrameGroupBy.mean()`、`DataFrameGroupBy.median()`、`GDataFrameGroupBy.var()`、`SeriesGroupBy.mean()`、、`SeriesGroupBy.median()`和#的结果`SeriesGroupBy.var()`

`loc`使用和#设置值时尝试就地操作`iloc`

设置#时切勿就地操作`frame[keys] = values`