pandas.DataFrame.replace #

数据框。替换( to_replace = None , value = _NoDefault.no_default , * , inplace = False , limit = None , regex = False , method = _NoDefault.no_default ) [来源] #

将to_replace中给出的值替换为value。

Series/DataFrame 的值会动态替换为其他值。这与使用.loc或更新不同.iloc，后者要求您指定要使用某个值更新的位置。

参数：

to_replace str、regex、list、dict、Series、int、float 或 None

如何找到将被替换的值。

数字、字符串或正则表达式：
- numeric：等于to_replace 的数值将被替换为value
- str：与to_replace完全匹配的字符串将被值替换
- 正则表达式：匹配to_replace 的正则表达式将被值替换
str、正则表达式或数字列表：
- 首先，如果to_replace和value都是列表，则它们的长度必须相同。
- 其次，如果两个regex=True列表中的所有字符串都将被解释为正则表达式，否则它们将直接匹配。这对于价值来说并不重要，因为您只能使用几种可能的替换正则表达式。
- str、正则表达式和数字规则如上所述应用。
字典：
- 字典可用于为不同的现有值指定不同的替换值。例如，将值“a”替换为“b”，将“y”替换为“z”。要以这种方式使用字典，不应给出可选值参数。{'a': 'b', 'y': 'z'}
- 对于 DataFrame，字典可以指定应在不同列中替换不同的值。例如，在“a”列中查找值 1，在“b”列中查找值“z”，并将这些值替换为value中指定的值。 value参数不应该出现在这种情况下。您可以将此视为传递两个列表的特殊情况，除非您指定要搜索的列。{'a': 1, 'b': 'z'}None
- 对于 DataFrame 嵌套字典，例如，按如下方式读取：在“a”列中查找值“b”并将其替换为 NaN。不应指定可选值参数以这种方式使用嵌套字典。您也可以嵌套正则表达式。请注意，列名（嵌套字典中的顶级字典键）不能是正则表达式。{'a': {'b': np.nan}}
没有任何：
- 这意味着regex参数必须是字符串、已编译的正则表达式或此类元素的列表、字典、ndarray 或系列。如果值也是，None那么这必须是嵌套字典或系列。

请参阅示例部分以获取其中每个示例的示例。

值标量、字典、列表、str、正则表达式、默认无

用于替换与to_replace匹配的任何值的值。对于 DataFrame，可以使用值字典来指定每列使用哪个值（不在字典中的列将不会被填充）。还允许使用此类对象的正则表达式、字符串和列表或字典。

inplace布尔值，默认 False

如果为 True，则就地执行操作并返回 None。

limit int，默认无

向前或向后填充的最大间隙。

自 2.1.0 版本起已弃用。

正则表达式bool 或与to_replace相同的类型，默认 False

是否将to_replace和/或value解释为正则表达式。或者，这可以是正则表达式或正则表达式列表、字典或数组，在这种情况下 to_replace必须是None。

方法{'pad', 'ffill', 'bfill'}

当to_replace是标量、列表或元组且值为时，替换时使用的方法None。

自 2.1.0 版本起已弃用。

返回：

系列/数据框: 替换后的对象。

加薪：

断言错误

如果regex不是 abool并且to_replace不是 None。

类型错误

如果to_replace不是标量、类数组、dict或None
如果to_replace是 adict并且value不是 a list、 dict、ndarray、或Series
如果to_replace是None并且regex不可编译为正则表达式或者是列表、dict、ndarray 或 Series。
当替换多个bool或对象并且to_replacedatetime64的参数与被替换值的类型不匹配时

值错误

如果 alist或 anndarray传递给to_replace和 value但它们的长度不同。

也可以看看

Series.fillna: 填充 NA 值。
DataFrame.fillna: 填充 NA 值。
Series.where: 根据布尔条件替换值。
DataFrame.where: 根据布尔条件替换值。
DataFrame.map: 按元素将函数应用于数据框。
Series.map: 根据输入映射或函数映射 Series 的值。
Series.str.replace: 简单的字符串替换。

笔记

正则表达式替换是在后台执行的re.sub。替换规则re.sub相同。
正则表达式只会替换字符串，这意味着您无法提供匹配浮点数的正则表达式，并期望框架中具有数字数据类型的列进行匹配。但是，如果这些浮点数是字符串，那么您可以这样做。
这种方法有很多选择。我们鼓励您尝试和使用此方法，以直观地了解其工作原理。
当 dict 用作 to_replace值时，就像 dict 中的 key(s) 是 to_replace 部分，而 dict 中的 value(s) 是 value 参数。

例子

标量“to_replace”和“value”

>>> s = pd.Series([1, 2, 3, 4, 5])
>>> s.replace(1, 5)
0    5
1    2
2    3
3    4
4    5
dtype: int64

>>> df = pd.DataFrame({'A': [0, 1, 2, 3, 4],
...                    'B': [5, 6, 7, 8, 9],
...                    'C': ['a', 'b', 'c', 'd', 'e']})
>>> df.replace(0, 5)
    A  B  C
0  5  5  a
1  1  6  b
2  2  7  c
3  3  8  d
4  4  9  e

类似列表的`to_replace`

>>> df.replace([0, 1, 2, 3], 4)
    A  B  C
4  5  a
4  6  b
4  7  c
4  8  d
4  9  e

>>> df.replace([0, 1, 2, 3], [4, 3, 2, 1])
    A  B  C
4  5  a
3  6  b
2  7  c
1  8  d
4  9  e

>>> s.replace([1, 2], method='bfill')
  3
  3
  3
  4
  5
dtype: int64

类似字典的 `to_replace`

>>> df.replace({0: 10, 1: 100})
        A  B  C
 10  5  a
100  6  b
  2  7  c
  3  8  d
  4  9  e

>>> df.replace({'A': 0, 'B': 5}, 100)
        A    B  C
100  100  a
  1    6  b
  2    7  c
  3    8  d
  4    9  e

>>> df.replace({'A': {0: 100, 4: 400}})
        A  B  C
100  5  a
  1  6  b
  2  7  c
  3  8  d
400  9  e

正则表达式“to_replace”

>>> df = pd.DataFrame({'A': ['bat', 'foo', 'bait'],
...                    'B': ['abc', 'bar', 'xyz']})
>>> df.replace(to_replace=r'^ba.$', value='new', regex=True)
        A    B
0   new  abc
1   foo  new
2  bait  xyz

>>> df.replace({'A': r'^ba.$'}, {'A': 'new'}, regex=True)
        A    B
0   new  abc
1   foo  bar
2  bait  xyz

>>> df.replace(regex=r'^ba.$', value='new')
        A    B
0   new  abc
1   foo  new
2  bait  xyz

>>> df.replace(regex={r'^ba.$': 'new', 'foo': 'xyz'})
        A    B
0   new  abc
1   xyz  new
2  bait  xyz

>>> df.replace(regex=[r'^ba.$', 'foo'], value='new')
        A    B
0   new  abc
1   new  new
2  bait  xyz

比较和的行为以了解to_replace参数的特殊性：s.replace({'a': None})s.replace('a', None)

>>> s = pd.Series([10, 'a', 'a', 'b', 'a'])

当使用字典作为to_replace值时，就像字典中的值等于 value参数。相当于：s.replace({'a': None})s.replace(to_replace={'a': None}, value=None, method=None)

>>> s.replace({'a': None})
    10
  None
  None
     b
  None
dtype: object

当value未显式传递并且to_replace是标量、列表或元组时，replace使用方法参数（默认“pad”）进行替换。因此，这就是为什么在本例中，第 1 行和第 2 行中的“a”值被替换为 10，第 4 行中的“b”值被替换。

>>> s.replace('a')
  10
  10
  10
   b
   b
dtype: object

自版本 2.1.0 起已弃用： 'method' 参数和填充行为已弃用。

另一方面，如果None显式传递给value，它将受到尊重：

>>> s.replace('a', None)
    10
  None
  None
     b
  None
dtype: object

在版本 1.4.0 中更改：以前显式None被默默地忽略。

当regex=True,value不是None并且to_replace是字符串时，替换将应用于 DataFrame 的所有列。

>>> df = pd.DataFrame({'A': [0, 1, 2, 3, 4],
...                    'B': ['a', 'b', 'c', 'd', 'e'],
...                    'C': ['f', 'g', 'h', 'i', 'j']})

>>> df.replace(to_replace='^[a-g]', value='e', regex=True)
    A  B  C
0  e  e
1  e  e
2  e  h
3  e  i
4  e  j

如果value不是None并且to_replace是字典，则字典键将是将应用替换的 DataFrame 列。

>>> df.replace(to_replace={'B': '^[a-c]', 'C': '^[h-j]'}, value='e', regex=True)
    A  B  C
0  e  f
1  e  g
2  e  e
3  d  e
4  e  e