dataframe 正则过滤,pandas dataframe/series 正则表达式使用 str.match str.contains str.extract...

pandas dataframe/series 正则表达式使用 str.match str.contains str.extract

pandas.Series.str.match

Series.str.match(pat, case=True, flags=0, na=nan, as_indexer=False)[source]

Deprecated: Find groups in each string in the Series/Index using passed regular expression. If as_indexer=True, determine if each string matches a regular expression.

Parameters:

pat : string

Character sequence or regular expression

case : boolean, default True

If True, case sensitive

flags : int, default 0 (no flags)

re module flags, e.g. re.IGNORECASE

na : default NaN, fill value for missing values.

as_indexer : False, by default, gives deprecated behavior better achieved

using str_extract. True return boolean indexer.

Returns:

Series/array of boolean values

if as_indexer=True

Series/Index of tuples

if as_indexer=False, default but deprecated

Series.str.contains(pat, case=True, flags=0, na=nan, regex=True)[source]

Return boolean Series/array whether given pattern/regex is contained in each string in the Series/Index.

Parameters:

pat : string

Character sequence or regular expression

case : boolean, default True

If True, case sensitive

flags : int, default 0 (no flags)

re module flags, e.g. re.IGNORECASE

na : default NaN, fill value for missing values.

regex : bool, default True

If True use re.search, otherwise use Python in operator

Returns:

contained : Series/array of boolean values

Series.str.extract(pat, flags=0, expand=None)[source]

For each subject string in the Series, extract groups from the first match of regular expression pat.

New in version 0.13.0.

Parameters:

pat : string

Regular expression pattern with capturing groups

flags : int, default 0 (no flags)

re module flags, e.g. re.IGNORECASE

.. versionadded:: 0.18.0

expand : bool, default False

If True, return DataFrame.

If False, return Series/Index/DataFrame.

Returns:

DataFrame with one row for each subject string, and one column for

each group. Any capture group names in regular expression pat will

be used for column names; otherwise capture group numbers will be

used. The dtype of each result column is always object, even when

no match is found. If expand=False and pat has only one capture group,

then return a Series (if subject is a Series) or Index (if subject

is an Index).

你可能感兴趣的:(dataframe,正则过滤)