Pandas_Numpy_cheatsheet

Pandas Cheatsheet

First refers to pandas DataFrame cheatsheet.pdf

Series

DataFrame

Initialization

Load and write data from other sources

  • csv
  • MySQL
  • Hadoop (impyla(as_pandas), happybase)

Woring with row and column index

df.index
df.columns

Work with columns of data (axis=1)

Work with rows of data (axis=0)

Work with cells

  • do a comprehensive summary of pandas indexing !!!

Join/combine DataFrame

Split DataFrame

  • use list comprehension

target = [x[11] for x in dataset]
train = [x[0:11] for x in dataset]

Work with whole DataFrame

Work with dates, times and their indexes

Work with strings

Work with missing and non-finite value

Basic Statistics

Work with Categorical data

Annoying Part:

Copy vs View

use of direct index will return a new copy of data, therefore is not recommended for modify things
http://stackoverflow.com/questions/20625582/how-to-deal-with-this-pandas-warning
From what I gather, SettingWithCopyWarning was created to flag potentially confusing "chained" assignments, such as the following, which don't always work as expected, particularly when the first selection returns a copy. [see GH5390 and GH5597 for background discussion.]

df[df['A'] > 2]['B'] = new_val # new_val not set in df
The warning offers a suggestion to rewrite as follows:

df.loc[df['A'] > 2, 'B'] = new_val
However, this doesn't fit your usage, which is equivalent to:

df = df[df['A'] > 2]
df['B'] = new_val

modify in place vs return a new value

index of row and column

select index from row or column by direct index is extremely similar with subtle difference:

change column name

change of column order

你可能感兴趣的:(Pandas_Numpy_cheatsheet)