谷歌 colab_如何在Google Colab上使用熊猫分析

谷歌 colab

Recently, pandas have come up with an amazing open-source library called pandas-profiling. Generally, EDA starts by df.describe(), df.info() and etc which to be done separately. Pandas_profiling extends the general data frame report using a single line of code: df.profile_report() which interactively describes the statistics, you can read it more here.

最近,熊猫想出了一个了不起的开源库,叫做pandas-profiling。 通常,EDA从df.describe()df.info()等开始,这需要分别进行。 Pandas_profiling使用单行代码df.profile_report()扩展了通用数据框架报告,该代码以交互方式描述了统计信息,您可以在此处阅读更多内容。

然而, pandas_profiling不能被直接用在Colab。 该代码将导致错误,如下所示; (However, pandas_profiling cannot be straightforwardly used on Colab. The code will result in an error, as below;)

“concat() got an unexpected keyword argument ‘join axes“

This is because Google Colab comes with a pre-installed older version of Pandas-profiling (v1) and the join_axes function is deprecated in the installed Pandas version on Google Colab.

这是因为Google Colab随附了预先安装的Pandas分析(v1)的join_axes版本,而在Google Colab上已安装的Pandas版本中不推荐使用join_axes函数。

Google Colab的两个主要命令是: (The two main commands for Google Colab are:)

! pip install https://github.com/pandas-profiling/pandas-profiling/archive/master.zip
profile.to_notebook_iframe()

步骤:在Google Colab上安装Pandas分析 (STEPS : Install Pandas Profiling on Google Colab)

  1. Run the below command, you can visit the link on github.

    运行以下命令,您可以访问github上的链接

! pip install https://github.com/pandas-profiling/pandas-profiling/archive/master.zip 

2. Restart the kernel

2.重新启动内核

3. Re-import the libraries

3.重新导入库

Image for post
image by Author 图片作者

4. Import and read your data set

4.导入和读取您的数据集

5. Define your profile report:

5.定义您的个人资料报告:

Image for post
image by Author 图片作者

6. However, profile.to_widgets() is not working properly as it is not yet fully supported on Google Colab, as below snapshot :

6.但是, profile.to_widgets() 无法正常运行,因为Google Colab尚未完全支持它,如下快照所示:

image by Author 图片作者

7. Instead, change to profile.to_notebook_iframe(), as below snapshot:

7.而是改为profile.to_notebook_iframe() ,如下快照:

Image for post
image by Author 图片作者

8. And here’s your output:

8.这是您的输出:

Gif by Author Gif作者

9. Save your output file in html format: so you can share as a webpage

9.将您的输出文件保存为html格式:这样您就可以作为网页共享

谷歌 colab_如何在Google Colab上使用熊猫分析_第1张图片
Image by Author 图片作者

Pandas_profiling displays descriptive overview of the data sets, by showing the number of variables, observations, total missing cells, duplicate rows, memory used and the variable types. Then, it generates detailed analysis for each variable, class distributions, interactions, correlations, missing values, samples and duplicated rows, which you can observe by clicking each tab.

Pandas_profiling通过显示变量的数量,观察值,丢失的单元格总数,重复的行,使用的内存和变量类型来显示数据集的描述性概述。 然后,它为每个变量,类分布,相互作用,相关性,缺失值,样本和重复行生成详细分析,您可以通过单击每个选项卡进行观察。

I hope this will help you to play around with Pandas profiling.

我希望这将帮助您进行Pandas分析。

Happy exploring!

探索愉快!

翻译自: https://medium.com/python-in-plain-english/how-to-use-pandas-profiling-on-google-colab-e34f34ff1c9f

谷歌 colab

你可能感兴趣的:(python,java)