《Python数据分析技术栈》第06章使用 Pandas 准备数据 05 通过从其他格式导入数据创建DataFrame(Creating DataFrames by importing data fr

05 通过从其他格式导入数据创建DataFrame(Creating DataFrames by importing data from other formats)

《Python数据分析技术栈》第06章使用 Pandas 准备数据 05 通过从其他格式导入数据创建DataFrame(Creating DataFrames by importing data from other formats)

Pandas can read data from a wide variety of formats using its reader functions (refer to the complete list of supported formats here: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html). The following are some of the commonly used formats.

Pandas 可以使用其阅读器函数从多种格式中读取数据(请参阅此处的完整支持格式列表:https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html)。以下是一些常用格式。

From a CSV file

The read_csv function can be used to read data from a CSV file into a DataFrame, as shown in the following.

read_csv 函数可用于将 CSV 文件中的数据读入 DataFrame,如下所示。

titanic=pd.read_csv('titanic.csv')

Reading data from CSV files is one of the most common ways to create a DataFrame. CSV files are comma-separated files for storing and retrieving values, where each line is equivalent to a row. Remember to upload the CSV file in Jupyter using the upload button on the Jupyter home page (Figure 6-1), before calling the “read_csv” function.

从 CSV 文件读取数据是创建 DataFrame 的最常见方法之一。CSV 文件是以逗号分隔的文件,用于存储和检索值,每一行相当于一行。在调用 "read_csv "函数之前,请记住使用 Jupyter 主页(图 6-1)上的上传按钮在 Jupyter 中上传 CSV 文件。

From an Excel file

Pandas provides support for importing data from both xls and xlsx file formats using the pd.read_excel function, as shown in the following.

Pandas 支持使用 pd.read_excel 函数从 xls 和 xlsx 文件格式导入数据,如下所示。

titanic_excel=pd.read_excel('titanic.xls')

From a JSON file

JSON stands for JavaScript Object Notation and is a cross-platform file format for transmitting and exchanging data between the client and server. Pandas provides the function read_json to read data from a JSON file, as shown in the following.

JSON 是 JavaScript Object Notation 的缩写,是一种跨平台文件格式,用于在客户端和服务器之间传输和交换数据。Pandas 提供了 read_json 函数,用于从 JSON 文件中读取数据,如下所示。

titanic=pd.read_json('titanic-json.json')

From an HTML file

We can also import data from a web page using the pd.read_html function.

我们还可以使用 pd.read_html 函数从网页中导入数据。

In the following example, this function parses the tables on the web page into DataFrame objects. This function returns a list of DataFrame objects which correspond to the tables on the web page. In the following example, table[0] corresponds to the first table on the mentioned URL.

在下面的示例中,该函数将网页上的表格解析为 DataFrame 对象。该函数将返回一个 DataFrame 对象列表,该列表与网页上的表格相对应。在下面的示例中,table[0] 对应上述 URL 中的第一个表格。

url="https://www.w3schools.com/sql/sql_create_table.asp"
table=pd.read_html(url)
table[0]

Further reading: See the complete list of supported formats in Pandas and the functions for reading data from such formats:
https://pandas.pydata.org/pandas-docs/stable/reference/io.html

进一步阅读: 请参阅 Pandas 支持格式的完整列表以及从这些格式读取数据的函数:
https://pandas.pydata.org/pandas-docs/stable/reference/io.html

你可能感兴趣的:(Python数据分析技术栈,python,数据分析,python,数据分析,pandas)