Pandas数据加载[read_csv/read_table/sqlite3.connect/read_sql/pymysql.connect/create_engine]

pandas提供了一些用于将表格型数据读取为DataFrame对象的函数,期中read_csv和read_table这两个使用最多

导包

import pandas as pd
from pandas import Series,DataFrame
import numpy as np

使用read_csv将其读入

# SMSSpanCollection 文件,没有标题  sep:分隔符为界限
sms = pd.read_csv('./data/SMSSpamCollection',sep = '\t',header = None)
sms.columns = ['label','message']
sms

Pandas数据加载[read_csv/read_table/sqlite3.connect/read_sql/pymysql.connect/create_engine]_第1张图片

 

使用read_table读取

pd.read_table('./data/SMSSpamCollection',header = None)

Pandas数据加载[read_csv/read_table/sqlite3.connect/read_sql/pymysql.connect/create_engine]_第2张图片

# ;,.-
# 指明sep,分开
pd.read_csv('./type-.txt',sep = '-',header = None)

Pandas数据加载[read_csv/read_table/sqlite3.connect/read_sql/pymysql.connect/create_engine]_第3张图片

 

读取sqlite文件

  • 导包sqlite3
import sqlite3
conn = sqlite3.connect('./data.sqlite')

 

  • pd.read_sql("sql语句", con)
df_weather = pd.read_sql('select * from Weather_2017',conn)
df_weather.shape
Out: (26352, 9)

df_weather

Pandas数据加载[read_csv/read_table/sqlite3.connect/read_sql/pymysql.connect/create_engine]_第4张图片

写入sql文件

  • ① 转换时间字段的格式
df_weather.dtypes
Out:
index                   int64
Date/Time              object
Temp (C)              float64
Dew Point Temp (C)    float64
Rel Hum (%)             int64
Wind Spd (km/h)         int64
Visibility (km)       float64
Stn Press (kPa)       float64
Weather                object
dtype: object

df_weather['Date/Time'] = pd.to_datetime(df_weather['Date/Time'],format='%d/%m/%Y')
df_weather.dtypes
Out:
index                          int64
Date/Time             datetime64[ns]
Temp (C)                     float64
Dew Point Temp (C)           float64
Rel Hum (%)                    int64
Wind Spd (km/h)                int64
Visibility (km)              float64
Stn Press (kPa)              float64
Weather                       object
dtype: object
  • ② 写入 csv/json/html/sql 格式的数据文件
df_weather.to_csv('./weather.csv')
df_weather.to_json('./weather.json')
df_weather.to_html('./weather.html')
df_weather.to_sql('Weather_2018',conn)

# 读取
pd.read_json('./weather.json').sort_index()

Pandas数据加载[read_csv/read_table/sqlite3.connect/read_sql/pymysql.connect/create_engine]_第5张图片

 

连接pymysql

  • 导包

import pymysql
  • 连接
conn = pymysql.connect(host = 'localhost',port = 3306,
    user = 'softpo',password = 'root',db = 'books',charset='utf8')
conn
  • 读取
book = pd.read_sql('select * from shu limit 30',conn)
book.shape
Out: (30, 5)
  • 直接写入数据会报错!
# 向mysql中写入数据
book.to_sql('read',conn)
Out:
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\io\sql.py in execute(self, *args, **kwargs)
   1399             else:
-> 1400                 cur.execute(*args)
   1401             return cur
………………略
  • 正确的写入方式
from sqlalchemy import create_engine  
'''  
将数据写入mysql的数据库,但需要先通过sqlalchemy.create_engine建立连接,
且字符编码设置为utf8,否则有些latin字符不能处理 
''' 
conn = create_engine('mysql+mysqldb://softpo:root@localhost:3306/books?charset=utf8')  

book.to_sql('read',conn,index=False,if_exists='append')

 

根据url获取网络上的数据

url = 'https://raw.githubusercontent.com/datasets/investor-flow-of-funds-us/master/data/weekly.csv'

df = pd.read_csv(url)
df

Pandas数据加载[read_csv/read_table/sqlite3.connect/read_sql/pymysql.connect/create_engine]_第6张图片

 

 

 

 

 

 

 

 

你可能感兴趣的:(科学数据包)