pandas to_dict 的用法

简介:pandas 中的to_dict 可以对DataFrame类型的数据进行转换
可以选择六种的转换类型,分别对应于参数 ‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’,下面逐一介绍每种的用法

Help on method to_dict in module pandas.core.frame:

to_dict(orient='dict') method of pandas.core.frame.DataFrame instance
    Convert DataFrame to dictionary.

    Parameters
    ----------
    orient : str {
    'dict', 'list', 'series', 'split', 'records', 'index'}
        Determines the type of the values of the dictionary.

        - dict (default) : dict like {column -> {index -> value}}
        - list : dict like {column -> [values]}
        - series : dict like {column -> Series(values)}
        - split : dict like
          {index -> [index], columns -> [columns], data -> [values]}
        - records : list like
          [{column -> value}, ... , {column -> value}]
        - index : dict like {index -> {column -> value}}

          .. versionadded:: 0.17.0

        Abbreviations are allowed. `s` indicates `series` and `sp`
        indicates `split`.

    Returns
    -------
    result : dict like {column -> {index -> value}}

1、选择参数orient=’dict’
dict也是默认的参数,下面的data数据类型为DataFrame结构, 会形成 {column -> {index -> value}}这样的结构的字典,可以看成是一种双重字典结构
- 单独提取每列的值及其索引,然后组合成一个字典
- 再将上述的列属性作为关键字(key),值(values)为上述的字典

查询方式为 :data_dict[key1][key2]
- data_dict 为参数选择orient=’dict’时的数据名
- key1 为列属性的键值(外层)
- key2 为内层字典对应的键值

data  
Out[9]: 
     pclass        age     embarked                      home.dest     sex
1086    3rd  31.194181      UNKNOWN                        UNKNOWN    male
12      1st  31.194181    Cherbourg                  Paris, France  female
1036    3rd  31.194181      UNKNOWN                        UNKNOWN    male
833     3rd  32.000000  Southampton  Foresvik, Norway Portland, ND    male
1108    3rd  31.194181      UNKNOWN                        UNKNOWN    male
562     2nd  41.000000    Cherbourg                   New York, NY    male
437     2nd  48.000000  Southampton   Somerset / Bernardsville, NJ  female
663     3rd  26.000000  Southampton                        UNKNOWN    male
669     3rd  19.000000  Southampton                        England    male
507     2nd  31.194181  Southampton               Petworth, Sussex    male
In[10]: data_dict=data.to_dict(orient= 'dict')
In[11]: data_dict
Out[11]: 
{
    'age': {
    12: 31.19418104265403,
  437: 48.0,
  507: 31.19418104265403,
  562: 41.0,
  663: 26.0,
  669: 19.0,
  833: 32.0,
  1036: 31.19418104265403,
  1086: 31.19418104265403,
  1108: 31.19418104265403},
 'embarked': {
    12: 'Cherbourg',
  437: 'Southampton',
  507: 'Southampton',
  562: 'Cherbourg',
  663: 'Southampton',
  669: 'Southampton',
  833: 'Southampton',
  1036: 'UNKNOWN',
  1086: 'UNKNOWN',
  1108: 'UNKNOWN'},
 'home.dest': {
    12: 'Paris, France',
  437: 'Somerset / Bernardsville, NJ',
  507: 'Petworth, Sussex',
  562: 'New York, NY',
  663: 'UNKNOWN',
  669: 'England',
  833: 'Foresvik, Norway Portland, ND',
  1036: 'UNKNOWN',
  1086: 'UNKNOWN',
  1108: 'UNKNOWN'},
 'pclass': {
    12: '1st',
  437: '2nd',
  507: '2nd',
  562: '2nd',
  663: '3rd',
  669: '3rd',
  833: '3rd',
  1036: '3rd',
  1086: '3rd',
  1108: '3rd'},
 'sex': {
    12: 'female',
  437: 'female',
  507: 'male',
  562: 'male',
  663: 'male',
  669: 'male',
  833: 'male',
  1036: 'male',
  1086: 'male',
  1108: 'male'}}

2、当关键字orient=’ list’ 时
和1中比较相似,只不过内层变成了一个列表,结构为{column -> [values]}
查询方式为: data_list[keys][index]

  • data_list 为关键字orient=’list’ 时对应的数据名
  • keys 为列属性的键值,如本例中的’age’ , ‘embarked’等
  • index 为整型索引,从0开始到最后
In[19]: data_list=data.to_dict(orient='list')

In[20]: data_list
Out[20]: 
{
      'age': [31.19418104265403,
  31.19418104265403,
  31.19418104265403,
  32.0,
  31.19418104265403,
  41.0,
  48.0,
  26.0,
  19.0,
  31.19418104265403],
 'embarked': ['UNKNOWN',
  'Cherbourg',
  'UNKNOWN',
  'Southampton',
  'UNKNOWN',
  'Cherbourg',
  'Southampton',
  'Southampton',
  'Southampton',
  'Southampton'],
 'home.dest': ['UNKNOWN',
  'Paris, France',
  'UNKNOWN',
  'Foresvik, Norway Portland, ND',
  'UNKNOWN',
  'New York, NY',
  'Somerset / Bernardsville, NJ',
  'UNKNOWN',
  'England',
  'Petworth, Sussex'],
 'pclass': ['3rd',
  '1st',
  '3rd',
  '3rd',
  '3rd',
  '2nd',
  '2nd',
  '3rd',
  '3rd',
  '2nd'],
 'sex': ['male',
  'female',
  'male',
  'male',
  'male',
  'male',
  'female',
  'male',
  'male',
  'male']}

3、关键字参数orient=’series’
形成结构{column -> Series(values)}
调用格式为:data_series[key1][key2]或data_dict[key1]

  • data_series 为数据对应的名字
  • key1 为列属性的键值,如本例中的’age’ , ‘embarked’等
  • key2 使用数据原始的索引(可选)

In[21]: data_series=data.to_dict(orient='series')
In[22]: data_series
Out[22]: 
{
    'age': 1086    31.194181
 12      31.194181
 1036    31.194181
 833     32.000000
 1108    31.194181
 562     41.000000
 437     48.000000
 663     26.000000
 669     19.000000
 507     31.194181
 Name: age, dtype: float64, 'embarked': 1086        UNKNOWN
 12        Cherbourg
 1036        UNKNOWN
 833     Southampton
 1108        UNKNOWN
 562       Cherbourg
 437     Southampton
 663     Southampton
 669     Southampton
 507     Southampton
 Name: embarked, dtype: object, 'home.dest': 1086                          UNKNOWN
 12                      Paris, France
 1036                          UNKNOWN
 833     Foresvik, Norway Portland, ND
 1108                          UNKNOWN
 562                      New York, NY
 437      Somerset / Bernardsville, NJ
 663                           UNKNOWN
 669                           England
 507                  Petworth, Sussex
 Name: home.dest, dtype: object, 'pclass': 1086    3rd
 12      1st
 1036    3rd
 833     3rd
 1108    3rd
 562     2nd
 437     2nd
 663     3rd
 669     3rd
 507     2nd
 Name: pclass, dtype: object, 'sex': 1086      male
 12      female
 1036      male
 833       male
 1108      male
 562       male
 437     female
 663       male
 669       male
 507       male
 Name: sex, dtype: object}

4、关键字参数orient=’split’
形成{index -> [index], columns -> [columns], data -> [values]}的结构,是将数据、索引、属性名单独脱离出来构成字典
调用方式有 data_split[‘index’],data_split[‘data’],data_split[‘columns’]

data_split=data.to_dict(orient='split')

data_split
Out[38]: 
{
    'columns': ['pclass', 'age', 'embarked', 'home.dest', 'sex'],
 'data': [['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],
  ['1st', 31.19418104265403, 'Cherbourg', 'Paris, France', 'female'],
  ['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],
  ['3rd', 32.0, 'Southampton', 'Foresvik, Norway Portland, ND', 'male'],
  ['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],
  ['2nd', 41.0, 'Cherbourg', 'New York, NY', 'male'],
  ['2nd', 48.0, 'Southampton', 'Somerset / Bernardsville, NJ', 'female'],
  ['3rd', 26.0, 'Southampton', 'UNKNOWN', 'male'],
  ['3rd', 19.0, 'Southampton', 'England', 'male'],
  ['2nd', 31.19418104265403, 'Southampton', 'Petworth, Sussex', 'male']],
 'index': [1086, 12, 1036, 833, 1108, 562, 437, 663, 669, 507]}

5、当关键字orient=’records’ 时
形成[{column -> value}, … , {column -> value}]的结构
整体构成一个列表,内层是将原始数据的每行提取出来形成字典
调用格式为data_records[index][key1]

data_records=data.to_dict(orient='records')

data_records
Out[41]: 
[{
    'age': 31.19418104265403,
  'embarked': 'UNKNOWN',
  'home.dest': 'UNKNOWN',
  'pclass': '3rd',
  'sex': 'male'},
 {
    'age': 31.19418104265403,
  'embarked': 'Cherbourg',
  'home.dest': 'Paris, France',
  'pclass': '1st',
  'sex': 'female'},
 {
    'age': 31.19418104265403,
  'embarked': 'UNKNOWN',
  'home.dest': 'UNKNOWN',
  'pclass': '3rd',
  'sex': 'male'},
 {
    'age': 32.0,
  'embarked': 'Southampton',
  'home.dest': 'Foresvik, Norway Portland, ND',
  'pclass': '3rd',
  'sex': 'male'},
 {
    'age': 31.19418104265403,
  'embarked': 'UNKNOWN',
  'home.dest': 'UNKNOWN',
  'pclass': '3rd',
  'sex': 'male'},
 {
    'age': 41.0,
  'embarked': 'Cherbourg',
  'home.dest': 'New York, NY',
  'pclass': '2nd',
  'sex': 'male'},
 {
    'age': 48.0,
  'embarked': 'Southampton',
  'home.dest': 'Somerset / Bernardsville, NJ',
  'pclass': '2nd',
  'sex': 'female'},
 {
    'age': 26.0,
  'embarked': 'Southampton',
  'home.dest': 'UNKNOWN',
  'pclass': '3rd',
  'sex': 'male'},
 {
    'age': 19.0,
  'embarked': 'Southampton',
  'home.dest': 'England',
  'pclass': '3rd',
  'sex': 'male'},
 {
    'age': 31.19418104265403,
  'embarked': 'Southampton',
  'home.dest': 'Petworth, Sussex',
  'pclass': '2nd',
  'sex': 'male'}]

6、当关键字orient=’index’ 时
形成{index -> {column -> value}}的结构,调用格式正好和’dict’ 对应的反过来,请读者自己思考

data_index=data.to_dict(orient='index')

data_index
Out[43]: 
{
    12: {
    'age': 31.19418104265403,
  'embarked': 'Cherbourg',
  'home.dest': 'Paris, France',
  'pclass': '1st',
  'sex': 'female'},
 437: {
    'age': 48.0,
  'embarked': 'Southampton',
  'home.dest': 'Somerset / Bernardsville, NJ',
  'pclass': '2nd',
  'sex': 'female'},
 507: {
    'age': 31.19418104265403,
  'embarked': 'Southampton',
  'home.dest': 'Petworth, Sussex',
  'pclass': '2nd',
  'sex': 'male'},
 562: {
    'age': 41.0,
  'embarked': 'Cherbourg',
  'home.dest': 'New York, NY',
  'pclass': '2nd',
  'sex': 'male'},
 663: {
    'age': 26.0,
  'embarked': 'Southampton',
  'home.dest': 'UNKNOWN',
  'pclass': '3rd',
  'sex': 'male'},
 669: {
    'age': 19.0,
  'embarked': 'Southampton',
  'home.dest': 'England',
  'pclass': '3rd',
  'sex': 'male'},
 833: {
    'age': 32.0,
  'embarked': 'Southampton',
  'home.dest': 'Foresvik, Norway Portland, ND',
  'pclass': '3rd',
  'sex': 'male'},
 1036: {
    'age': 31.19418104265403,
  'embarked': 'UNKNOWN',
  'home.dest': 'UNKNOWN',
  'pclass': '3rd',
  'sex': 'male'},
 1086: {
    'age': 31.19418104265403,
  'embarked': 'UNKNOWN',
  'home.dest': 'UNKNOWN',
  'pclass': '3rd',
  'sex': 'male'},
 1108: {
    'age': 31.19418104265403,
  'embarked': 'UNKNOWN',
  'home.dest': 'UNKNOWN',
  'pclass': '3rd',
  'sex': 'male'}}

你可能感兴趣的:(love_python,to-dict-用法)