python工程中处理txt文件中同时包含字符(str)和数字(float)的情况

1. If you use np.genfromtxt, you could specify dtype=None, which will tell genfromtxt to intelligently guess the dtype of each column. Most conveniently, it relieves you of the burder of specifying the number of bytes required for the string column. (Omitting the number of bytes, by specifying e.g. np.str, does not work.)

方法一:如果使用np.genfromtxt,则可以指定dtype = None,这将告诉genfromtxt智能地猜测每列的dtype。最方便的是,它减轻了指定字符串列所需字节数的麻烦。 (通过指定例如np.str省略字节数不起作用。)

In [58]: np.genfromtxt('data.txt', delimiter=',', dtype=None, names=('sepal length', 'sepal width', 'petal length', 'petal width', 'label'))
Out[58]: array([(5.1, 3.5, 1.4, 0.2, 'Iris-setosa'),
       (4.9, 3.0, 1.4, 0.2, 'Iris-setosa'),
       (5.8, 2.7, 4.1, 1.0, 'Iris-versicolor'),
       (6.2, 2.2, 4.5, 1.5, 'Iris-versicolor'),
       (6.4, 3.1, 5.5, 1.8, 'Iris-virginica'),
       (6.0, 3.0, 4.8, 1.8, 'Iris-virginica')], 
      dtype=[('sepal_length', '

2.If you do want to use np.loadtxt, then to fix your code with minimal changes, you could use:

如果你想使用np.loadtxt,那么需要稍作修改:

np.loadtxt("data.txt", dtype={'names': ('sepal length', 'sepal width', 'petal length', 'petal width', 'label'),
          'formats': (np.float, np.float, np.float, np.float, '|S15')}, delimiter=',', skiprows=0)

The main difference is simply changing np.str to |S15 (a 15-byte string).

Also note that open("data.txt"), 'r' should be open("data.txt", 'r'). But since np.loadtxtcan accept a filename, you don't really need to use open at all.

参考:https://stackoverflow.com/questions/23546349/loading-text-file-containing-both-float-and-string-using-numpy-loadtxt

你可能感兴趣的:(Python,学习)