AlwaysOnline999

cs231n：python3.6.4对实验数据图像的读取，课后作业代码解释

作者：AlwaysOnline
时间：2018年3月。

声明：版权所有，转载请注明出处，谢谢!!!
地址：https://blog.csdn.net/a781751136/article/details/79725922

仅供参考，错误欢迎指正!!!

1.相信大家都已经在其他地方找到了cs231n的翻译课程，邻近算法原理和KNN自己看其他人博客，我就直接进入正题，解决操作上遇到的问题！！！

2.数据的下载http://www.cs.toronto.edu/~kriz/cifar.html，进入一个后下来下载python数据集。

3.解压数据放在Python的下路径里面。

4.重点来了，一般小白同学，刚开始学cs231n时候读取数据不会，python下没有读取数据的函数，这就需要去下载，但是下载下来是python2.x的代码，必须做简单的调整才可以在python3.6.4上运行，进而读取数据！下面我直接给出代码，保存为py文件，就可以python编译器里面运行。

# -*- coding: utf-8 -*-
"""
Created on Tue Mar 27 16:03:04 2018
@author: 78175
"""
import pickle as pickle #2.x的版本是cpickle，这里从别人那里copy是有问题的，3.x版本是pickle
import numpy as np
import os
from scipy.misc import imread
def load_CIFAR_batch(filename):
  """ load single batch of cifar """
  with open(filename, 'rb') as f:
    datadict = pickle.load(f,encoding='iso-8859-1')#encoding='iso-8859-1这个在2.x版本中不需要，3.x中必须需要，会出编码问题，也不需要懂，先copy
    X = datadict['data']
    Y = datadict['labels']
    X = X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("float")
    Y = np.array(Y)
    return X, Y

def load_CIFAR10(ROOT):#这是root是文件夹的根目录，在python工作空间里面，把数据集解压放在工作目录下。
  """ load all of cifar """
  xs = []
  ys = []
  for b in range(1,6):
    f = os.path.join(ROOT, 'data_batch_%d' % (b, ))
    X, Y = load_CIFAR_batch(f)
    xs.append(X)
    ys.append(Y)    
  Xtr = np.concatenate(xs)
  Ytr = np.concatenate(ys)
  del X, Y
  Xte, Yte = load_CIFAR_batch(os.path.join(ROOT, 'test_batch'))
  return Xtr, Ytr, Xte, Yte

def load_tiny_imagenet(path, dtype=np.float32):
  """
  Load TinyImageNet. Each of TinyImageNet-100-A, TinyImageNet-100-B, and
  TinyImageNet-200 have the same directory structure, so this can be used
  to load any of them.

  Inputs:
  - path: String giving path to the directory to load.
  - dtype: numpy datatype used to load the data.

  Returns: A tuple of
  - class_names: A list where class_names[i] is a list of strings giving the
    WordNet names for class i in the loaded dataset.
  - X_train: (N_tr, 3, 64, 64) array of training images
  - y_train: (N_tr,) array of training labels
  - X_val: (N_val, 3, 64, 64) array of validation images
  - y_val: (N_val,) array of validation labels
  - X_test: (N_test, 3, 64, 64) array of testing images.
  - y_test: (N_test,) array of test labels; if test labels are not available
    (such as in student code) then y_test will be None.
  """
  # First load wnids
  with open(os.path.join(path, 'wnids.txt'), 'r') as f:
    wnids = [x.strip() for x in f]

  # Map wnids to integer labels
  wnid_to_label = {wnid: i for i, wnid in enumerate(wnids)}

  # Use words.txt to get names for each class
  with open(os.path.join(path, 'words.txt'), 'r') as f:
    wnid_to_words = dict(line.split('\t') for line in f)
    for wnid, words in wnid_to_words.iteritems():
      wnid_to_words[wnid] = [w.strip() for w in words.split(',')]
  class_names = [wnid_to_words[wnid] for wnid in wnids]

  # Next load training data.
  X_train = []
  y_train = []
  for i, wnid in enumerate(wnids):
    if (i + 1) % 20 == 0:
      print ('loading training data for synset %d / %d' % (i + 1, len(wnids)))
    # To figure out the filenames we need to open the boxes file
    boxes_file = os.path.join(path, 'train', wnid, '%s_boxes.txt' % wnid)
    with open(boxes_file, 'r') as f:
      filenames = [x.split('\t')[0] for x in f]
    num_images = len(filenames)
    
    X_train_block = np.zeros((num_images, 3, 64, 64), dtype=dtype)
    y_train_block = wnid_to_label[wnid] * np.ones(num_images, dtype=np.int64)
    for j, img_file in enumerate(filenames):
      img_file = os.path.join(path, 'train', wnid, 'images', img_file)
      img = imread(img_file)
      if img.ndim == 2:
        ## grayscale file
        img.shape = (64, 64, 1)
      X_train_block[j] = img.transpose(2, 0, 1)
    X_train.append(X_train_block)
    y_train.append(y_train_block)
      
  # We need to concatenate all training data
  X_train = np.concatenate(X_train, axis=0)
  y_train = np.concatenate(y_train, axis=0)
  
  # Next load validation data
  with open(os.path.join(path, 'val', 'val_annotations.txt'), 'r') as f:
    img_files = []
    val_wnids = []
    for line in f:
      img_file, wnid = line.split('\t')[:2]
      img_files.append(img_file)
      val_wnids.append(wnid)
    num_val = len(img_files)
    y_val = np.array([wnid_to_label[wnid] for wnid in val_wnids])
    X_val = np.zeros((num_val, 3, 64, 64), dtype=dtype)
    for i, img_file in enumerate(img_files):
      img_file = os.path.join(path, 'val', 'images', img_file)
      img = imread(img_file)
      if img.ndim == 2:
        img.shape = (64, 64, 1)
      X_val[i] = img.transpose(2, 0, 1)

  # Next load test images
  # Students won't have test labels, so we need to iterate over files in the
  # images directory.
  img_files = os.listdir(os.path.join(path, 'test', 'images'))
  X_test = np.zeros((len(img_files), 3, 64, 64), dtype=dtype)
  for i, img_file in enumerate(img_files):
    img_file = os.path.join(path, 'test', 'images', img_file)
    img = imread(img_file)
    if img.ndim == 2:
      img.shape = (64, 64, 1)
    X_test[i] = img.transpose(2, 0, 1)

  y_test = None
  y_test_file = os.path.join(path, 'test', 'test_annotations.txt')
  if os.path.isfile(y_test_file):
    with open(y_test_file, 'r') as f:
      img_file_to_wnid = {}
      for line in f:
        line = line.split('\t')
        img_file_to_wnid[line[0]] = line[1]
    y_test = [wnid_to_label[img_file_to_wnid[img_file]] for img_file in img_files]
    y_test = np.array(y_test)
  
  return class_names, X_train, y_train, X_val, y_val, X_test, y_test


def load_models(models_dir):
  """
  Load saved models from disk. This will attempt to unpickle all files in a
  directory; any files that give errors on unpickling (such as README.txt) will
  be skipped.

  Inputs:
  - models_dir: String giving the path to a directory containing model files.
    Each model file is a pickled dictionary with a 'model' field.

  Returns:
  A dictionary mapping model file names to models.
  """
  models = {}
  for model_file in os.listdir(models_dir):
    with open(os.path.join(models_dir, model_file), 'rb') as f:
      try:
        models[model_file] = pickle.load(f)['model']
      except pickle.UnpicklingError:
        continue
  return models

5.上面一长串代码，其实不用懂，只是读数据的函数，不影响后面的操作。

6.接下来是读取代码

Xtr, Ytr, Xte, Yte = load_CIFAR10('cifar-10-python/cifar-10-batches-py/')#这个位置一定是数据集文件夹，不是文件，understand
Xtr_rows = Xtr.reshape(Xtr.shape[0], 32 * 32 *3) # 这是训练集，这重新组成了一个新矩阵50000*3072，解释下3072是指图像每个像素下的每个颜色都表示一个特征32*32*3
Xte_rows = Xte.reshape(Xte.shape[0], 32 * 32 *3)#测试集，1000*32*32*3

7.开始训练

nn = NearestNeighbor() # 创建一个邻近算法对象
nn.train(Xtr_rows, Ytr) # 训练样本
Yte_predict = nn.predict(Xte_rows) # 预测集
print ('accuracy: %f' % ( np.mean(Yte_predict == Yte) ))# 平均分类精度计算

8.训练函数NearestNeighbor上面7所用到的函数

class NearestNeighbor(object):
  def __init__(self):
    pass

  def train(self, X, y):
    """ X is N x D where each row is an example. Y is 1-dimension of size N """
    # the nearest neighbor classifier simply remembers all the training data
    self.Xtr = X
    self.ytr = y
  def predict(self, X):
    """ X is N x D where each row is an example we wish to predict label for """
    num_test = X.shape[0]#获取输入样本数量
    # lets make sure that the output type matches the input type
    Ypred = np.zeros(num_test, dtype = self.ytr.dtype)
    # loop over all test rows
    for i in range(num_test):
      # find the nearest training image to the i'th test image
      # using the L1 distance (sum of absolute value differences)
      distances = np.sum(np.abs(self.Xtr - X[i,:]), axis = 1)#计算距离，训练集每个图像特征与样本集特征的距离
      min_index = np.argmin(distances) # argmin取最小值坐标
      Ypred[i] = self.ytr[min_index] # 预测
    return Ypred

距离选择：计算向量间的距离有很多种方法，另一个常用的方法是L2距离，从几何学的角度，可以理解为它在计算两个向量间的欧式距离。L2距离的公式如下：

换句话说，我们依旧是在计算像素间的差值，只是先求其平方，然后把这些平方全部加起来，最后对这个和开方。在Numpy中，我们只需要替换上面代码中的1行代码就行：

distances = np.sqrt(np.sum(np.square(self.Xtr - X[i,:]), axis = 1))

注意在这里使用了np.sqrt，但是在实际中可能不用。因为求平方根函数是一个单调函数，它对不同距离的绝对值求平方根虽然改变了数值大小，但依然保持了不同距离大小的顺序。所以用不用它，都能够对像素差异的大小进行正确比较。如果你在CIFAR-10上面跑这个模型，正确率是35.4%，比刚才低了一点。

L1和L2比较。比较这两个度量方式是挺有意思的。在面对两个向量之间的差异时，L2比L1更加不能容忍这些差异。也就是说，相对于1个巨大的差异，L2距离更倾向于接受多个中等程度的差异。L1和L2都是在p-norm常用的特殊形式。

9.KNN训练，KNearestNeighbor类，直接上代码

# -*- coding: utf-8 -*-
"""
Created on Wed Mar 28 09:49:26 2018

@author: 78175
"""
import numpy as np
 
class KNearestNeighbor(object):#首先是定义一个处理KNN的类
  """ a kNN classifier with L2 distance """
 
  def __init__(self):
    pass
 
  def train(self, X, y):
    """
    Train the classifier. For k-nearest neighbors this is just
    memorizing the training data.
 
    Inputs:
    - X: A numpy array of shape (num_train, D) containing the training data
      consisting of num_train samples each of dimension D.
    - y: A numpy array of shape (N,) containing the training labels, where
         y[i] is the label for X[i].
    """
    self.X_train = X
    self.y_train = y
     
  def predict(self, X, k=1, num_loops=0):
    """
    Predict labels for test data using this classifier.
 
    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data consisting
         of num_test samples each of dimension D.
    - k: The number of nearest neighbors that vote for the predicted labels.
    - num_loops: Determines which implementation to use to compute distances
      between training points and testing points.
 
    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i]. 
    """
    if num_loops == 0:#选择三种不同计算距离的方法
      dists = self.compute_distances_no_loops(X)
    elif num_loops == 1:
      dists = self.compute_distances_one_loop(X)
    elif num_loops == 2:
      dists = self.compute_distances_two_loops(X)
    else:
      raise ValueError('Invalid value %d for num_loops' % num_loops)
 
    return self.predict_labels(dists, k=k)
 
  def compute_distances_two_loops(self, X):#两个循环
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a nested loop over both the training data and the
    test data.
 
    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data.
 
    Returns:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      is the Euclidean distance between the ith test point and the jth training
      point.
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in  range(num_test):
      for j in  range(num_train):
        dists[i][j] = np.sqrt(np.sum(np.square(self.X_train[j,:] - X[i,:])))
        #####################################################################
        # TODO:                                                             #
        # Compute the l2 distance between the ith test point and the jth    #
        # training point, and store the result in dists[i, j]. You should   #
        # not use a loop over dimension.                                    #
        #####################################################################
        #####################################################################
        #                       END OF YOUR CODE                            #
        #####################################################################
    return dists
 
  def compute_distances_one_loop(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a single loop over the test data.
 
    Input / Output: Same as compute_distances_two_loops
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in range(num_test):
      #######################################################################
      # TODO:                                                               #
      # Compute the l2 distance between the ith test point and all training #
      # points, and store the result in dists[i, :].                        #
      #######################################################################
      dists[i,:] = np.sqrt(np.sum(np.square(self.X_train-X[i,:]),axis = 1)) 
      #######################################################################
      #                         END OF YOUR CODE                            #
      #######################################################################
    return dists
 
  def compute_distances_no_loops(self, X):#没用循环完全用的矩阵运行，可能会出现，Memory error，建议把编译器关了重新操作，是之前占用的内存太多，用64位的尽量
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using no explicit loops.
 
    Input / Output: Same as compute_distances_two_loops
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    #########################################################################
    # TODO:                                                                 #
    # Compute the l2 distance between all test points and all training      #
    # points without using any explicit loops, and store the result in      #
    # dists.                                                                #
    #                                                                       #
    # You should implement this function using only basic array operations; #
    # in particular you should not use functions from scipy.                #
    #                                                                       #
    # HINT: Try to formulate the l2 distance using matrix multiplication    #
    #       and two broadcast sums.                                         #
    #########################################################################
    dists = np.multiply(np.dot(X,self.X_train.T),-2) 
    sq1 = np.sum(np.square(X),axis=1,keepdims = True) #保持二维特性
    sq2 = np.sum(np.square(self.X_train),axis=1) 
    dists = np.add(dists,sq1) 
    dists = np.add(dists,sq2) 
    dists = np.sqrt(dists)
    #########################################################################
    #                         END OF YOUR CODE                              #
    #########################################################################
    return dists
 
  def predict_labels(self, dists, k=1):
    """
    Given a matrix of distances between test points and training points,
    predict a label for each test point.
 
    Inputs:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      gives the distance betwen the ith test point and the jth training point.
 
    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i]. 
    """
    num_test = dists.shape[0]
    y_pred = np.zeros(num_test)
    for i in range(num_test):
      # A list of length k storing the labels of the k nearest neighbors to
      # the ith test point.
      closest_y = [] 
      ######################################################################### 
      # TODO:                                                                 # 
      # Use the distance matrix to find the k nearest neighbors of the ith    # 
      # training point, and use self.y_train to find the labels of these      # 
      # neighbors. Store these labels in closest_y.                           # 
      # Hint: Look up the function numpy.argsort.                             # 
      ######################################################################### 
      closest_y = self.y_train[np.argsort(dists[i,:])[:k]] 
      #排序argsort，把排序后的位置返回为一个向量，
      #后面的[:,k]取前k个值 #并且前K个值数对应y_train里面位置上，
      #并取出值这些位置上的值,这些值代表了K个位置上的标签，给closest_y #########
      ################################################################
      ######################################################################### 
      # TODO:                                                                 # 
      # Now that you have found the labels of the k nearest neighbors, you    # 
      # need to find the most common label in the list closest_y of labels.   # 
      # Store this label in y_pred[i]. Break ties by choosing the smaller     # 
      # label.                                                                # 
      ######################################################################### 
      y_pred[i] = np.argmax(np.bincount(closest_y))       
      # #统计closest_y，bincount就是统计函数，统计每个数出现个数#### 
      # 返回最大个数所在的位置，这个位置的数值就是标签，标签就是预测的结果   
      #                 END OF YOUR CODE                            #
      #########################################################################
    return y_pred

10.上面代码直接复制用，对于小白来说，难点我已经中文备注，解释，不清楚的自行百度下函数用法，下面KNN训练。

Xval_rows = Xtr_rows[:1000, :] # take first 1000 for validation
Yval = Ytr[:1000]
Xtr_rows = Xtr_rows[1000:, :] # keep last 49,000 for train
Ytr = Ytr[1000:]#分成了49000个训练集1000验证集
validation_accuracies = []
for k in [1, 3, 5, 10, 20, 50, 100]:
    # use a particular value of k and evaluation
    #on validation data
    nn = KNearestNeighbor()
    nn.train(Xtr_rows, Ytr)
    # here we assume a modified NearestNeighbor
    #class that can take a k as input
    Yval_predict = nn.predict(Xval_rows, k = k)
    acc = np.mean(Yval_predict == Yval)
    print ('accuracy: %f' % (acc,))
     # keep track of what works on the validation
     #set
    validation_accuracies.append((k, acc))

结语：以上代码我已经完全调通可以直接用python3.6.4测试，可能这个过程都是现成的，但是小白会遇到很多问题，对比网上其他资料，主要就是直接从数据读取代码，到读取到训练，小白会反映不过来的。我也只是把该改的改了，运行通了，把我认为需要加强理解的地方备注了，大牛勿喷！！！

第一次写博客，坚持！！！

uwsgi loading shared libraries:libicui18n.so.58 异常处理 Maann python anaconda
用uwsgi+flask搭建python应用环境python使用anaconda3（python3.6.4）报错信息：(nlp)[root@host]#uwsgiapp_uwsgi.iniuwsgi:/lib64/./libstdc++.so.6:version`CXXABI_1.3.8'notfound(requiredby/lib64/libicui18n.so.58)uwsgi:/lib64
数据分析与处理实验一 Python编程环境与语言基础 okfang616 #2022秋季课程笔记 python numpy
1．实验内容（1）Python3.6.4与Anaconda3-5.1.0环境的搭建、启动与退出。（2）Python数据类型的操作。（3）Python条件语句与循环语句的程序编写。（4）Python用户自定义函数的编写。（5）Python高阶函数与匿名函数的使用。2．目的要求（1）熟悉Python3.6.4与Anaconda3-5.1.0环境的启动与退出。（2）掌握Python数据类型的操作。（3）
Python Flask Web 框架入门魔王不会哭 python python flask 前端 pycharm 学习开发语言
嗨喽~大家好呀，这里是魔王呐❤~!python更多源码/资料/解答/教程等点击此处跳转文末名片免费获取Flask是一个轻量级的基于Python的web框架。本文适合有一定HTML、Python、网络基础的同学阅读。1.简介这份文档中的代码使用Python3运行。是的，所以读者需要自己在电脑上安装Python3和pip3。建议安装最新版本，我使用的是Python3.6.4。安装方法，可以自行谷歌或者
Django学习日志一：Django配置 Z天南之城Z python 后端 django入门 python django pycharm 后端 virtualenv
Django学习日志一：Django配置1一、创建虚拟环境环境准备：python3.6.4(python3.3以上自带venv模块)Windows10第一步新建一个文件夹用于创建虚拟环境，这里在桌面上新建了一个Env文件夹第二步打开cmd，切换路径到Env文件夹中，命令如下：cddesktop\Env第三步接着创建虚拟环境，命令如下：python-mvenvtest_env(test_env是虚拟
python3.6.4安装及安装pip库和换源步骤指南以及pycharm的使用 Clever的羊驼 python pycharm 开发语言
首先打开python官网然后鼠标移到Downloads上选择自己的版本并点击然后CTRL+F搜索自己需要的版本（我这用的是3.6.4)选择第三个版本（executableinstaller)选择第二个选项，一般第一次安装只有两个选项，选第二个即可(并勾选下面两个选项）下一个页面的选项全部勾上然后点next下一步勾选这三个选项，其他的随意（尽量安装到除c盘以外的地方，并且知道自己安装在了哪里，方便后
Win10+VS2015+python3.6.8源码编译安装 17506331945 Python零基础
参考：VS2015编译python3.6.4源码参考：解决python“Nomodulenamedpip”的方法（1）修改build.batsetplatf=x64setconf=Release首先编译Release版本，Debug版本在使用pip安装依赖包时会报错。Debug版本可以在最后编译。（2）执行build.bat会在根目录下生成externals文件夹，并用Nuget下载相关的依赖包，
opcv4结合python3.7安装配置教程青莲居士_村长
一、工具1.编程语言：python3.72.算法语言：opencv4.1.23.编译器：pytharm二、python3.7安装教程1.在python官网https://www.python.org/downloads/windows/下载你要的版本image.png2.下载好后打开安装，因为我的已经安装好了，就不卸了重装了，在网上找了一张python3.6.4版的图，红色框框里一定要勾选，这样你
python基础教程第二版答案-Python基础教程（第2版） weixin_37988176
Python是一种解释型、面向对象、动态数据类型的高级程序设计语言，是*受欢迎的程序设计语言之一。Python语言简洁，语法简单，很适合作为学习编程的入门语言。本书包括基础篇和高级篇，全面介绍Python编程的基础知识和实用技术。读者在阅读本书时可以充分了解和体验Python语言的强大功能。本书中所有程序均在Python3.6.4环境下调试通过。（1）理论联系实际，强化计算思维能力培养。语言语法介
八、Docker笔记：Nginx + Gunicorn + Gevent + Flask H_fb4e
一、拉取Python3.6.4镜像dockerpullpython:3.6.4二、宿主机创建testflask目录，并添加文件touchapp.pydocker-compose.ymlDockerfilegunicorn.conf.pyrequirements.txtapp.pyfromflaskimportFlaskapp=Flask(__name__)@app.route('/')defhel
初探python之做一个简单小爬虫 SangSir
准备工作初探python，这个文章属于自己的一个总结。所以教程面向新手，无技术含量。python环境Linux基本都有，Windows下官网也提供了便利的安装包，怎么安装配置网上有很多教程在此就不一一说明。我使用的python版本为Python3.6.4，后面的代码也是基于python3的。分析需求做一个小爬虫离不开获取网页内容和匹配存储内容，那么我们先装上python爬虫的老朋友requests
Python3.6.4代码哇卡哇卡来啦 python 开发语言
抓取知乎图片，只用30行代码from selenium import webdriverimport timeimport urllib.requestdriver = webdriver.Chrome()driver.maximize_window()driver.get("https://www.zhihu.com/question/29134042")i = 0while i < 10:
如何安装python，配置环境变量，第三方库换源 Jade_Youjun python windows 开发语言
python入门必会文章目录一、windows下python的安装二、手动将python配置到系统环境三、python第三方库换源的方法一、windows下python的安装打开官网(https://www.python.org)，选择Downloads（也可应用商店直接下载）2.然后选择相应的版本python3.xxxx3.下载完成后打开，这里我下载的是python3.6.4（1）勾选AddPy
Python3.6.4安装scrapy失败解决办法 Sanma
问题描述当前环境：windows10（64位系统），python3.6.4在windows下，在dos中运行pipinstallScrapy报错：building'twisted.test.raiser'extensionerror:MicrosoftVisualC++14.0isrequired.Getitwith"MicrosoftVisualC++BuildTools":[http://la
python安装pygame模块 yszdzjt
看了看Python的书，想来试试书上的程序（Python编程从入门到实践，EricMatthes）,里面的项目部分《外星人入侵》用Pygame模块来做，想试一下。首先发现装的Python有问题原来的Python3.6.4版本安装完成后Scripts文件夹里空白的，什么也没有，从https://www.python.org/downloads/windows/，重新下了Python3.7.0a3，D
「Mongo」聚合操作与清洗重复数据项 HughDong
使用Mongo聚合操作来进行重复的数据项清洗，并使用PyMongo加入到数据清洗组件中。当前环境：PyMongo3.6.1/MongoDB3.4.7/Python3.6.4::Anaconda,Inc.在爬虫中断续爬时会出现少量数据重复的问题，我将数据去重放在了数据清洗环节，清洗的过程中顺带将重复的数据删除。Mongo老版本的解决方案是建立单一索引，Mongo3.+可以使用聚合操作将重复的数据检索
No module named _bz2 码--到成功 Python后端 linux 运维服务器
环境：suse12sp3、python3.6.4、django2.0.2背景：在Windows上开发，在Linux部署生产（我来了之后让我开始搞）过程：该Linux自带python3.4，由于环境需要，直接源码安装python3.6.4，之后git代码，安装虚拟环境，以及所需模块，一切顺利，启动，报错Nomodulenamed_bz2原因：pandas模块需要bz2的支持，而后来安装的python
CentOS7安装部署HttpRunnerManager接口自动化测试 chaohui0982 python 大数据数据库
HttpRunnerManager项目是基于httpRunner的接口自动化测试平台，纯Python语言开发。可实现接口自动化测试、性能测试、监控、Jenkins集成等测试任务。Github上现在HttpRunnerManager已停止维护。环境准备CentOS7;Python3.6.4（3.7版本，会有各种问题）;HttpRunnerManagerV2.0;Mysql5.6（推荐Mysql5.7
2021-11-30 【解决】pyautogui 安装opencv，ImportError: DLL load failed问题，原因是win server 2012 老_Z
安装PyautoGui,用到了confidence参数，需要opencv，装之。自己电脑用着没问题，换服务器各种问题pyautogui.locateOnScreen(pic,confidence=0.8)怀疑是python版本问题python3.6.4不行3.7不行raiseNotImplementedError('Theconfidencekeywordargumentisonlyavailab
ImportError: libffi.so.7: cannot open shared object file: No such file or directory解决方法渊飞 Linux Python python linux
本机环境Centos7，ArchLinux方法与本文提出解决方案不同。在linux中强制将Anaconda进行更新或者对Python强制更新时，condaupdate--forceconda使得原机器的python3.6.4版本升级成最新的Anaconda附带python3.6.10版本，出现Importerbreak的情况，如下所示：ImportError:libffi.so.7:cannoto
安装Python3.6.4后，在使用numpy时报错RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibil... weixin_34395205 python
原因：因为安装numpy用的是pip来安装的pypi官方对于numpy的库已经升级了，但是升级后的版本与其他的库不匹配所以报错解决：先把已经安装的numpy卸载：pipuninstallnumpy再安装低版本的numpy：pipinstall-Unumpy==1.14.5转载于:https://www.cnblogs.com/frankielf0921/p/9884929.html
电脑安装python为什么显示的是程序丢失-python报错：无法启动此程序，因为计算机中丢失... weixin_37988176
原标题：python报错：无法启动此程序，因为计算机中丢失python报错：无法启动此程序，因为计算机中丢失api-ms-win-crt-runtime-|1-1-0.dllapi-ms-win-crt-runtime-|1-1-0.dll丢失的解决方法问题：在win7下安装Python3.6.4的时候遇到下面这个错误api-ms-win-crt-runtimel1-1-0.dll缺失如图：原因时
Ubuntu下安装Python环境他-途 Ubuntu
1.安装python36(非必需)在终端输入以下命令wgethttp://www.python.org/ftp/python/3.6.4/Python-3.6.4.tgztar-xvzfPython-3.6.4.tgzcdPython-3.6.4./configure--with-sslmakesudomakeinstall这些命令会使你的ubuntu下载python3.6.4，并替换你现在的py
django + uwsgi + nginx 部署线上个人博客项目 Bear_beat
django+uwsgi+nginx部署线上个人博客项目前提:在线上服务器上已经安装好了db(如mysql),我的还有redis,当然还有用到的python环境,以及所有依赖包。环境:阿里云ubuntu16.04.5,python3.6.4,Django2.0概要1542199692102.jpg当请求的是静态资源的时候(css,js等)可以直接通过nginx直接访问到静态资源,这是静态请求,可以
Python编写游戏——拼图游戏 hz_zhangrl python 游戏 pygame
Python编写游戏需要用到pgzrun模块。但不能用pipinstallpgzrun进行安装，正确的库名是pgzero，pipinstallpgzero。一、开发工具Python版本：Python3.6.4或以上版本。二、相关模块pgzrun模块以及一些Python自带的标准模块。三、环境搭建安装Python并添加到环境变量，pip安装需要的相关模块即可。Python在各个领域都有着丰富的第三方
Python3使用pip安装git并获取Yahoo金融数据 justtoomuchforyou python Python git
Python3.6.4必须downgrade成3.5pip版本最低9.0.3自己的电脑必须已经安装好git关于anacondaprompt报错“Cannotfindcommand'git'”解决在anacondaprompt执行condainstallpandas-datareader报错，读prompt的错误，执行它提示的命令，把Python3.6.4降级成3.5，pip升级成9.0.3，过程有
Centos系统下修改环境变量PATH路径的方法黑鸽子
今天在安装pip3的时候系统报错如下:Thescriptwheelisinstalledin'/usr/bin/python3.6.4/bin'whichisnotonPATH.所以目的就是要将/usr/bin/python3.6.4/bin目录添加到PATH中.上网搜了一圈后发现总结如下三种方法.方法1PATH=$PATH:/usr/bin/python3.6.4/bin使用这种方法,只对当前会
pytesseract 安装错误总结 Spielberg_1 计算机视觉 python 开发语言 opencv 计算机视觉
项目场景：win10操作系统使用eclipse调用pytesseract接口，进行OCR识别。在anaconda的python3.6.4版本，安装配置pytesseract问题描述pipinstallpytesseract报错错误提醒：pytesseractrequiresPython'>=3.7'buttherunningPythonis3.6.4pipinstall安装pytesseract最
help集合txt brief_and_clear
把关键函数和内置函数的help信息集合成了一个txt，还有一些魔法方法(不全)的help信息！(百度云网盘：永久有效！)Pythonhelp合集txt(可以当小说或者速查手册使用，有些分类有些不标准，不影响使用，有目录,基于Python3.6.4)Pythonhelp合集txt(可以当小说或者速查手册使用，有些分类有些不标准，不影响使用，有目录,基于Python3.6.4)(英语)
夏天怎么解暑？教你用python画出《雪景》的效果，太好玩了哪吒敲代码闹海 python
目录开发工具讲解部分一：动态雪景原理二：雪花的运动三：雪花飘落的控制开发工具python3.6.4用到的第三方库：pygamerandomosPIL讲解部分一：动态雪景原理所谓制作动态雪景，就像大家在视频里看到的那样，在一个背景图中，模拟雪花由上到下飘落，我们看看背景图。我们就是在这个背景图上画雪花。然后快速移动位置。雪花也是一个图片，我们可以动态调节不同位置的雪花的大小，模拟出雪花远近。模拟过程
利用python实现，基于“博弈树”的AI五子棋新月清光 python
python讨论qq群：996113038代码及相关资源获取：关注微信公众号：python趣味爱好者，后台回复：五子棋获取源代码开发工具：python3.6.4。需要安装的库：graphics，time。可以联系群主安装效果演示：基本原理我们用到了博弈论的算法，下面我简单介绍一下博弈树：博弈树类似于状态图和问题求解搜索中使用的搜索树。在AI五子棋中，博弈树的节点对应于某一个器具，其分支表示走一步棋
如何用ruby来写hadoop的mapreduce并生成jar包 wudixiaotie mapreduce
ruby来写hadoop的mapreduce，我用的方法是rubydoop。怎么配置环境呢： 1.安装rvm：不说了网上有 2.安装ruby：由于我以前是做ruby的，所以习惯性的先安装了ruby，起码调试起来比jruby快多了。 3.安装jruby： rvm install jruby然后等待安
java编程思想 -- 访问控制权限百合不是茶 java 访问控制权限单例模式
访问权限是java中一个比较中要的知识点,它规定者什么方法可以访问,什么不可以访问一:包访问权限; 自定义包: package com.wj.control; //包 public class Demo { //定义一个无参的方法 public void DemoPackage(){ System.out.println("调用
[生物与医学]请审慎食用小龙虾 comsci 生物
现在的餐馆里面出售的小龙虾,有一些是在野外捕捉的,这些小龙虾身体里面可能带有某些病毒和细菌,人食用以后可能会导致一些疾病,严重的甚至会死亡..... 所以,参加聚餐的时候,最好不要点小龙虾...就吃养殖的猪肉,牛肉,羊肉和鱼,等动物蛋白质
org.apache.jasper.JasperException: Unable to compile class for JSP: 商人shang maven 2.2 jdk1.8
环境： jdk1.8 maven tomcat7-maven-plugin 2.0 原因： tomcat7-maven-plugin 2.0 不知吃 jdk 1.8，换成 tomcat7-maven-plugin 2.2就行，即 <plugin>
你的垃圾你处理掉了吗?GC oloz GC
前序:本人菜鸟，此文研究学习来自网络，各位牛牛多指教　 1.垃圾收集算法的核心思想　　Java语言建立了垃圾收集机制，用以跟踪正在使用的对象和发现并回收不再使用(引用)的对象。该机制可以有效防范动态内存分配中可能发生的两个危险：因内存垃圾过多而引发的内存耗尽，以及不恰当的内存释放所造成的内存非法引用。　　垃圾收集算法的核心思想是：对虚拟机可用内存空间，即堆空间中的对象进行识别
shiro 和 SESSSION 杨白白 shiro
shiro 在web项目里默认使用的是web容器提供的session，也就是说shiro使用的session是web容器产生的，并不是自己产生的，在用于非web环境时可用其他来源代替。在web工程启动的时候它就和容器绑定在了一起，这是通过web.xml里面的shiroFilter实现的。通过session.getSession()方法会在浏览器cokkice产生JESSIONID，当关闭浏览器，此
移动互联网终端淘宝客如何实现盈利小桔子移動客戶端淘客淘寶App
2012年淘宝联盟平台为站长和淘宝客带来的分成收入突破30亿元，同比增长100%。而来自移动端的分成达1亿元，其中美丽说、蘑菇街、果库、口袋购物等App运营商分成近5000万元。可以看出，虽然目前阶段PC端对于淘客而言仍旧是盈利的大头，但移动端已经呈现出爆发之势。而且这个势头将随着智能终端(手机，平板)的加速普及而更加迅猛
wordpress小工具制作 aichenglong wordpress 小工具
wordpress 使用侧边栏的小工具，很方便调整页面结构小工具的制作过程 1 在自己的主题文件中新建一个文件夹(如widget)，在文件夹中创建一个php(AWP_posts-category.php) 小工具是一个类,想侧边栏一样，还得使用代码注册，他才可以再后台使用，基本的代码一层不变 <?php class AWP_Post_Category extends WP_Wi
JS微信分享 AILIKES js
// 所有功能必须包含在 WeixinApi.ready 中进行 WeixinApi.ready(function(Api) { // 微信分享的数据 var wxData = { &nb
封装探讨百合不是茶 JAVA面向对象封装
//封装属性方法将某些东西包装在一起，通过创建对象或使用静态的方法来调用，称为封装；封装其实就是有选择性地公开或隐藏某些信息，它解决了数据的安全性问题，增加代码的可读性和可维护性在 Aname类中申明三个属性，将其封装在一个类中：通过对象来调用例如 1： //属性将其设为私有姓名 name 可以公开
jquery radio/checkbox change事件不能触发的问题 bijian1013 JavaScript jquery
我想让radio来控制当前我选择的是机动车还是特种车，如下所示： <html> <head> <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js" type="text/javascript"><
AngularJS中安全性措施 bijian1013 JavaScript AngularJS 安全性 XSRF JSON漏洞
在使用web应用中，安全性是应该首要考虑的一个问题。AngularJS提供了一些辅助机制，用来防护来自两个常见攻击方向的网络攻击。一.JSON漏洞当使用一个GET请求获取JSON数组信息的时候（尤其是当这一信息非常敏感，
[Maven学习笔记九]Maven发布web项目 bit1129 maven
基于Maven的web项目的标准项目结构 user-project user-core user-service user-web src
【Hive七】Hive用户自定义聚合函数(UDAF) bit1129 hive
用户自定义聚合函数，用户提供的多个入参通过聚合计算(求和、求最大值、求最小值)得到一个聚合计算结果的函数。问题：UDF也可以提供输入多个参数然后输出一个结果的运算，比如加法运算add(3，5)，add这个UDF需要实现UDF的evaluate方法,那么UDF和UDAF的实质分别究竟是什么？ Double evaluate(Double a, Double b)
通过 nginx-lua 给 Nginx 增加 OAuth 支持 ronin47
前言：我们使用Nginx的Lua中间件建立了OAuth2认证和授权层。如果你也有此打算，阅读下面的文档，实现自动化并获得收益。SeatGeek 在过去几年中取得了发展，我们已经积累了不少针对各种任务的不同管理接口。我们通常为新的展示需求创建新模块，比如我们自己的博客、图表等。我们还定期开发内部工具来处理诸如部署、可视化操作及事件处理等事务。在处理这些事务中，我们使用了几个不同的接口来认证： &n
利用tomcat-redis-session-manager做session同步时自定义类对象属性保存不上的解决方法 bsr1983 session
在利用tomcat-redis-session-manager做session同步时，遇到了在session保存一个自定义对象时，修改该对象中的某个属性，session未进行序列化，属性没有被存储到redis中。在 tomcat-redis-session-manager的github上有如下说明： Session Change Tracking As noted in the &qu
《代码大全》表驱动法-Table Driven Approach-1 bylijinnan java 算法
关于Table Driven Approach的一篇非常好的文章： http://www.codeproject.com/Articles/42732/Table-driven-Approach package com.ljn.base; import java.util.Random; public class TableDriven { public
Sybase封锁原理 chicony Sybase
昨天在操作Sybase IQ12.7时意外操作造成了数据库表锁定，不能删除被锁定表数据也不能往其中写入数据。由于着急往该表抽入数据，因此立马着手解决该表的解锁问题。无奈此前没有接触过Sybase IQ12.7这套数据库产品，加之当时已属于下班时间无法求助于支持人员支持，因此只有借助搜索引擎强大的
java异常处理机制 CrazyMizzz java
java异常关键字有以下几个，分别为 try catch final throw throws 他们的定义分别为 try： Opening exception-handling statement. catch： Captures the exception. finally： Runs its code before terminating
hive 数据插入DML语法汇总 daizj hive DML 数据插入
Hive的数据插入DML语法汇总1、Loading files into tables语法：1) LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]解释：1)、上面命令执行环境为hive客户端环境下： hive>l
工厂设计模式 dcj3sjt126com 设计模式
使用设计模式是促进最佳实践和良好设计的好办法。设计模式可以提供针对常见的编程问题的灵活的解决方案。工厂模式工厂模式（Factory）允许你在代码执行时实例化对象。它之所以被称为工厂模式是因为它负责“生产”对象。工厂方法的参数是你要生成的对象对应的类名称。 Example #1 调用工厂方法（带参数） <?phpclass Example{
mysql字符串查找函数 dcj3sjt126com mysql
FIND_IN_SET(str,strlist) 假如字符串str 在由N 子链组成的字符串列表strlist 中，则返回值的范围在1到 N 之间。一个字符串列表就是一个由一些被‘,’符号分开的自链组成的字符串。如果第一个参数是一个常数字符串，而第二个是type SET列，则 FIND_IN_SET() 函数被优化，使用比特计算。如果str不在strlist 或st
jvm内存管理 easterfly jvm
一、JVM堆内存的划分分为年轻代和年老代。年轻代又分为三部分：一个eden,两个survivor。工作过程是这样的：e区空间满了后，执行minor gc，存活下来的对象放入s0, 对s0仍会进行minor gc，存活下来的的对象放入s1中，对s1同样执行minor gc，依旧存活的对象就放入年老代中；年老代满了之后会执行major gc，这个是stop the word模式，执行
CentOS-6.3安装配置JDK-8 gengzg centos
JAVA_HOME=/usr/java/jdk1.8.0_45 JRE_HOME=/usr/java/jdk1.8.0_45/jre PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib export JAVA_HOME
【转】关于web路径的获取方法 huangyc1210 Web 路径
假定你的web application 名称为news,你在浏览器中输入请求路径： http://localhost:8080/news/main/list.jsp 则执行下面向行代码后打印出如下结果： 1、 System.out.println(request.getContextPath()); //可返回站点的根路径。也就是项
php里获取第一个中文首字母并排序远去的渡口数据结构 PHP
很久没来更新博客了，还是觉得工作需要多总结的好。今天来更新一个自己认为比较有成就的问题吧。最近在做储值结算，需求里结算首页需要按门店的首字母A-Z排序。我的数据结构原本是这样的： Array ( [0] => Array ( [sid] => 2885842 [recetcstoredpay] =&g
java内部类 hm4123660 java 内部类匿名内部类成员内部类方法内部类
　在Java中，可以将一个类定义在另一个类里面或者一个方法里面，这样的类称为内部类。内部类仍然是一个独立的类，在编译之后内部类会被编译成独立的.class文件，但是前面冠以外部类的类名和$符号。内部类可以间接解决多继承问题,可以使用内部类继承一个类，外部类继承一个类，实现多继承。 &nb
Caused by: java.lang.IncompatibleClassChangeError: class org.hibernate.cfg.Exten zhb8015
maven pom.xml关于hibernate的配置和异常信息如下，查了好多资料，问题还是没有解决。只知道是包冲突，就是不知道是哪个包....遇到这个问题的分享下是怎么解决的。。 maven pom: <dependency> <groupId>org.hibernate</groupId> <ar
Spark 性能相关参数配置详解－任务调度篇 Stark_Summer spark cache cpu 任务调度 yarn
随着Spark的逐渐成熟完善, 越来越多的可配置参数被添加到Spark中来, 本文试图通过阐述这其中部分参数的工作原理和配置思路, 和大家一起探讨一下如何根据实际场合对Spark进行配置优化。由于篇幅较长，所以在这里分篇组织，如果要看最新完整的网页版内容，可以戳这里：http://spark-config.readthedocs.org/，主要是便
css3滤镜 wangkeheng html css
经常看到一些网站的底部有一些灰色的图标，鼠标移入的时候会变亮，开始以为是js操作src或者bg呢，搜索了一下，发现了一个更好的方法：通过css3的滤镜方法。 html代码： <a href='' class='icon'><img src='utv.jpg' /></a> css代码： .icon{-webkit-filter: graysc

cs231n：python3.6.4对实验数据图像的读取，课后作业代码解释

你可能感兴趣的:(python3.6.4)