KagglerWu

Kaggle项目实战1——Digit Recognizer——排名Top10%

一、kaggle介绍

Kaggle是一个大数据的众包平台，也是一个很好的项目实践场所。Kaggle的项目分为练习项目和奖励项目。今天写的Digit Recognizer属于练习项目，最后的结果只按照测试集的正确率计算排名，没有奖励。解决方案的python代码在Github开源平台上。

二、Digit Recognizer任务

此任务是在MNIST（一个带Label的数字像素集合）上训练一个数字分类器，训练集的大小为42000个training example，每个example是28*28=784个灰度像素值和一个0~9的label。最后的排名以在测试集合上的分类正确率为依据排名。

三、工具准备

这个项目在python环境下做，需要安装的库有ipython，numpy，matplotlib，pandas，scikit-learn，theano。在window环境下其中ipython，numpy，matplotlib，pandas库可以通过安装Canopy集成开发环境完成并且省去了环境变量的配置。Canpy的下载在这里，在校学生注册认证可以获得学术版本的Canopy。ipython负责提供交互式的开发调试环境，numpy负责科学计算库，matplotlib负责绘图和数据可视化，pandas库负责数据的转换和导入导出。scikit-learn是python下比较成熟的机器学习库，其文档多而规范。theano库是deeplearning的库，在Canopy中安装theano库需要在Canopy的Package Manager中事先安装libpython和mingw。

四、第一次尝试：Random Forest

一看是分类问题，第一个想法就是利用随机森林。随机森林本质是一种bagging的方法，其bagging的对象是（层数比较深，叶节点上example比较少的）决策树分类器，既然层数比较深所以带有比较多的variance，bagging方法通过两次Random取样降低了variance：第一次是Random地取training example来构建独立的决策树，第二次是在决策树选split point时Random地取一些feature的子集。通过这两次Random就使得每棵决策树Remember了training example的一部分而不是整个的训练集，带来的bias就trade off了深层决策树带来的high variance。随机森林算法的优点是：1、既可以包含binary的feature也可以同时包含标量的feature。2、算法hyperparameters比较少，参数的选取调试对结果的影响不大。3、算法的精度较高。随机森林算法最突出的缺点是由于使用了ensemble，所以算法的速度比较慢。

一般的数据挖掘问题feature engineering十分重要，但是在这个例子里面，给的数据集已经是规范的按像素排列的灰度值，并不需要过多的处理。首先读入数据：

def readCSVFile(file):
    rawData=[]
    trainFile=open(path+file,'rb')
    reader=csv.reader(trainFile)
    for line in reader:
        rawData.append(line)#42001 lines,the first line is header
    rawData.pop(0)#remove header
    intData=np.array(rawData).astype(np.int32)
    return intData
    
def loadTrainingData():
    intData=readCSVFile("train.csv")
    label=intData[:,0]
    data=intData[:,1:]
    data=np.where(data>0,1,0)#replace positive in feature vector to 1
    return data,label

def loadTestData():
    intData=readCSVFile("test.csv")
    data=np.where(intData>0,1,0)
    return data

然后通过scikit-learn的RandomForestClassifier来训练分类器：

def handwritingClassTest():
    #load data and normalization
    trainData,trainLabel=loadTrainingData()
    testData=loadTestData()
    testLabel=loadTestResult()
    #train the rf classifier
    clf=RandomForestClassifier(n_estimators=1000,min_samples_split=5)
    clf=clf.fit(trainData,trainLabel)#train 20 objects
    m,n=np.shape(testData)
    errorCount=0
    resultList=[]
    for i in range(m):#test 5 objects
         classifierResult = clf.predict(testData[i])
         resultList.append(classifierResult)
    saveResult(resultList)

其中的hyperparameter是通过crossvaildation选出来的，最后输出到csv文件并submit，正确率为96.3%，也不是非常糟糕。

五、第二次尝试：Multi-Layer-Perceptron

由于是图像问题，意识到神经网络应该会有不错的表现，为了防止overfitting，我们先构造一个比较浅层的网络分类器。这里用theano库做一个一层感知机级联一个softmax分类器，为了以后的程序复用性最好把感知机和softmax分类器分别封装成类。首先用pandas库读入DataFrame，并且把42000个example分裂为35000+7000，其中的35000个用于训练，另外的7000个用于validation set（用于early-stop防止overfitting，后面细说）。

def shared_dataset(data_xy,borrow=True):
    """
    speed up the calculation by theano,in GPU float computation.
    """
    data_x,data_y=data_xy
    shared_x=theano.shared(np.asarray(data_x,dtype=theano.config.floatX),borrow=borrow)
    shared_y=theano.shared(np.asarray(data_y,dtype=theano.config.floatX),borrow=borrow)
    # When storing data on the GPU it has to be stored as floats
    # therefore we will store the labels as ``floatX`` as well
    # (``shared_y`` does exactly that). But during our computations
    # we need them as ints (we use labels as index, and if they are
    # floats it doesn't make sense) therefore instead of returning
    # `shared_y`` we will have to cast it to int. This little hack
    # lets ous get around this issue
    return shared_x,T.cast(shared_y,'int32')

def load_data(path):
    print '...loading data'
    train_df=DataFrame.from_csv(path+'train.csv',index_col=False).fillna(0).astype(int)
    test_df=DataFrame.from_csv(path+'test.csv',index_col=False).fillna(0).astype(int)
    if debug_mode==False:
        train_set=[train_df.values[0:35000,1:]/255.0,train_df.values[0:35000,0]]
        valid_set=[train_df.values[35000:,1:]/255.0,train_df.values[35000:,0]]
    else:
        train_set=[train_df.values[0:3500,1:]/255.0,train_df.values[0:3500,0]]
        valid_set=[train_df.values[3500:4000,1:]/255.0,train_df.values[3500:4000,0]]
    test_set=test_df.values/255.0
    #print train_set[0][:10][:10],'\n',train_set[1][:10],'\n',valid_set[0][-10:][:10],'\n',valid_set[1][-10:],'\n',test_set[0][10:][:10]
    test_set_x=theano.shared(np.asarray(test_set,dtype=theano.config.floatX),borrow=True)
    valid_set_x,valid_set_y=shared_dataset(valid_set,borrow=True)
    train_set_x,train_set_y=shared_dataset(train_set,borrow=True)
    rval=[(train_set_x,train_set_y),(valid_set_x,valid_set_y),test_set_x]
    return rval

支持符号运算和支持GPU运算加速是theano库的两大亮点。由于要在GPU中计算所以要存为floatX类型，存为shared可以保证只存在变量的一个引用，对于大的数据集十分经济。具体的theano函数参看官网的说明文档，这里不再赘述。

接下来构建一个多类的Logistic Regression（softmax）类：

class LogisticRegression(object):
    """Multi-class Logistic Regression Class"""
    def __init__(self,input,n_in,n_out,p_drop_logistic):
        """
        :type input: theano.tensor.TensorType
        :param input: symbolic variable that describes the input of the architecture(one minibatch)
        
        :type n_in: int
        :param n_in: number of input units,data dimension.
        
        :type n_out: int
        :param n_out: number of output units,label dimension.
        
        :type p_drop_logistic:float
        :param p_drop_logistic: add some noise by dropout by this probability
        """
        input=dropout(input,p_drop_logistic)
        self.W=theano.shared(value=np.zeros((n_in,n_out),dtype=theano.config.floatX),name='W',borrow=True)
        self.b=theano.shared(value=np.zeros((n_out,),dtype=theano.config.floatX),name='b',borrow=True)
        self.p_y_given_x=T.nnet.softmax(T.dot(input,self.W)+self.b)
        self.y_pred=T.argmax(self.p_y_given_x,axis=1)
        self.params=[self.W,self.b]
        
    def negative_log_likelihood(self,y):
        """
        The cost function of multi-class logistic regression
        :type y:theano.tensor.TensorType
        :param y: corresponds to a vector that gives for each example the correct label
        
        """
        # y.shape[0] is (symbolically) the number of rows in y, i.e.,
        # number of examples (call it n) in the minibatch
        # T.arange(y.shape[0]) is a symbolic vector which will contain
        # [0,1,2,... n-1]. T.log(self.p_y_given_x) is a matrix of
        # Log-Probabilities (call it LP) with one row per example and
        # one column per class. LP[T.arange(y.shape[0]),y] is a vector
        # v containing [LP[0,y[0]], LP[1,y[1]], LP[2,y[2]], ...,
        # LP[n-1,y[n-1]]] and T.mean(LP[T.arange(y.shape[0]),y]) is
        # the mean (across minibatch examples) of the elements in v,
        # i.e., the mean log-likelihood across the minibatch.
        return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]),y])

    def errors(self,y):
        """
        Return a float number of error rate: #(errors in minibatch)/#(total minibatch)
        :type y:theano.tensor.TensorType
        :param y: corresponds to a vector that gives for each example the correct label
        """
        if y.ndim!=self.y_pred.ndim:
            raise TypeError('y should have the same shape as self.y_pred')
        if y.dtype.startswith('int'):
            return T.mean(T.neq(self.y_pred,y))
        else:
            raise NotImplementedError()
        
    def predict(self):
        return self.y_pred

这里采用的cost function为交叉熵即负对数似然函数，由于我们之后取的input是一个mini batch，所以这里的cost function是使用的mini batch上的平均cost。p_drop_logistic是随机dropout的input neuron的概率。这里有必要着重讲一下dropout技术，dropout是Hinton提出的一个方法，在神经网络深度比较深或者hidden neurons比较多的时候防止overfitting的一大利器。其思路有点类似于bagging，本质是通过加入一些Random的东西（noise）来提升神经网络的鲁棒性。具体的实现是在级联的layer的每一层layer的input neurons中随机选出一定比例的置为零，并把survive的neurons按此概率做scaling（保证能量基本守恒），这样就使得神经网络随机地忘记了一部分training example从而降低了variance。值得一提的是一定要在神经网络的weight参数比较多，偏向于overfitting的时候才采用dropout技术。在本身参数不够，带有high bias的时候采用dropout反而会使模型的bias增加，分类效果更差。关于dropout的详细介绍可以参考Hinton的相关文章。具体的代码实现很简单：

def dropout(X,p=0.):
    """
    Add some noise to regularize by drop out by probility p
    so to prevent overfitting
    """
    if p>0:
        retain_prob=1-p
        srng=RandomStreams()
        X*=srng.binomial(X.shape,p=retain_prob,dtype=theano.config.floatX)
        X/=retain_prob
    return X

接下来构造一个由Perceptron构成的HiddenLayer类，其中的Activation function（激励函数）是可以指定的，在2010年以前大范围使用的都是sigmoid function，但是近年来的paper中激励函数更多的是取的tanh和rectify function。我们这里数据量并不大所以暂时不采用求导容易的rectify function。HiddenLayer的代码如下：

class HiddenLayer(object):
    def __init__(self,rng,input,n_in,n_out,W=None,b=None,activation=T.tanh,p_drop_perceptron=0.2):
        """
        Typical hidden layer of a MLP:units are fully-connected and have 
        tanh activation function.Weight matrix W is of shape (n_in,n_out),
        the bias vector b is of shape(n_out)
        
        :type rng: numpy.random.RandomState
        :param rng: a random number generator used to initialize weights

        :type input: theano.tensor.dmatrix
        :param input: a symbolic tensor of shape (n_examples, n_in)

        :type n_in: int
        :param n_in: dimensionality of input

        :type n_out: int
        :param n_out: number of hidden units

        :type activation: theano.Op or function
        :param activation: Non linearity to be applied in the hidden layer
        
        :type p_drop_perceptron:float
        :param p_drop_perceptron: add some noise by dropout by this probability
        """
        self.input=dropout(input,p_drop_perceptron)
        if W is None:
            W_values=np.asarray(rng.uniform(low=-np.sqrt(6./(n_in+n_out)),
                                high=np.sqrt(6./(n_in+n_out)),size=(n_in,n_out)),
                                dtype=theano.config.floatX)
            if activation==theano.tensor.nnet.sigmoid:
                W_values*=4
            W=theano.shared(value=W_values,name='W',borrow=True)
        if b is None:
            b_values=np.zeros((n_out,),dtype=theano.config.floatX)
            b=theano.shared(value=b_values,name='b',borrow=True)
        self.W=W
        self.b=b
        lin_output=T.dot(input,self.W)+self.b
        self.output=(lin_output if activation is None else activation(lin_output))    
        self.params=[self.W,self.b]

为了使用的方便我们这里把HiddenLayer和LogistRegression类级联的神经网络也封装成一个MLP类，类中的计算全部都是符号运算，MLP类的cost function沿用较后层的softmax分类器的cost function，代码如下：

class MLP(object): 
    """Multi-Layer Perceptron Class

    A multilayer perceptron is a feedforward artificial neural network model
    that has one layer or more of hidden units and nonlinear activations.
    Intermediate layers usually have as activation function tanh or the
    sigmoid function (defined here by a ``HiddenLayer`` class)  while the
    top layer is a softamx layer (defined here by a ``LogisticRegression``
    class).
    """  
    def __init__(self,rng,input,n_in,n_hidden,n_out,p_drop_perceptron=0.2,p_drop_logistic=0.2):
        """
        :type rng: numpy.random.RandomState
        :param rng: a random number generator used to initialize weights

        :type input: theano.tensor.TensorType
        :param input: symbolic variable that describes the input of the
        architecture (one minibatch)

        :type n_in: int
        :param n_in: number of input units, the dimension of the space in
        which the datapoints lie

        :type n_hidden: int
        :param n_hidden: number of hidden units

        :type n_out: int
        :param n_out: number of output units, the dimension of the space in
        which the labels lie
        """
        #We are now dealing with "one hidden layer+ logistic regression" ---Old Network
        self.hiddenLayer=HiddenLayer(rng=rng,input=input,n_in=n_in,n_out=n_hidden,activation=T.tanh,p_drop_perceptron=p_drop_perceptron)
        self.logRegressionLayer=LogisticRegression(input=self.hiddenLayer.output,n_in=n_hidden,n_out=n_out,p_drop_logistic=p_drop_logistic)
        #L1 regularization
        self.L1=abs(self.hiddenLayer.W).sum()+abs(self.logRegressionLayer.W).sum()
        #L2 regularization
        self.L2_sqr=(self.hiddenLayer.W**2).sum()+(self.logRegressionLayer.W**2).sum()
        self.negative_log_likelihood=self.logRegressionLayer.negative_log_likelihood
        self.errors=self.logRegressionLayer.errors
        self.params=self.hiddenLayer.params+self.logRegressionLayer.params
        self.predict=self.logRegressionLayer.predict

有了几个基本的类以后就可以开始真正的模型的构建过程了，无非就是类的实例化并且调用类中的符号运算计算出的变量来生成theano函数，这里生成的train_model才是我们真正构建的训练模型。而 validate_model是用来利用validation set来early stop防止overfitting的模型，模型构建的代码如下：

def train_old_net():
    learning_rate=0.001
    L1_reg=0.00
    L2_reg=0.0001
    n_epochs=100
    batch_size=20
    n_hidden=500
    datasets=load_data(path)
    train_set_x,train_set_y=datasets[0]
    valid_set_x,valid_set_y=datasets[1]
    test_set_x=datasets[2]
    
    #compute number of minibatches 
    n_train_batches=train_set_x.get_value(borrow=True).shape[0]/batch_size
    n_valid_batches=valid_set_x.get_value(borrow=True).shape[0]/batch_size
    n_test_batches=test_set_x.get_value(borrow=True).shape[0]/batch_size
    
    print '...building the model'
    index=T.lscalar()
    x=T.matrix('x')
    y=T.ivector('y')
    rng=np.random.RandomState(1234567890)
    #construct the MLP class
    #Attention!!!
    #this line to set p_drop_perceptron and p_drop_logistic
    #if set no dropout then decrease the early stop threshold 
    #improvement_threshold on line 292
    classifier=MLP(rng=rng,input=x,n_in=28*28,n_hidden=n_hidden,n_out=10,p_drop_perceptron=0,p_drop_logistic=0)
    
    # the cost we minimize during training is the negative log likelihood of
    # the model plus the regularization terms (L1 and L2); cost is expressed
    # here symbolically
    cost=classifier.negative_log_likelihood(y)+L1_reg*classifier.L1+L2_reg*classifier.L2_sqr
    
    #compiling a theano function that computes the mistake rate that 
    #made by the validate_set on minibatch
    validate_model=theano.function(inputs=[index],outputs=classifier.errors(y),
                                    givens={x:valid_set_x[index*batch_size:(index+1)*batch_size],
                                            y:valid_set_y[index*batch_size:(index+1)*batch_size]})
    
    #symbolicly compute the gradient of cost respect to params
    #the resulting gradient will be stored in list gparams
    gparams=[]
    for param in classifier.params:
        gparam=T.grad(cost,param)
        gparams.append(gparam)    
        
    #compiling a Theano function 'train_model' that returns the cost
    #but in the same time updates the parameter of the model based on
    #the rules defined in 'updates'
    train_model=theano.function(inputs=[index],outputs=cost,updates=RMSprop(gparams,classifier.params,learning_rate=0.001),
                                givens={x:train_set_x[index*batch_size:(index+1)*batch_size],
                                        y:train_set_y[index*batch_size:(index+1)*batch_size]
                                        })

这里最重要的一点是迭代求解Weights的优化问题时并没有采用经典的SGD随机梯度下降方法，而是采用RMSprop的方法，这个方法我是从Hinton的课程上学来的，类似的方法还有Momentum，Adaptive learning-rate等，都是一个目的，使得迭代更快地收敛。RMSprop的思路是：由于我们是在mini batch上计算的梯度，所以每次走的方向是比较随机的，我们不希望某一次在一个错误的方向上走的太远以至于我们以后需要很多步来矫正。总的来说我们希望每一次的gradient是相同量级的，怎么做到能？scaling！每次都更新一个平均的gradient来做scaling即可。具体的代码如下：

#using RMSprop(scaling the gradient based on running average)
    #to update the parameters of the model as a list of (variable,update expression) pairs
    def RMSprop(gparams,params,learning_rate,rho=0.9,epsilon=1e-6):
        """
        param:rho,the fraction we keep the previous gradient contribution
        """
        updates=[]
        for p,g in zip(params,gparams):
            acc=theano.shared(p.get_value()*0.)
            acc_new=rho*acc+(1-rho)*g**2
            gradient_scaling=T.sqrt(acc_new+epsilon)
            g=g/gradient_scaling
            updates.append((acc,acc_new))
            updates.append((p,p-learning_rate*g))
        return updates

接下来的就是具体的训练工作，训练的何时停止呢？我们在规定一个阈值的同时采用了early-stop的技术。这是一个十分普遍的技术，下面我们结合代码看一下它是如何工作的：

#early-stopping parameters
    patience=10000 #look as this many examples regardless
    patience_increase=2 #wait the iter number longer when a new best is found
    #improvement_threshold=0.995 # a relative improvement of this much on validation set 
                                # considered as not overfitting 
                                # if have added drop-out noise,we can increase the value
    improvement_threshold=0.995
    validation_frequency=min(n_train_batches,patience/2)
                                # every this much interval check on the validation set 
                                # to see if the net is overfitting.
                                # patience/2 because we want to at least check twice before getting the patience
                                # include n_train_batches to ensure we at least check on every epoch
    best_validation_error_rate=np.inf
    epoch=0
    done_looping=False
    
    while(epoch

 
     
     
 
     
    我们起初设定的阈值是10000，即无论如何我们都会遍历10000个training example，但是每隔一段时间我们都会检查在validation set上的错误率，如果错误率提升的幅度太小我们就在达到阈值时停止（early-stop）。如果提升的幅度足够大（上面是错误率至少下降了0.005%），我们就把阈值提高让算法继续迭代下去。我们的training example是循环使用的，每循环一次记为一个epoch，上面的设定是在达到我们初始阈值之前起码做了两次validate(即至少有两次机会提升阈值），并且上面的设定还保证每个epoch内至少validate一次。最后的工作就是predict并用pandas把结果导出到csv文件，这里最重要的一点是在predict的时候是不能带drop-out的：

 
    
    
 
    
    
    classifier.p_drop_perceptron=0
    classifier.p_drop_logistic=0
    y_x=classifier.predict()
    model_predict=theano.function(inputs=[index],outputs=y_x,givens={x:test_set_x[index*batch_size:(index+1)*batch_size]})
    digit_preds=Series(np.concatenate([model_predict(i) for i in xrange(n_test_batches)]))
    image_ids=Series(np.arange(1,len(digit_preds)+1))
    submission=DataFrame([image_ids,digit_preds]).T
    submission.columns=['ImageId','Label']
    submission.to_csv(path+'submission_sample.csv',index=False) 
    
 
    
    
    由于网络比较浅（只有两层），所以运行还是很快的，半小时出结果，提交后错误率97.5%，比随机森林的96%提升了不少。 
    
    
    
 
    
    
     
    六、第三次尝试：LeNet5 
     
     上面的MLP可以看出神经网络确实十分适合用于计算机视觉方面的工作，查阅文献后发现MNIST的识别现在最好的已经可以做到99.5%以上的正确率，看来MLP还有很大的提升空间。提升的空间在哪儿呢，初步判断是layer只有两层导致参数较少underfitting，所以加深网络才是正道。Convolution neural network公认是解决图像识别的最有效的方法之一，我们这里就构建LeCun文章中的LeNet5。LeNet的具体架构如下： 
     
     
     
 
     
    
 
    
    
    一个卷积层（Convolution layer）和一个池化层（Pooling layer，图中是sub-sampling layer）作为一个LeConvPoolLayer，则LeNet5是两个级联的LeConvPoolLayer级联上一个全连接的Perceptron再级联上一个softmax分类器。与多层感知机的参数是乘法运算的转移weight矩阵不同，卷积层的参数是用于卷积的filter map，这个filter map是学习获得的，初始化时是随机数。所以卷积层的本质是取feature，之所以使用比较小的filter map来取特征是因为这样能够保证在原图像经历了一定的平移时filter出相同的特征。池化层则可以看做是降低分辨率的一步，将临近的几个pixel做平均或者取最大时既保留了主要的feature，又减少了neurons的数目，使得最后全连接感知机的weight参数不至于太多造成计算量大或者overfitting。如此一来，代码实现就比较清楚了： 
    
    
    
 
    
    
    class LeNetConvPoolLayer(object):
    """
    A layer consists of a convolution layer and a pooling layer
    """
    def __init__(self,rng,input,filter_shape,image_shape,poolsize=(2,2),p_drop_cov=0.2):
        """
        :type rng: numpy.random.RandomState
        :param rng: a random number generator used to initialize weights
        
        :type input:theano.tensor.dtensor4
        :param input: symbolic image tensor,of shape image_shape
        
        :type filter_shape: tuple or list of length 4
        :param filter_shape: (number of filters, num input feature maps,
                              filter height,filter width)
        
        :type image_shape:tuple or list of length 4
        :param image_shape: (batch size,num input feature maps(maps from different channels),
                             image height, image width)
        
        :type poolsize:tuple or list of length 2
        :param poolsize: the downsampling(pooling) factor(#rows,#cols)
        """
        assert image_shape[1]==filter_shape[1]
        self.input=dropout(input,p_drop_cov)
        #there are "num input feature maps(channels) * filter height * filter width"
        #input to each hidden unit
        fan_in=np.prod(filter_shape[1:])
        #each unit in the lower layer receives a gradient from:
        #"num output feature maps * filter height * filter width"
        #  / pooling size
        fan_out=(filter_shape[0]*np.prod(filter_shape[2:])/np.prod(poolsize))
        #initialize weights with random weights
        W_bound=np.sqrt(6./(fan_in+fan_out))
        self.W=theano.shared(np.asarray(rng.uniform(
                                        low=-W_bound,high=W_bound,size=filter_shape),
                                        dtype=theano.config.floatX),borrow=True)
        #the bias is a 1D tensor-- one bias per output feature map
        b_values=np.zeros((filter_shape[0],),dtype=theano.config.floatX)
        self.b=theano.shared(value=b_values,borrow=True)
        
        #convolve input feature maps with filters
        conv_out=conv.conv2d(input=input,filters=self.W,filter_shape=filter_shape,image_shape=image_shape)
                             
        #pooling each feature map individually,using maxpooling
        pooled_out=downsample.max_pool_2d(input=conv_out,ds=poolsize,ignore_border=True)
        
        #add the bias term.Since the bias is a vector (1D array),we first 
        #reshape it to a tensor of shape(1,n_filters,1,1).Each bias will thus
        #be broadcasted across mini-batches and feature map width& height
        self.output=T.tanh(pooled_out+self.b.dimshuffle('x',0,'x','x'))
        
        #store parameters of this layer
        self.params=[self.W,self.b]
        
    def return_output():
        return self.output 
    
这里需要注意的一点是在卷积之后池化之前我们需要在feature map上加上一个bias量（上述类中用b表示），feature map是一个4维的tensor，而bias是一个一维的tensor，这里就需要用到tensor的广播机制，这一点上theano库算是充分发挥了python的优势，具体的用法可以参加theano库dimshuffle函数的说明。有了LeConvPoolLayer之后构建实际的模型就按照LeNet5的图示级联即可： 
    
    
    
 
    
    
    def train_lenet():
    learning_rate=0.001
    #if not using RMSprop learn_rate=0.1 
    #if using RMSprop learn_rate=0.001
    nkerns=[20,50]
    batch_size=500
    """
    :type nkerns:list of ints
    :param nkerns: nkerns[0] is the number of feature maps after 1 LeCovPoollayer
                   nkerns[1] is the number of feature maps after 2 LeCovPoolllayer
    """  
    rng=np.random.RandomState(1234567890)
    datasets=load_data(path)
    train_set_x,train_set_y=datasets[0]
    valid_set_x,valid_set_y=datasets[1]
    test_set_x=datasets[2]
   
    #compute number of minibatches
    n_train_batches=train_set_x.get_value(borrow=True).shape[0]
    n_valid_batches=valid_set_x.get_value(borrow=True).shape[0]
    n_test_batches=test_set_x.get_value(borrow=True).shape[0]
    n_train_batches/=batch_size
    n_valid_batches/=batch_size
    n_test_batches/=batch_size
    
    #allocat symbolic variables for the data
    index=T.lscalar() #index to minibatch
    x=T.matrix('x') #image
    y=T.ivector('y') #labels  
    ishape=(28,28) #MNIST image size
    p_drop_cov=0.
    p_drop_perceptron=0.
    p_drop_logistic=0. #probablities of drop-out to prevent overfitting
    
    #########################
    # Building actual model # 
    #########################
    print '...building the model'
    
    #reshape matrix of images of shape (batch_size,28,28)
    #to a 4D tensor, compatible with our LeNetConvPoolLayer
    layer0_input=x.reshape((batch_size,1,28,28)) #batch_size* 1 channel* (28*28)size
    
    #Construct the first convolutional pooling layer:
    #filtering reduces the image size to (28-5+1,28-5+1)=(24,24)
    #maxpooling reduces this further to (24/2,24/2)=(12,12)
    #4D output tensor is thus of shape (batch_size,nkerns[0],12,12)
    layer0=LeNetConvPoolLayer(rng,input=layer0_input,
                               image_shape=(batch_size,1,28,28),
                               filter_shape=(nkerns[0],1,5,5),poolsize=(2,2),p_drop_cov=p_drop_cov)
    
    # Construct the second convolutional pooling layer:
    # filtering reduces the image size to (12-5+1,12-5+1)=(8,8)
    # maxpooling reduces this further to (8/2,8/2) = (4,4)
    # 4D output tensor is thus of shape (batch_size,nkerns[1],4,4)
    layer1 = LeNetConvPoolLayer(rng, input=layer0.output,
            image_shape=(batch_size, nkerns[0], 12, 12),
            filter_shape=(nkerns[1], nkerns[0], 5, 5), poolsize=(2, 2),p_drop_cov=p_drop_cov)
    
    #the HiddenLayer being fully-connected,it operates on 2D matrices shape
    #(batch_size,num_pixels)
    #This will generate a matrix of shape (batch_size,nkerns[1]*4*4)=(500,800)
    layer2_input=layer1.output.flatten(2)
    
    #construct a fully-connected perceptron layer
    layer2=HiddenLayer(rng,input=layer2_input,n_in=nkerns[1]*4*4,n_out=500,activation=T.tanh,p_drop_perceptron=p_drop_perceptron)
    
    #classify the values of the fully connected softmax layer
    layer3=LogisticRegression(input=layer2.output,n_in=500,n_out=10,p_drop_logistic=p_drop_logistic)
    
    #the cost we minimize during training
    cost=layer3.negative_log_likelihood(y)
    
    #create a function to compute the error rate on validation set
    validate_model=theano.function([index],layer3.errors(y),
                givens={x:valid_set_x[index*batch_size:(index+1)*batch_size],
                        y:valid_set_y[index*batch_size:(index+1)*batch_size]})
    
    #create a list of gradients for all model parameters
    params=layer3.params+layer2.params+layer1.params+layer0.params
    grads=T.grad(cost,params)
    
    #using RMSprop(scaling the gradient based on running average)
    #to update the parameters of the model as a list of (variable,update expression) pairs
    def RMSprop(gparams,params,learning_rate,rho=0.9,epsilon=1e-6):
        """
        param:rho,the fraction we keep the previous gradient contribution
        """
        updates=[]
        for p,g in zip(params,gparams):
            acc=theano.shared(p.get_value()*0.)
            acc_new=rho*acc+(1-rho)*g**2
            gradient_scaling=T.sqrt(acc_new+epsilon)
            g=g/gradient_scaling
            updates.append((acc,acc_new))
            updates.append((p,p-learning_rate*g))
        return updates
                            
    #iterate to get the optimal solution using minibatch SGD
    #updates=[]
    #for param,grad in zip(params,grads):
        #updates.append((param,param-learning_rate*grad))
    
    train_model=theano.function([index],cost,updates=RMSprop(grads,params,learning_rate),
                givens={ x:train_set_x[index* batch_size:(index+1)*batch_size],
                         y:train_set_y[index* batch_size:(index+1)*batch_size]
                         }) 
    
这里同样加入了drop-out防止overfitting，加入了RMSprop加速SGD的迭代收敛速度。训练的过程与MLP相仿，也采用early-stop防止overfitting，这里不再赘述。由于LetNet5是一个包含5层的网络，参数比较多，所以训练的速度比MLP要慢的多，可以采用theano库的GPU运算配置加速5倍左右。最后在我的laptop上大概2小时出结果，从log中看出大概30个epoch时就基本收敛了，最后Kaggle上提交正确率99.1%，排名在28/554，前5%的名次。之所以没有达到99.5%的理论值可能是由于training example不够多，可以考虑对training example做左（右）旋5度来获得更多的training example或许会进一步提高正确率，由于手头事情多我没有做。但是个人觉得上下或者左右的平移并不会带来正确率的提升，因为在平移变化下取得的卷积特征是不变的。 
    
    
    
 
    
    
    
 
    
    
     
    七、本文所需的一些背景 
     
     1、机器学习基础知识：Adrew Ng的Coursera课程，目前已经结课但是资料可以看。里面有一些编程左右，我自己做了一份提交验证过的答案。 
     
     
     2、机器学习基础知识：Adrew Ng的UFLDL网页，我自己也做了一份答案。 
     
     
     3、theano教程，我在github上fork了一个branch，大家有兴趣可以看看。 
     
     
     4、我做的本文的项目python代码。

推荐算法学习记录2.2——kaggle数据集的动漫电影数据集推荐算法实践——基于内容的推荐算法、协同过滤推荐萱仔学习自我记录推荐算法学习 python matplotlib 开发语言
1、基于内容的推荐：这种方法根据项的相关信息（如描述信息、标签等）和用户对项的操作行为（如评论、收藏、点赞等）来构建推荐算法模型。它可以直接利用物品的内容特征进行推荐，适用于内容较为丰富的场景。‌#1.基于内容的推荐算法fromsklearn.feature_extraction.textimportTfidfVectorizerfromsklearn.metrics.pairwiseimport
免费GPU平台教程，助力你的AI, pytorch tensorflow 支持cuda zhangfeng1133 人工智能 pytorch tensorflow
Colab：https://drive.google.com/drive/home阿里天池实验室：https://tianchi.aliyun.com/60个小时gputianchi.aliyun.com/notebook-ai/天池实验室_实时在线的数据分析协作工具，享受免费计算资源-阿里云天池移动九天：https://jiutian.10086.cn/edu/#/homekagglekaggl
49Kaggle 数据分析项目入门实战--绝地求生游戏最终排名预测 Jachin111
绝地求生介绍相信很多都玩过绝地求生这款游戏，其游戏规则主要是将100名玩家空手被扔到一个岛上，这些玩家必须探索、寻找、消灭其他玩家，直到只剩下一个玩家活着。绝地求生很受欢迎。这款游戏销量目前超过5000万份，是有史以来销量排名前五的游戏，每月有数百万活跃玩家。而我们本次实验的任务就是根据玩家在游戏中的种种表现来预测出其在最终的排名。导入数据并预览首先安装实验需要的statsmodels包。!pip
李沐《动手学深度学习》课程笔记：15 实战：Kaggle房价预测 + 课程竞赛：加州2020年房价预测非文的NLP修炼笔记 #李沐《动手学深度学习》课程笔记深度学习人工智能
15实战：Kaggle房价预测+课程竞赛：加州2020年房价预测1.访问和读取数据集importhashlibimportosimporttarfileimportzipfileimportrequestsDATA_HUB=dict()DATA_URL='http://d2l_data.s3-accelerate.amazonaws.com/'defdownload(name,cache_dir=
Kaggle Intermediate ML Part Two 卢延吉 New Developer 数据 (Data)ML &ME &GPT Data ML
CategoricalVariablesCategoricalvariables,alsoknownasqualitativevariables,areafundamentalconceptinstatisticsanddataanalysis.Here'sabreakdowntohelpyouunderstandthem:Whatarethey?Categoricalvariablesrepre
【工业智能】VSB Power Line Fault Detection-chapter1 凭轩听雨199407 学习 python 制造数据挖掘
VSBPowerLineFaultDetection-chapter1backgrounddataset数据介绍信号处理方法EDAtrainfeatureengineeringmodeltraintry信息来源：KaggleCompetition:VSBPowerLineFaultDetectionbackground中压高架线路绵延上百公里来为城市提供电力。因为距离很远，所以人工检测那些没有立即
【工业智能】VSB Power Line Fault Detection-chapter2 凭轩听雨199407 数据挖掘
工业智能】VSBPowerLineFaultDetection-chapter2关键信息依赖版本信息名词术语tricks信息来源：KaggleCompetition:VSBPowerLineFaultDetection分析冠军代码。源文件URL：https://www.kaggle.com/code/mark4h/vsb-1st-place-solution关键信息LGB标准5折验证9个特征所有特
机器学习网格搜索超参数优化实战(随机森林) ##4 恒c 机器学习随机森林人工智能
文章目录基于Kaggle电信用户流失案例数据（可在官网进行下载）数据预处理模块时序特征衍生第一轮网格搜索第二轮搜索第三轮搜索第四轮搜索第五轮搜索基于Kaggle电信用户流失案例数据（可在官网进行下载）导入库#基础数据科学运算库importnumpyasnpimportpandasaspd#可视化库importseabornassnsimportmatplotlib.pyplotasplt#时间模块
多元统计分析课程论文-聚类效果评价 talle2021 数据分析机器学习聚类数据挖掘机器学习
数据集来源：UnsupervisedLearningonCountryData(kaggle.com)代码参考：Clustering:PCA|K-Means-DBSCAN-Hierarchical||Kaggle基于特征合成降维和主成分分析法降维的国家数据集聚类效果评价目录1.特征合成降维2.PCA降维3.K-Means聚类3.1对特征合成降维的数据聚类分析3.2对PCA降维的数据聚类分析摘要：本
R语言课程论文-飞机失事数据可视化分析 talle2021 数据分析 r语言数据分析数据可视化
数据来源：AirplaneCrashesSince1908(kaggle.com)代码参考：ExploringhistoricAirPlanecrashdata|Kaggle数据指标及其含义指标名含义Date事故发生日期(年-月-日)Time当地时间，24小时制，格式为hh:mmLocation事故发生的地点Operator航空公司或飞机的运营商Flight由飞机操作员指定的航班号Route事故前
Dataframe型数据分析技巧汇总我叫杨傲天学习笔记机器学习数据分析数据挖掘
Kaggle如何针对少量数据集比赛的打法。数据降维的几种方法HF.075|时间序列趋势性分析方法汇总机器学习必须了解的7种交叉验证方法（附代码）这个图！Python也能一键绘制了，而且样式更多..散点图，把散点图画出花来综述：机器学习中的模型评价、模型选择与算法选择！表格任务中的深度学习模型性能比较再见Onehot！KaggleMaster的上分神操作！特征重要性评估方法之排列重要性
Task 11 XGBoost 算法分析与案例调参实例沫2021
1.XGBoost算法XGBoost是陈天奇等人开发的一个开源机器学习项目，高效地实现了GBDT算法并进行了算法和工程上的许多改进，被广泛应用在Kaggle竞赛及其他许多机器学习竞赛中并取得了不错的成绩。XGBoost是一个优化的分布式梯度增强库，旨在实现高效，灵活和便携。它在GradientBoosting框架下实现机器学习算法。XGBoost提供了并行树提升（也称为GBDT，GBM），可以快速
关于商店销售量的数据处理小问题（Python）不期而遇__ python pandas 数据分析大数据
通过学校举行的某次学科竞赛，我接触到了kaggle上的一道题：StoreSales-TimeSeriesForecasting。由于题主资质尚浅，本文将对前期数据处理的一些小问题做出解答，不涉及后续更难的问题。此处放原题链接：StoreSales-TimeSeriesForecasting题主也是看了很多的资料，也看到了CSDN上另外一位大佬写的文章，收获颇多，此处也放一下链接：Kaggle实战：
学习笔记 2019-04-30 段勇_bf97
HousePrices-bagging_xgboost+lasso+ridgeKaggle入門級賽題：房價預測FFMPEG视音频编解码零基础学习方法35岁程序员的独家面试经历公司名称公司介绍薪水车辆工程专业33岁简历有些传感器方面的东西20k-35k非渣硕是如何获得百度、京东双SP一些面试经验20k-40k吴以均的简历一个大牛的简历北京航空航天大学毕业生的简历厦门大学软件学院毕业生的简历名称介绍H
数据分析基础之《pandas（8）—综合案例》 csj50 机器学习数据分析
一、需求1、现在我们有一组从2006年到2016年1000部最流行的电影数据数据来源：https://www.kaggle.com/damianpanek/sunday-eda/data2、问题1想知道这些电影数据中评分的平均分，导演的人数等信息，我们应该怎么获取？3、问题2对于这一组电影数据，如果我们想看Rating、Runtime(Minutes)的分布情况，应该如何呈现数据？4、问题3对于这
XGBoost算法小森( ﹡ˆoˆ﹡ ) 机器学习算法算法人工智能机器学习
XGBoost在机器学习中被广泛应用于多种场景，特别是在结构化数据的处理上表现出色，XGBoost适用于多种监督学习任务，包括分类、回归和排名问题。在数据挖掘和数据科学竞赛中，XGBoost因其出色的性能而被频繁使用。例如，在Kaggle平台上的许多获奖方案中，XGBoost都发挥了重要作用。此外，它在处理缺失值和大规模数据集上也有很好的表现。XGBoost是一种基于梯度提升决策树（GBDT）的算
Kaggle Intro Model Validation and Underfitting and Overfitting 卢延吉 New Developer 数据 (Data)ML &ME &GPT 机器学习
ModelValidationModelvalidationisthecornerstoneofensuringarobustandreliablemachinelearningmodel.It'stherigorousassessmentofhowwellyourmodelperformsonunseendata,mimickingreal-worldscenarios.Doneright,it
kaggle实战语义分割-Car segmentation（附源码）橘柚jvyou python 人工智能计算机视觉深度学习 pytorch
目录前言项目介绍数据集处理数据集加载定义网络训练网络验证网络前言本篇文章会讲解使用pytorch完成另外一个计算机视觉的基本任务-语义分割。语义分割是将图片中每个部分根据其语义分割出来，其相比于图像分类的不同点是，图像分类是对一张图片进行分类，而语义分割是对图像中的每个像素点进行分类。我们这里使用的语义分割数据集是kaggle上的一个数据集。数据集来源：https://www.kaggle.com
kaggle实战图像分类-Intel Image Classification（附源码）橘柚jvyou 分类人工智能 pytorch 计算机视觉深度学习
目录前言数据集加载定义网络训练网络验证网络前言本篇文章会讲解一个使用pytorch这个深度学习框架完成一个kaggle上的图像分类任务。主要会介绍如何加载数据集，导入网络训练数据，保存损失，精度变化曲线和最终模型，以及测试模型在验证集上的好坏。其数据集介绍可以看一下kaggle的网址，这里就不过多介绍。数据集来源：https://www.kaggle.com/datasets/puneet6060
机器学习 | 深入集成学习的精髓及实战技巧挑战亦世凡华、 #机器学习机器学习集成学习人工智能 boosting xgboost
目录xgboost算法简介泰坦尼克号乘客生存预测(实操)lightGBM算法简介《绝地求生》玩家排名预测(实操)xgboost算法简介XGBoost全名叫极端梯度提升树，XGBoost是集成学习方法的王牌，在Kaggle数据挖掘比赛中，大部分获胜者用了XGBoost。XGBoost在绝大多数的回归和分类问题上表现的十分顶尖，接下来将较详细的介绍XGBoost的算法原理。最优模型构建方法：构建最优模
称霸kaggle的XGBoost究竟是啥？猴小白
一、前言：kaggle神器XGBoost相信入了机器学习这扇门的小伙伴们一定听过XGBoost这个名字，这个看起来朴实无华的boosting算法近年来可算是炙手可热，别的不说，但是大家所熟知的kaggle比赛来看，说XGBoost是“一统天下”都不为过。业界将其冠名“机器学习竞赛的胜利女神”，当然，相信很多小伙伴也看过很多文章称其为“超级女王”。那么问题来了，为啥是女的？（滑稽~）XGBoost全
烹饪第一个U-Net进行图像分割小北的北 python 开发语言
今天我们将学习如何准备计算机视觉中最重要的网络之一：U-Net。如果你没有代码和数据集也没关系，可以分别通过下面两个链接进行访问：代码：https://www.kaggle.com/datasets/mateuszbuda/lgg-mri-segmentation?source=post_page-----e812e37e9cd0--------------------------------Ka
北京房价预测——Kaggle数据 GavinHarbus
日暮途远，人间何世将军一去，大树飘零概述之前学习了加州房价预测模型，便摩拳擦掌，从kaggle上找到一份帝都房价数据，练练手。实验流程实验数据从Kaggle中选择了帝都北京住房价格的数据集，该数据集摘录了2011～2017年链家网上的北京房价数据。image下载并预览数据下载并解压数据image预览数据image每一行代表一间房，每个房子有26个相关属性，其中以下几个需要备注：DOM:市场活跃天数
kaggle：泰坦尼克号获救预测_Titanic_EDA## 卜咦
问题数据来源于Kaggle，通过一组列有泰坦尼克号灾难幸存者或幸存者的训练样本集，我们的模型能否基于不包含幸存者信息的给定测试数据集确定这些测试数据集中的乘客是否幸存。代码与数据分析导入必要的包和titanic数据image数据集基本信息将数据分为不同类别，分别为类别型数据和数字型数据类别数据：Survived,Sex,andEmbarked.Ordinal:Pclass数字型数据：Age,Far
基于LLM的数据漂移和异常检测新缸中之脑 LLM
大型语言模型(LLM)的最新进展被证明是许多领域的颠覆性力量（请参阅：通用人工智能的火花：GPT-4的早期实验）。和许多人一样，我们非常感兴趣地关注这些发展，并探索LLM影响数据科学和机器学习领域的工作流程和常见实践的潜力。在我们之前的文章中，我们展示了LLM使用Kaggle竞赛中的表格数据提供预测的潜力。只需很少的努力（即数据清理和/或功能开发），我们基于LLM的模型就可以在几个竞赛参赛作品中获
Xgboost 大雄的学习人生
在最近的Kaggle竞赛中，利用Xgboost的队伍经常能问鼎冠军，那么问题来了，Xgboost为什么这么强呢？算法释义Xgboost是一种带有正则化项，并利用损失函数泰勒展开式中二阶导数信息优化求解并增加一些计算优化的梯度提升树。Xgboost的目标函数定义为：其中l为损失函数，Ω(ft(x))是用于惩罚ft(x)模型复杂度的正则化项。根据上述目标函数可以得到Xgboost在每一轮前向分步算法中
机器学习数据预处理方法（数据重编码） ##2 恒c 机器学习人工智能数据分析
文章目录@[TOC]基于Kaggle电信用户流失案例数据（可在官网进行下载）一、离散字段的数据重编码1.OrdinalEncoder自然数排序2.OneHotEncoder独热编码3.ColumnTransformer转化流水线二、连续字段的特征变换1.标准化（Standardization）和归一化（Normalization）2.连续变量分箱3.连续变量特征转化的ColumnTransform
机器学习逻辑回归模型训练与超参数调优 ##3 恒c 机器学习逻辑回归人工智能
文章目录@[TOC]基于Kaggle电信用户流失案例数据（可在官网进行下载）逻辑回归模型训练逻辑回归的超参数调优基于Kaggle电信用户流失案例数据（可在官网进行下载）数据预处理部分可见：机器学习数据预处理方法（数据重编码）逻辑回归模型训练fromsklearn.metricsimportaccuracy_score,recall_score,precision_score,f1_score,ro
50Kaggle 数据分析项目入门实战--分销商产品未来销售情况预测 Jachin111
分销商产品未来销售情况预测未来销售额预测介绍对于一个产品来说，其未来销售额的预测是一个重要的指标，也是一项重要的任务。例如，对于一部苹果手机来说。在上市之前，得先对销售额进行预测，才能确定出货量的大小。本次实验来源于Kaggle上的一个挑战，即：未来销售额预测，由俄罗斯的1C-Company软件分销公司发起，并提供数据。而本次实验的任务就是根据提供的数据，包含商品类别、商品名称、商店等信息和商品的
机器学习本科课程实验1 线性模型 11egativ1ty 机器学习本科课程机器学习人工智能
第三章线性模型3.1一元线性回归3.2多元线性回归3.3对数几率回归，线性判别分析（二选一）3.4类别不均衡3.1一元线性回归——Kaggle房价预测使用Kaggle房价预测数据集：打乱数据顺序，取前70%的数据作为训练集，后30%的数据作为测试集分别以LotArea,BsmtUnfSF,GarageArea三种特征作为模型的输入，SalePrice作为模型的输出在训练集上，使用最小二乘法求解模型
统一思想认识永夜-极光思想
1.统一思想认识的基础,才能有的放矢原因: 总有一种描述事物的方式最贴近本质,最容易让人理解. 如何让教育更轻松,在于找到最适合学生的方式. 难点在于,如何模拟对方的思维基础选择合适的方式. &
Joda Time使用笔记 bylijinnan java joda time
Joda Time的介绍可以参考这篇文章： http://www.ibm.com/developerworks/cn/java/j-jodatime.html 工作中也常常用到Joda Time，为了避免每次使用都查API，记录一下常用的用法： /** * DateTime变化（增减） */ @Tes
FileUtils API eksliang FileUtils FileUtils API
转载请出自出处：http://eksliang.iteye.com/blog/2217374 一、概述这是一个Java操作文件的常用库，是Apache对java的IO包的封装，这里面有两个非常核心的类FilenameUtils跟FileUtils，其中FilenameUtils是对文件名操作的封装;FileUtils是文件封装，开发中对文件的操作，几乎都可以在这个框架里面找到。非常的好用。
各种新兴技术不懂事的小屁孩技术
1:gradle Gradle 是以 Groovy 语言为基础，面向Java应用为主。基于DSL（领域特定语言）语法的自动化构建工具。现在构建系统常用到maven工具，现在有更容易上手的gradle，搭建java环境: http://www.ibm.com/developerworks/cn/opensource/os-cn-gradle/ 搭建android环境： http://m
tomcat6的https双向认证酷的飞上天空 tomcat6
1.生成服务器端证书 keytool -genkey -keyalg RSA -dname "cn=localhost,ou=sango,o=none,l=china,st=beijing,c=cn" -alias server -keypass password -keystore server.jks -storepass password -validity 36
托管虚拟桌面市场势不可挡蓝儿唯美
用户还需要冗余的数据中心，dinCloud的高级副总裁兼首席营销官Ali Din指出。该公司转售一个MSP可以让用户登录并管理和提供服务的用于DaaS的云自动化控制台，提供服务或者MSP也可以自己来控制。在某些情况下，MSP会在dinCloud的云服务上进行服务分层，如监控和补丁管理。 MSP的利润空间将根据其参与的程度而有所不同，Din说。 “我们有一些合作伙伴负责将我们推荐给客户作为个
spring学习——xml文件的配置 a-john spring
在Spring的学习中，对于其xml文件的配置是必不可少的。在Spring的多种装配Bean的方式中，采用XML配置也是最常见的。以下是一个简单的XML配置文件： <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.or
HDU 4342 History repeat itself 模拟 aijuans 模拟
来源：http://acm.hdu.edu.cn/showproblem.php?pid=4342 题意：首先让求第几个非平方数，然后求从1到该数之间的每个sqrt(i)的下取整的和。思路：一个简单的模拟题目，但是由于数据范围大，需要用__int64。我们可以首先把平方数筛选出来，假如让求第n个非平方数的话，看n前面有多少个平方数，假设有x个，则第n个非平方数就是n+x。注意两种特殊情况，即
java中最常用jar包的用途 asia007 java
java中最常用jar包的用途 jar包用途axis.jarSOAP引擎包commons-discovery-0.2.jar用来发现、查找和实现可插入式接口，提供一些一般类实例化、单件的生命周期管理的常用方法.jaxrpc.jarAxis运行所需要的组件包saaj.jar创建到端点的点到点连接的方法、创建并处理SOAP消息和附件的方法，以及接收和处理SOAP错误的方法. w
ajax获取Struts框架中的json编码异常和Struts中的主控制器异常的解决办法百合不是茶 js json编码返回异常
一:ajax获取自定义Struts框架中的json编码出现以下问题: 1,强制flush输出 json编码打印在首页 2, 不强制flush js会解析json 打印出来的是错误的jsp页面却没有跳转到错误页面 3, ajax中的dataType的json 改为text 会
JUnit使用的设计模式 bijian1013 java 设计模式 JUnit
JUnit源代码涉及使用了大量设计模式 1、模板方法模式（Template Method）定义一个操作中的算法骨架，而将一些步骤延伸到子类中去，使得子类可以不改变一个算法的结构，即可重新定义该算法的某些特定步骤。这里需要复用的是算法的结构，也就是步骤，而步骤的实现可以在子类中完成。
Linux常用命令（摘录） sunjing crond chkconfig
chkconfig --list 查看linux所有服务 chkconfig --add servicename 添加linux服务 netstat -apn | grep 8080 查看端口占用 env 查看所有环境变量 echo $JAVA_HOME 查看JAVA_HOME环境变量安装编译器 yum install -y gcc
【Hadoop一】Hadoop伪集群环境搭建 bit1129 hadoop
结合网上多份文档，不断反复的修正hadoop启动和运行过程中出现的问题，终于把Hadoop2.5.2伪分布式安装起来，跑通了wordcount例子。Hadoop的安装复杂性的体现之一是，Hadoop的安装文档非常多，但是能一个文档走下来的少之又少，尤其是Hadoop不同版本的配置差异非常的大。Hadoop2.5.2于前两天发布，但是它的配置跟2.5.0，2.5.1没有分别。 &nb
Anychart图表系列五之事件监听白糖_ chart
创建图表事件监听非常简单：首先是通过addEventListener('监听类型',js监听方法)添加事件监听，然后在js监听方法中定义具体监听逻辑。以钻取操作为例，当用户点击图表某一个point的时候弹出point的name和value，代码如下： <script> //创建AnyChart var chart = new AnyChart(); //添加钻取操作&quo
Web前端相关段子 braveCS web前端
Web标准：结构、样式和行为分离使用语义化标签 0）标签的语义：使用有良好语义的标签，能够很好地实现自我解释，方便搜索引擎理解网页结构，抓取重要内容。去样式后也会根据浏览器的默认样式很好的组织网页内容，具有很好的可读性，从而实现对特殊终端的兼容。 1）div和span是没有语义的：只是分别用作块级元素和行内元素的区域分隔符。当页面内标签无法满足设计需求时，才会适当添加div
编程之美-24点游戏 bylijinnan 编程之美
import java.util.ArrayList; import java.util.Arrays; import java.util.HashSet; import java.util.List; import java.util.Random; import java.util.Set; public class PointGame { /**编程之美
主页面子页面传值总结 chengxuyuancsdn 总结
1、showModalDialog returnValue是javascript中html的window对象的属性,目的是返回窗口值,当用window.showModalDialog函数打开一个IE的模式窗口时,用于返回窗口的值主界面 var sonValue=window.showModalDialog("son.jsp"); 子界面 window.retu
[网络与经济]互联网+的含义 comsci 互联网+
互联网+后面是一个人的名字 = 网络控制系统互联网+你的名字 = 网络个人数据库每日提示:如果人觉得不舒服,千万不要外出到处走动,就呆在床上,玩玩手游,更不能够去开车,现在交通状况不
oracle 创建视图 with check option daizj 视图 view oralce
我们来看下面的例子： create or replace view testview as select empno,ename from emp where ename like ‘M%’ with check option; 这里我们创建了一个视图，并使用了with check option来限制了视图。然后我们来看一下视图包含的结果： select * from testv
ToastPlugin插件在cordova3.3下使用 dibov Cordova
自己开发的Todos应用，想实现“ 再按一次返回键退出程序 ”的功能，采用网上的ToastPlugins插件，发现代码或文章基本都是老版本，运行问题比较多。折腾了好久才弄好。下面吧基于cordova3.3下的ToastPlugins相关代码共享。 ToastPlugin.java package&nbs
C语言22个系统函数 dcj3sjt126com c function
C语言系统函数一、数学函数下列函数存放在math.h头文件中Double floor(double num) 求出不大于num的最大数。Double fmod(x, y) 求整数x/y的余数。Double frexp(num, exp); double num; int *exp; 将num分为数字部分（尾数）x和以2位的指数部分n，即num=x*2n，指数n存放在exp指向的变量中，返回x。D
开发一个类的流程 dcj3sjt126com 开发
本人近日根据自己的开发经验总结了一个类的开发流程。这个流程适用于单独开发的构件，并不适用于对一个项目中的系统对象开发。开发出的类可以存入私人类库，供以后复用。以下是开发流程： 1. 明确类的功能，抽象出类的大概结构 2. 初步设想类的接口 3. 类名设计（驼峰式命名） 4. 属性设置(权限设置) 判断某些变量是否有必要作为成员属
java 并发 shuizhaosi888 java 并发
能够写出高伸缩性的并发是一门艺术在JAVA SE5中新增了3个包 java.util.concurrent java.util.concurrent.atomic java.util.concurrent.locks 在java的内存模型中，类的实例字段、静态字段和构成数组的对象元素都会被多个线程所共享，局部变量与方法参数都是线程私有的，不会被共享。
Spring Security（11）——匿名认证 234390216 Spring Security ROLE_ANNOYMOUS 匿名
匿名认证目录 1.1 配置 1.2 AuthenticationTrustResolver 对于匿名访问的用户，Spring Security支持为其建立一个匿名的AnonymousAuthenticat
NODEJS项目实践0.2[ express,ajax通信...] 逐行分析JS源代码 Ajax nodejs express
一、前言通过上节学习，我们已经 ubuntu系统搭建了一个可以访问的nodejs系统，并做了nginx转发。本节原要做web端服务及 mongodb的存取，但写着写着，web端就
在Struts2 的Action中怎样获取表单提交上来的多个checkbox的值 lhbthanks java html struts checkbox
第一种方法：获取结果String类型在 Action 中获得的是一个 String 型数据，每一个被选中的 checkbox 的 value 被拼接在一起，每个值之间以逗号隔开(,)。所以在 Action 中定义一个跟 checkbox 的 name 同名的属性来接收这些被选中的 checkbox 的 value 即可。以下是实现的代码：前台 HTML 代码：
003.Kafka基本概念 nweiren hadoop kafka
Kafka基本概念：Topic、Partition、Message、Producer、Broker、Consumer。 Topic：消息源（Message）的分类。 Partition： Topic物理上的分组，一
Linux环境下安装JDK roadrunners jdk linux
1、准备工作创建JDK的安装目录： mkdir -p /usr/java/ 下载JDK，找到适合自己系统的JDK版本进行下载： http://www.oracle.com/technetwork/java/javase/downloads/index.html 把JDK安装包下载到/usr/java/目录，然后进行解压： tar -zxvf jre-7
Linux忘记root密码的解决思路 tomcat_oracle linux
1：使用同版本的linux启动系统，chroot到忘记密码的根分区passwd改密码　　2：grub启动菜单中加入init=/bin/bash进入系统，不过这时挂载的是只读分区。根据系统的分区情况进一步判断. 　　3: grub启动菜单中加入 single以单用户进入系统. 　　4:用以上方法mount到根分区把/etc/passwd中的root密码去除　　例如: 　　ro
跨浏览器 HTML5 postMessage 方法以及 message 事件模拟实现 xueyou jsonp jquery 框架 UI html5
postMessage 是 HTML5 新方法，它可以实现跨域窗口之间通讯。到目前为止，只有 IE8+, Firefox 3, Opera 9, Chrome 3和 Safari 4 支持，而本篇文章主要讲述 postMessage 方法与 message 事件跨浏览器实现。postMessage 方法 JSONP 技术不一样，前者是前端擅长跨域文档数据即时通讯，后者擅长针对跨域服务端数据通讯，p

Kaggle项目实战1——Digit Recognizer——排名Top10%

一、kaggle介绍

二、Digit Recognizer任务

三、工具准备

四、第一次尝试：Random Forest

五、第二次尝试：Multi-Layer-Perceptron

六、第三次尝试：LeNet5

七、本文所需的一些背景

你可能感兴趣的:(Kaggle)