cunzai1985

cntk数据集_CNTK-内存中和大数据集

cntk数据集

CNTK-内存中和大数据集 (CNTK - In-Memory and Large Datasets)

In this chapter, we will learn about how to work with the in-memory and large datasets in CNTK.

在本章中，我们将学习如何使用CNTK中的内存和大型数据集。

使用内存较小的数据集进行训练 (Training with small in memory datasets)

When we talk about feeding data into CNTK trainer, there can be many ways, but it will depend upon the size of the dataset and format of the data. The data sets can be small in-memory or large datasets.

当我们谈论将数据馈入CNTK培训器时，可以有很多方法，但这取决于数据集的大小和数据的格式。数据集可以是小型内存数据集，也可以是大型数据集。

In this section, we are going to work with in-memory datasets. For this, we will use the following two frameworks −

在本节中，我们将使用内存数据集。为此，我们将使用以下两个框架-

Numpy
脾气暴躁的
Pandas
大熊猫

使用Numpy数组 (Using Numpy arrays)

Here, we will work with a numpy based randomly generated dataset in CNTK. In this example, we are going to simulate data for a binary classification problem. Suppose, we have a set of observations with 4 features and want to predict two possible labels with our deep learning model.

在这里，我们将使用CNTK中基于numpy的随机生成的数据集。在此示例中，我们将模拟二进制分类问题的数据。假设我们有一组具有4个特征的观察结果，并希望使用我们的深度学习模型预测两个可能的标签。

实施实例 (Implementation Example)

For this, first we must generate a set of labels containing a one-hot vector representation of the labels, we want to predict. It can be done with the help of following steps −

为此，首先我们必须生成一组标签，其中包含我们要预测的标签的单热矢量表示。可以通过以下步骤完成-

Step 1 − Import the numpy package as follows −

步骤1-如下导入numpy包-


import numpy as np
num_samples = 20000

Step 2 − Next, generate a label mapping by using np.eye function as follows −

步骤2-接下来，使用np.eye函数生成标签映射，如下所示-


label_mapping = np.eye(2)

Step 3 − Now by using np.random.choice function, collect the 20000 random samples as follows −

步骤3-现在通过使用np.random.choice函数，如下收集20000个随机样本-


y = label_mapping[np.random.choice(2,num_samples)].astype(np.float32)

Step 4 − Now at last by using np.random.random function, generate an array of random floating point values as follows −

步骤4-现在最后通过使用np.random.random函数，生成一个随机浮点值数组，如下所示：


x = np.random.random(size=(num_samples, 4)).astype(np.float32)

Once, we generate an array of random floating-point values, we need to convert them to 32-bit floating point numbers so that it can be matched to the format expected by CNTK. Let’s follow the steps below to do this −

一次，我们生成了一个随机浮点值数组，我们需要将它们转换为32位浮点数，以便它可以与CNTK期望的格式匹配。让我们按照以下步骤进行操作-

Step 5 − Import the Dense and Sequential layer functions from cntk.layers module as follows −

步骤5-从cntk.layers模块导入密集和顺序图层功能，如下所示-


from cntk.layers import Dense, Sequential

Step 6 − Now, we need to import the activation function for the layers in the network. Let us import the sigmoid as activation function −

步骤6-现在，我们需要为网络中的层导入激活功能。让我们导入Sigmoid作为激活函数-


from cntk import input_variable, default_options
from cntk.ops import sigmoid

Step 7 − Now, we need to import the loss function to train the network. Let us import binary_cross_entropy as loss function −

步骤7-现在，我们需要导入损失功能来训练网络。让我们导入binary_cross_entropy作为损失函数-


from cntk.losses import binary_cross_entropy

Step 8 − Next, we need to define the default options for the network. Here, we will be providing the sigmoid activation function as a default setting. Also, create the model by using Sequential layer function as follows −

步骤8-接下来，我们需要定义网络的默认选项。在这里，我们将提供S型激活功能作为默认设置。另外，通过使用顺序图层函数创建模型，如下所示：


with default_options(activation=sigmoid):
model = Sequential([Dense(6),Dense(2)])

Step 9 − Next, initialise an input_variable with 4 input features serving as the input for the network.

步骤9-接下来，使用4个输入要素 (用作网络的输入)初始化input_variable 。


features = input_variable(4)

Step 10 − Now, in order to complete it, we need to connect features variable to the NN.

步骤10-现在，为了完成它，我们需要将要素变量连接到NN。


z = model(features)

So, now we have a NN, with the help of following steps, let us train it using in-memory dataset −

所以，现在我们有了一个NN，在以下步骤的帮助下，让我们使用内存数据集训练它-

Step 11 − To train this NN, first we need to import learner from cntk.learners module. We will import sgd learner as follows −

步骤11-要训练该NN，首先我们需要从cntk.learners模块导入学习者。我们将如下导入sgd学习器-


from cntk.learners import sgd

Step 12 − Along with that import the ProgressPrinter from cntk.logging module as well.

步骤12-以及从cntk.logging模块导入ProgressPrinter的 步骤。


from cntk.logging import ProgressPrinter
progress_writer = ProgressPrinter(0)

Step 13 − Next, define a new input variable for the labels as follows −

步骤13-接下来，为标签定义一个新的输入变量，如下所示-


labels = input_variable(2)

Step 14 − In order to train the NN model, next, we need to define a loss using the binary_cross_entropy function. Also, provide the model z and the labels variable.

步骤14-为了训练NN模型，接下来，我们需要使用binary_cross_entropy函数定义一个损失。另外，提供模型z和标签变量。


loss = binary_cross_entropy(z, labels)

Step 15 − Next, initialize the sgd learner as follows −

步骤15-接下来，如下初始化sgd学习器-


learner = sgd(z.parameters, lr=0.1)

Step 16 − At last, call the train method on the loss function. Also, provide it with the input data, the sgd learner and the progress_printer.−

步骤16-最后，在损失函数上调用train方法。另外，向其提供输入数据， sgd学习器和progress_printer。


training_summary=loss.train((x,y),parameter_learners=[learner],callbacks=[progress_writer])

完整的实施示例 (Complete implementation example)


import numpy as np
num_samples = 20000
label_mapping = np.eye(2)
y = label_mapping[np.random.choice(2,num_samples)].astype(np.float32)
x = np.random.random(size=(num_samples, 4)).astype(np.float32)
from cntk.layers import Dense, Sequential
from cntk import input_variable, default_options
from cntk.ops import sigmoid
from cntk.losses import binary_cross_entropy
with default_options(activation=sigmoid):
   model = Sequential([Dense(6),Dense(2)])
features = input_variable(4)
z = model(features)
from cntk.learners import sgd
from cntk.logging import ProgressPrinter
progress_writer = ProgressPrinter(0)
labels = input_variable(2)
loss = binary_cross_entropy(z, labels)
learner = sgd(z.parameters, lr=0.1)
training_summary=loss.train((x,y),parameter_learners=[learner],callbacks=[progress_writer])

输出量 (Output)


Build info:
     Built time: *** ** **** 21:40:10
     Last modified date: *** *** ** 21:08:46 2019
     Build type: Release
     Build target: CPU-only
     With ASGD: yes
     Math lib: mkl
     Build Branch: HEAD
     Build SHA1:ae9c9c7c5f9e6072cc9c94c254f816dbdc1c5be6 (modified)
     MPI distribution: Microsoft MPI
     MPI version: 7.0.12437.6
-------------------------------------------------------------------
average   since   average   since examples
loss      last    metric    last
------------------------------------------------------
Learning rate per minibatch: 0.1
1.52      1.52      0         0     32
1.51      1.51      0         0     96
1.48      1.46      0         0    224
1.45      1.42      0         0    480
1.42       1.4      0         0    992
1.41      1.39      0         0   2016
1.4       1.39      0         0   4064
1.39      1.39      0         0   8160
1.39      1.39      0         0  16352

使用熊猫数据框 (Using Pandas DataFrames)

Numpy arrays are very limited in what they can contain and one of the most basic ways of storing data. For example, a single n-dimensional array can contain data of a single data type. But on the other hand, for many real-world cases we need a library that can handle more than one data type in a single dataset.

Numpy数组可以包含的内容以及存储数据的最基本方法之一非常有限。例如，单个n维数组可以包含单个数据类型的数据。但另一方面，对于许多实际情况，我们需要一个可处理单个数据集中多个数据类型的库。

One of the Python libraries called Pandas makes it easier to work with such kind of datasets. It introduces the concept of a DataFrame (DF) and allows us to load datasets from disk stored in various formats as DFs. For example, we can read DFs stored as CSV, JSON, Excel, etc.

一个名为Pandas的Python库使使用这种数据集变得更加容易。它介绍了DataFrame(DF)的概念，并允许我们从以DF格式存储的磁盘中加载数据集。例如，我们可以读取以CSV，JSON，Excel等格式存储的DF。

You can learn Python Pandas library in more detail at https://www.tutorialspoint.com/python_pandas/index.htm.

您可以在https://www.tutorialspoint.com/python_pandas/index.htm上详细了解Python Pandas库。

实施实例 (Implementation Example)

In this example, we are going to use the example of classifying three possible species of the iris flowers based on four properties. We have created this deep learning model in the previous sections too. The model is as follows −

在此示例中，我们将使用基于四个属性对鸢尾花的三种可能物种进行分类的示例。我们也在之前的部分中创建了这种深度学习模型。模型如下-


from cntk.layers import Dense, Sequential
from cntk import input_variable, default_options
from cntk.ops import sigmoid, log_softmax
from cntk.losses import binary_cross_entropy
model = Sequential([
Dense(4, activation=sigmoid),
Dense(3, activation=log_softmax)
])
features = input_variable(4)
z = model(features)

The above model contains one hidden layer and an output layer with three neurons to match the number of classes we can predict.

上面的模型包含一个隐藏层和一个包含三个神经元的输出层，以匹配我们可以预测的类数。

Next, we will use the train method and loss function to train the network. For this, first we must load and preprocess the iris dataset, so that it matches the expected layout and data format for the NN. It can be done with the help of following steps −

接下来，我们将使用训练方法和损失函数来训练网络。为此，首先我们必须加载和预处理虹膜数据集，以使其与NN的预期布局和数据格式匹配。可以通过以下步骤完成-

Step 1 − Import the numpy and Pandas package as follows −

步骤1-如下导入numpy和Pandas包-


import numpy as np
import pandas as pd

Step 2 − Next, use the read_csv function to load the dataset into memory −

步骤2-接下来，使用read_csv函数将数据集加载到内存中-


df_source = pd.read_csv(‘iris.csv’, names = [‘sepal_length’, ‘sepal_width’,
 ‘petal_length’, ‘petal_width’, ‘species’], index_col=False)

Step 3 − Now, we need to create a dictionary that will be mapping the labels in the dataset with their corresponding numeric representation.

步骤3-现在，我们需要创建一个字典，该字典将映射数据集中的标签及其对应的数字表示形式。


label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2}

Step 4 − Now, by using iloc indexer on the DataFrame, select the first four columns as follows −

步骤4-现在，通过在DataFrame上使用iloc索引器，如下选择前四列：


x = df_source.iloc[:, :4].values

Step 5 −Next, we need to select the species columns as the labels for the dataset. It can be done as follows −

步骤5-接下来，我们需要选择种类列作为数据集的标签。它可以做到如下-


y = df_source[‘species’].values

Step 6 − Now, we need to map the labels in the dataset, which can be done by using label_mapping. Also, use one_hot encoding to convert them into one-hot encoding arrays.

步骤6-现在，我们需要在数据集中映射标签，这可以通过使用label_mapping来完成。另外，使用one_hot编码将它们转换为one-hot编码数组。


y = np.array([one_hot(label_mapping[v], 3) for v in y])

Step 7 − Next, to use the features and the mapped labels with CNTK, we need to convert them both to floats −

步骤7-接下来，要将特征和映射标签与CNTK一起使用，我们需要将它们都转换为浮点数-


x= x.astype(np.float32)
y= y.astype(np.float32)

As we know that, the labels are stored in the dataset as strings and CNTK cannot work with these strings. That’s the reason, it needs one-hot encoded vectors representing the labels. For this, we can define a function say one_hot as follows −

众所周知，标签以字符串形式存储在数据集中，而CNTK无法使用这些字符串。这就是原因，它需要表示标签的一键编码矢量。为此，我们可以定义一个函数one_hot ，如下所示：


def one_hot(index, length):
result = np.zeros(length)
result[index] = index
return result

Now, we have the numpy array in the correct format, with the help of following steps we can use them to train our model −

现在，我们以正确的格式设置了numpy数组，在以下步骤的帮助下，我们可以使用它们来训练我们的模型-

Step 8 − First, we need to import the loss function to train the network. Let us import binary_cross_entropy_with_softmax as loss function −

步骤8-首先，我们需要导入损失功能来训练网络。让我们导入binary_cross_entropy_with_softmax作为损失函数-


from cntk.losses import binary_cross_entropy_with_softmax

Step 9 − To train this NN, we also need to import learner from cntk.learners module. We will import sgd learner as follows −

步骤9-要训练该NN，我们还需要从cntk.learners模块导入学习者。我们将如下导入sgd学习器-


from cntk.learners import sgd

Step 10 − Along with that import the ProgressPrinter from cntk.logging module as well.

步骤10-以及从cntk.logging模块导入ProgressPrinter的 步骤。


from cntk.logging import ProgressPrinter
progress_writer = ProgressPrinter(0)

Step 11 − Next, define a new input variable for the labels as follows −

步骤11-接下来，为标签定义一个新的输入变量，如下所示-


labels = input_variable(3)

Step 12 − In order to train the NN model, next, we need to define a loss using the binary_cross_entropy_with_softmax function. Also provide the model z and the labels variable.

步骤12-为了训练NN模型，接下来，我们需要使用binary_cross_entropy_with_softmax函数定义损失。还提供模型z和标签变量。


loss = binary_cross_entropy_with_softmax (z, labels)

Step 13 − Next, initialise the sgd learner as follows −

步骤13-接下来，如下初始化sgd学习器-


learner = sgd(z.parameters, 0.1)

Step 14 − At last, call the train method on the loss function. Also, provide it with the input data, the sgd learner and the progress_printer.

步骤14-最后，在损失函数上调用train方法。另外，向其提供输入数据， sgd学习者和progress_printer 。


training_summary=loss.train((x,y),parameter_learners=[learner],callbacks=
[progress_writer],minibatch_size=16,max_epochs=5)

完整的实施示例 (Complete implementation example)


from cntk.layers import Dense, Sequential
from cntk import input_variable, default_options
from cntk.ops import sigmoid, log_softmax
from cntk.losses import binary_cross_entropy
model = Sequential([
Dense(4, activation=sigmoid),
Dense(3, activation=log_softmax)
])
features = input_variable(4)
z = model(features)
import numpy as np
import pandas as pd
df_source = pd.read_csv(‘iris.csv’, names = [‘sepal_length’, ‘sepal_width’, ‘petal_length’, ‘petal_width’, ‘species’], index_col=False)
label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2}
x = df_source.iloc[:, :4].values
y = df_source[‘species’].values
y = np.array([one_hot(label_mapping[v], 3) for v in y])
x= x.astype(np.float32)
y= y.astype(np.float32)
def one_hot(index, length):
result = np.zeros(length)
result[index] = index
return result
from cntk.losses import binary_cross_entropy_with_softmax
from cntk.learners import sgd
from cntk.logging import ProgressPrinter
progress_writer = ProgressPrinter(0)
labels = input_variable(3)
loss = binary_cross_entropy_with_softmax (z, labels)
learner = sgd(z.parameters, 0.1)
training_summary=loss.train((x,y),parameter_learners=[learner],callbacks=[progress_writer],minibatch_size=16,max_epochs=5)

输出量 (Output)


Build info:
     Built time: *** ** **** 21:40:10
     Last modified date: *** *** ** 21:08:46 2019
     Build type: Release
     Build target: CPU-only
     With ASGD: yes
     Math lib: mkl
     Build Branch: HEAD
     Build SHA1:ae9c9c7c5f9e6072cc9c94c254f816dbdc1c5be6 (modified)
     MPI distribution: Microsoft MPI
     MPI version: 7.0.12437.6
-------------------------------------------------------------------
average    since    average   since   examples
loss        last     metric   last
------------------------------------------------------
Learning rate per minibatch: 0.1
1.1         1.1        0       0      16
0.835     0.704        0       0      32
1.993      1.11        0       0      48
1.14       1.14        0       0     112
[………]

大型数据集训练 (Training with large datasets)

In the previous section, we worked with small in-memory datasets using Numpy and pandas, but not all datasets are so small. Specially the datasets containing images, videos, sound samples are large. MinibatchSource is a component, that can load data in chunks, provided by CNTK to work with such large datasets. Some of the features of MinibatchSource components are as follows −

在上一节中，我们使用Numpy和pandas处理了较小的内存数据集，但并非所有数据集都这么小。特别是包含图像，视频，声音样本的数据集很大。 MinibatchSource是一个组件，可以按块加载数据，由CNTK提供，可以处理如此大的数据集。 MinibatchSource组件的一些功能如下-

MinibatchSource can prevent NN from overfitting by automatically randomize samples read from the data source.
MinibatchSource可以通过自动随机化从数据源读取的样本来防止NN过度拟合。
It has built-in transformation pipeline which can be used to augment the data.
它具有内置的转换管道，可用于扩充数据。
It loads the data on a background thread separate from the training process.
它将数据加载到与训练过程分开的后台线程中。

In the following sections, we are going to explore how to use a minibatch source with out-of-memory data to work with large datasets. We will also explore, how we can use it to feed for training a NN.

在以下各节中，我们将探讨如何使用带有内存不足数据的小批量源处理大型数据集。我们还将探索如何使用它来训练神经网络。

创建MinibatchSource实例 (Creating MinibatchSource instance)

In the previous section, we have used iris flower example and worked with small in-memory dataset using Pandas DataFrames. Here, we will be replacing the code that uses data from a pandas DF with MinibatchSource. First, we need to create an instance of MinibatchSource with the help of following steps −

在上一节中，我们使用了鸢尾花示例，并使用Pandas DataFrames处理了较小的内存数据集。在这里，我们将使用MinibatchSource替换使用熊猫DF数据的代码。首先，我们需要在以下步骤的帮助下创建MinibatchSource的实例-

实施实例 (Implementation Example)

Step 1 − First, from cntk.io module import the components for the minibatchsource as follows −

步骤1-首先，从cntk.io模块导入minibatchsource的组件，如下所示-


from cntk.io import StreamDef, StreamDefs, MinibatchSource, CTFDeserializer,
 INFINITY_REPEAT

Step 2 − Now, by using StreamDef class, crate a stream definition for the labels.

步骤2-现在，通过使用StreamDef类，为标签创建流定义。


labels_stream = StreamDef(field=’labels’, shape=3, is_sparse=False)

Step 3 − Next, create to read the features filed from the input file, create another instance of StreamDef as follows.

步骤3-接下来，创建以读取输入文件中归档的功能，如下所示创建另一个StreamDef实例。


feature_stream = StreamDef(field=’features’, shape=4, is_sparse=False)

Step 4 − Now, we need to provide iris.ctf file as input and initialise the deserializer as follows −

步骤4-现在，我们需要提供iris.ctf文件作为输入并按如下方式初始化解串器 -


deserializer = CTFDeserializer(‘iris.ctf’, StreamDefs(labels=
label_stream, features=features_stream)

Step 5 − At last, we need to create instance of minisourceBatch by using deserializer as follows −

步骤5-最后，我们需要使用反序列化器创建minisourceBatch的实例，如下所示-


Minibatch_source = MinibatchSource(deserializer, randomize=True)

创建一个MinibatchSource实例-完整的实现示例 (Creating a MinibatchSource instance - Complete implementation example)


from cntk.io import StreamDef, StreamDefs, MinibatchSource, CTFDeserializer, INFINITY_REPEAT
labels_stream = StreamDef(field=’labels’, shape=3, is_sparse=False)
feature_stream = StreamDef(field=’features’, shape=4, is_sparse=False)
deserializer = CTFDeserializer(‘iris.ctf’, StreamDefs(labels=label_stream, features=features_stream)
Minibatch_source = MinibatchSource(deserializer, randomize=True)

创建MCTF文件 (Creating MCTF file)

As you have seen above, we are taking the data from ‘iris.ctf’ file. It has the file format called CNTK Text Format(CTF). It is mandatory to create a CTF file to get the data for the MinibatchSource instance we created above. Let us see how we can create a CTF file.

如您在上面看到的，我们正在从“ iris.ctf”文件中获取数据。它具有称为CNTK文本格式(CTF)的文件格式。必须创建一个CTF文件来获取上面创建的MinibatchSource实例的数据。让我们看看如何创建CTF文件。

实施实例 (Implementation Example)

Step 1 − First, we need to import the pandas and numpy packages as follows −

步骤1-首先，我们需要如下导入pandas和numpy包-


import pandas as pd
import numpy as np

Step 2 − Next, we need to load our data file, i.e. iris.csv into memory. Then, store it in the df_source variable.

步骤2-接下来，我们需要将数据文件(即iris.csv)加载到内存中。然后，将其存储在df_source变量中。


df_source = pd.read_csv(‘iris.csv’, names = [‘sepal_length’, ‘sepal_width’, ‘petal_length’, ‘petal_width’, ‘species’], index_col=False)

Step 3 − Now, by using iloc indexer as the features, take the content of the first four columns. Also, use the data from species column as follows −

步骤3-现在，通过使用iloc索引器作为功能，获取前四列的内容。另外，使用来自物种列的数据，如下所示：


features = df_source.iloc[: , :4].values
labels = df_source[‘species’].values

Step 4 − Next, we need to create a mapping between the label name and its numeric representation. It can be done by creating label_mapping as follows −

步骤4-接下来，我们需要在标签名称与其数字表示形式之间创建映射。可以通过如下创建label_mapping来完成-


label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2}

Step 5 − Now, convert the labels to a set of one-hot encoded vectors as follows −

步骤5-现在，将标签转换为一组单热编码矢量，如下所示-


labels = [one_hot(label_mapping[v], 3) for v in labels]

Now, as we did before, create a utility function called one_hot to encode the labels. It can be done as follows −

现在，像我们之前所做的那样，创建一个称为one_hot的实用程序函数来对标签进行编码。它可以做到如下-


def one_hot(index, length):
result = np.zeros(length)
result[index] = 1
return result

As, we have loaded and preprocessed the data, it’s time to store it on disk in the CTF file format. We can do it with the help of following Python code −

由于我们已经加载并预处理了数据，是时候以CTF文件格式将其存储在磁盘上了。我们可以在以下Python代码的帮助下做到这一点-


With open(‘iris.ctf’, ‘w’) as output_file:
for index in range(0, feature.shape[0]):
feature_values = ‘ ‘.join([str(x) for x in np.nditer(features[index])])
label_values = ‘ ‘.join([str(x) for x in np.nditer(labels[index])])
output_file.write(‘features {} | labels {} \n’.format(feature_values, label_values))

创建MCTF文件-完整的实现示例 (Creating a MCTF file - Complete implementation example)


import pandas as pd
import numpy as np
df_source = pd.read_csv(‘iris.csv’, names = [‘sepal_length’, ‘sepal_width’, ‘petal_length’, ‘petal_width’, ‘species’], index_col=False)
features = df_source.iloc[: , :4].values
labels = df_source[‘species’].values
label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2}
labels = [one_hot(label_mapping[v], 3) for v in labels]
def one_hot(index, length):
result = np.zeros(length)
result[index] = 1
return result
With open(‘iris.ctf’, ‘w’) as output_file:
for index in range(0, feature.shape[0]):
feature_values = ‘ ‘.join([str(x) for x in np.nditer(features[index])])
label_values = ‘ ‘.join([str(x) for x in np.nditer(labels[index])])
output_file.write(‘features {} | labels {} \n’.format(feature_values, label_values))

馈送数据 (Feeding the data)

Once you create MinibatchSource, instance, we need to train it. We can use the same training logic as used when we worked with small in-memory datasets. Here, we will use MinibatchSource instance as the input for the train method on loss function as follows −

创建实例MinibatchSource之后 ，我们需要对其进行培训。我们可以使用与处理小型内存数据集时相同的训练逻辑。在这里，我们将使用MinibatchSource实例作为损失函数的train方法的输入，如下所示：

实施实例 (Implementation Example)

Step 1 − In order to log the output of the training session, first import the ProgressPrinter from cntk.logging module as follows −

步骤1-为了记录培训课程的输出，请首先从cntk.logging模块导入ProgressPrinter，如下所示：


from cntk.logging import ProgressPrinter

Step 2 − Next, to set up the training session, import the trainer and training_session from cntk.train module as follows −

步骤2-接下来，要设置培训课程，请从cntk.train模块中导入trainer和training_session ，如下所示：


from cntk.train import Trainer,

Step 3 − Now, we need to define some set of constants like minibatch_size, samples_per_epoch and num_epochs as follows −

步骤3-现在，我们需要定义一些常量集，例如minibatch_size ， samples_per_epoch和num_epochs ，如下所示：


minbatch_size = 16
samples_per_epoch = 150
num_epochs = 30

Step 4 − Next, in order to know CNTK how to read data during training, we need to define a mapping between the input variable for the network and the streams in the minibatch source.

步骤4-接下来，为了知道CNTK如何在训练期间读取数据，我们需要定义网络的输入变量和minibatch源中的流之间的映射。


input_map = {
     features: minibatch.source.streams.features,
     labels: minibatch.source.streams.features
}

Step 5 − Next, to log the output of the training process, initialise the progress_printer variable with a new ProgressPrinter instance as follows −

步骤5-接下来，要记录训练过程的输出，请使用新的ProgressPrinter实例初始化progress_printer变量，如下所示：


progress_writer = ProgressPrinter(0)

Step 6 − At last, we need to invoke the train method on the loss as follows −

步骤6-最后，我们需要对损失调用以下方法：


train_history = loss.train(minibatch_source,
parameter_learners=[learner],
  model_inputs_to_streams=input_map,
callbacks=[progress_writer],
epoch_size=samples_per_epoch,
max_epochs=num_epochs)

馈送数据-完整的实现示例 (Feeding the data - Complete implementation example)


from cntk.logging import ProgressPrinter
from cntk.train import Trainer, training_session
minbatch_size = 16
samples_per_epoch = 150
num_epochs = 30
input_map = {
   features: minibatch.source.streams.features,
   labels: minibatch.source.streams.features
}
progress_writer = ProgressPrinter(0)
train_history = loss.train(minibatch_source,
parameter_learners=[learner],
model_inputs_to_streams=input_map,
callbacks=[progress_writer],
epoch_size=samples_per_epoch,
max_epochs=num_epochs)

输出量 (Output)


-------------------------------------------------------------------
average   since   average   since  examples
loss      last     metric   last
------------------------------------------------------
Learning rate per minibatch: 0.1
1.21      1.21      0        0       32
1.15      0.12      0        0       96
[………]

翻译自: https://www.tutorialspoint.com/microsoft_cognitive_toolkit/microsoft_cognitive_toolkit_in_memory_and_large_datasets.htm

cntk数据集

你可能感兴趣的:(字符串,python,tensorflow,机器学习,深度学习)

python 包管理工具uv
uv--versionuvpythonfinduvpythonlistexportUV_DEFAULT_INDEX="https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple"#换成私有的repoexportUV_HTTP_TIMEOUT=120uvpythoninstall3.12uvvenvmyenv--python3.12--seeduvhtt
Python通关秘籍之基础教程(一） Smile丶Life丶 Python 通关指南：从零基础到高手之路 python 开发语言后端
引言在编程的世界里，Python就像一位温和而强大的导师，它以简洁优雅的语法和强大的功能吸引着无数初学者和专业人士。无论你是想开发网站、分析数据、构建人工智能，还是仅仅想学习编程思维，Python都是你的理想选择。Python的魅力在于它的易读性和广泛的应用场景。它的代码就像英语句子一样自然，即使是完全没有编程经验的人也能快速上手。同时，Python拥有庞大的生态系统，从Web开发（Django、
Python 包管理工具（uv） cliffordl python python uv 开发语言
Python虚拟环境（conda）Python虚拟环境（venv）Python包管理工具（uv）文章目录1.uv的特点2.安装uv2.1.使用官方推荐方式2.2.使用pip安装（Python>=3.8）2.3.使用conda/mamba安装3.基本使用方法3.1.初始化项目并创建虚拟环境3.1.1.CMD运行结果3.1.2.VScode运行结果3.2.安装依赖3.3.生成依赖文件3.4.使用pyp
Python协程从入门到精通：9个案例解析yield、gevent与asyncio实战 python_chai Python python 开发语言协程并发 yield生成器 gerrnlet gevent
引言痛点分析：传统多线程在高并发场景下的性能瓶颈。协程优势：轻量级、高并发、低资源消耗。本文目标：通过9个代码案例，系统讲解协程的核心技术和应用场景。目录引言1.协程基础：理解yield生成器1.1yield的暂停与恢复机制1.2生产者-消费者模型实战1.3双向通信：send()方法详解2.手动协程控制：greenlet进阶2.1greenlet的显式切换原理2.2多任务协作案例3.自动化协程：g
C语言正则表达式使用详解
标准的C和C++都不支持正则表达式，但有正则表达式的函数库提供这功能.C语言处理正则表达式常用的函数有regcomp()、regexec()、regfree()和regerror()。使用正则表达式步骤：1)编译正则表达式regcomp()2)匹配正则表达式regexec()3)释放正则表达式regfree()4)获取regcomp或者regexec产生错误，获取包含错误信息的字符串函数声明如下：
Python爬虫在社交平台数据挖掘中的应用：深入探索用户互动程序员威哥 python 爬虫数据挖掘
引言社交媒体已经成为全球用户互动的主要平台，每天都有大量的信息生成，用户之间的互动行为如点赞、评论、分享、转发等构成了宝贵的数据资源。如何利用这些互动数据为商业决策、用户行为分析以及产品优化提供支持，已经成为数据科学与大数据分析领域的一个重要课题。Python作为一款强大的编程语言，凭借其丰富的爬虫库和数据分析工具，已经成为挖掘社交平台数据的重要工具。在本文中，我们将通过Python爬虫技术，深入
Python 爬虫实战：精准抓取母婴电商平台数据，深入分析用户评价洞察市场趋势程序员威哥最新爬虫实战项目 python 爬虫开发语言
前言随着生活水平的提高，越来越多的年轻父母开始关注母婴产品的质量和品牌。而母婴电商平台成为了他们选择和购买产品的主要渠道之一。母婴产品市场也因此变得异常活跃且充满竞争。在这样的市场环境下，用户评价不仅反映了产品的实际质量，也揭示了消费者的需求和偏好，成为品牌决策的核心依据之一。Python爬虫是获取电商平台用户评价数据、产品详情、价格等关键信息的强大工具。通过抓取和分析这些数据，品牌商可以实时了解
*Python爬虫应用：从社交媒体数据中提取有价值的用户行为洞察程序员威哥 python 爬虫媒体
引言在现代数字化时代，社交媒体已成为获取用户行为数据的重要来源。每秒钟，数百万条信息在平台上传播，用户的互动行为——点赞、评论、分享、关注等，构成了大量宝贵的行为数据。企业和个人通过分析这些数据，不仅可以理解用户需求、改进产品，还能精准制定营销策略。然而，如何高效地抓取、分析并从中提取有价值的用户行为洞察？这正是Python爬虫和数据分析技术的优势所在。本文将介绍如何利用Python爬虫从社交媒体
Python异步编程终极指南：用协程与事件循环重构你的高并发系统
title:Python异步编程终极指南：用协程与事件循环重构你的高并发系统date:2025/2/24updated:2025/2/24author:cmdragonexcerpt:深入剖析Python异步编程的核心机制。你将掌握：\n事件循环的底层实现原理与调度算法\nasync/await协程的6种高级用法模式\n异步HTTP请求的性能优化技巧（速度提升15倍+）\n常见异步陷阱的26种解决
python 异步编程：协程与 asyncio 花_城 Python 开发语言后端异步协程
文章目录一、协程（coroutine）1.1协程的概念1.2实现协程的方式二、asyncio异步编程2.1事件循环2.2快速上手2.3运行协程2.4await关键字2.5可等待对象2.5.1协程2.5.2任务（Task）2.5.3asyncio.Future三、concurrent.futures.Future（补充）3.1爬虫案例（asyncio+不支持异步的模块）四、asyncio异步迭代器五
突破性能瓶颈，几个高性能Python网络框架，高效实现网络应用
引言随着互联网和大数据时代的到来，高性能网络应用的需求日益增加。Python作为一种流行的编程语言，在高性能网络编程领域也具有广泛的应用。本文将深入探讨基于Python的几种高性能网络框架，分析它们各自的优势和适用场景，帮助开发者选择最适合自己需求的网络框架这里插播一条粉丝福利，如果你正在学习Python或者有计划学习Python，想要突破自我，对未来十分迷茫的，可以点击这里获取最新的Python
LeetCode 第91题：解码方法
题目描述：一条包含字母A-Z的消息通过以下映射进行了编码1-A......26-Z要特别注意，11106可以映射为AAJF或KJF06不是一个合法编码给你一个只含数字的非空字符串s，请计算并返回解码方法的总数。如果没有合法的方法解码整个字符串，返回0示例1：输入：s="12"输出：2解释：它可以解码为"AB"（12）或者"L"（12）。示例2：输入：s="226"输出：3解释：它可以解码为"BZ"
Python面试题：Python中的异步编程：详细讲解asyncio库的使用超哥同学 Python系列 python 开发语言面试编程
Python的异步编程是实现高效并发处理的一种方法，它使得程序能够在等待I/O操作时继续执行其他任务。在Python中，asyncio库是实现异步编程的主要工具。asyncio提供了一种机制来编写可以在单线程内并发执行的代码，适用于I/O密集型任务。以下是对asyncio库的详细讲解，包括基本概念、用法、示例以及注意事项。1.基本概念1.1协程（Coroutines）协程是一个特殊的函数，它可以被
Python 爬虫实战：如何搭建高效的分布式爬虫架构，突破数据抓取极限程序员威哥 python 爬虫分布式
随着互联网数据量的飞速增长，单一爬虫在抓取大量数据时的效率和稳定性往往无法满足需求。在这种情况下，分布式爬虫架构应运而生。分布式爬虫通过多节点并行工作，可以大大提高数据抓取的速度，同时减少单点故障的风险。本文将深入探讨如何使用Python构建一个高效的分布式爬虫架构，从架构设计到技术实现，帮助你突破数据抓取的极限。一、什么是分布式爬虫？分布式爬虫系统将爬虫任务拆分为多个子任务，分布到不同的服务器或
一文搞懂 Cursor 内部工作原理~ zz_jesse
介绍了Cursor，一个结合了AI技术的代码编辑器，它通过深度学习和语义索引的方式，提升了开发者的工作效率。Cursor通过与VSCode相似的界面和功能，以及自己的AI特性，实现了代码的智能化编辑和错误检查。译文从这开始～～你可能已经看到新闻：OpenAI正以高达30亿美元的价格收购Windsurf！与此同时，Cursor的母公司Anysphere也正在以90亿美元估值融资9亿美元！这对于代码生
python程序基本架构_Python 程序基本架构尤尔小喵喵 python程序基本架构
Python的一般程序基本架构为：输入，处理，输出，这三块。输入：包括两个内容，变量赋值与输入语句处理：包括算术运算，逻辑运算，算法处理这三方面输出：包括打印输出，写入文件，写入数据库这三块下面举两个例子具体了解一下Python的程序基本架构1输入：变量赋值处理：算术运算输出：打印输出x=12#变量赋值x=12y=13#变量赋值y=13z=x+y#算术运算print(z)#打印输出252输入：输入
如何让AI真正理解你的意图（自适应Prompt实战指南） nine是个工程师大语言模型人工智能 prompt
目前的LLM模型，在理解用户意图方面，正在使用自适应Prompt技术，来提升模型的理解能力。目前使用deepseek推理模型能明显看到自适应的一个过程。前言：为什么你的AI总是"答非所问"？相信很多人都遇到过这样的情况：你问：“帮我写一个Python爬虫”AI答：给你一堆理论知识和完整教程（你只想要简单代码）你问：“推荐一部电影”AI答：推荐了《教父》（你想看轻松喜剧）你问：“解释一下机器学习”A
如何创建Python工程目录九月恒心 Python python 自动测试
如何创建一个简单但是比较规范的python工程目录，本文是学习了LearnPythontheHardWay相关内容后做的一些笔记。安装python第三方包1.pipfromhttp://pypi.python.org/pypi/pip用于安装python第三方包的工具2.distributefromhttp://pypi.python.org/pypi/distribute已被弃用，是SetupT
从单体脚本到模块化设计：Python工程师的架构思维跃迁
引言：从“一团乱麻”到“乐高积木”你是否曾经打开一个Python脚本，里面密密麻麻挤着上千行代码？函数相互缠绕，全局变量随处可见，想改一个小功能却心惊胆战，生怕牵一发而动全身？这就是典型的“单体脚本”(MonolithicScript)困境。作为过来人，我深知这种痛苦。本文将手把手带你跳出这个泥潭，掌握模块化设计的核心思想，并初步建立宝贵的架构设计思维，让你的代码从“勉强运行”跃迁到“优雅可维护”
python json 反序列化-V1 CATTLECODE python json 开发语言
在编程中，‌反序列化函数‌用于将序列化后的数据（如JSON、XML等格式）重新转换为程序可操作的对象或数据结构。以下是不同语言和场景下的实现方式及特点：‌1.Python中的反序列化‌‌(1)标准库json模块‌‌json.loads()‌：将JSON字符串反序列化为Python对象（如字典、列表）。importjsonjson_str='{"name":"Alice","age":25}'dat
AI人工智能与机器学习的大数据融合应用 AI智能探索者人工智能机器学习大数据 ai
AI人工智能与机器学习的大数据融合应用关键词：AI人工智能、机器学习、大数据、融合应用、数据挖掘摘要：本文深入探讨了AI人工智能与机器学习在大数据融合应用方面的相关内容。首先介绍了研究的背景、目的、预期读者和文档结构，对核心术语进行了清晰定义。接着阐述了AI、机器学习和大数据的核心概念及相互联系，给出了形象的文本示意图和Mermaid流程图。详细讲解了核心算法原理，并通过Python源代码进行说明
day49-ansible初体验朱包林 linux python 运维服务器云计算
1.选型工具说明缺点xshell不适应机器过多场景，需要连接后才能用for+ssh/scp+密钥认证密钥认证，免密码登录scp传输文本/脚本ssh远程执行命令或脚本串行saltstack需要安装客户端ansible无客户端（密钥认证）批量部署环境需要新python版本，被红帽收购了Terraform关注基础设施（云环境），一键创建100台云服务器，一键创建负载均衡，数据库产品2.ansible架构
目标检测YOLO实战应用案例100讲-基于深度学习的自动驾驶目标检测算法研究（续）林聪木目标检测 YOLO 深度学习
目录基于双蓝图卷积的轻量化自动驾驶目标检测算法5.1引言5.2DarkNet53网络冗余性分析5.3双蓝图卷积网络5.4实验结果及分析基于深度学习的自动驾驶目标检测算法研究与应用传统的目标检测算法目标检测基线算法性能对比与选择相关理论和算法基础2.1引言2.2人工神经网络2.3FCOS目标检测算法2.4复杂交通场景下的目标检测难点与FCOS改进方案基于FCOS的目标检测算法改进3.1引言3.2Re
AI人工智能遇上TensorFlow：技术融合新趋势 AI大模型应用之禅人工智能 tensorflow python ai
AI人工智能遇上TensorFlow：技术融合新趋势关键词：人工智能、TensorFlow、深度学习、神经网络、机器学习、技术融合、AI开发摘要：本文深入探讨了人工智能技术与TensorFlow框架的融合发展趋势。我们将从基础概念出发，详细分析TensorFlow在AI领域的核心优势，包括其架构设计、算法实现和实际应用。文章包含丰富的技术细节，如神经网络原理、TensorFlow核心算法实现、数学
Python 通过IP地址查询地理位置
文章目录Python通过IP地址查询地理位置一、在线API查询（简单快速，依赖网络）1.**使用`requests`+ipinfo.io**2.**使用`requests`+ip-api.com**二、本地数据库查询（离线高效，需下载数据库）1.**使用`geoip2`+GeoLite2数据库**2.**其他本地库对比**️三、结果可视化（增强展示）使用`folium`生成交互地图⚖️四、方法选择
正则表达式基本用法（notepad++）丨封尘绝念斩丨正则表达式
1.启动Notepad++并打开一个文本文件。点击菜单栏的"搜索"，然后选择"查找"或"替换"。2.学习基本的匹配字符："."表示匹配任意字符。"\d"表示匹配数字字符。"\w"表示匹配字母、数字和下划线字符。"\s"表示匹配空白字符。"[abc]"表示匹配字符"a"、"b"或"c"中的任意一个。3.学习特殊字符和量词："^"表示匹配字符串的开头。"$"表示匹配字符串的结尾。"*"表示匹配前面的字
从零构建MCP服务器：FastMCP实战指南炼丹上岸大模型 #MCP 服务器运维人工智能大模型 python MCP
引言：MCP协议与FastMCP框架ModelContextProtocol（MCP）是连接AI模型与外部服务的标准化协议，允许LLM（如Claude、Gemini）调用工具、访问数据。然而，直接实现MCP协议需要处理JSON-RPC、会话管理等繁琐细节。FastMCP作为Python框架，封装了这些底层逻辑，让开发者专注于业务功能。本文将通过分步实战，从零构建一个完整的MCP服务器，涵盖工具、资
解决element ui select多选下拉框编辑时没有回显数据菌菇汤前端 javascript elementui
我们直接从编辑的数据拿id分割成数组是不行的，只会显示id正确做法：应该再遍历一下，主要是字符转数字，重点乘以1letjsonList=data.CharacteristicId.split(',')letlist=[]for(letiinjsonList){list.push(jsonList[i]*1)}this.ruleForm.characteristicEdit=list如果是单个字符串
notepad++正则表达式痞子IT 嵌入式开发语言 xml c语言
notepad++正则表达式使用笔记：1.查找空行：^\s*\r\n2.排除以（开头的行：^(?!（).*$3.查找第二行以A-D开头的情况：(\r\n)(^[A-D])4.查找不含有helloworld的行：^(?!.*helloworld).*$5.查找不以com结尾的字符串：^.*?(?|"']|"[^"]*"|'[^']*')*?(?:/>|>.*?)11.查找非换行空白：(\s)(?)及
4. 接收输入的一行字符串，统计字符串中包含数字的个数 `whyYa python 字符串
a=input("请输入字符串")print("输入的字符串是",a)count=0foriina:if'0'<=i<='9':count+=1print("字符串中出现数字的个数有：{}个".format(count))
辗转相处求最大公约数沐刃青蛟 C++漏洞
无言面对”江东父老“了，接触编程一年了，今天发现还不会辗转相除法求最大公约数。惭愧惭愧！为此，总结一下以方便日后忘了好查找。 1.输入要比较的两个数a,b 忽略：2.比较大小（因为后面要的是大的数对小的数做%操作） 3.辗转相除（用循环不停的取余，如a%b,直至b=0） 4.最后的a为两数的最大公约数 &
F5负载均衡会话保持技术及原理技术白皮书 bijian1013 F5 负载均衡
一.什么是会话保持？在大多数电子商务的应用系统或者需要进行用户身份认证的在线系统中，一个客户与服务器经常经过好几次的交互过程才能完成一笔交易或者是一个请求的完成。由于这几次交互过程是密切相关的，服务器在进行这些交互过程的某一个交互步骤时，往往需要了解上一次交互过程的处理结果，或者上几步的交互过程结果，服务器进行下
Object.equals方法：重载还是覆盖 Cwind java generics override overload
本文译自StackOverflow上对此问题的讨论。原问题链接在阅读Joshua Bloch的《Effective Java（第二版）》第8条“覆盖equals时请遵守通用约定”时对如下论述有疑问： “不要将equals声明中的Object对象替换为其他的类型。程序员编写出下面这样的equals方法并不鲜见，这会使程序员花上数个小时都搞不清它为什么不能正常工作：” pu
初始线程 15700786134
暑假学习的第一课是讲线程，任务是是界面上的一条线运动起来。既然是在界面上，那必定得先有一个界面，所以第一步就是，自己的类继承JAVA中的JFrame，在新建的类中写一个界面，代码如下： public class ShapeFr
Linux的tcpdump 被触发 tcpdump
用简单的话来定义tcpdump，就是：dump the traffic on a network，根据使用者的定义对网络上的数据包进行截获的包分析工具。 tcpdump可以将网络中传送的数据包的“头”完全截获下来提供分析。它支持针对网络层、协议、主机、网络或端口的过滤，并提供and、or、not等逻辑语句来帮助你去掉无用的信息。实用命令实例默认启动 tcpdump 普通情况下，直
安卓程序listview优化后还是卡顿肆无忌惮_ ListView
最近用eclipse开发一个安卓app，listview使用baseadapter，里面有一个ImageView和两个TextView。使用了Holder内部类进行优化了还是很卡顿。后来发现是图片资源的问题。把一张分辨率高的图片放在了drawable-mdpi文件夹下，当我在每个item中显示，他都要进行缩放，导致很卡顿。解决办法是把这个高分辨率图片放到drawable-xxhdpi下。 &nb
扩展easyUI tab控件，添加加载遮罩效果知了ing jquery
(function () { $.extend($.fn.tabs.methods, { //显示遮罩 loading: function (jq, msg) { return jq.each(function () { var panel = $(this).tabs(&
gradle上传jar到nexus 矮蛋蛋 gradle
原文地址： https://docs.gradle.org/current/userguide/maven_plugin.html configurations { deployerJars } dependencies { deployerJars "org.apache.maven.wagon
千万条数据外网导入数据库的解决方案。 alleni123 sql mysql
从某网上爬了数千万的数据，存在文本中。然后要导入mysql数据库。悲剧的是数据库和我存数据的服务器不在一个内网里面。。 ping了一下， 19ms的延迟。于是下面的代码是没用的。 ps = con.prepareStatement(sql); ps.setString(1, info.getYear())............; ps.exec
JAVA IO InputStreamReader和OutputStreamReader 百合不是茶 JAVA.io操作字符流
这是第三篇关于java.io的文章了，从开始对io的不了解-->熟悉--->模糊，是这几天来对文件操作中最大的感受，本来自己认为的熟悉了的，刚刚在回想起前面学的好像又不是很清晰了，模糊对我现在或许是最好的鼓励我会更加的去学加油！： JAVA的API提供了另外一种数据保存途径，使用字符流来保存的，字符流只能保存字符形式的流字节流和字符的难点：a,怎么将读到的数据
MO、MT解读 bijian1013 GSM
MO= Mobile originate，上行，即用户上发给SP的信息。MT= Mobile Terminate，下行，即SP端下发给用户的信息；上行:mo提交短信到短信中心下行:mt短信中心向特定的用户转发短信，你的短信是这样的，你所提交的短信，投递的地址是短信中心。短信中心收到你的短信后，存储转发，转发的时候就会根据你填写的接收方号码寻找路由，下发。在彩信领域是一样的道理。下行业务：由SP
五个JavaScript基础问题 bijian1013 JavaScript call apply this Hoisting
下面是五个关于前端相关的基础问题，但却很能体现JavaScript的基本功底。问题1：Scope作用范围考虑下面的代码： (function() { var a = b = 5; })(); console.log(b); 什么会被打印在控制台上？回答：上面的代码会打印 5。 &nbs
【Thrift二】Thrift Hello World bit1129 Hello world
本篇，不考虑细节问题和为什么，先照葫芦画瓢写一个Thrift版本的Hello World，了解Thrift RPC服务开发的基本流程 1. 在Intellij中创建一个Maven模块，加入对Thrift的依赖，同时还要加上slf4j依赖，如果不加slf4j依赖，在后面启动Thrift Server时会报错 <dependency>
【Avro一】Avro入门 bit1129 入门
本文的目的主要是总结下基于Avro Schema代码生成，然后进行序列化和反序列化开发的基本流程。需要指出的是，Avro并不要求一定得根据Schema文件生成代码，这对于动态类型语言很有用。 1. 添加Maven依赖 <?xml version="1.0" encoding="UTF-8"?> <proj
安装nginx+ngx_lua支持WAF防护功能 ronin47
需要的软件:LuaJIT-2.0.0.tar.gz nginx-1.4.4.tar.gz &nb
java-5.查找最小的K个元素-使用最大堆 bylijinnan java
import java.util.Arrays; import java.util.Random; public class MinKElement { /** * 5.最小的K个元素 * I would like to use MaxHeap. * using QuickSort is also OK */ public static void
TCP的TIME-WAIT bylijinnan socket
原文连接： http://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html 以下为对原文的阅读笔记说明：主动关闭的一方称为local end，被动关闭的一方称为remote end 本地IP、本地端口、远端IP、远端端口这一“四元组”称为quadruplet，也称为socket 1、TIME_WA
jquery ajax 序列化表单 coder_xpf Jquery ajax 序列化
checkbox 如果不设定值，默认选中值为on；设定值之后，选中则为设定的值 <input type="checkbox" name="favor" id="favor" checked="checked"/> $("#favor&quo
Apache集群乱码和最高并发控制 cuisuqiang apache tomcat 并发集群乱码
都知道如果使用Http访问，那么在Connector中增加URIEncoding即可，其实使用AJP时也一样，增加useBodyEncodingForURI和URIEncoding即可。最大连接数也是一样的，增加maxThreads属性即可，如下，配置如下： <Connector maxThreads="300" port="8019" prot
websocket dalan_123 websocket
一、低延迟的客户端-服务器和服务器-客户端的连接很多时候所谓的http的请求、响应的模式，都是客户端加载一个网页，直到用户在进行下一次点击的时候，什么都不会发生。并且所有的http的通信都是客户端控制的，这时候就需要用户的互动或定期轮训的，以便从服务器端加载新的数据。通常采用的技术比如推送和comet（使用http长连接、无需安装浏览器安装插件的两种方式：基于ajax的长
菜鸟分析网络执法官 dcj3sjt126com 网络
最近在论坛上看到很多贴子在讨论网络执法官的问题。菜鸟我正好知道这回事情.人道"人之患好为人师" 手里忍不住,就写点东西吧. 我也很忙.又没有MM,又没有MONEY....晕倒有点跑题. OK,闲话少说,切如正题. 要了解网络执法官的原理. 就要先了解局域网的通信的原理. 前面我们看到了.在以太网上传输的都是具有以太网头的数据包.
Android相对布局属性全集 dcj3sjt126com android
RelativeLayout布局android:layout_marginTop="25dip" //顶部距离android:gravity="left" //空间布局位置android:layout_marginLeft="15dip //距离左边距 // 相对于给定ID控件android:layout_above 将该控件的底部置于给定ID的
Tomcat内存设置详解 eksliang jvm tomcat tomcat内存设置
Java内存溢出详解一、常见的Java内存溢出有以下三种： 1. java.lang.OutOfMemoryError: Java heap space ----JVM Heap（堆）溢出JVM在启动的时候会自动设置JVM Heap的值，其初始空间(即-Xms)是物理内存的1/64，最大空间(-Xmx)不可超过物理内存。可以利用JVM提
Java6 JVM参数选项 greatwqs java HotSpot jvm jvm参数 JVM Options
Java 6 JVM参数选项大全（中文版）作者：Ken Wu Email: [email protected] 转载本文档请注明原文链接 http://kenwublog.com/docs/java6-jvm-options-chinese-edition.htm！本文是基于最新的SUN官方文档Java SE 6 Hotspot VM Opt
weblogic创建JMC i5land weblogic jms
进入 weblogic控制太 1.创建持久化存储 --Services--Persistant Stores--new--Create FileStores--name随便起--target默认--Directory写入在本机建立的文件夹的路径--ok 2.创建JMS服务器 --Services--Messaging--JMS Servers--new--name随便起--Pers
基于 DHT 网络的磁力链接和BT种子的搜索引擎架构 justjavac DHT
上周开发了一个磁力链接和 BT 种子的搜索引擎 {Magnet & Torrent}，本文简单介绍一下主要的系统功能和用到的技术。系统包括几个独立的部分：使用 Python 的 Scrapy 框架开发的网络爬虫，用来爬取磁力链接和种子；使用 PHP CI 框架开发的简易网站；搜索引擎目前直接使用的 MySQL，将来可以考虑使
sql添加、删除表中的列 macroli sql
添加没有默认值：alter table Test add BazaarType char(1) 有默认值的添加列：alter table Test add BazaarType char(1) default(0) 删除没有默认值的列：alter table Test drop COLUMN BazaarType 删除有默认值的列：先删除约束（默认值）alter table Test DRO
PHP中二维数组的排序方法 abc123456789cba 排序二维数组 PHP
<?php/*** @package BugFree* @version $Id: FunctionsMain.inc.php,v 1.32 2005/09/24 11:38:37 wwccss Exp $*** Sort an two-dimension array by some level
hive优化之------控制hive任务中的map数和reduce数 superlxw1234 hive hive优化
一、控制hive任务中的map数: 1. 通常情况下，作业会通过input的目录产生一个或者多个map任务。主要的决定因素有： input的文件总个数，input的文件大小，集群设置的文件块大小(目前为128M, 可在hive中通过set dfs.block.size;命令查看到，该参数不能自定义修改)；2.
Spring Boot 1.2.4 发布 wiselyman spring boot
Spring Boot 1.2.4已于6.4日发布，repo.spring.io and Maven Central可以下载(推荐使用maven或者gradle构建下载)。这是一个维护版本，包含了一些修复small number of fixes,建议所有的用户升级。 Spring Boot 1.3的第一个里程碑版本将在几天后发布，包含许多