cunzai1985

java--ml 时间序列_时间序列-快速指南

java--ml 时间序列

时间序列-快速指南 (Time Series - Quick Guide)

时间序列-简介 (Time Series - Introduction)

A time series is a sequence of observations over a certain period. A univariate time series consists of the values taken by a single variable at periodic time instances over a period, and a multivariate time series consists of the values taken by multiple variables at the same periodic time instances over a period. The simplest example of a time series that all of us come across on a day to day basis is the change in temperature throughout the day or week or month or year.

时间序列是在一定时期内的一系列观察结果。单变量时间序列由一个变量在一个周期内的定期时间实例所取的值组成，而多元时间序列由多个变量在一个周期内的相同周期性时间实例所取的值组成。我们每个人每天遇到的时间序列的最简单示例是整个一天，一周，一个月或一年中温度的变化。

The analysis of temporal data is capable of giving us useful insights on how a variable changes over time, or how it depends on the change in the values of other variable(s). This relationship of a variable on its previous values and/or other variables can be analyzed for time series forecasting and has numerous applications in artificial intelligence.

时间数据的分析能够为我们提供有用的见解，以了解变量如何随时间变化，或者变量如何取决于其他变量的值变化。可以将变量与其先前值和/或其他变量的这种关系进行分析，以进行时间序列预测，并在人工智能中具有许多应用。

时间序列-编程语言 (Time Series - Programming Languages)

A basic understanding of any programming language is essential for a user to work with or develop machine learning problems. A list of preferred programming languages for anyone who wants to work on machine learning is given below −

对任何编程语言的基本理解对于用户处理或发展机器学习问题都是必不可少的。下面列出了想要从事机器学习的任何人的首选编程语言列表-

Python (Python)

It is a high-level interpreted programming language, fast and easy to code. Python can follow either procedural or object-oriented programming paradigms. The presence of a variety of libraries makes implementation of complicated procedures simpler. In this tutorial, we will be coding in Python and the corresponding libraries useful for time series modelling will be discussed in the upcoming chapters.

它是一种高级解释型编程语言，可以快速，轻松地编写代码。 Python可以遵循过程式或面向对象的编程范例。各种库的存在使复杂过程的实现更加简单。在本教程中，我们将使用Python进行编码，并将在接下来的章节中讨论对时间序列建模有用的相应库。

[R (R)

Similar to Python, R is an interpreted multi-paradigm language, which supports statistical computing and graphics. The variety of packages makes it easier to implement machine learning modelling in R.

与Python相似，R是一种解释型多范式语言，支持统计计算和图形。各种软件包使在R中更容易实现机器学习建模。

Java (Java)

It is an interpreted object-oriented programming language, which is widely famous for a large range of package availability and sophisticated data visualization techniques.

它是一种解释型的面向对象的编程语言，以广泛的软件包可用性和复杂的数据可视化技术而闻名。

C / C ++ (C/C++)

These are compiled languages, and two of the oldest programming languages. These languages are often preferred to incorporate ML capabilities in the already existing applications as they allow you to customize the implementation of ML algorithms easily.

这些是编译语言，也是两种最古老的编程语言。通常首选使用这些语言将ML功能集成到现有应用程序中，因为它们使您可以轻松自定义ML算法的实现。

的MATLAB (MATLAB)

MATrix LABoratory is a multi-paradigm language which gives functioning to work with matrices. It allows mathematical operations for complex problems. It is primarily used for numerical operations but some packages also allow the graphical multi-domain simulation and model-based design.

MATrix LABoratory是一种多范式语言，可为使用矩阵提供功能。它允许对复杂问题进行数学运算。它主要用于数值运算，但某些软件包还允许图形化多域仿真和基于模型的设计。

Other preferred programming languages for machine learning problems include JavaScript, LISP, Prolog, SQL, Scala, Julia, SAS etc.

针对机器学习问题的其他首选编程语言包括JavaScript，LISP，Prolog，SQL，Scala，Julia，SAS等。

时间序列-Python库 (Time Series - Python Libraries)

Python has an established popularity among individuals who perform machine learning because of its easy-to-write and easy-to-understand code structure as well as a wide variety of open source libraries. A few of such open source libraries that we will be using in the coming chapters have been introduced below.

Python因其易于编写和易于理解的代码结构以及各种各样的开源库而在执行机器学习的个人中广受欢迎。下面介绍了一些我们将在接下来的章节中使用的开源库。

NumPy (NumPy)

Numerical Python is a library used for scientific computing. It works on an N-dimensional array object and provides basic mathematical functionality such as size, shape, mean, standard deviation, minimum, maximum as well as some more complex functions such as linear algebraic functions and Fourier transform. You will learn more about these as we move ahead in this tutorial.

数值Python是用于科学计算的库。它适用于N维数组对象，并提供基本的数学功能，例如大小，形状，均值，标准差，最小值，最大值以及一些更复杂的函数，例如线性代数函数和傅立叶变换。随着本教程的深入，您将学到更多有关这些的知识。

大熊猫 (Pandas)

This library provides highly efficient and easy-to-use data structures such as series, dataframes and panels. It has enhanced Python’s functionality from mere data collection and preparation to data analysis. The two libraries, Pandas and NumPy, make any operation on small to very large dataset very simple. To know more about these functions, follow this tutorial.

该库提供了高效，易于使用的数据结构，例如系列，数据框和面板。从单纯的数据收集和准备到数据分析，它增强了Python的功能。 Pandas和NumPy这两个库使对小到非常大的数据集的任何操作都非常简单。要了解有关这些功能的更多信息，请遵循本教程。

科学 (SciPy)

Science Python is a library used for scientific and technical computing. It provides functionalities for optimization, signal and image processing, integration, interpolation and linear algebra. This library comes handy while performing machine learning. We will discuss these functionalities as we move ahead in this tutorial.

科学Python是用于科学和技术计算的库。它提供了优化，信号和图像处理，积分，插值和线性代数的功能。该库在执行机器学习时非常方便。在本教程中，我们将讨论这些功能。

Scikit学习 (Scikit Learn)

This library is a SciPy Toolkit widely used for statistical modelling, machine learning and deep learning, as it contains various customizable regression, classification and clustering models. It works well with Numpy, Pandas and other libraries which makes it easier to use.

该库是一个SciPy工具包，因为它包含各种可自定义的回归，分类和聚类模型，因此广泛用于统计建模，机器学习和深度学习。它与Numpy，Pandas和其他库配合使用，使其更易于使用。

统计模型 (Statsmodels)

Like Scikit Learn, this library is used for statistical data exploration and statistical modelling. It also operates well with other Python libraries.

与Scikit Learn一样，该库也用于统计数据探索和统计建模。它也可以与其他Python库一起很好地运行。

Matplotlib (Matplotlib)

This library is used for data visualization in various formats such as line plot, bar graph, heat maps, scatter plots, histogram etc. It contains all the graph related functionalities required from plotting to labelling. We will discuss these functionalities as we move ahead in this tutorial.

该库用于各种格式的数据可视化，例如折线图，条形图，热图，散点图，直方图等。它包含从绘图到标记所需的所有图形相关功能。在本教程中，我们将讨论这些功能。

These libraries are very essential to start with machine learning with any sort of data.

这些库对于从任何类型的数据开始机器学习都是非常重要的。

Beside the ones discussed above, another library especially significant to deal with time series is −

除了上面讨论的，另一个对时间序列特别重要的库是-

约会时间 (Datetime)

This library, with its two modules − datetime and calendar, provides all necessary datetime functionality for reading, formatting and manipulating time.

该库及其两个模块(日期时间和日历)提供了用于读取，格式化和处理时间的所有必需的日期时间功能。

We shall be using these libraries in the coming chapters.

在接下来的章节中，我们将使用这些库。

时间序列-数据处理和可视化 (Time Series - Data Processing and Visualization)

Time Series is a sequence of observations indexed in equi-spaced time intervals. Hence, the order and continuity should be maintained in any time series.

时间序列是按等距时间间隔索引的观察序列。因此，应该在任何时间序列中保持顺序和连续性。

The dataset we will be using is a multi-variate time series having hourly data for approximately one year, for air quality in a significantly polluted Italian city. The dataset can be downloaded from the link given below − https://archive.ics.uci.edu/ml/datasets/air+quality.

我们将使用的数据集是一个多变量时间序列，其中包含大约一年的每小时数据，用于一个污染严重的意大利城市的空气质量。可以从下面给出的链接中下载数据集-https: //archive.ics.uci.edu/ml/datasets/air+quality 。

It is necessary to make sure that −

有必要确保-

The time series is equally spaced, and
时间序列等距分布，并且
There are no redundant values or gaps in it.
没有多余的值或空白。

In case the time series is not continuous, we can upsample or downsample it.

如果时间序列不是连续的，我们可以对其进行上采样或下采样。

显示df.head() (Showing df.head())

In [122]:

在[122]中：


import pandas

In [123]:

在[123]中：


df = pandas.read_csv("AirQualityUCI.csv", sep = ";", decimal = ",")
df = df.iloc[ : , 0:14]

In [124]:

在[124]中：


len(df)

Out[124]:

出[124]：

In [125]:

在[125]中：


df.head()

Out[125]:

出[125]：

For preprocessing the time series, we make sure there are no NaN(NULL) values in the dataset; if there are, we can replace them with either 0 or average or preceding or succeeding values. Replacing is a preferred choice over dropping so that the continuity of the time series is maintained. However, in our dataset the last few values seem to be NULL and hence dropping will not affect the continuity.

为了预处理时间序列，我们确保数据集中没有NaN(NULL)值；如果有的话，我们可以将它们替换为0或平均值或之前或之后的值。与丢弃相比，替换是首选选择，这样可以保持时间序列的连续性。但是，在我们的数据集中，最后几个值似乎为NULL，因此删除不会影响连续性。

删除NaN(非数字) (Dropping NaN(Not-a-Number))

In [126]:

在[126]中：


df.isna().sum()
Out[126]:
Date             114
Time             114
CO(GT)           114
PT08.S1(CO)      114
NMHC(GT)         114
C6H6(GT)         114
PT08.S2(NMHC)    114
NOx(GT)          114
PT08.S3(NOx)     114
NO2(GT)          114
PT08.S4(NO2)     114
PT08.S5(O3)      114
T                114
RH               114
dtype: int64

In [127]:

在[127]中：


df = df[df['Date'].notnull()]

In [128]:

在[128]中：


df.isna().sum()

Out[128]:

出[128]：


Date             0
Time             0
CO(GT)           0
PT08.S1(CO)      0
NMHC(GT)         0
C6H6(GT)         0
PT08.S2(NMHC)    0
NOx(GT)          0
PT08.S3(NOx)     0
NO2(GT)          0
PT08.S4(NO2)     0
PT08.S5(O3)      0
T                0
RH               0
dtype: int64

Time Series are usually plotted as line graphs against time. For that we will now combine the date and time column and convert it into a datetime object from strings. This can be accomplished using the datetime library.

时间序列通常绘制为相对于时间的折线图。为此，我们现在将合并date和time列，并将其从字符串转换为datetime对象。这可以使用datetime库来完成。

转换为日期时间对象 (Converting to datetime object)

In [129]:

在[129]中：


df['DateTime'] = (df.Date) + ' ' + (df.Time)
print (type(df.DateTime[0]))

In [130]:

在[130]中：


import datetime

df.DateTime = df.DateTime.apply(lambda x: datetime.datetime.strptime(x, '%d/%m/%Y %H.%M.%S'))
print (type(df.DateTime[0]))

Let us see how some variables like temperature changes with change in time.

让我们看看温度等一些变量如何随时间变化。

显示情节 (Showing plots)

In [131]:

在[131]中：


df.index = df.DateTime

In [132]:

在[132]中：


import matplotlib.pyplot as plt
plt.plot(df['T'])

Out[132]:

出[132]：

[]

In [208]:

在[208]中：


plt.plot(df['C6H6(GT)'])

Out[208]:

出[208]：

[]

Box-plots are another useful kind of graphs that allow you to condense a lot of information about a dataset into a single graph. It shows the mean, 25% and 75% quartile and outliers of one or multiple variables. In the case when number of outliers is few and is very distant from the mean, we can eliminate the outliers by setting them to mean value or 75% quartile value.

箱形图是另一种有用的图形，它使您可以将有关数据集的许多信息压缩到单个图形中。它显示一个或多个变量的均值，25％和75％的四分位数和离群值。在离群数很少且与均值相距甚远的情况下，我们可以通过将其设置为均值或75％四分位数来消除离群值。

显示箱线图 (Showing Boxplots)

In [134]:

在[134]中：


plt.boxplot(df[['T','C6H6(GT)']].values)

Out[134]:

出[134]：


{'whiskers': [,
   ,
   ,
   ],
   'caps': [,
   ,
   ,
   ],
   'boxes': [,
   ],
   'medians': [,
   ],
   'fliers': [,
   ],'means': []
}

时间序列-建模 (Time Series - Modeling)

介绍 (Introduction)

A time series has 4 components as given below −

时间序列具有4个成分，如下所示-

Level − It is the mean value around which the series varies.
水平 -它是序列变化的平均值。
Trend − It is the increasing or decreasing behavior of a variable with time.
趋势 -它是变量随时间的增加或减少行为。
Seasonality − It is the cyclic behavior of time series.
季节性 -这是时间序列的周期性行为。
Noise − It is the error in the observations added due to environmental factors.
噪声 -这是由于环境因素导致的观测值误差。

时间序列建模技术 (Time Series Modeling Techniques)

To capture these components, there are a number of popular time series modelling techniques. This section gives a brief introduction of each technique, however we will discuss about them in detail in the upcoming chapters −

为了捕获这些组件，有许多流行的时间序列建模技术。本节简要介绍了每种技术，但是我们将在接下来的章节中详细讨论它们-

幼稚的方法 (Naïve Methods)

These are simple estimation techniques, such as the predicted value is given the value equal to mean of preceding values of the time dependent variable, or previous actual value. These are used for comparison with sophisticated modelling techniques.

这些是简单的估算技术，例如，给预测值等于时间相关变量的先前值或先前实际值的平均值。这些用于与复杂的建模技术进行比较。

自回归 (Auto Regression)

Auto regression predicts the values of future time periods as a function of values at previous time periods. Predictions of auto regression may fit the data better than that of naïve methods, but it may not be able to account for seasonality.

自动回归功能将根据先前时间段的值来预测未来时间段的值。自回归的预测可能比朴素的方法更适合数据，但可能无法说明季节性。

ARIMA模型 (ARIMA Model)

An auto-regressive integrated moving-average models the value of a variable as a linear function of previous values and residual errors at previous time steps of a stationary timeseries. However, the real world data may be non-stationary and have seasonality, thus Seasonal-ARIMA and Fractional-ARIMA were developed. ARIMA works on univariate time series, to handle multiple variables VARIMA was introduced.

自回归积分移动平均值将变量的值建模为固定时间序列的先前时间步长上先前值和残差的线性函数。但是，现实世界的数据可能是不稳定的，并且具有季节性，因此开发了Seasonal-ARIMA和Fractional-ARIMA。 ARIMA研究单变量时间序列，以处理多个变量。

指数平滑 (Exponential Smoothing)

It models the value of a variable as an exponential weighted linear function of previous values. This statistical model can handle trend and seasonality as well.

它将变量的值建模为先前值的指数加权线性函数。该统计模型还可以处理趋势和季节性。

LSTM (LSTM)

Long Short-Term Memory model (LSTM) is a recurrent neural network which is used for time series to account for long term dependencies. It can be trained with large amount of data to capture the trends in multi-variate time series.

长短期记忆模型(LSTM)是一个递归神经网络，用于时间序列以解决长期依赖性。可以使用大量数据对其进行训练，以捕获多元时间序列中的趋势。

The said modelling techniques are used for time series regression. In the coming chapters, let us now explore all these one by one.

所述建模技术用于时间序列回归。在接下来的章节中，让我们现在逐一探讨所有这些。

时间序列-参数校准 (Time Series - Parameter Calibration)

介绍 (Introduction)

Any statistical or machine learning model has some parameters which greatly influence how the data is modeled. For example, ARIMA has p, d, q values. These parameters are to be decided such that the error between actual values and modeled values is minimum. Parameter calibration is said to be the most crucial and time-consuming task of model fitting. Hence, it is very essential for us to choose optimal parameters.

任何统计或机器学习模型都具有一些参数，这些参数会极大地影响数据建模的方式。例如，ARIMA具有p，d，q值。确定这些参数，以使实际值和建模值之间的误差最小。据说参数校准是模型拟合的最关键和最耗时的任务。因此，选择最优参数对我们来说至关重要。

参数校准方法 (Methods for Calibration of Parameters)

There are various ways to calibrate parameters. This section talks about some of them in detail.

有多种校准参数的方法。本节详细讨论其中的一些。

尝试 (Hit-and-try)

One common way of calibrating models is hand calibration, where you start by visualizing the time-series and intuitively try some parameter values and change them over and over until you achieve a good enough fit. It requires a good understanding of the model we are trying. For ARIMA model, hand calibration is done with the help of auto-correlation plot for ‘p’ parameter, partial auto-correlation plot for ‘q’ parameter and ADF-test to confirm the stationarity of time-series and setting ‘d’ parameter. We will discuss all these in detail in the coming chapters.

校准模型的一种常见方法是手动校准，从可视化时间序列开始，然后直观地尝试一些参数值，并不断地更改它们，直到达到足够好的拟合度为止。它需要对我们正在尝试的模型有很好的了解。对于ARIMA模型，借助“ p”参数的自相关图，“ q”参数的部分自相关图和ADF测试进行手动校准，以确认时间序列的平稳性并设置“ d”参数。在接下来的章节中，我们将详细讨论所有这些内容。

网格搜索 (Grid Search)

Another way of calibrating models is by grid search, which essentially means you try building a model for all possible combinations of parameters and select the one with minimum error. This is time-consuming and hence is useful when number of parameters to be calibrated and range of values they take are fewer as this involves multiple nested for loops.

校准模型的另一种方法是通过网格搜索，这实际上意味着您尝试为所有可能的参数组合构建模型并选择误差最小的模型。这很耗时，因此在要校准的参数数量和所取值范围较小时很有用，因为这涉及多个嵌套的for循环。

遗传算法 (Genetic Algorithm)

Genetic algorithm works on the biological principle that a good solution will eventually evolve to the most ‘optimal’ solution. It uses biological operations of mutation, cross-over and selection to finally reach to an optimal solution.

遗传算法基于生物学原理，即良好的解决方案最终将演变为最“最佳”的解决方案。它使用突变，交叉和选择的生物学操作最终达到最佳解决方案。

For further knowledge you can read about other parameter optimization techniques like Bayesian optimization and Swarm optimization.

有关更多的知识，您可以阅读有关其他参数优化技术的信息，例如贝叶斯优化和Swarm优化。

时间序列-天真的方法 (Time Series - Naïve Methods)

介绍 (Introduction)

Naïve Methods such as assuming the predicted value at time ‘t’ to be the actual value of the variable at time ‘t-1’ or rolling mean of series, are used to weigh how well do the statistical models and machine learning models can perform and emphasize their need.

单纯的方法(例如假设在时间t的预测值是时间t-1的变量的实际值或序列的滚动平均值)用于衡量统计模型和机器学习模型的执行效果并强调他们的需求。

In this chapter, let us try these models on one of the features of our time-series data.

在本章中，让我们在时间序列数据的功能之一上尝试这些模型。

First we shall see the mean of the ‘temperature’ feature of our data and the deviation around it. It is also useful to see maximum and minimum temperature values. We can use the functionalities of numpy library here.

首先，我们将看到数据“温度”特征的均值及其周围的偏差。查看最大和最小温度值也很有用。我们可以在这里使用numpy库的功能。

显示统计 (Showing statistics)

In [135]:

在[135]中：


import numpy
print (
   'Mean: ',numpy.mean(df['T']), '; 
   Standard Deviation: ',numpy.std(df['T']),'; 
   \nMaximum Temperature: ',max(df['T']),'; 
   Minimum Temperature: ',min(df['T'])
)

We have the statistics for all 9357 observations across equi-spaced timeline which are useful for us to understand the data.

我们拥有等距时间轴上所有9357个观测值的统计信息，这对于我们理解数据很有用。

Now we will try the first naïve method, setting the predicted value at present time equal to actual value at previous time and calculate the root mean squared error(RMSE) for it to quantify the performance of this method.

现在，我们将尝试第一种天真的方法，将当前时间的预测值设置为与先前时间的实际值相等，并为其计算均方根误差(RMSE)以量化该方法的性能。

显示^第一种天真的方法 (Showing 1^st naïve method)

In [136]:

在[136]中：


df['T']
df['T_t-1'] = df['T'].shift(1)

In [137]:

在[137]中：


df_naive = df[['T','T_t-1']][1:]

In [138]:

在[138]中：


from sklearn import metrics
from math import sqrt

true = df_naive['T']
prediction = df_naive['T_t-1']
error = sqrt(metrics.mean_squared_error(true,prediction))
print ('RMSE for Naive Method 1: ', error)

RMSE for Naive Method 1: 12.901140576492974

原始方法1的RMSE：12.901140576492974

Let us see the next naïve method, where predicted value at present time is equated to the mean of the time periods preceding it. We will calculate the RMSE for this method too.

让我们看看下一个朴素的方法，其中当前的预测值等于它之前的时间段的平均值。我们还将为此方法计算RMSE。

显示^第二种天真的方法 (Showing 2^nd naïve method)

In [139]:

在[139]中：


df['T_rm'] = df['T'].rolling(3).mean().shift(1)
df_naive = df[['T','T_rm']].dropna()

In [140]:

在[140]中：


true = df_naive['T']
prediction = df_naive['T_rm']
error = sqrt(metrics.mean_squared_error(true,prediction))
print ('RMSE for Naive Method 2: ', error)

RMSE for Naive Method 2: 14.957633272839242

原始方法2的RMSE：14.957633272839242

Here, you can experiment with various number of previous time periods also called ‘lags’ you want to consider, which is kept as 3 here. In this data it can be seen that as you increase the number of lags and error increases. If lag is kept 1, it becomes same as the naïve method used earlier.

在这里，您可以尝试各种数量的先前时间段，也就是您要考虑的“滞后时间”，此处保持为3。从该数据可以看出，随着滞后次数的增加和误差的增加。如果将滞后保持为1，它将与之前使用的朴素方法相同。

Points to Note

注意事项

You can write a very simple function for calculating root mean squared error. Here, we have used the mean squared error function from the package ‘sklearn’ and then taken its square root.
您可以编写一个非常简单的函数来计算均方根误差。在这里，我们使用了软件包“ sklearn”中的均方误差函数，然后取其平方根。
In pandas df[‘column_name’] can also be written as df.column_name, however for this dataset df.T will not work the same as df[‘T’] because df.T is the function for transposing a dataframe. So use only df[‘T’] or consider renaming this column before using the other syntax.
在熊猫中，df ['column_name']也可以写为df.column_name，但是对于此数据集df.T不能与df ['T']相同，因为df.T是用于转置数据帧的功能。因此，仅使用df ['T']或考虑在使用其他语法之前重命名此列。

时间序列-自动回归 (Time Series - Auto Regression)

For a stationary time series, an auto regression models sees the value of a variable at time ‘t’ as a linear function of values ‘p’ time steps preceding it. Mathematically it can be written as −

对于固定时间序列，自动回归模型将时间“ t”处的变量值视为值“ p”时间步长的线性函数。数学上可以写成-

$$y_{t} = \:C+\:\phi_{1}y_{t-1}\:+\:\phi_{2}Y_{t-2}+...+\phi_{p}y_{t-p}+\epsilon_{t}$$

$$ y_ {t} = \：C + \：\ phi_ {1} y_ {t-1} \：+ \：\ phi_ {2} Y_ {t-2} + ... + \ phi_ {p} y_ {tp} + \ epsilon_ {t} $$

Where,‘p’ is the auto-regressive trend parameter

其中， “ p”是自回归趋势参数

$\epsilon_{t}$ is white noise, and

$ \ epsilon_ {t} $是白噪声，并且

$y_{t-1}, y_{t-2}\:\: ...y_{t-p}$ denote the value of variable at previous time periods.

$ y_ {t-1}，y_ {t-2} \：\：... y_ {tp} $表示前一个时间段的变量值。

The value of p can be calibrated using various methods. One way of finding the apt value of ‘p’ is plotting the auto-correlation plot.

p的值可以使用各种方法进行校准。找到“ p”的合适值的一种方法是绘制自相关图。

Note − We should separate the data into train and test at 8:2 ratio of total data available prior to doing any analysis on the data because test data is only to find out the accuracy of our model and assumption is, it is not available to us until after predictions have been made. In case of time series, sequence of data points is very essential so one should keep in mind not to lose the order during splitting of data.

注意 -在对数据进行任何分析之前，我们应将数据分成总数据的8：2比例进行训练和测试，因为测试数据仅是为了找出我们模型的准确性，而假设是，我们直到做出预测之后。对于时间序列，数据点的顺序非常重要，因此应牢记不要在数据拆分期间丢失顺序。

An auto-correlation plot or a correlogram shows the relation of a variable with itself at prior time steps. It makes use of Pearson’s correlation and shows the correlations within 95% confidence interval. Let’s see how it looks like for ‘temperature’ variable of our data.

自相关图或相关图显示了先前时间步长处变量与自身的关系。它利用了Pearson的相关性，并显示了95％置信区间内的相关性。让我们看看数据的“温度”变量的样子。

显示ACP (Showing ACP)

In [141]:

在[141]中：


split = len(df) - int(0.2*len(df))
train, test = df['T'][0:split], df['T'][split:]

In [142]:

在[142]中：


from statsmodels.graphics.tsaplots import plot_acf

plot_acf(train, lags = 100)
plt.show()

All the lag values lying outside the shaded blue region are assumed to have a csorrelation.

假定位于蓝色阴影区域之外的所有滞后值都具有反相关关系。

时间序列-移动平均 (Time Series - Moving Average)

For a stationary time series, a moving average model sees the value of a variable at time ‘t’ as a linear function of residual errors from ‘q’ time steps preceding it. The residual error is calculated by comparing the value at the time ‘t’ to moving average of the values preceding.

对于固定的时间序列，移动平均模型将时间“ t”处的变量值视为来自其前“ q”个时间步长的残差线性函数。通过将时间“ t”处的值与先前值的移动平均值进行比较，可以计算出残余误差。

Mathematically it can be written as −

数学上可以写成-

$$y_{t} = c\:+\:\epsilon_{t}\:+\:\theta_{1}\:\epsilon_{t-1}\:+\:\theta_{2}\:\epsilon_{t-2}\:+\:...+:\theta_{q}\:\epsilon_{t-q}\:$$

$$ y_ {t} = c \：+ \：\ epsilon_ {t} \：+ \：\ theta_ {1} \：\ epsilon_ {t-1} \：+ \：\ theta_ {2} \：\ epsilon_ {t-2} \：+ \：... +：\ theta_ {q} \：\ epsilon_ {tq} \：$$

Where‘q’ is the moving-average trend parameter

其中“ q”是移动平均趋势参数

$\epsilon_{t}$ is white noise, and

$ \ epsilon_ {t} $是白噪声，并且

$\epsilon_{t-1}, \epsilon_{t-2}...\epsilon_{t-q}$ are the error terms at previous time periods.

$ \ epsilon_ {t-1}，\ epsilon_ {t-2} ... \ epsilon_ {tq} $是先前时间段的误差项。

Value of ‘q’ can be calibrated using various methods. One way of finding the apt value of ‘q’ is plotting the partial auto-correlation plot.

可以使用多种方法来校准“ q”的值。找到“ q”的合适值的一种方法是绘制局部自相关图。

A partial auto-correlation plot shows the relation of a variable with itself at prior time steps with indirect correlations removed, unlike auto-correlation plot which shows direct as well as indirect correlations, let’s see how it looks like for ‘temperature’ variable of our data.

部分自相关图显示了变量在先前时间步长上与自身之间的关系，其中间接相关性已删除，这与自相关图显示了直接相关性和间接相关性不同，让我们来看一下“温度”变量的样子数据。

显示PACP (Showing PACP)

In [143]:

在[143]中：


from statsmodels.graphics.tsaplots import plot_pacf

plot_pacf(train, lags = 100)
plt.show()

A partial auto-correlation is read in the same way as a correlogram.

以与相关图相同的方式读取部分自相关。

时间序列-ARIMA (Time Series - ARIMA)

We have already understood that for a stationary time series a variable at time ‘t’ is a linear function of prior observations or residual errors. Hence it is time for us to combine the two and have an Auto-regressive moving average (ARMA) model.

我们已经了解，对于平稳的时间序列，时间“ t”处的变量是先前观测值或残差的线性函数。因此，现在是我们将两者结合起来并拥有自回归移动平均值(ARMA)模型的时候了。

However, at times the time series is not stationary, i.e the statistical properties of a series like mean, variance changes over time. And the statistical models we have studied so far assume the time series to be stationary, therefore, we can include a pre-processing step of differencing the time series to make it stationary. Now, it is important for us to find out whether the time series we are dealing with is stationary or not.

但是，有时时间序列不是固定的，即序列的统计属性(例如均值，方差)会随时间变化。到目前为止，我们研究的统计模型都假设时间序列是固定的，因此，我们可以包括一个预处理步骤，使该时间序列微分以使其稳定。现在，重要的是要弄清我们正在处理的时间序列是否固定。

Various methods to find the stationarity of a time series are looking for seasonality or trend in the plot of time series, checking the difference in mean and variance for various time periods, Augmented Dickey-Fuller (ADF) test, KPSS test, Hurst’s exponent etc.

查找时间序列平稳性的各种方法是在时间序列图中查找季节性或趋势，检查各个时间段的均值和方差的差异，增强Dickey-Fuller(ADF)测试，KPSS测试，Hurst指数等。

Let us see whether the ‘temperature’ variable of our dataset is a stationary time series or not using ADF test.

让我们使用ADF测试来查看数据集的“温度”变量是否为固定时间序列。

In [74]:

在[74]中：


from statsmodels.tsa.stattools import adfuller

result = adfuller(train)
print('ADF Statistic: %f' % result[0])
print('p-value: %f' % result[1])
print('Critical Values:')
for key, value In result[4].items()
   print('\t%s: %.3f' % (key, value))

ADF Statistic: -10.406056

ADF统计：-10.406056

p-value: 0.000000

p值：0.000000

Critical Values:

关键值：

1%: -3.431

1％：-3.431

5%: -2.862

5％：-2.862

10%: -2.567

10％：-2.567

Now that we have run the ADF test, let us interpret the result. First we will compare the ADF Statistic with the critical values, a lower critical value tells us the series is most likely non-stationary. Next, we see the p-value. A p-value greater than 0.05 also suggests that the time series is non-stationary.

现在我们已经运行了ADF测试，让我们解释结果。首先，我们将ADF统计量与临界值进行比较，较低的临界值告诉我们该序列很可能是不稳定的。接下来，我们看到p值。如果p值大于0.05，则表明时间序列是非平稳的。

Alternatively, p-value less than or equal to 0.05, or ADF Statistic less than critical values suggest the time series is stationary.

或者，p值小于或等于0.05，或ADF统计量小于临界值表明时间序列是固定的。

Hence, the time series we are dealing with is already stationary. In case of stationary time series, we set the ‘d’ parameter as 0.

因此，我们正在处理的时间序列已经静止。在固定时间序列的情况下，我们将“ d”参数设置为0。

We can also confirm the stationarity of time series using Hurst exponent.

我们还可以使用Hurst指数来确认时间序列的平稳性。

In [75]:

在[75]中：


import hurst

H, c,data = hurst.compute_Hc(train)
print("H = {:.4f}, c = {:.4f}".format(H,c))

H = 0.1660, c = 5.0740

H = 0.1660，c = 5.0740

The value of H<0.5 shows anti-persistent behavior, and H>0.5 shows persistent behavior or a trending series. H=0.5 shows random walk/Brownian motion. The value of H<0.5, confirming that our series is stationary.

H <0.5的值表示持久性行为，H> 0.5的值表示持久性行为或趋势序列。 H = 0.5表示随机行走/布朗运动。 H的值<0.5，证实我们的序列是平稳的。

For non-stationary time series, we set ‘d’ parameter as 1. Also, the value of the auto-regressive trend parameter ‘p’ and the moving average trend parameter ‘q’, is calculated on the stationary time series i.e by plotting ACP and PACP after differencing the time series.

对于非平稳时间序列，我们将“ d”参数设置为1。此外，自动回归趋势参数“ p”和移动平均趋势参数“ q”的值是在平稳时间序列上计算的，即通过绘制区分时间序列后的ACP和PACP。

ARIMA Model, which is characterized by 3 parameter, (p,d,q) are now clear to us, so let us model our time series and predict the future values of temperature.

现在已经清楚了以3个参数(p，d，q)为特征的ARIMA模型，因此让我们对时间序列建模并预测温度的未来值。

In [156]:

在[156]中：


from statsmodels.tsa.arima_model import ARIMA

model = ARIMA(train.values, order=(5, 0, 2))
model_fit = model.fit(disp=False)

In [157]:

在[157]中：


predictions = model_fit.predict(len(test))
test_ = pandas.DataFrame(test)
test_['predictions'] = predictions[0:1871]

In [158]:

在[158]中：


plt.plot(df['T'])
plt.plot(test_.predictions)
plt.show()

In [167]:

在[167]中：


error = sqrt(metrics.mean_squared_error(test.values,predictions[0:1871]))
print ('Test RMSE for ARIMA: ', error)

Test RMSE for ARIMA: 43.21252940234892

测试ARIMA的RMSE：43.21252940234892

时间序列-ARIMA的变化 (Time Series - Variations of ARIMA)

In the previous chapter, we have now seen how ARIMA model works, and its limitations that it cannot handle seasonal data or multivariate time series and hence, new models were introduced to include these features.

在上一章中，我们现在看到了ARIMA模型的工作原理，以及它不能处理季节性数据或多元时间序列的局限性，因此引入了包含这些功能的新模型。

A glimpse of these new models is given here −

这些新模型的概览在这里给出-

向量自回归(VAR) (Vector Auto-Regression (VAR))

It is a generalized version of auto regression model for multivariate stationary time series. It is characterized by ‘p’ parameter.

它是用于多元平稳时间序列的自动回归模型的通用版本。它以“ p”参数为特征。

向量移动平均线(VMA) (Vector Moving Average (VMA))

It is a generalized version of moving average model for multivariate stationary time series. It is characterized by ‘q’ parameter.

它是用于多元平稳时间序列的移动平均模型的广义版本。它以“ q”参数为特征。

向量自回归移动平均值(VARMA) (Vector Auto Regression Moving Average (VARMA))

It is the combination of VAR and VMA and a generalized version of ARMA model for multivariate stationary time series. It is characterized by ‘p’ and ‘q’ parameters. Much like, ARMA is capable of acting like an AR model by setting ‘q’ parameter as 0 and as a MA model by setting ‘p’ parameter as 0, VARMA is also capable of acting like an VAR model by setting ‘q’ parameter as 0 and as a VMA model by setting ‘p’ parameter as 0.

它是VAR和VMA的组合，以及用于多元平稳时间序列的ARMA模型的广义版本。它的特征是“ p”和“ q”参数。很像，ARMA可以通过将“ q”参数设置为0来充当AR模型，而通过将“ p”参数设置为0来作为MA模型，VARMA也可以通过设置“ q”参数来充当VAR模型。通过将“ p”参数设置为0，将其设置为0并作为VMA模型。

In [209]:

在[209]中：


df_multi = df[['T', 'C6H6(GT)']]
split = len(df) - int(0.2*len(df))
train_multi, test_multi = df_multi[0:split], df_multi[split:]

In [211]:

在[211]中：


from statsmodels.tsa.statespace.varmax import VARMAX

model = VARMAX(train_multi, order = (2,1))
model_fit = model.fit()
c:\users\naveksha\appdata\local\programs\python\python37\lib\site-packages\statsmodels\tsa\statespace\varmax.py:152: 
   EstimationWarning: Estimation of VARMA(p,q) models is not generically robust, 
   due especially to identification issues. 
   EstimationWarning)
c:\users\naveksha\appdata\local\programs\python\python37\lib\site-packages\statsmodels\tsa\base\tsa_model.py:171: 
   ValueWarning: No frequency information was provided, so inferred frequency H will be used. 
  % freq, ValueWarning)
c:\users\naveksha\appdata\local\programs\python\python37\lib\site-packages\statsmodels\base\model.py:508: 
   ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals 
  "Check mle_retvals", ConvergenceWarning)

In [213]:

在[213]中：


predictions_multi = model_fit.forecast( steps=len(test_multi))
c:\users\naveksha\appdata\local\programs\python\python37\lib\site-packages\statsmodels\tsa\base\tsa_model.py:320: 
   FutureWarning: Creating a DatetimeIndex by passing range endpoints is deprecated.  Use `pandas.date_range` instead.
   freq = base_index.freq)
c:\users\naveksha\appdata\local\programs\python\python37\lib\site-packages\statsmodels\tsa\statespace\varmax.py:152: 
   EstimationWarning: Estimation of VARMA(p,q) models is not generically robust, due especially to identification issues.
   EstimationWarning)

In [231]:

在[231]中：


plt.plot(train_multi['T'])
plt.plot(test_multi['T'])
plt.plot(predictions_multi.iloc[:,0:1], '--')
plt.show()

plt.plot(train_multi['C6H6(GT)'])
plt.plot(test_multi['C6H6(GT)'])
plt.plot(predictions_multi.iloc[:,1:2], '--')
plt.show()

The above code shows how VARMA model can be used to model multivariate time series, although this model may not be best suited on our data.

上面的代码显示了如何使用VARMA模型对多元时间序列进行建模，尽管该模型可能并非最适合我们的数据。

具有外生变量的VARMA(VARMAX) (VARMA with Exogenous Variables (VARMAX))

It is an extension of VARMA model where extra variables called covariates are used to model the primary variable we are interested it.

它是VARMA模型的扩展，其中使用了称为协变量的额外变量来对我们感兴趣的主要变量进行建模。

季节性自回归综合移动平均线(SARIMA) (Seasonal Auto Regressive Integrated Moving Average (SARIMA))

This is the extension of ARIMA model to deal with seasonal data. It divides the data into seasonal and non-seasonal components and models them in a similar fashion. It is characterized by 7 parameters, for non-seasonal part (p,d,q) parameters same as for ARIMA model and for seasonal part (P,D,Q,m) parameters where ‘m’ is the number of seasonal periods and P,D,Q are similar to parameters of ARIMA model. These parameters can be calibrated using grid search or genetic algorithm.

这是ARIMA模型用于处理季节性数据的扩展。它将数据分为季节和非季节成分，并以类似的方式对其进行建模。它的特征在于7个参数，对于非季节性部分(p，d，q)参数与ARIMA模型相同，对于季节性部分(P，D，Q，m)参数，其中'm'是季节性周期数， P，D，Q与ARIMA模型的参数相似。这些参数可以使用网格搜索或遗传算法进行校准。

具有外生变量的SARIMA(SARIMAX) (SARIMA with Exogenous Variables (SARIMAX))

This is the extension of SARIMA model to include exogenous variables which help us to model the variable we are interested in.

这是SARIMA模型的扩展，其中包括外生变量，这些变量有助于我们对感兴趣的变量进行建模。

It may be useful to do a co-relation analysis on variables before putting them as exogenous variables.

在将变量作为外生变量之前，对它们进行关联分析可能会很有用。

In [251]:

在[251]中：


from scipy.stats.stats import pearsonr
x = train_multi['T'].values
y = train_multi['C6H6(GT)'].values

corr , p = pearsonr(x,y)
print ('Corelation Coefficient =', corr,'\nP-Value =',p)
Corelation Coefficient = 0.9701173437269858
P-Value = 0.0

Pearson’s Correlation shows a linear relation between 2 variables, to interpret the results, we first look at the p-value, if it is less that 0.05 then the value of coefficient is significant, else the value of coefficient is not significant. For significant p-value, a positive value of correlation coefficient indicates positive correlation, and a negative value indicates a negative correlation.

皮尔逊相关性显示2个变量之间的线性关系，为了解释结果，我们首先查看p值，如果小于0.05，则系数的值显着，否则系数的值不显着。对于显着的p值，相关系数的正值表示正相关，而负值表示负相关。

Hence, for our data, ‘temperature’ and ‘C6H6’ seem to have a highly positive correlation. Therefore, we will

因此，对于我们的数据，“温度”和“ C6H6”似乎具有高度正相关。因此，我们将

In [297]:

在[297]中：


from statsmodels.tsa.statespace.sarimax import SARIMAX

model = SARIMAX(x, exog = y, order = (2, 0, 2), seasonal_order = (2, 0, 1, 1), enforce_stationarity=False, enforce_invertibility = False)
model_fit = model.fit(disp = False)
c:\users\naveksha\appdata\local\programs\python\python37\lib\site-packages\statsmodels\base\model.py:508: 
   ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
   "Check mle_retvals", ConvergenceWarning)

In [298]:

在[298]中：


y_ = test_multi['C6H6(GT)'].values
predicted = model_fit.predict(exog=y_)
test_multi_ = pandas.DataFrame(test)
test_multi_['predictions'] = predicted[0:1871]

In [299]:

在[299]中：


plt.plot(train_multi['T'])
plt.plot(test_multi_['T'])
plt.plot(test_multi_.predictions, '--')

Out[299]:

出[299]：

[]

The predictions here seem to take larger variations now as opposed to univariate ARIMA modelling.

与单变量ARIMA建模相反，此处的预测现在似乎需要更大的变化。

Needless to say, SARIMAX can be used as an ARX, MAX, ARMAX or ARIMAX model by setting only the corresponding parameters to non-zero values.

不用说，通过仅将相应参数设置为非零值，SARIMAX可以用作ARX，MAX，ARMAX或ARIMAX模型。

分数自回归综合移动平均线(FARIMA) (Fractional Auto Regressive Integrated Moving Average (FARIMA))

At times, it may happen that our series is not stationary, yet differencing with ‘d’ parameter taking the value 1 may over-difference it. So, we need to difference the time series using a fractional value.

有时可能会发生我们的序列不稳定的情况，但是与“ d”参数取值1的差异可能会使它过分差异。因此，我们需要使用小数值来区分时间序列。

In the world of data science there is no one superior model, the model that works on your data depends greatly on your dataset. Knowledge of various models allows us to choose one that work on our data and experimenting with that model to achieve the best results. And results should be seen as plot as well as error metrics, at times a small error may also be bad, hence, plotting and visualizing the results is essential.

在数据科学领域，没有一种上乘的模型，对您的数据起作用的模型在很大程度上取决于您的数据集。对各种模型的了解使我们可以选择一种可以处理数据的模型，并对该模型进行试验以获得最佳结果。结果应视为绘图以及误差度量，有时小误差也可能是不好的，因此，绘图和可视化结果至关重要。

In the next chapter, we will be looking at another statistical model, exponential smoothing.

在下一章中，我们将研究另一种统计模型，即指数平滑。

时间序列-指数平滑 (Time Series - Exponential Smoothing)

In this chapter, we will talk about the techniques involved in exponential smoothing of time series.

在本章中，我们将讨论时间序列的指数平滑所涉及的技术。

简单指数平滑 (Simple Exponential Smoothing)

Exponential Smoothing is a technique for smoothing univariate time-series by assigning exponentially decreasing weights to data over a time period.

指数平滑是一种通过在一段时间内为数据分配指数递减的权重来平滑单变量时间序列的技术。

Mathematically, the value of variable at time ‘t+1’ given value at time t, y_(t+1|t) is defined as −

数学上，在时间t处给定时间t y_(t + 1 | t)的值在时间“ t + 1”处的变量的值定义为-

$$y_{t+1|t}\:=\:\alpha y_{t}\:+\:\alpha\lgroup1 -\alpha\rgroup y_{t-1}\:+\alpha\lgroup1-\alpha\rgroup^{2}\:y_{t-2}\:+\:...+y_{1}$$

$$ y_ {t + 1 | t} \：= \：\ alpha y_ {t} \：+ \：\ alpha \ lgroup1-\ alpha \ rgroup y_ {t-1} \：+ \ alpha \ lgroup1- \ alpha \ rgroup ^ {2} \：y_ {t-2} \：+ \：... + y_ {1} $$

where,$0\leq\alpha \leq1$ is the smoothing parameter, and

其中， $ 0 \ leq \ alpha \ leq1 $是平滑参数，并且

$y_{1},....,y_{t}$ are previous values of network traffic at times 1, 2, 3, … ,t.

$ y_ {1}，....，y_ {t} $是时间1、2、3，...，t的网络流量的先前值。

This is a simple method to model a time series with no clear trend or seasonality. But exponential smoothing can also be used for time series with trend and seasonality.

这是对没有明确趋势或季节性的时间序列建模的简单方法。但是，指数平滑也可以用于具有趋势和季节性的时间序列。

三重指数平滑 (Triple Exponential Smoothing)

Triple Exponential Smoothing (TES) or Holt's Winter method, applies exponential smoothing three times - level smoothing $l_{t}$, trend smoothing $b_{t}$, and seasonal smoothing $S_{t}$, with $\alpha$, $\beta^{*}$ and $\gamma$ as smoothing parameters with ‘m’ as the frequency of the seasonality, i.e. the number of seasons in a year.

三重指数平滑(TES)或Holt的Winter方法，应用了三次指数平滑-水平平滑$ l_ {t} $，趋势平滑$ b_ {t} $和季节性平滑$ S_ {t} $，其中$ \ alpha $ ，$ \ beta ^ {** $和$ \ gamma $作为平滑参数，其中'm'为季节性频率，即一年中的季节数。

According to the nature of the seasonal component, TES has two categories −

根据季节性因素的性质，TES有两类：

Holt-Winter's Additive Method − When the seasonality is additive in nature.
Holt-Winter的加法 -当季节性本质上是加法时。
Holt-Winter’s Multiplicative Method − When the seasonality is multiplicative in nature.
Holt-Winter的乘法法 -当季节性本质上是乘法时。

For non-seasonal time series, we only have trend smoothing and level smoothing, which is called Holt’s Linear Trend Method.

对于非季节时间序列，我们只有趋势平滑和水平平滑，这称为Holt线性趋势方法。

Let’s try applying triple exponential smoothing on our data.

让我们尝试对数据应用三重指数平滑。

In [316]:

在[316]中：


from statsmodels.tsa.holtwinters import ExponentialSmoothing

model = ExponentialSmoothing(train.values, trend= )
model_fit = model.fit()

In [322]:

在[322]中：


predictions_ = model_fit.predict(len(test))

In [325]:

在[325]中：


plt.plot(test.values)
plt.plot(predictions_[1:1871])

Out[325]:

出[325]：

[]

Here, we have trained the model once with training set and then we keep on making predictions. A more realistic approach is to re-train the model after one or more time step(s). As we get the prediction for time ‘t+1’ from training data ‘til time ‘t’, the next prediction for time ‘t+2’ can be made using the training data ‘til time ‘t+1’ as the actual value at ‘t+1’ will be known then. This methodology of making predictions for one or more future steps and then re-training the model is called rolling forecast or walk forward validation.

在这里，我们使用训练集对模型进行了一次训练，然后继续进行预测。一种更现实的方法是在一个或多个时间步长之后重新训练模型。当我们从训练数据'til time't'得到时间't + 1'的预测时，可以使用训练数据'til time't + 1'作为实际的时间来进行时间't + 2'的下一个预测这样就知道了“ t + 1”的值。这种对一个或多个未来步骤进行预测，然后重新训练模型的方法称为滚动预测或前瞻性验证。

时间序列-前移验证 (Time Series - Walk Forward Validation)

In time series modelling, the predictions over time become less and less accurate and hence it is a more realistic approach to re-train the model with actual data as it gets available for further predictions. Since training of statistical models are not time consuming, walk-forward validation is the most preferred solution to get most accurate results.

在时间序列建模中，随着时间的推移，预测变得越来越不准确，因此，当模型可用于进一步的预测时，采用实际数据重新训练模型是一种更为现实的方法。由于训练统计模型并不耗时，因此，前向验证是获得最准确结果的最优选解决方案。

Let us apply one step walk forward validation on our data and compare it with the results we got earlier.

让我们对数据进行一步向前验证，并将其与我们之前获得的结果进行比较。

In [333]:

在[333]中：


prediction = []
data = train.values
for t In test.values:
   model = (ExponentialSmoothing(data).fit())
   y = model.predict()
   prediction.append(y[0])
   data = numpy.append(data, t)

In [335]:

在[335]中：


test_ = pandas.DataFrame(test)
test_['predictionswf'] = prediction

In [341]:

在[341]中：


plt.plot(test_['T'])
plt.plot(test_.predictionswf, '--')
plt.show()

In [340]:

在[340]中：


error = sqrt(metrics.mean_squared_error(test.values,prediction))
print ('Test RMSE for Triple Exponential Smoothing with Walk-Forward Validation: ', error)
Test RMSE for Triple Exponential Smoothing with Walk-Forward Validation:  11.787532205759442

We can see that our model performs significantly better now. In fact, the trend is followed so closely that on the plot predictions are overlapping with the actual values. You can try applying walk-forward validation on ARIMA models too.

我们可以看到我们的模型现在性能明显更好。实际上，趋势是如此接近，以至于绘图上的预测与实际值重叠。您也可以尝试在ARIMA模型上应用前向验证。

时间序列-先知模型 (Time Series - Prophet Model)

In 2017, Facebook open sourced the prophet model which was capable of modelling the time series with strong multiple seasonalities at day level, week level, year level etc. and trend. It has intuitive parameters that a not-so-expert data scientist can tune for better forecasts. At its core, it is an additive regressive model which can detect change points to model the time series.

2017年，Facebook开源了先知模型，该模型能够在日水平，周水平，年水平等和趋势方面对具有多个季节性的时间序列进行建模。它具有直观的参数，不是那么专业的数据科学家可以调整这些参数以获得更好的预测。它的核心是可加回归模型，可以检测变化点以对时间序列建模。

Prophet decomposes the time series into components of trend $g_{t}$, seasonality $S_{t}$ and holidays $h_{t}$.

先知将时间序列分解为趋势$ g_ {t} $，季节性$ S_ {t} $和假期$ h_ {t} $的分量。

$$y_{t}=g_{t}+s_{t}+h_{t}+\epsilon_{t}$$

$$ y_ {t} = g_ {t} + s_ {t} + h_ {t} + \ epsilon_ {t} $$

Where, $\epsilon_{t}$ is the error term.

其中， \\ epsilon_ {t} $是错误项。

Similar packages for time series forecasting such as causal impact and anomaly detection were introduced in R by google and twitter respectively.

谷歌和推特分别在R中引入了类似的时间序列预测软件包，例如因果影响和异常检测。

时间序列-LSTM模型 (Time Series - LSTM Model)

Now, we are familiar with statistical modelling on time series, but machine learning is all the rage right now, so it is essential to be familiar with some machine learning models as well. We shall start with the most popular model in time series domain − Long Short-term Memory model.

现在，我们已经很熟悉时间序列的统计建模，但是机器学习现在非常流行，因此也必须熟悉某些机器学习模型。我们将从时间序列域中最流行的模型开始-长短期记忆模型。

LSTM is a class of recurrent neural network. So before we can jump to LSTM, it is essential to understand neural networks and recurrent neural networks.

LSTM是一类递归神经网络。因此，在进入LSTM之前，必须了解神经网络和递归神经网络。

神经网络 (Neural Networks)

An artificial neural network is a layered structure of connected neurons, inspired by biological neural networks. It is not one algorithm but combinations of various algorithms which allows us to do complex operations on data.

人工神经网络是受生物神经网络启发的连接神经元的分层结构。它不是一种算法，而是多种算法的组合，使我们能够对数据进行复杂的操作。

递归神经网络 (Recurrent Neural Networks)

It is a class of neural networks tailored to deal with temporal data. The neurons of RNN have a cell state/memory, and input is processed according to this internal state, which is achieved with the help of loops with in the neural network. There are recurring module(s) of ‘tanh’ layers in RNNs that allow them to retain information. However, not for a long time, which is why we need LSTM models.

它是为处理时间数据而量身定制的一类神经网络。 RNN的神经元具有细胞状态/内存，并根据此内部状态处理输入，这是借助神经网络中的循环来实现的。 RNN中有“ tanh”层的重复模块，可让它们保留信息。但是，不是很长一段时间，这就是为什么我们需要LSTM模型。

LSTM (LSTM)

It is special kind of recurrent neural network that is capable of learning long term dependencies in data. This is achieved because the recurring module of the model has a combination of four layers interacting with each other.

它是一种特殊的循环神经网络，能够学习数据的长期依赖性。之所以能够实现这一目标，是因为模型的重复模块具有相互交互的四层组合。

The picture above depicts four neural network layers in yellow boxes, point wise operators in green circles, input in yellow circles and cell state in blue circles. An LSTM module has a cell state and three gates which provides them with the power to selectively learn, unlearn or retain information from each of the units. The cell state in LSTM helps the information to flow through the units without being altered by allowing only a few linear interactions. Each unit has an input, output and a forget gate which can add or remove the information to the cell state. The forget gate decides which information from the previous cell state should be forgotten for which it uses a sigmoid function. The input gate controls the information flow to the current cell state using a point-wise multiplication operation of ‘sigmoid’ and ‘tanh’ respectively. Finally, the output gate decides which information should be passed on to the next hidden state

上图显示了黄色方框中的四个神经网络层，绿色圆圈中的点智能算子，黄色圆圈中的输入，蓝色圆圈中的单元状态。 LSTM模块具有单元状态和三个门，这三个门为它们提供了从每个单元中选择性地学习，取消学习或保留信息的能力。 LSTM中的单元状态仅允许一些线性交互作用，从而使信息流经这些单元而不会被更改。每个单元都有一个输入，输出和一个忘记门，可以将信息添加或删除到单元状态。遗忘门决定使用S形函数应忘记先前单元状态中的哪些信息。输入门分别使用“ Sigmoid”和“ tanh”的逐点乘法运算将信息流控制为当前单元状态。最后，输出门决定应将哪些信息传递到下一个隐藏状态

Now that we have understood the internal working of LSTM model, let us implement it. To understand the implementation of LSTM, we will start with a simple example − a straight line. Let us see, if LSTM can learn the relationship of a straight line and predict it.

现在我们已经了解了LSTM模型的内部工作原理，让我们实现它。为了理解LSTM的实现，我们将从一个简单的示例开始-一条直线。让我们看看，LSTM是否可以学习直线的关系并对其进行预测。

First let us create the dataset depicting a straight line.

首先，让我们创建描述直线的数据集。

In [402]:

在[402]中：


x = numpy.arange (1,500,1)
y = 0.4 * x + 30
plt.plot(x,y)

Out[402]:

出[402]：

[]

In [403]:

在[403]中：


trainx, testx = x[0:int(0.8*(len(x)))], x[int(0.8*(len(x))):]
trainy, testy = y[0:int(0.8*(len(y)))], y[int(0.8*(len(y))):]
train = numpy.array(list(zip(trainx,trainy)))
test = numpy.array(list(zip(trainx,trainy)))

Now that the data has been created and split into train and test. Let’s convert the time series data into the form of supervised learning data according to the value of look-back period, which is essentially the number of lags which are seen to predict the value at time ‘t’.

现在已经创建了数据，并将其拆分为训练和测试。让我们根据回溯期的值将时间序列数据转换为监督学习数据的形式，回溯期的值本质上是指可以预测时间“ t”时的滞后次数。

So a time series like this −

所以这样的时间序列-


time variable_x
t1  x1
t2  x2
 :   :
 :   :
T   xT

When look-back period is 1, is converted to −

当回溯期为1时，转换为-


x1   x2
x2   x3
 :    :
 :    :
xT-1 xT

In [404]:

在[404]中：


def create_dataset(n_X, look_back):
   dataX, dataY = [], []
   for i in range(len(n_X)-look_back):
      a = n_X[i:(i+look_back), ]
      dataX.append(a)
      dataY.append(n_X[i + look_back, ])
   return numpy.array(dataX), numpy.array(dataY)

In [405]:

在[405]中：


look_back = 1
trainx,trainy = create_dataset(train, look_back)
testx,testy = create_dataset(test, look_back)

trainx = numpy.reshape(trainx, (trainx.shape[0], 1, 2))
testx = numpy.reshape(testx, (testx.shape[0], 1, 2))

Now we will train our model.

现在，我们将训练模型。

Small batches of training data are shown to network, one run of when entire training data is shown to the model in batches and error is calculated is called an epoch. The epochs are to be run ‘til the time the error is reducing.

将小批量的训练数据显示给网络，一次将整个训练数据分批显示给模型并且计算出误差时的一次运行称为时期。直到错误减少的时间段为止。

In [ ]:

在[]中：


from keras.models import Sequential
from keras.layers import LSTM, Dense

model = Sequential()
model.add(LSTM(256, return_sequences = True, input_shape = (trainx.shape[1], 2)))
model.add(LSTM(128,input_shape = (trainx.shape[1], 2)))
model.add(Dense(2))
model.compile(loss = 'mean_squared_error', optimizer = 'adam')
model.fit(trainx, trainy, epochs = 2000, batch_size = 10, verbose = 2, shuffle = False)
model.save_weights('LSTMBasic1.h5')

In [407]:

在[407]中：


model.load_weights('LSTMBasic1.h5')
predict = model.predict(testx)

Now let’s see what our predictions look like.

现在，让我们看看我们的预测是什么样的。

In [408]:

在[408]中：


plt.plot(testx.reshape(398,2)[:,0:1], testx.reshape(398,2)[:,1:2])
plt.plot(predict[:,0:1], predict[:,1:2])

Out[408]:

出[408]：

[]

Now, we should try and model a sine or cosine wave in a similar fashion. You can run the code given below and play with the model parameters to see how the results change.

现在，我们应该尝试以类似方式对正弦波或余弦波建模。您可以运行下面给出的代码，并使用模型参数来查看结果如何变化。

In [409]:

在[409]中：


x = numpy.arange (1,500,1)
y = numpy.sin(x)
plt.plot(x,y)

Out[409]:

出[409]：

[]

In [410]:

在[410]中：


trainx, testx = x[0:int(0.8*(len(x)))], x[int(0.8*(len(x))):]
trainy, testy = y[0:int(0.8*(len(y)))], y[int(0.8*(len(y))):]
train = numpy.array(list(zip(trainx,trainy)))
test = numpy.array(list(zip(trainx,trainy)))

In [411]:

在[411]中：


look_back = 1
trainx,trainy = create_dataset(train, look_back)
testx,testy = create_dataset(test, look_back)
trainx = numpy.reshape(trainx, (trainx.shape[0], 1, 2))
testx = numpy.reshape(testx, (testx.shape[0], 1, 2))

In [ ]:

在[]中：


model = Sequential()
model.add(LSTM(512, return_sequences = True, input_shape = (trainx.shape[1], 2)))
model.add(LSTM(256,input_shape = (trainx.shape[1], 2)))
model.add(Dense(2))
model.compile(loss = 'mean_squared_error', optimizer = 'adam')
model.fit(trainx, trainy, epochs = 2000, batch_size = 10, verbose = 2, shuffle = False)
model.save_weights('LSTMBasic2.h5')

In [413]:

在[413]中：


model.load_weights('LSTMBasic2.h5')
predict = model.predict(testx)

In [415]:

在[415]中：


plt.plot(trainx.reshape(398,2)[:,0:1], trainx.reshape(398,2)[:,1:2])
plt.plot(predict[:,0:1], predict[:,1:2])

Out [415]:

出[415]：

[]

Now you are ready to move on to any dataset.

现在您可以继续使用任何数据集了。

时间序列-误差指标 (Time Series - Error Metrics)

It is important for us to quantify the performance of a model to use it as a feedback and comparison. In this tutorial we have used one of the most popular error metric root mean squared error. There are various other error metrics available. This chapter discusses them in brief.

对我们来说，量化模型的性能以将其用作反馈和比较非常重要。在本教程中，我们使用了最流行的误差度量均方根误差之一。还有其他各种错误度量标准。本章简要讨论它们。

均方误差 (Mean Square Error)

It is the average of square of difference between the predicted values and true values. Sklearn provides it as a function. It has the same units as the true and predicted values squared and is always positive.

它是预测值和真实值之间差异的平方的平均值。 Sklearn提供了它的功能。它的单位与真实值和预测值的平方相同，并且始终为正。

$$MSE = \frac{1}{n} \displaystyle\sum\limits_{t=1}^n \lgroup y'_{t}\:-y_{t}\rgroup^{2}$$

$$ MSE = \ frac {1} {n} \ displaystyle \ sum \ limits_ {t = 1} ^ n \ lgroup y'_ {t} \：-y_ {t} \ rgroup ^ {2} $$

Where $y'_{t}$ is the predicted value,

其中$ y'_ {t} $是预测值，

$y_{t}$ is the actual value, and

$ y_ {t} $是实际值，并且

n is the total number of values in test set.

n是测试集中值的总数。

It is clear from the equation that MSE is more penalizing for larger errors, or the outliers.

从方程式中可以明显看出，MSE对于较大的错误或异常值的惩罚更大。

根均方误差 (Root Mean Square Error)

It is the square root of the mean square error. It is also always positive and is in the range of the data.

它是均方误差的平方根。它也总是正的，并且在数据范围内。

$$RMSE = \sqrt{\frac{1}{n} \displaystyle\sum\limits_{t=1}^n \lgroup y'_{t}-y_{t}\rgroup ^2}$$

$$ RMSE = \ sqrt {\ frac {1} {n} \ displaystyle \ sum \ limits_ {t = 1} ^ n \ lgroup y'_ {t} -y_ {t} \ rgroup ^ 2} $$

Where, $y'_{t}$ is predicted value

其中， $ y'_ {t} $是预测值

$y_{t}$ is actual value, and

$ y_ {t} $是实际值，并且

n is total number of values in test set.

n是测试集中值的总数。

It is in the power of unity and hence is more interpretable as compared to MSE. RMSE is also more penalizing for larger errors. We have used RMSE metric in our tutorial.

它具有统一性，因此与MSE相比更具可解释性。对于较大的错误，RMSE也会受到更大的惩罚。我们在本教程中使用了RMSE指标。

平均绝对误差 (Mean Absolute Error)

It is the average of absolute difference between predicted values and true values. It has the same units as predicted and true value and is always positive.

它是预测值和真实值之间的绝对差的平均值。它具有与预测值和真实值相同的单位，并且始终为正。

$$MAE = \frac{1}{n}\displaystyle\sum\limits_{t=1}^{t=n} | y'{t}-y_{t}\lvert$$

$$ MAE = \ frac {1} {n} \ displaystyle \ sum \ limits_ {t = 1} ^ {t = n} | y'{t} -y_ {t} \ lvert $$

Where, $y'_{t}$ is predicted value,

其中， $ y'_ {t} $是预测值，

$y_{t}$ is actual value, and

$ y_ {t} $是实际值，并且

n is total number of values in test set.

n是测试集中值的总数。

平均百分比误差 (Mean Percentage Error)

It is the percentage of average of absolute difference between predicted values and true values, divided by the true value.

它是预测值和真实值之间的绝对差平均值的平均值除以真实值的百分比。

$$MAPE = \frac{1}{n}\displaystyle\sum\limits_{t=1}^n\frac{y'_{t}-y_{t}}{y_{t}}*100\: \%$$

$$ MAPE = \ frac {1} {n} \ displaystyle \ sum \ limits_ {t = 1} ^ n \ frac {y'_ {t} -y_ {t}} {y_ {t}} * 100 \： \％$$

Where, $y'_{t}$ is predicted value,

其中， $ y'_ {t} $是预测值，

$y_{t}$ is actual value and n is total number of values in test set.

$ y_ {t} $是实际值，n是测试集中的值总数。

However, the disadvantage of using this error is that the positive error and negative errors can offset each other. Hence mean absolute percentage error is used.

但是，使用此误差的缺点是正误差和负误差会相互抵消。因此，使用平均绝对百分比误差。

平均绝对百分比误差 (Mean Absolute Percentage Error)

It is the percentage of average of absolute difference between predicted values and true values, divided by the true value.

它是预测值和真实值之间的绝对差平均值的平均值除以真实值的百分比。

$$MAPE = \frac{1}{n}\displaystyle\sum\limits_{t=1}^n\frac{|y'_{t}-y_{t}\lvert}{y_{t}}*100\: \%$$

$$ MAPE = \ frac {1} {n} \ displaystyle \ sum \ limits_ {t = 1} ^ n \ frac {| y'_ {t} -y_ {t} \ lvert} {y_ {t}} * 100 \：\％$$

Where $y'_{t}$ is predicted value

其中$ y'_ {t} $是预测值

$y_{t}$ is actual value, and

$ y_ {t} $是实际值，并且

n is total number of values in test set.

n是测试集中值的总数。

时间序列-应用 (Time Series - Applications)

We discussed time series analysis in this tutorial, which has given us the understanding that time series models first recognize the trend and seasonality from the existing observations and then forecast a value based on this trend and seasonality. Such analysis is useful in various fields such as −

我们在本教程中讨论了时间序列分析，这使我们理解了时间序列模型首先会从现有观察值中识别趋势和季节性，然后根据该趋势和季节性预测值。这种分析在各个领域都非常有用，例如-

Financial Analysis − It includes sales forecasting, inventory analysis, stock market analysis, price estimation.
财务分析 -包括销售预测，库存分析，股票市场分析，价格估计。
Weather Analysis − It includes temperature estimation, climate change, seasonal shift recognition, weather forecasting.
天气分析 -包括温度估计，气候变化，季节性变化识别，天气预报。
Network Data Analysis − It includes network usage prediction, anomaly or intrusion detection, predictive maintenance.
网络数据分析 -它包括网络使用情况预测，异常或入侵检测，预测性维护。
Healthcare Analysis − It includes census prediction, insurance benefits prediction, patient monitoring.
医疗保健分析 -它包括人口普查预测，保险利益预测，患者监测。

时间序列-进一步的范围 (Time Series - Further Scope)

Machine learning deals with various kinds of problems. In fact, almost all fields have a scope to be automatized or improved with the help of machine learning. A few such problems on which a great deal of work is being done are given below.

机器学习处理各种问题。实际上，借助机器学习，几乎所有领域都有自动化或改进的范围。下面给出了一些需要大量工作的问题。

时间序列数据 (Time Series Data)

This is the data which changes according to time, and hence time plays a crucial role in it, which we largely discussed in this tutorial.

这是随时间变化的数据，因此时间在其中起着至关重要的作用，我们在本教程中对此进行了大量讨论。

非时间序列数据 (Non-Time Series Data)

It is the data independent of time, and a major percentage of ML problems are on nontime series data. For simplicity, we shall categorize it further as −

它是与时间无关的数据，并且大部分ML问题都与非时间序列数据有关。为了简单起见，我们将其进一步分类为-

Numerical Data − Computers, unlike humans, only understand numbers, so all kinds of data ultimately is converted to numerical data for machine learning, for example, image data is converted to (r,b,g) values, characters are converted to ASCII codes or words are indexed to numbers, speech data is converted to mfcc files containing numerical data.
数值数据 -计算机与人类不同，计算机只能理解数字，因此最终将各种数据转换为用于机器学习的数值数据，例如，将图像数据转换为(r，b，g)值，将字符转换为ASCII码或将单词索引为数字，则语音数据将转换为包含数字数据的mfcc文件。
Image Data − Computer vision has revolutionized the world of computers, it has various application in the field of medicine, satellite imaging etc.
图像数据 -计算机视觉彻底改变了计算机世界，在医学，卫星成像等领域具有多种应用。
Text Data − Natural Language Processing (NLP) is used for text classification, paraphrase detection and language summarization. This is what makes Google and Facebook smart.
文本数据 -自然语言处理(NLP)用于文本分类，释义检测和语言摘要。这就是使Google和Facebook变得聪明的原因。
Speech Data − Speech Processing involves speech recognition and sentiment understanding. It plays a crucial role in imparting computers the human-like qualities.
语音数据 -语音处理涉及语音识别和情感理解。它在赋予计算机类似人的品质方面起着至关重要的作用。

翻译自: https://www.tutorialspoint.com/time_series/time_series_quick_guide.htm

java--ml 时间序列

你可能感兴趣的:(神经网络,大数据,算法,python,机器学习)

基于人工智能的Python面试题请一直在路上 python 开发语言
基于人工智能的Python面试题1.Python中的元组与列表区别是什么？列表是可变类型，元组不是。列表是引用类型，元组不是。列表使用场景更宽泛，元组更多用于一些数据不可变的场景，例如参数、或者返回值。2.Python中的字典是否有序？python3.6之前字典是无序的，之后是有序的。原因可以参考下这个帖子https://blog.csdn.net/weixin_48629601/article/
17-7 向量数据库之野望7 - PostgreSQL 和pgvector 拉达曼迪斯II AIGC学习数据库管理工具 AI创业数据库 postgresql 人工智能机器学习 AIGC 搜索引擎
PostgreSQL是一款功能强大的开源对象关系数据库系统，它已将其功能扩展到传统数据管理之外，通过pgvector扩展支持矢量数据。这一新增功能满足了对高效处理高维矢量数据日益增长的需求，这些数据通常用于机器学习、自然语言处理(NLP)和推荐系统等应用。https://github.com/mazzasaverio/find-your-opensource-project什么是pgvector？
matlab实现一个雷达信号处理的程序，涉及到对原始图像的模拟、加权、加噪以及通过迭代算法对图像进行恢复和优化处理 max500600 MATLAB 算法算法 matlab 信号处理
clcclearcloseallloadscene3.mat%加载原始图像，自己设计设计为一个300*400的矩阵300是距离向长度，400是方位向长度Map_ori=scene3;[M,N_K]=size(Map_ori);figureimagesc(scene3)v=100;%机载速度，单位m/sbandwidth=30*1e6;%信号带宽，决定距离分辨率，单位Hzc=3*1e8;%光速R_R
海外抖音技术深度解析：算法、AI与全球化的挑战神探阿航计算机产业科普与思考算法人工智能机器学习数据挖掘深度学习
引言2025年1月19日，在美国宣布暂停服务，这一事件引发了全球用户的广泛关注。作为全球最受欢迎的短视频平台之一，其成功离不开其强大的技术支撑，尤其是其个性化推荐算法和AI驱动的创作工具。然而，随着全球市场环境的变化，它面临的技术与运营挑战也日益凸显。本文将深入分析其技术核心、全球化运营中的挑战及其未来发展方向。核心：个性化推荐引擎其算法是其成功的关键，其核心在于个性化推荐引擎。该引擎采用深度学习
JAVA 反射(JAVA面试题) geejkse_seff java 开发语言
5.1.2.JAVA反射5.1.2.1.动态语言动态语言，是指程序在运行时可以改变其结构：新的函数可以引进，已有的函数可以被删除等结构上的变化。比如常见的JavaScript就是动态语言，除此之外Ruby,Python等也属于动态语言，而C、C++则不属于动态语言。从反射角度说JAVA属于半动态语言。5.1.2.2.反射机制概念（运行状态中知道类所有的属性和方法）在Java中的反射机制是指在运行状
如何运用Python爬虫快速获得1688商品详情数据小爬虫程序猿 API python 爬虫开发语言
在数字化时代，数据的价值日益凸显，尤其是在电商领域。对于企业来说，获取竞争对手的商品信息是分析市场趋势、制定营销策略的重要手段。1688作为中国领先的B2B电商平台，拥有海量的商品数据。本文将介绍如何使用Python编写爬虫程序，以合法合规的方式快速获取1688商品详情，为电商企业提供数据支持。1.环境准备在开始编写代码之前，我们需要准备以下开发环境：Python3.x：确保已安装Python3.
如何使用Java爬虫获取阿里巴巴热卖商品推荐：代码示例与实践指南小爬虫程序猿 Java java 爬虫 python
在电商领域，获取热卖商品推荐对于商家和开发者来说至关重要。阿里巴巴提供了热卖商品推荐API接口，能够根据消费者的购买历史、浏览行为、搜索习惯等数据，自动推荐符合其需求的商品。以下将详细介绍如何使用Java爬虫获取阿里巴巴热卖商品推荐，并提供相关的代码示例。一、阿里巴巴热卖商品推荐API接口简介阿里巴巴热卖商品推荐API接口是一种基于人工智能算法的推荐系统，能够根据消费者的购买历史、浏览行为、搜索习
AI与API的融合：构建智能互联技术世界的基石 IT数据V+I7809804594 人工智能数据分析 python 爬虫大数据
在当今科技飞速发展的时代，人工智能（AI）与应用程序接口（API）的融合正在开启智能应用的新纪元。AI以其强大的数据处理和分析能力，正在改变各行各业的工作方式，而API则作为连接技术与应用的桥梁，为AI技术的普及和应用提供了无限可能。本文将深入探讨AI与API的融合如何推动智能应用的创新和发展，以及其在各个领域的应用和前景。一、AI与API融合的背景随着大数据、云计算、物联网等技术的快速发展，人工
python微博关键词爬虫嵌入式开发项目 2025年爬虫精通专栏 python 爬虫开发语言媒体
目录记一次阿里云盾滑块验证分析并通过操作环境数据接口proxy配置根据关键词获取userid根据userid获取信息数据保存数据：记一次阿里云盾滑块验证分析并通过操作环境win10、macPython3.9数据接口搜索https://**********?containerid=100103type%3D{chanenl}%26q%3D{quote(self.words)}&page_type=s
python matplotlib legend()参数详解请一直在路上 python matplotlib 开发语言
在Python的Matplotlib库中，legend函数用于添加图例，帮助解释图表中不同数据系列或数据点的含义。legend函数有很多参数，可以自定义图例的各个方面，从位置到样式，从字体大小到边框。下面是一些常用参数的详细解释：importmatplotlib.pyplotasplt#创建一些数据x=[1,2,3,4]y1=[1,4,9,16]y2=[1,2,3,4]#绘制数据plt.plot(
YOLOv8与Transformer：探索目标检测的新架构 AI架构设计之禅 AI大模型应用入门实战与进阶大数据AI人工智能计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
YOLOv8与Transformer：探索目标检测的新架构关键词：目标检测，深度学习，YOLOv8，Transformer，计算机视觉，卷积神经网络摘要：目标检测是计算机视觉领域的一项重要任务，其目标是从图像或视频中识别和定位特定对象。近年来，YOLO（YouOnlyLookOnce）系列算法以其高精度和高速度成为目标检测领域的佼佼者。最新版本的YOLOv8引入了Transformer架构，进一步
Python+Pytest+Allure+Git+Jenkins数据驱动接口自动化测试框架_python+pytest+allure+jenkins架构 2401_87378716 python pytest git
接口测试流程1、需求评审，熟悉业务和需求2、开发提供接口文档3、编写接口测试用例4、用例评审5、提测后开始测试6、提交测试报告两种常见的HTTP请求方法：GET和POST二、项目说明本框架是一套基于Python+Pytest+Requests+Allure+Jenkins而设计的数据驱动接口自动化测试的框架。技术栈Python、Pytest、Requests、Pactverity、Excel、Js
python运行方式威胁情报收集站 pycharm ide python
#python代码运行方式第一种：交互式解释器。第二种：命令行运行python源代码。第三种：使用编辑器或集成开发环境（IDE）。比如：pycharm。（IDE：IntegratedDevelopmentEnvironment）详细教程：https://edu.csdn.net/job/pythonbe_01/python-3-3
【强化学习】PyTorch-RL框架大雨淅淅人工智能 pytorch 人工智能 python 深度学习机器学习
目录一、框架简介二、核心功能三、学习环境配置四、学习资源五、实践与应用六、常见问题与解决方案七、深入理解强化学习概念八、构建自己的强化学习环境九、调试与优化十、参与社区与持续学习一、框架简介PyTorch-RL是一个基于PyTorch框架的深度强化学习项目。它充分利用了PyTorch的强大功能，提供了易于使用且高效的深度强化学习算法实现。该项目的主要编程语言是Python，旨在帮助开发者快速实现和
蓝桥杯备赛笔记（九）动态规划（一）小魏´•ﻌ•` 蓝桥杯C++蓝桥杯笔记动态规划
1.动态规划基础(1)线性DP1）什么是DP（动态规划）DP（动态规划）全称DynamicProgramming，是运筹学的一个分支，是一种将复杂问题分解成很多重叠的子问题，并通过子问题的解得到整个问题的解的算法。在动态规划中有一些概念：状态：就是形如dp[i][j]=val的取值，其中i，j为下标，也是用于描述、确定状态所需的变量，val为状态值。状态转移：状态与状态之间的转移关系，一般可以表示
两万字探讨时间轮算法 Damon_0411 算法 java spring
1.引言1.1背景介绍随着分布式系统、微服务架构的流行以及高并发场景的广泛应用，系统中处理延时任务的需求变得愈发重要。延时任务的常见场景包括：任务调度：某些任务需要按照预定时间执行，比如每天的定时数据备份。超时控制：网络连接的超时检测、数据库锁的释放延迟等。缓存管理：缓存数据的过期清理策略。事件驱动场景：如日志系统中，只有当所有日志接收完毕并经过一定延迟后才能触发归档。延时任务的本质是系统需要管理
运行python程序的两种方式交互式和文件式_执行Python程序的两种方式 weixin_39610085
交互式(了解)交互式环境下，敲完一条命令按下enter键马上能看到结果，调试程序方便。程序无法永久保存，关掉cmd窗口数据就消失了。命令行式(了解)打开文本编辑器，在文本编辑器中写入一串字符。文本编辑器写的代码毫无意义，只是一堆字符，并且文件的后缀名没有影响。由于python语言是解释型语言，我们直接使用python打开文件，python会读一行翻译一行，并且这个文件是永久保存在硬盘中的。但是需要
【python基础】python GIL(全局解释器锁) 和多线程锁 shengnan_wsn python python 开发语言后端
文章目录什么是GIL？有了GIL还需要线程锁吗？参考资料1：[终于有人把GIL全局解释器说清楚了](https://zhuanlan.zhihu.com/p/311877485)2：[浅谈Python多线程之GIL描述](https://blog.csdn.net/qq_34359754/article/details/115209158)3：[多线程锁机制](https://www.cnblog
windows下python运行的方法好烦好烦方法
python有2中运行模式，一种是交互运行模式，一种是脚本运行模式。（假设用户已经安装好，网上很多例子）交互运行方和matlab命令窗口有点类似，有2中方法，一种是用它自带的开发环境IDLE，一种是在命令窗口下运行，建议2种方法都掌握，因为我们要在命令窗口模式下运行脚本文件。用开始菜单键win+r直接打开命令窗口，输入python字符回车，应该会出现如下提示： ![打开](https://im
【WRF后处理】基于NCL处理wrf运行结果wrfout_d01 WW、forever WRF模型原理及应用 WRF NCL
【WRF后处理】基于NCL处理wrf运行结果wrfout_d01NCL概述wrf-python和NCL总结WRF后处理数据信息查看诊断变量的获取插值参考NCL概述NCARCommandLanguage（NCL）是由美国大气研究中心（NCAR）推出的一款用于科学数据计算和可视化的免费软件。它有着非常强大的文件输入和输出功能，可读写netCDF-3、netCDF-4classic、HDF4、binar
常用Python GUI库推荐！老男孩IT教育 python 开发语言
tkinter的全称是TkInterface，是Python自带的GUI库，支持跨平台的GUl程序开发，只要安装了python就可以直接使用它。那么pythontkinter是什么?常用PythonGUI库有哪些?具体请看下文。pythontkinter是什么?tkinter是Python的标准GUI库。Python使用tkinter可以快速的创建GUI应用程序。由于tkinter是内置到Pyth
人工智能伦理：技术发展背后的思考 m0_72547478 人工智能
近年来，人工智能技术呈爆发式发展，在医疗、交通、金融等诸多领域取得惊人成果，但与此同时，人工智能伦理问题日益凸显，引发广泛关注。数据隐私与安全首当其冲。AI系统依赖海量数据训练，这些数据包含个人信息、医疗记录等敏感内容。若数据保护不当，极易引发数据泄露风险，侵犯个人隐私。例如，某些智能健康APP，若未能加密传输用户健康数据，一旦遭受黑客攻击，用户的隐私将暴露无遗。算法偏见也是一大痛点。AI算法基于
第 7 课Python 容器类型与相关操作嵌入式老牛 Python入门 python 开发语言
1.容器介绍对象是Python中对数据的抽象，Python程序中的所有数据都是由对象或对象间关系来表示的。Python中，可包含其他对象的引用的对象，称之为“容器”。容器的例子有元组、列表和字典等。这些引用的对象是容器对象值的组成部分。常用的容器主要划分为两种：序列（如：列表、元组等）和映射（如：字典）。序列中，每个元素都有下标，它们是有序的。映射中，每个元素都有名称（又称“键”），它们是无序的。
Python在WRF模型自动化运行及前后处理中实践技术应用数字化信息化智能化解决方案 python
Python在WRF（WeatherResearchandForecasting）模型自动化运行及前后处理中的实践技术应用如下：自动化运行WRF模型：使用Python脚本可以自动化执行WRF模型的运行过程。通过编写脚本来调用WRF模型的输入文件、运行模型并收集输出结果。这样可以在短时间内运行多个模拟，提高工作效率。数据预处理：在运行WRF模型之前，需要进行数据预处理，包括数据格式转换、坐标转换、数
基于区块链的云上数据访问控制模型研究 XLYcmy 论文阅读阅读笔记网络安全论文阅读论文笔记区块链访问控制云数据
论⽂选择理由:汉语论⽂,对于新⼿⼊⼿阅读相对容易之前,进⾏过区块链⽅⾯的研究，有⼀定基础⽅便理解论⽂通读情况:①基本掌握论⽂所提出背景和要解决的问题②⼤致理解论⽂所提出的⽅案和优势收获:⼤致梳理出⼀篇做的架构:(我的理解)背景→现有⽅案不⾜→预备免识→提出⽅案→⽅案核⼼设计与算法→与其他⽅案对比→设计实验环境与实验指标进⾏⽅案验证→总结与展望
使用conda升级到python 3.8 Babayacy python jupyter notebook python conda 开发语言
Python3.8.0已经发布，但我找不到关于如何使用conda更新到python3.8的任何文章-也许他们会等待正式发布？有什么建议么？Answers:打开Anaconda提示（基本）：合理的创建标题，有助于目录的生成1、更新conda：condaupdate-nbase-cdefaultsconda2、使用Python3.8创建新环境：condacreate-npython38python=3
Python-基础-字典（dict） All_Test_Pass Python-基础 python 开发语言
目录1、字典2、字典常用操作3、字典的方法1、字典字典（Dictionary）是一种用于存储键值对（key-valuepairs）数据的可变容器类型。每个字典都包含一组键（key）和值（value），通过键可以快速访问对应的值。字典是无序的，也就是说，它们不保证元素的顺序，直到Python3.7之后，字典会保留插入的顺序my_dict={key1:value1,key2:value2,key3:v
【Java数据结构】二叉树相关算法回响N 算法数据结构 java 开发语言链表
第一题：获取二叉树中结点个数得到二叉树结点个数，如果结点为空则返回0，然后再用递归计算左树结点个数+根结点（1个）+右树结点个数。publicintnodeSize(Noderoot){if(root==null)return0;returnnodeSize1(root.left)+nodeSize1(root.right)+1;}第二题：获取叶子结点的个数得到叶子结点个数和结点总数的做法相同，也
华为OD机试E卷 --热点网站统计--24年OD统一考试（Java & JS & Python & C & C++）飞码创造者最新华为OD机试题库2024 华为od java javascript python c++c语言
文章目录题目描述输入描述输出描述用例题目解析JS算法源码Java算法源码python算法源码c算法源码c++算法源码题目描述企业路由器的统计页面，有一个功能需要动态统计公司访问最多的网页URLtopN。请设计一个算法，可以高效动态统计TopN的页面。输入描述每一行都是一个URL或一个数字•如果是URL，代表一段时间内的网页访问•如果是一个数字N，代表本次需要输出的TopN个URL输入约束：总访问网
《CPython Internals》阅读笔记：p232-p249 python
《CPythonInternals》学习第13天，p232-p249总结，总计18页。一、技术总结无。二、英语总结(生词：1)1.overhead(1)overhead:over-("above")+head(“toppart,uppermostsection”)overhead的字面意思是：abovethehead,后来演变成"representthingssituatedaboveormeta
ASM系列六利用TreeApi 添加和移除类成员 lijingyao8206 jvm 动态代理 ASM 字节码技术 TreeAPI
同生成的做法一样，添加和移除类成员只要去修改fields和methods中的元素即可。这里我们拿一个简单的类做例子，下面这个Task类，我们来移除isNeedRemove方法，并且添加一个int 类型的addedField属性。 package asm.core; /** * Created by yunshen.ljy on 2015/6/
Springmvc-权限设计 bee1314 spring Web jsp
万丈高楼平地起。权限管理对于管理系统而言已经是标配中的标配了吧，对于我等俗人更是不能免俗。同时就目前的项目状况而言，我们还不需要那么高大上的开源的解决方案，如Spring Security，Shiro。小伙伴一致决定我们还是从基本的功能迭代起来吧。目标： 1.实现权限的管理（CRUD） 2.实现部门管理（CRUD) 3.实现人员的管理（CRUD） 4.实现部门和权限
算法竞赛入门经典（第二版）第2章习题 CrazyMizzz c 算法
2.4.1 输出技巧 #include <stdio.h> int main() { int i, n; scanf("%d", &n); for (i = 1; i <= n; i++) printf("%d\n", i); return 0; } 习题2-2 水仙花数(daffodil
struts2中jsp自动跳转到Action 麦田的设计者 jsp webxml struts2 自动跳转
1、在struts2的开发中，经常需要用户点击网页后就直接跳转到一个Action，执行Action里面的方法，利用mvc分层思想执行相应操作在界面上得到动态数据。毕竟用户不可能在地址栏里输入一个Action（不是专业人士） 2、＜jsp:forward page="xxx.action" /＞，这个标签可以实现跳转，page的路径是相对地址,不同与jsp和j
php 操作webservice实例 IT独行者 PHP webservice
首先大家要简单了解了何谓webservice，接下来就做两个非常简单的例子，webservice还是逃不开server端与client端。我测试的环境为：apache2.2.11 php5.2.10做这个测试之前，要确认你的php配置文件中已经将soap扩展打开，即extension=php_soap.dll; OK 现在我们来体验webservice //server端 serve
Windows下使用Vagrant安装linux系统 _wy_ windows vagrant
准备工作：下载安装 VirtualBox ：https://www.virtualbox.org/ 下载安装 Vagrant ：http://www.vagrantup.com/ 下载需要使用的 box ：官方提供的范例：http://files.vagrantup.com/precise32.box 还可以在 http://www.vagrantbox.es/
更改linux的文件拥有者及用户组(chown和chgrp) 无量 c linux chgrp chown
本文（转） http://blog.163.com/yanenshun@126/blog/static/128388169201203011157308/ http://ydlmlh.iteye.com/blog/1435157 一、基本使用：使用chown命令可以修改文件或目录所属的用户：命令
linux下抓包工具矮蛋蛋 linux
原文地址： http://blog.chinaunix.net/uid-23670869-id-2610683.html tcpdump -nn -vv -X udp port 8888 上面命令是抓取udp包、端口为8888 netstat -tln 命令是用来查看linux的端口使用情况 13 . 列出所有的网络连接 lsof -i 14. 列出所有tcp 网络连接信息 l
我觉得mybatis是垃圾！：“每一个用mybatis的男纸，你伤不起” alafqq mybatis
最近看了每一个用mybatis的男纸，你伤不起原文地址：http://www.iteye.com/topic/1073938 发表一下个人看法。欢迎大神拍砖；个人一直使用的是Ibatis框架，公司对其进行过小小的改良；最近换了公司，要使用新的框架。听说mybatis不错；就对其进行了部分的研究；发现多了一个mapper层；个人感觉就是个dao；
解决java数据交换之谜百合不是茶数据交换
交换两个数字的方法有以下三种，其中第一种最常用 /* 输出最小的一个数 */ public class jiaohuan1 { public static void main(String[] args) { int a =4; int b = 3; if(a<b){ // 第一种交换方式 int tmep =
渐变显示 bijian1013 JavaScript
<style type="text/css"> #wxf { FILTER: progid:DXImageTransform.Microsoft.Gradient(GradientType=0, StartColorStr=#ffffff, EndColorStr=#97FF98); height: 25px; } </style>
探索JUnit4扩展：断言语法assertThat bijian1013 java 单元测试 assertThat
一.概述 JUnit 设计的目的就是有效地抓住编程人员写代码的意图，然后快速检查他们的代码是否与他们的意图相匹配。 JUnit 发展至今，版本不停的翻新，但是所有版本都一致致力于解决一个问题，那就是如何发现编程人员的代码意图，并且如何使得编程人员更加容易地表达他们的代码意图。JUnit 4.4 也是为了如何能够
【Gson三】Gson解析{"data":{"IM":["MSN","QQ","Gtalk"]}} bit1129 gson
如何把如下简单的JSON字符串反序列化为Java的POJO对象? {"data":{"IM":["MSN","QQ","Gtalk"]}} 下面的POJO类Model无法完成正确的解析： import com.google.gson.Gson;
【Kafka九】Kafka High Level API vs. Low Level API bit1129 kafka
1. Kafka提供了两种Consumer API High Level Consumer API Low Level Consumer API(Kafka诡异的称之为Simple Consumer API，实际上非常复杂) 在选用哪种Consumer API时，首先要弄清楚这两种API的工作原理，能做什么不能做什么，能做的话怎么做的以及用的时候，有哪些可能的问题
在nginx中集成lua脚本：添加自定义Http头，封IP等 ronin47 nginx lua
Lua是一个可以嵌入到Nginx配置文件中的动态脚本语言，从而可以在Nginx请求处理的任何阶段执行各种Lua代码。刚开始我们只是用Lua 把请求路由到后端服务器，但是它对我们架构的作用超出了我们的预期。下面就讲讲我们所做的工作。强制搜索引擎只索引mixlr.com Google把子域名当作完全独立的网站，我们不希望爬虫抓取子域名的页面，降低我们的Page rank。 location /{
java-归并排序 bylijinnan java
import java.util.Arrays; public class MergeSort { public static void main(String[] args) { int[] a={20,1,3,8,5,9,4,25}; mergeSort(a,0,a.length-1); System.out.println(Arrays.to
Netty源码学习-CompositeChannelBuffer bylijinnan java netty
CompositeChannelBuffer体现了Netty的“Transparent Zero Copy” 查看API（ http://docs.jboss.org/netty/3.2/api/org/jboss/netty/buffer/package-summary.html#package_description）可以看到，所谓“Transparent Zero Copy”是通
Android中给Activity添加返回键 hotsunshine Activity
// this need android:minSdkVersion="11" getActionBar().setDisplayHomeAsUpEnabled(true); @Override public boolean onOptionsItemSelected(MenuItem item) {
静态页面传参 ctrain 静态
$(document).ready(function () { var request = { QueryString : function (val) { var uri = window.location.search; var re = new RegExp("" + val + "=([^&?]*)", &
Windows中查找某个目录下的所有文件中包含某个字符串的命令 daizj windows 查找某个目录下的所有文件包含某个字符串
findstr可以完成这个工作。 [html] view plain copy >findstr /s /i "string" *.* 上面的命令表示，当前目录以及当前目录的所有子目录下的所有文件中查找"string&qu
改善程序代码质量的一些技巧 dcj3sjt126com 编程 PHP 重构
有很多理由都能说明为什么我们应该写出清晰、可读性好的程序。最重要的一点，程序你只写一次，但以后会无数次的阅读。当你第二天回头来看你的代码时，你就要开始阅读它了。当你把代码拿给其他人看时，他必须阅读你的代码。因此，在编写时多花一点时间，你会在阅读它时节省大量的时间。让我们看一些基本的编程技巧：尽量保持方法简短尽管很多人都遵
SharedPreferences对数据的存储 dcj3sjt126com
SharedPreferences简介： &nbs
linux复习笔记之bash shell (2) bash基础 eksliang bash bash shell
转载请出自出处： http://eksliang.iteye.com/blog/2104329 1.影响显示结果的语系变量（locale） 1.1locale这个命令就是查看当前系统支持多少种语系，命令使用如下： [root@localhost shell]# locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8"
Android零碎知识总结 gqdy365 android
1、CopyOnWriteArrayList add(E) 和remove(int index)都是对新的数组进行修改和新增。所以在多线程操作时不会出现java.util.ConcurrentModificationException错误。所以最后得出结论：CopyOnWriteArrayList适合使用在读操作远远大于写操作的场景里，比如缓存。发生修改时候做copy，新老版本分离，保证读的高
HoverTree.Model.ArticleSelect类的作用 hvt Web .net C#hovertree asp.net
ArticleSelect类在命名空间HoverTree.Model中可以认为是文章查询条件类，用于存放查询文章时的条件，例如HvtId就是文章的id。HvtIsShow就是文章的显示属性，当为-1是，该条件不产生作用，当为0时，查询不公开显示的文章，当为1时查询公开显示的文章。HvtIsHome则为是否在首页显示。HoverTree系统源码完全开放，开发环境为Visual Studio 2013
PHP 判断是否使用代理 PHP Proxy Detector 天梯梦 proxy
1. php 类 I found this class looking for something else actually but I remembered I needed some while ago something similar and I never found one. I'm sure it will help a lot of developers who try to
apache的math库中的回归——regression（翻译） lvdccyb Math apache
这个Math库，虽然不向weka那样专业的ML库，但是用户友好，易用。多元线性回归，协方差和相关性（皮尔逊和斯皮尔曼），分布测试（假设检验，t，卡方，G），统计。数学库中还包含，Cholesky，LU，SVD，QR，特征根分解，真不错。基本覆盖了：线代，统计，矩阵，最优化理论曲线拟合常微分方程遗传算法（GA），还有3维的运算。。。
基础数据结构和算法十三：Undirected Graphs (2) sunwinner Algorithm
Design pattern for graph processing. Since we consider a large number of graph-processing algorithms, our initial design goal is to decouple our implementations from the graph representation
云计算平台最重要的五项技术 sumapp 云计算云平台智城云
云计算平台最重要的五项技术 1、云服务器云服务器提供简单高效，处理能力可弹性伸缩的计算服务，支持国内领先的云计算技术和大规模分布存储技术，使您的系统更稳定、数据更安全、传输更快速、部署更灵活。特性机型丰富通过高性能服务器虚拟化为云服务器，提供丰富配置类型虚拟机，极大简化数据存储、数据库搭建、web服务器搭建等工作；仅需要几分钟，根据CP
《京东技术解密》有奖试读获奖名单公布 ITeye管理员活动
ITeye携手博文视点举办的12月技术图书有奖试读活动已圆满结束，非常感谢广大用户对本次活动的关注与参与。 12月试读活动回顾： http://webmaster.iteye.com/blog/2164754 本次技术图书试读活动获奖名单及相应作品如下：一等奖（两名） Microhardest：http://microhardest.ite

java--ml 时间序列_时间序列-快速指南

时间序列-快速指南 (Time Series - Quick Guide)

时间序列-简介 (Time Series - Introduction)

时间序列-编程语言 (Time Series - Programming Languages)

Python (Python)

[R (R)

Java (Java)

C / C ++ (C/C++)

的MATLAB (MATLAB)

时间序列-Python库 (Time Series - Python Libraries)

NumPy (NumPy)

大熊猫 (Pandas)

科学 (SciPy)

Scikit学习 (Scikit Learn)

统计模型 (Statsmodels)

Matplotlib (Matplotlib)

约会时间 (Datetime)

时间序列-数据处理和可视化 (Time Series - Data Processing and Visualization)

显示df.head() (Showing df.head())

删除NaN(非数字) (Dropping NaN(Not-a-Number))

转换为日期时间对象 (Converting to datetime object)

显示情节 (Showing plots)

显示箱线图 (Showing Boxplots)

时间序列-建模 (Time Series - Modeling)

介绍 (Introduction)

时间序列建模技术 (Time Series Modeling Techniques)

幼稚的方法 (Naïve Methods)

自回归 (Auto Regression)

ARIMA模型 (ARIMA Model)

指数平滑 (Exponential Smoothing)

LSTM (LSTM)

时间序列-参数校准 (Time Series - Parameter Calibration)

介绍 (Introduction)

参数校准方法 (Methods for Calibration of Parameters)

尝试 (Hit-and-try)

网格搜索 (Grid Search)

遗传算法 (Genetic Algorithm)

时间序列-天真的方法 (Time Series - Naïve Methods)

介绍 (Introduction)

显示统计 (Showing statistics)

显示第一种天真的方法 (Showing 1st naïve method)

显示第二种天真的方法 (Showing 2nd naïve method)

时间序列-自动回归 (Time Series - Auto Regression)

显示ACP (Showing ACP)

时间序列-移动平均 (Time Series - Moving Average)

显示PACP (Showing PACP)

时间序列-ARIMA (Time Series - ARIMA)

时间序列-ARIMA的变化 (Time Series - Variations of ARIMA)

向量自回归(VAR) (Vector Auto-Regression (VAR))

向量移动平均线(VMA) (Vector Moving Average (VMA))

向量自回归移动平均值(VARMA) (Vector Auto Regression Moving Average (VARMA))

具有外生变量的VARMA(VARMAX) (VARMA with Exogenous Variables (VARMAX))

季节性自回归综合移动平均线(SARIMA) (Seasonal Auto Regressive Integrated Moving Average (SARIMA))

具有外生变量的SARIMA(SARIMAX) (SARIMA with Exogenous Variables (SARIMAX))

分数自回归综合移动平均线(FARIMA) (Fractional Auto Regressive Integrated Moving Average (FARIMA))

时间序列-指数平滑 (Time Series - Exponential Smoothing)

简单指数平滑 (Simple Exponential Smoothing)

三重指数平滑 (Triple Exponential Smoothing)

时间序列-前移验证 (Time Series - Walk Forward Validation)

时间序列-先知模型 (Time Series - Prophet Model)

时间序列-LSTM模型 (Time Series - LSTM Model)

神经网络 (Neural Networks)

递归神经网络 (Recurrent Neural Networks)

LSTM (LSTM)

时间序列-误差指标 (Time Series - Error Metrics)

均方误差 (Mean Square Error)

根均方误差 (Root Mean Square Error)

平均绝对误差 (Mean Absolute Error)

平均百分比误差 (Mean Percentage Error)

平均绝对百分比误差 (Mean Absolute Percentage Error)

时间序列-应用 (Time Series - Applications)

时间序列-进一步的范围 (Time Series - Further Scope)

时间序列数据 (Time Series Data)

非时间序列数据 (Non-Time Series Data)

你可能感兴趣的:(神经网络,大数据,算法,python,机器学习)

显示^第一种天真的方法 (Showing 1^st naïve method)

显示^第二种天真的方法 (Showing 2^nd naïve method)