1.
We do away with this expensive step by choosing several random orderings of attributes.
As we shall show experimentally, this approach works exceedingly well in practive.
我们通过选择随机属性来避免这种昂贵的步骤,正如我们将通过实验证明的那样,这种方法再实践中工作的非常好。
2.
神经网络强大的学习能力
the powerful learning capacity of neural networks
神经网络强大的学习能力甚至可以轻松地学习具有挑战性的分解
the powerful learning capacity can readily learn even a challenging decomposition by increasing the depth of the MADE model
3.
Of course, different orderings result in different models with their crresponding estimate and associated accuracy.
4.
虽然随机排序通常能够提供良好的结果,但是最好还是能避免糟糕排列的最坏情况。
While a random ordering often provides good results, it is desirable to guard against the worst case scenario of a bad permutation.
5.
thereby easier to learn than the other way around
6.
上述的自回归方法
the autoregressive approach outlined above
7.
leveraging query workload
利用查询工作负载
8.
a naive training could cause catastrophic forgetting where the model forgets the old data and focuses exclusively on the new data.
这是不可取的,必须被避免
This is undesirable and must be avoided.
9.
障碍是查询和他们的选择性之间的复杂关系,
在这种关系中,简化假设(例如属性独立性)是站不住的。
The impediment is the complex relationship between queries and their selectivities where simplifying assumptions such as attribute value independence do not hold.
10.
miscellaneous issues 各种各样的问题
miscellaneous expenses 各种开支
11.
查询的选择性通常遵循一种偏态分布,即其中较少的查询有较大的选择性,而大部分的查询有较小的选择性。
偏态分布:follow a skewed distribution
Selectivities of queries often follow a skewed distribution where few queries have a large selectivity and the vast majority of queries have much smaller selectivities.
不需要the:刚开始出现的或不限定的复数名词(泛指)
需要the:前文已提及的或限定的单数/复数名词(特指)
12.
Relu是一个简单的非线性激活函数,具有较快的训练速度和稀疏表示等优点。
ReLU is a simple non-linear activation function with known advantages such as faster training and sparser representations.
13.
squashes its parameter into a [0,1]
14.
We begin by enumerating all queries with 1 predicates that are the atomic units from which multi-predicate queries could be estimated.
15.
如果查询工作负载是可用的,我们描述了一种新的增强策略,这样DL模型可以为类似于查询工作负载的未知拆线呢生成准确的估计。
If query workload is available, we describe a novel augmentation strategy such that the DL model can generate accurate estimates for unknown queries that are similar to the query workload.
一种新的增强策略:a novel augmentation strategy
16.
然而,通过拓展他可以做的更好,获得一个信息更丰富的查询训练集。
However, one can do much better by augmenting it, obtaining a more informative training set of queries.
17.
关键的思想是从查询工作负载引起的分布中选择查询,这样模型就可以泛化到来自相同分布的未知查询。
The key idea is to select queries from the distribution induced by the query workload such that the model generalizes to unknown queries from the same distribution.
the distribution induced by the query workload
由工作负载引起的分布
18.
This ensures that queries involving popular attributes and attribute values are generated at a higher frequency.
涉及流行属性和属性值的查询: queries involving popular attributes
19.
the database community
数据库社区
20.
扩散模型可能看起来是隐变量模型的受限类别,但是它们在实现中允许大量自由度。
Diffusion models might appear to be a restricted class of latent variable models, but they allow a large number of degrees of freedom in implementation.
在实现中允许大量自由度。
allow a large number of degrees of freedom in implementation.
21.
Additionally, we demonstrate that our models learn effective representations via image inpainting experiments.
此外,我们通过图像修复实验证明了我们的模型学习到了有效的表示.
our models learn effective representations: 我们的模型学到了有效的表征
22.
One must choose the variances βt of the forward process and the model architecture and Gaussian distribution parameterization of the reverse process.
必须选择正向过程的方差βt和反向过程的模型结构和高斯分布参数化。
23.
We argue that there are two major obstacles that prevent a naive application of this idea.
我们认为有两个主要的障碍阻止简单地应用这一观点。
两个主要的障碍:two major obstacles、
简单地应用这一观点:a naive application of this idea
24.
非平稳时间序列的精确变化趋势
variant tendency of non-stationary time series
25.
in an arbitrary order
以任意顺序
26.
顶点的基本性质 类比与 物体的质量
the basic property of vertex by analogy with the mass of an object
27.
因此,本文研究了时间序列的长期预测问题,其特点是预测的时间序列长度很大。
Thus, in this paper, we study the long-term forecasting problem of time-series, characterizing itself by the large length of predicted time series.
28.
首先,直接从长期时间序列中发现时间相关性是不可靠的,因为相关性可能被纠缠的时间模式所掩盖。
First, it is unreliable to discover the temporal dependencies directly from the long-term time series because the dependencies can be obscured by entangled temporal patterns.
可能被纠缠的时间模式所掩盖。
can be obscured by entangled temporal patterns
29.
While performance is significantly improved, these models still utilize the point-wise representation aggregation.
逐点表示聚合: the point-wise representation aggregation.
30.
常见的做法:common usage
限制了分解的能力:limit the capabilities of decomposition
忽视了分解组件之间未来可能的交互:
overlooks the potential future interactions among decomposed components