It is consistent with my experience here.
这与我的经历相符合
intractable不可用的,很难实现的
This re-implementation serves as可以作为
a proof of concept and is not intended for不是为了 ``reproduction of the results复现结果
reported in.
would significantly accelerate developments
显著的加速什么的发展in
novel medicines and materials.
I will refer to these models as
Graph Convolutional Networks (GCNs). 我将把这些model叫做图卷积模型.
a canonical form
正规式,canonical adj 权威的
reify
使具体化。
differentiate
between concepts and instances 在概念和实体之间做了区分
This is clearly inadequate不充足的不恰当的
and it is quite conceivable that可以想象
incorporating temporal scopes结合时间域
during representation learning is likely to yield更有可能产生
better KG embeddings
In spite of its importance尽管他很重要
HyTE fragments将什么东西分解成
a temporally-scoped input KG into multiple static subgraphs
This significant gain重大成果
empirically validates our claim经验性的证实了我们所说的
that including temporal information in a principled fashion helps to learn richer embeddings of the KG elements.
Relation prediction: Again, in this scenario在这个环节上
, we show improvement over baselines
. relation prediction是一个任务,作者在这块写的时候用了一个in this scenario短语,有意思。
disambiguate among relations消除关系中的歧义
We also thank
Chandrahas Dewangan for his indispensable suggestions不可获取的建议,建议加复数
.
methods also suffer from有什么坏处
the oversimplified loss metric
Firstly, due to the inflexibility of loss metric 由于现在的loss metric的不灵活性
, current translation-based methods apply spherical equipotential hyper-surfaces with different plausibilities,
It is worth noting that值得注意的是
is a nontrivial task是非常重要的任务
Inspired by generative adversarial networks (GANs),收到GANs的启发
but by far 到目前为止
the two described above上面叙述的两种
are the most common最常见的
.
The problem is less severe to不太严重
models using logsoftmax loss function
we would like to draw an analogy 进行类比
between our model and an RL model
making models prone to易于
overfitting.
Finding the best ratio between expressiveness and parameter space size is the keystone of embedding models是嵌入模型的关键技术
embeddings can be a very effective composition function,provided that前提是
one uses the right representation.
Using the same embeddings for right and left factors boils down to归结为
Eigenvalue decomposition特征值分解
We are in this work however注意however的位置
explicitly interested in problems where matrices — 双横杠的运用and thus the relations they represent
— can also be antisymmetric.
conjugate共轭
The factorization (5) is useful
as the rows of E can be used as vectorial representations of the entities corresponding to rows and columns of the relation matrix X.好像理解了英语的思维,作者会把最重要的话写在最前面,首先说明矩阵分解是有用的,然后加上限定,当E的行被使用作为entity的表示的时候,而这个entity对应于relation矩阵X的行和列
While X as a whole整体上
is unknown
presumably大概是由于他简单
due to its simplicity
This only indirectly间接地
shows the meaningfulness意义
of low-dimensional embeddings这只是间接地显示了低维嵌入的意义。
The rest of this paper is structured as follows.
一般写在introduction中,Section 2 discusses related work.
a canonical权威的
link prediction task
Likewise, variations变体
on entity representations also exist.
a simple variant of 一个关于什么的变体
In terms of就参数量而言
the number of parameters,
canbe helpful to model有助于建模
relations in which the similarity of entities is important
In this scheme在这种机制中
The features of OpenKE are threefold:一下三个方面
OpenKE integrates将什么和什么结合起来
efficient computing power, training methods, and
various acceleration strategies to support KE models
dissimilarity function不同的函数
The answer is affirmative答案是肯定的
Our findings cast doubt on怀疑
the claim that 声称
the performance improvements of recent models are due to由于
architectural changes as opposed to而不是
hyperparameter tuning or different training objectives.
it is prohibitive to consider考虑什么是让人望而却步的
ProjEhas four parts
that distinguish it from
the related work:又四个地方与众不同
In principle
原则上基本上。
the 2-layer GCN significantly improves over超过
the 1-layer GCN by a large margin以一个非常大的差距
.
Certainly not the more the better当然不是越多越好
This is certainly undesirable当然是不可取的
as it defeats the purpose of因为他不满足...的需求
semi-supervised learning
Unlike these prior models之前的模型
, which modified the regular convolutional operations to fit them for generic graph data, we instead propose表示转折instead
to transform graphs into grid-like data to enable the use of CNNs directly。
is an operation that performs执行
the k-largest node selection.
I 've been blessed with an incredible amount of talents, My work ethic is second to none. 我被赋予了难以置信的天赋,我的职业道德是首屈一指的
suffer from a dearth of缺乏
linguistic variation,
These operations are self-explanatory.
这些操作都是不言自明的。
by nature就本质而言
A case in point is that of fine-grained endangered species 一个例子是
In such specialized problems
, it is challenging to effectively train deep networks that are
, by nature, data hungry
. A case in point is that of fine-grained endangered species recognition [2], where besides scarcity of training data, there are further bottlenecks like subtle inter-class object differences compared to significant randomized background variation both between and within classes.
where besides scarcity of training data,there are further bottlenecks like subtle inter-class object除过数据的缺乏,还有一个瓶颈就是内部太像了
In spite of these advances尽管有了这些进步
, deep learning
of small fine-grained datasets remains one of the open popular challenges of machine vision依然是计算机视觉中的一个开放挑战
in an end-to-end fashion.以一种端到端的方式
Weattribute this mainly to我们主要把这归功于
the L2 normalization involved in
the cosine loss,
Such regimes arise from practical situations
where not
only data labeling but also data collection itself is expensive.这种regimes源自于现实情形:不仅数据label的标注并且数据的收集也很困难。
All results endorse the superior capability of DADA表明了DADA的优越能力
in enhancing the generalization ability of deep networks trained in practical extremely low data regimes在现实中极度短缺数据的情况
In this paper, we place ourself in front of an even more ill-posed and challenging problem:在这篇论文中,我们把自己摆在一个更不恰当、更具挑战性的问题面前。
we have made multi-fold technical contributions in this paper:我们在这篇论文中有多个贡献
Hereby we review and categorize several mainstream approaches.在此,我们回顾并归类几种主流的方法。
In practice, pre-training is often accompanied with data augmentation在实际中预训练方法通常和数据增强方法结合在一起。
they are motivated by the same hypothesis that他们都启发与同样的假设
while labeling data is difficult, collecting unlabeled data remains to be a cheap task
overall,总而言之
We conjecture that我们推测
that might hamper阻碍
class-conditional generative modeling.
many researchers have broadened their interest towards将他们的研究兴趣扩展到
cross-lingual word embeddings
It was done exclusively完全是为了
for the good of Inter
without loss of generality不失一般性
I was ecstatic for them我为他们欣喜若狂
This stems from源自于
the cultural and legal tradition
In my mind, the 4 main steps are:在我看来,四个主要的步骤是
Are these the "Chinese Characteristics"中国特色
we hear so much about?
A number of ways to do this can be envisaged可以想到许多的方法
To this end 为了达到这个目的
is less prone to overfitting.
被证明是不容易过拟合的。
in place of
代替
subtle changes to对背景和纹理的微小改动
the background or texture of an image can break a seemingly powerful classifier
we use adversarial learning to constrain the representation to have desired statistical characteristics specific to a prior.表示进一步的目的
We term our method把我们的模型称作
Mixture Generative Adversarial Nets (MGAN)
will therefore struggle因此是很困难的
at negligible cost.
For more demanding tasks为了更高要求的任务
we posit that我们假定
which in turn反过来
will drive推动
the choice of the architecture and training algorithm.
without access to没有
supervised labels during training.
Schematic原理图
for meta-learning an unsupervised learning algorithm
Suppose the feature vector is of具有
limited capacity
DIM opens new avenues for为什么开辟了新的手段
MI is notoriously difficult众所周知的困难
to compute
is tailored to为什么量身定做的
we’re following a more end-to-end approach increasing表示结果,这大大的提高了我们系统的能力
the capabilities of our system considerably.
In effect实际上
Watch Out for注意
Correlated Features
shed some lights on表明了对什么的一些看法
We advocate for提倡
cost-sensitive robustness as the criteria for因为
measuring the classifier’s performance for对于
tasks where some adversarial transformation are more important than others.
the first version of this document was drafted by
my students. My thanks to
them.
However, they are hard to train as因为
they require a delicate balancing act between two deep networks fighting a never ending duel.
You must do the work as按照
I have told you.
Wasserstein distance is just one out of a large family of只是大家族中的一
objective functions that yield these properties.
The training additionally becomes more robust to suboptimal choices of hyperparameters, model architectures, or objective functions.注意副词提前翻译,此外,训练变得对
This is even worse for对什么是更坏的
neural networks
it is inferior to在什么方面次于CBOW
CBOW in memorizing word content.
The dominant approach to unsupervised balabala无监督balabala中的主流方法
which is independent of独立于
the attributes specifying its “style”
Generations should retain as many of the original input characteristics as possible, provided the attribute constraint is not violated只要不违反属性的限制 provided conj. 如果; 假如; 在…的条件下; violate v 违反,亵渎
Over the past decade
, deep neural networks have become arguably the most popular model choice for a vast number of natural language processing (NLP) tasks and have constantly been delivering state-of-the-art results.
Current approaches to对于
summarization are based on the sequence-to-sequence paradigm over在什么之上的一个范例
the words of some text.
To mitigate the long-distance relationship problem, we draw inspiration from我们从..获得灵感
recent work on highly-structured objects.
The framework is agnostic to这个框架和结构无关
the architecture of the machine reading model provided只要
it has access to能够获得
the token-level hidden representations of the reader.
I am fortunate to be supported by the很幸运获得了
Besides my research, I am passionate about对什么具有热情
sports.
We find that the MC model performs better when training time is less of an issue当训练时间不是一个问题的时候
.
the technique was not developed for the purpose of不是为了什么的目的而发展的
explicitly improving either informativeness or diversity.
so as to improve从而增强了
The objective in Eqn. (3) also resembles像
Wasserstein GAN (WGAN)
for example例如
This problem compounds在生成长文本的时候加剧了
when generating longer and longer sentences.
The discriminator has an identical
architecture to有于生成器相同的结构
the generator.
Analogously类似的
it shows that our proposed approach for
learning inference in an adversarial setting is superior to
the other approaches investigated超过了其他所调研的方法
.
The discriminator has an identical
architecture to
the generator
For this purpose
为了这个目的
In regards to 关于
your comment on Gibbs sampling
he is determined to决心
make
offers excellence in teaching and learning提供卓越的教学
Akin to类似于
the use of pre-trained models in computer vision
We would appreciate it
if you could let us know whether this clarifies the issue, and changes your assessment of the novelty of our work.