dataset和data set的区别

在写英语论文的时候会遇到“数据集”这一词语,一些英文论文使用dataset,还有一些使用的data set。采用data set的占多数。查阅了一下二者的区别,找到了一个网站,上面有一些解释:

  1. dataset for certain datasets; data set for any set for data in general. In specific contexts, a dataset needs to satisfy conditions to qualify as a dataset. Any set of any data can be called a data set, unqualified;
    也就是说dataset指的是某一特定的数据集,是一种特指;而data set是一种泛指,可以是任意的数据集。
  2. I note that googling the NIPS website that contains many academic papers with datasets I find that “data set” reports 1.890 results and “dataset” 2.660 results. The same pattern is seen for plural (datasets/data sets). I would suggest using “dataset”;
    使用谷歌搜索二者,在学术性论文中有1890篇采用的data set,而2660篇采用的dataset。因此,建议使用dataset。(而我看的相关领域的内容,采用data set的占多数,所以可能还是跟研究领域有关系吧)
  3. dataset does not appear in any dictionaries. However, there are 172 incidences in the Corpus of Contemporary American English, and all but a handful are in the “academic” section, representing formal academic writing. Its lack of appearance in dictionaries is probably because it is a fairly new coinage, the two examples from the Corpus of Historical American English are from 2001. Nothing from before then. Interestingly, the British National Corpus has 51 incidences, dating from the 1980s to the mid 1990s.
  4. Wiktionary says they are equivalent, but neither Merriam-Webster nor has an entry.Given that information, I guess I would classify dataset as technical jargon, but it’s really not much of a jargon term. Any technical audience would have no problem with it; a non-technical audience should still easily understand its meaning.
    Wiktionary说二者是等价的,但无论是Merriam Webster还是Dictionary.com都没有记录。根据这些信息,我想我会将dataset归类为行业术语,但实际上它并不是一个术语。任何一个行业从事者都可以明白dataset的含义,此外一个非行业从事者也是可以很容易就明白dataset的意思。


我认为dataset和data set均是可以的,只不过data set所指的数据集范围更广,而dataset则相对来说更窄一些
