先看看来自wikipedia的定义:The arXiv (pronounced "archive", as if the "X" were the Greek letter Chi, χ) is an archive for electronic preprints of scientific papers in the fields of mathematics, physics,computer science, quantitative biology and statistics which can be accessed via the Internet. In many fields of mathematics and physics, almost all scientific papers are placed on the arXiv. As of 3 October 2008, arXiv.org passed the half-million article milestone, with roughly five thousand new e-prints added every month.[1]
不难看出arXiv.org是一个收录科学文献预印本的在线数据库,目前包含了超过50万篇文章,并且以每个月5000篇的速度增长着。目前,这个数据库包含:数学,物理,计算机,非线性科学,定量生物学,定量财务以及统计学几大分类。其最重要的特点就是“开放式获取”,每个人都可以免费地访问全文数据。
arXiv.org最初创立于1991年,那个时候甚至连万维网(WWW)都还不存在,但Paul Ginsparg的创造被证明很受他的同行们欢迎,高能物理学家们很快就接受了这种新的交流方式,并积极地参与进来。并很快曼延到天体物理、凝聚态物理等其他领域。
物理学家群体之所以如此快地接受arXiv,这与职业物理学家一直以来的工作方式有关的。自从本世纪初,量子力学诞生以来,物理学家就一直处在“亢奋”的状态,把新的量子力学运用于更小的亚原子领域,或运用于更大的固体物理,乃至整个宇宙被证明是个巨大的科学淘金运动,重要的发现一个接一个,似乎永远不会间断,因此用最快的速度交流理论和实验的进展就成了大家的需求,而将文章发表,再从期刊上读到同行的文章将会耽误半年到一年的时间,这对工作在第一线的物理学家来说是不可忍受的,半年时间可能会把本来属于自己的“光荣”拱手让给自己的同行,甚至是获得诺贝尔奖的机会。因此工作在最前沿的物理学家习惯于互相交换自己最新工作的论文预印本(preprint),所谓预印本就是处在投稿前的已完成的科学论文。
如:我们在阅读李杨的科学经历的时候,会发现早在上世纪50年代预印本就很流行,并且确实起到了促进科学社群交流,加快科学进步的作用。
...1956年 8月杨振宁收到了芝加哥大学欧米(R.Oehme)的信,此信是欧米看了杨振宁和李政道关于宇称不守恒的预印本后写的。此信导致了他们三人于1956年底所写的一篇文章,文章中将字称不守恒的考虑推广到电荷共轭不守恒与时间反演不守恒。这篇文章奠定了以后讨论ß衰变中三种不守恒现象的基础。...
因此工作在最前沿的物理学家本来就有使用预印本的习惯,而Ginsparg工作的真正意义是把它们集中地放到互联网上,使每个物理学家都有机会接触到这些本来是私人之间流传,只有小圈子精英物理学家才能读到的预印本。而有了arXiv之后,每个物理学家,特别是来自“第三世界”物理学家在获取最重要科研动态的方面,不再那么落后了,时差几乎不存在了,而从前这个时间差至少是一年,许多最重要的工作已经被别人做完了,你才知道一年前的进展。
从这个角度arXiv的意义是重大的,它使全世界的物理学研究“一体化”了,不论你是在英国剑桥、波兰克拉考或印度加尔各答,你都将有机会第一时间知道物理学领域最新的进展。而最近物理学在超弦、超导等热门领域的巨大进展,无不与arXiv联系在一起。Ginsparg也因此获得了2002年的麦克阿瑟奖。
从以上叙述,我们可看出来Ginsparg的工作可以说是在无意识中就改变了物理学家交流的方式,并一举获得成功。当然随着互联网的普及,预印本文库也逐渐开始碰到新问题,抛开版权等问题不说,我们主要讨论预印本文库的“质量控制”问题。
在arXiv的诞生之初,“质量控制”并不是问题,因为它的管理者和使用者全部是一流的高能物理学家。预印本的上传、批准等全部是自动完成的。甚至它的读者也全部是高能物理学家或未来的高能物理学家。但随着arXiv的知名度越来越大,可以使用互联网的普通用户越来越多,这种状况也在逐渐改变。但arXiv的调整仍然不是很大,如果你有一个合法的所属科研单位(通过Email地址判断)即可。即arXiv不是一个向大众完全开放的社区,如果你要发言,必须证明你是来自学术科研机构的,需要有个.edu后缀Email地址做为注册地址。其他则依然照旧,自动提交,自动批准,没有人去审核提交文章的质量和相关度。这种“无为而治”的方法还是颇为成功的,虽然存在少数垃圾文章,但我们极少碰到。
只到2004年1月,随着越来越多的预印本被提交,arXiv才逐步引入审核机制,要求不活跃的研究者在提交预印本时需得到该领域活跃研究者的认可。arXiv这样做的主要目的是为了保持预印本文库对该领域科学家的可用性,保证文章的相关度和基本的质量。arXiv从诞生之日起,其定位就是为职业科学工作者服务的,因此arXiv“封杀业余研究者”也就显得可以理解了。
arXiv.org的前身是xxx.lanl.gov,堪称是开放获取(Open Access)运动的先驱,其创始人是Paul Ginsparg,关于Ginsparg与arXiv.org的故事可以从下面这个网址读到:
http://www.qiji.cn/news/open/2003/11/28/20031128232449.htm
附预印本说明:
预印本(Preprint)是指科研工作者的研究成果还未在正式出版物上发表,而出于和同行交流目的自愿先在学术会议上或通过互联网发布的科研论文、科技报告等文章。与刊物发表的文章以及网页发布的文章比,预印本具有交流速度快、利于学术争鸣、可靠性高的特点。
arXiv.org在全球有许多镜像站,可以方便身处世界各地的科研工作者下载文献。此外,有很多文献检索服务都是基于arXiv的数据的,如CiteBase。在中国中科院理论所也拥有一个镜像(cn.arxiv.org)。海外的镜像可以从下面的wikipedia引用文字看到一些。当然一个更方便的方法也是存在的,那就是使用google来直接检索(google可以指定检索网站)。很多时候,当你在google搜索中直接输入论文的关键词或标题,arXiv数据库中的条目也会出现,这也恰恰反映了arXiv的流行程度。The standard access route is through the arXiv.org website or one of several mirrors. Several other interfaces and access routes have also been created by other un-associated organisations. These include theUniversity of California, Davis's front, a web portal that offers additional search functions and a more self-explanatory interface for arXiv.org, and is referred to by some mathematicians as (the) Front.[8] A similar function is offered by eprintweb.org, launched in September 2006 by the Institute of Physics. Google Scholar and Windows Live Academic can also be used to search for items in arXiv.[9] Finally, researchers can select sub-fields and receive daily e-mailings or rss feeds of all submissions in them.
研究者按照一定的格式将论文进行排版后,通过E-mail、FTP等方式、按学科类别上传至相应的数据库中。要说明的是,送入预印本库中的论文均未经过任何审核,也没有任何先决条件决定哪些论文可以送入e-print arXiv数据库中,实际上这是默认了文责自负的原则。收入该数据库中的论文可以随时受到同行的评论,论文作者也可以对这种评论进行反驳。论文作者在将论文提交e-print arXiv的同时,也可以将论文提交学术期刊正式发表。论文一旦在某种期刊上发表,在e-print arXiv的该论文记录中将加入正式发表期刊的有关信息。
arXiv.org本身开放式获取的特点决定了它和商业出版社之间的对立或竞争关系。因此,为了存在它不可避免地要明确版权问题。官方网站上对此做了明确说明:
arXiv License Information
arXiv is a repository for scholarly material, and perpetual access is necessary to maintain the scholarly record. As such, arXiv keeps a permanent record of every submission and replacement announced.
arXiv does not ask that copyright be transferred. However, we require sufficient rights to allow us to distribute submitted articles in perpetuity. In order to submit an article to arXiv, the submitter must either:
grant arXiv.org a non-exclusive and irrevocable license to distribute the article, and certify that they have the right to grant this license,
certify that the work is available under either the Creative Commons Attribution license, or the Creative Commons Attribution-Noncommercial-ShareAlike license, and that they have the right to grant this license, or
certify that the work is in the public domain (we will store this information by associating the Create Commons Public Domain Declaration with the submission)
In the most common case authors have the right to grant these licenses because they hold copyright in their own work. We currently support only two of the Creative Commons licenses. If you wish to use another license then it is appropriate to indicate a more restrictive version for arXiv records (both of the licenses we support give us sufficient rights to distribute articles) and then indicate the more permissive license in the actual article.
Note that if you intend to submit, or have submitted, your article to a journal then you should verify that the license you intend to select does not conflict with the journal license or copyright transfer agreement. Many journal agreements permit submission to arXiv with the non-exclusive license to distribute which arXiv has used since 2004. The Creative Commons Attribution license in particular, permits commercial reuse and thus conflicts with many journal agreements.
可以看出,arXiv基本上遵循CC版权声明,也就是你可以自由分享,自由改动但是你必须提供按原作者指定方式的署名并且同样遵循CC协议。事实上这极大地鼓励了科研领域知识的分享。但是需要注意的是,CC署名授权并不排斥作品的商业使用,因此arXiv上的某些文章也可以被用于商业用途,但是如果这些文章的作者还需要在一些出版社发表的话,就可能有潜在的协议冲突。
这也是Simuworld.linkka.com编写本文时最想讨论的话题。或多或少,arXiv.org在哲学高度上代表了当代社会的一种不可阻挡的趋势。让我们来看看arXiv.org的创建者和维护者的想法吧(引文来自康奈尔大学计算机科学系的主页):
A crisis has been evolving in the past few years in the realm of scholarly publishing because commercial journals have raised their prices substantially without a proportional benefit to the community of authors or readers. For example, the EMPS (Engineering, Math and Physical Sciences) library at Cornell has seen a 9% subscription increase in just the past year. The worst offender seems to be Elsevier, which publishes many CS journals.
首先,知名了商业出版社“唯利是图”的本质。这一定程度地阻碍了科研领域传统的开放的交流方式。
A second looming concern with scholarly publishing is that commercial publishers are using pricing policies to push libraries into switching to all-electronic subscription. All-electronic subscription gives the commercial publisher unprecedented control over who can read articles and for what purposes those articles are used. Furthermore, an electronic subscription means that the publisher expands its role to become also the archivist of the material. There is no reason to believe that a company like Elsevier is qualified to usurp the role traditionally filled by libraries as the archivist of scholarly work over a period of decades or centuries. For more information about the problems faced by university libraries, please visit the home page of the SPARC project of the Association of Research Libraries.
哈,这里提及了arXiv项目诞生的深层本质,那就是当代出版商利用价格和其拥有的电子资源做“武器”,“入侵”了传统图书馆的领域。而图书馆作为传统的大众知识集散地,正在受到威胁和制约。因此,决定了arXiv是维护图书馆功能的“反击武器”。
An obvious solution to these problems is for the academic community as a whole to create its own archive under the control of scholars rather than a corporate board of directors. This is the goal behind arxiv.org. We believe that all academics ought to include their publications in this kind of archive. Therefore, we are establishing this as a departmental policy. We would like to establish it as a policy for the whole world, but we have to start somewhere!
这里,终于明确了arXiv的目标:“学者自治”。因为科研工作者是学术文章的原始版权拥有者,他们有能力自己决定自己作品的命运。
Naturally, a member of the department could easily follow this policy on his or her own initiative without the existence of a departmentwide policy. Indeed, several of us already archive our papers as a matter of course because archiving brings several benefits to the author including enhanced visibility of the result and proof of precedence of discovery. But we believe there are three reasons why it is useful to make archiving an official policy of the department.
这是一个很值得考虑的问题。因为,很多科研工作者仍然认为,他们的论文被一些公认的商业刊物发表是传播其影响力的最有效途径。但是,在一篇作品发表之前,出版商会要求作者同意他们制定的版权转让协议。这些协议有可能和arXiv的做法相抵触,无论你是在文章发表以前还是以后,要想也在arXiv上登录,必须考虑这些法律问题。前面也说过arXiv倡导的这种自由分享的精神已经形成一种潮流,以至于越来越多的出版商逐渐接受了和自由分享共存的价值观。正如下文中所说,已经有很多出版商在他们的版权协议中支持或不反对作者也提交其文章到arXiv。
Doesn't archiving violate a journal's copyright policy?
First, note that you can alter copyright transfer agreements to preserve more rights for yourself. Naturally, a journal might reject the paper if it disagrees with your alterations to the agreement, but we have heard that many people have successfully altered these agreements without adverse consequences. Later, we will post some possible alterations that people have successfully used on copyright transfer agreements.
Assuming you don't alter the agreement, you are subject to the terms of it. Here are the policies of some of the larger CS publishers.
ACM. The relevant ACM policy states that, prior to acceptance you can post the preprint version of the paper anywhere, but that after acceptance you should add an ACM copyright note to the posted version. After acceptance, you can post the accepted paper on your home page but not on a server like arxiv.org unless you first obtain ACM permission. In the case of Cornell University employees, an author-prepared version of the paper can be posted to arxiv.org even after acceptance because ACM allows posting on a "publicly accessible server of their employer". For Cornell employees, arxiv.org counts as an employer's server.
There is a potential difficulty with ACM conferences that use blind reviewing. See further remarks below.
SIAM. The SIAM copyright transfer agreement allows you to post the preprint version anywhere including arxiv.org, and allows you to post the final version on your personal home page but not on a server like arxiv.org.
IEEE. The relevant IEEE policy states that you can post the preprint version of the paper anywhere including arxiv.org, but, upon acceptance, you are required to replace it with the accepted version that includes the IEEE copyright notice.
Elsevier. The relevant Elsevier policy states that you can post a preprint version of the paper anywhere including arxiv.org, and you can post the final version on your home page (but not arxiv.org). You are allowed to post revised versions made during the refereeing process on an employer website. For Cornell University employees, this includes arxiv.org.
Springer. The relevant Springer policy states that you can post a preprint version of the paper anywhere including arxiv.org but not the final version. Springer also has a program called Open Choice in which you pay Springer $3000, and in return, they will make the final PDF version of your paper available on their server for free for anyone with web access.
Wiley. I have not been able to figure out the Wiley copyright transfer policy. I sent email to Wiley in January, 2005 to request clarification but have not yet received a definitive answer.
SimuWorld编写这篇文章的目的是希望我们中国的科研工作者可以善用arXiv这个文献资源。同时,我们很欣赏这样一股新鲜的“开放”精神。因为知识的分享将会极大地促进人类进步,而科学研究本身更是需要交流。我们中国的成语早就表明这种潮流的重要性,希望每一个人都不要“闭门造车”,“固步自封”。
SimuWorld.linkka.com声明: 本文章为本站原创编辑并撰写,未经管理员许可,任何个人或组织不能以商业目的擅用本文。对非营利目的引用和转载,请务必注明本站网址,谢谢合作!