你能从GitHub存储库中获得代码行数吗?

本文翻译自:Can you get the number of lines of code from a GitHub repository?

In a GitHub repository you can see “language statistics”, which displays the percentage of the project that's written in a language. 在GitHub存储库中,您可以看到“语言统计信息”,它显示用一种语言编写的项目的百分比 It doesn't, however, display how many lines of code the project consists of. 但是,它不会显示项目包含的代码行数。 Often, I want to quickly get an impression of the scale and complexity of a project, and the count of lines of code can give a good first impression. 通常,我希望能够快速了解​​项目的规模和复杂性,并且代码行数可以给人留下良好的第一印象。 500 lines of code implies a relatively simple project, 100,000 lines of code implies a very large/complicated project. 500行代码意味着一个相对简单的项目,100,000行代码意味着一个非常大/复杂的项目。

So, is it possible to get the lines of code written in the various languages from a GitHub repository, preferably without cloning it? 那么,是否有可能从GitHub存储库中获取用各种语言编写的代码行,最好不要克隆它?


The question “ Count number of lines in a git repository ” asks how to count the lines of code in a local Git repository, but: 问题“ 计算git存储库中的行数”询问如何计算本地Git存储库中的代码行,但是:

  1. You have to clone the project, which could be massive. 你必须克隆项目,这可能是巨大的。 Cloning a project like Wine, for example, takes ages. 例如,克隆像Wine这样的项目需要很长时间。
  2. You would count lines in files that wouldn't necessarily be code, like i13n files. 您可以计算文件中不一定是代码的行,例如i13n文件。
  3. If you count just (for example) Ruby files, you'd potentially miss massive amount of code in other languages, like JavaScript. 如果你计算(例如)Ruby文件,你可能会错过其他语言的大量代码,比如JavaScript。 You'd have to know beforehand which languages the project uses. 您必须事先知道项目使用的语言。 You'd also have to repeat the count for every language the project uses. 您还必须重复项目使用的每种语言的计数。

All in all, this is potentially far too time-intensive for “quickly checking the scale of a project”. 总而言之,这对于“快速检查项目规模”来说可能太耗费时间。


#1楼

参考:https://stackoom.com/question/1on5d/你能从GitHub存储库中获得代码行数吗


#2楼

You can clone just the latest commit using git clone --depth 1 and then perform your own analysis using Linguist , the same software Github uses. 您可以使用git clone --depth 1 最新的提交,然后使用与Github使用的相同软件Linguist执行您自己的分析。 That's the only way I know you're going to get lines of code. 这是我知道你将获得代码的唯一方法。

Another option is to use the API to list the languages the project uses . 另一种选择是使用API​​列出项目使用的语言 。 It doesn't give them in lines but in bytes. 它不以行为单位给出它们,而是以字节为单位。 For example... 例如...

$ curl https://api.github.com/repos/evalEmpire/perl5i/languages
{
  "Perl": 274835
}

Though take that with a grain of salt, that project includes YAML and JSON which the web site acknowledges but the API does not. 尽管如此,该项目还包括网站承认的YAML和JSON,但API没有。

Finally, you can use code search to ask which files match a given language. 最后,您可以使用代码搜索来询问哪些文件与给定语言匹配。 This example asks which files in perl5i are Perl. 此示例询问perl5i中的哪些文件是Perl。 https://api.github.com/search/code?q=language:perl+repo:evalEmpire/perl5i . https://api.github.com/search/code?q=language:perl+repo:evalEmpire/perl5i It will not give you lines, and you have to ask for the file size separately using the returned url for each file. 它不会为您提供行,您必须使用每个文件的返回url单独请求文件大小。


#3楼

Not currently possible on Github.com or their API-s 目前无法在Github.com或其API上使用

I have talked to customer support and confirmed that this can not be done on github.com. 我已经与客户支持部门进行了交谈,并确认无法在github.com上完成此操作。 They have passed the suggestion along to the Github team though, so hopefully it will be possible in the future. 他们已经将建议传递给了Github团队,所以希望将来有可能。 If so, I'll be sure to edit this answer. 如果是这样,我一定会编辑这个答案。

Meanwhile, Rory O'Kane's answer is a brilliant alternative based on cloc and a shallow repo clone. 与此同时, Rory O'Kane的答案是基于cloc和浅层repo克隆的出色选择。


#4楼

A shell script, cloc-git 一个shell脚本, cloc-git

You can use this shell script to count the number of lines in a remote Git repository with one command: 您可以使用此shell脚本使用一个命令计算远程Git存储库中的行数:

#!/usr/bin/env bash
git clone --depth 1 "$1" temp-linecount-repo &&
  printf "('temp-linecount-repo' will be deleted automatically)\n\n\n" &&
  cloc temp-linecount-repo &&
  rm -rf temp-linecount-repo

Installation 安装

This script requires CLOC (“Count Lines of Code”) to be installed. 此脚本需要安装CLOC (“计数代码行”)。 cloc can probably be installed with your package manager – for example, brew install cloc with Homebrew . cloc可能与您的软件包管理器一起安装 - 例如,使用Homebrew进行 brew install cloc There is also a docker image published under mribeiro/cloc . 还有一个在mribeiro/cloc下发布的码头图片 。

You can install the script by saving its code to a file cloc-git , running chmod +x cloc-git , and then moving the file to a folder in your $PATH such as /usr/local/bin . 您可以通过将其代码保存到文件cloc-git ,运行chmod +x cloc-git ,然后将文件移动到$PATH的文件夹(例如/usr/local/bin来安装脚本。

Usage 用法

The script takes one argument, which is any URL that git clone will accept. 该脚本采用一个参数,即git clone将接受的任何URL。 Examples are https://github.com/evalEmpire/perl5i.git (HTTPS) or [email protected]:evalEmpire/perl5i.git (SSH). 示例是https://github.com/evalEmpire/perl5i.git )或[email protected]:evalEmpire/perl5i.git (SSH)。 You can get this URL from any GitHub project page by clicking “Clone or download”. 您可以通过单击“克隆或下载”从任何GitHub项目页面获取此URL。

Example output: 示例输出:

$ cloc-git https://github.com/evalEmpire/perl5i.git
Cloning into 'temp-linecount-repo'...
remote: Counting objects: 200, done.
remote: Compressing objects: 100% (182/182), done.
remote: Total 200 (delta 13), reused 158 (delta 9), pack-reused 0
Receiving objects: 100% (200/200), 296.52 KiB | 110.00 KiB/s, done.
Resolving deltas: 100% (13/13), done.
Checking connectivity... done.
('temp-linecount-repo' will be deleted automatically)


     171 text files.
     166 unique files.                                          
      17 files ignored.

http://cloc.sourceforge.net v 1.62  T=1.13 s (134.1 files/s, 9764.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Perl                           149           2795           1425           6382
JSON                             1              0              0            270
YAML                             2              0              0            198
-------------------------------------------------------------------------------
SUM:                           152           2795           1425           6850
-------------------------------------------------------------------------------

Alternatives 备择方案

Run the commands manually 手动运行命令

If you don't want to bother saving and installing the shell script, you can run the commands manually. 如果您不想打扰保存和安装shell脚本,可以手动运行命令。 An example: 一个例子:

$ git clone --depth 1 https://github.com/evalEmpire/perl5i.git
$ cloc perl5i
$ rm -rf perl5i

Linguist 语言学家

If you want the results to match GitHub's language percentages exactly, you can try installing Linguist instead of CLOC . 如果您希望结果与GitHub的语言百分比完全匹配,您可以尝试安装Linguist而不是CLOC 。 According to its README , you need to gem install linguist and then run linguist . 根据其自述文件 ,您需要gem install linguist ,然后运行linguist I couldn't get it to work ( issue #2223 ). 我无法让它工作( 问题#2223 )。


#5楼

If the question is "can you quickly get NUMBER OF LINES of a github repo", the answer is no as stated by the other answers. 如果问题是“你能快速获得github回购的数量”,那么答案就不是其他答案所说明的。

However, if the question is "can you quickly check the SCALE of a project", I usually gauge a project by looking at its size. 但是,如果问题是“您能否快速检查项目的SCALE”,我通常会通过查看项目的大小来衡量项目。 Of course the size will include deltas from all active commits, but it is a good metric as the order of magnitude is quite close. 当然,大小将包括来自所有活动提交的增量,但它是一个很好的度量,因为数量级非常接近。

Eg 例如

How big is the "docker" project? “码头工程”项目有多大?

In your browser, enter api.github.com/repos/ORG_NAME/PROJECT_NAME ie api.github.com/repos/docker/docker 在浏览器中输入api.github.com/repos/ORG_NAME/PROJECT_NAME即api.github.com/repos/docker/docker

In the response hash, you can find the size attribute: 在响应哈希中,您可以找到size属性:

{
    ...
    size: 161432,
    ...
}

This should give you an idea of the relative scale of the project. 这应该让您了解项目的相对规模。 The number seems to be in KB, but when I checked it on my computer it's actually smaller, even though the order of magnitude is consistent. 这个数字似乎是以KB为单位,但是当我在计算机上检查它时,它实际上更小,即使数量级是一致的。 (161432KB = 161MB, du -s -h docker = 65MB) (161432KB = 161MB,du -s -h docker = 65MB)


#6楼

If you go to the graphs/contributors page, you can see a list of all the contributors to the repo and how many lines they've added and removed. 如果您转到图表/贡献者页面,您可以看到回购的所有贡献者列表以及他们添加和删除的行数。

Unless I'm missing something, subtracting the aggregate number of lines deleted from the aggregate number of lines added among all contributors should yield the total number of lines of code in the repo. 除非我遗漏了某些内容,否则从所有贡献者中添加的总行数中减去删除的行总数应该会产生回购中代码行的总数。 (EDIT: it turns out I was missing something after all. Take a look at orbitbot's comment for details.) (编辑:事实证明我毕竟失去了一些东西。看看orbitbot的评论细节。)

UPDATE: 更新:

This data is also available in GitHub's API . 这些数据也可以在GitHub的API中找到。 So I wrote a quick script to fetch the data and do the calculation: 所以我写了一个快速脚本来获取数据并进行计算:

 'use strict'; //replace jquery/jquery with the repo you're interested in fetch('https://api.github.com/repos/jquery/jquery/stats/contributors') .then(response => response.json()) .then(contributors => contributors .map(contributor => contributor.weeks .reduce((lineCount, week) => lineCount + week.a - week.d, 0))) .then(lineCounts => lineCounts.reduce((lineTotal, lineCount) => lineTotal + lineCount)) .then(lines => window.alert(lines)); 

Just paste it in a Chrome DevTools snippet, change the repo and click run. 只需将其粘贴到Chrome DevTools代码段中,即可更改回购并点击“运行”。

Disclaimer (thanks to lovasoa ): 免责声明(感谢lovasoa ):

Take the results of this method with a grain of salt, because for some repos (sorich87/bootstrap-tour) it results in negative values, which might indicate there's something wrong with the data returned from GitHub's API. 用这种方法得到一粒盐的结果,因为对于一些repos(sorich87 / bootstrap-tour),它会产生负值,这可能表明从GitHub的API返回的数据有问题。

UPDATE: 更新:

Looks like this method to calculate total line numbers isn't entirely reliable. 看起来这种计算总行数的方法并不完全可靠。 Take a look at orbitbot's comment for details. 有关详细信息,请查看orbitbot的评论 。

你可能感兴趣的:(git,github,line-count)