官方wiki 简单介绍了OpenGrok的搭建过程, 参考https://github.com/oracle/opengrok/wiki/How-to-setup-OpenGrok
在自己的实践过程中,还是会遇到一些小问题,记录下来以避免后人继续踩坑。
本文以最新版ubuntu 22.10为例(实测ubuntu 22.10搭建AOSP编译环境完全没有任何问题)
tomcat需要10.x版本, apt源中的版本是9.x,因此需要手动安装
手动安装步骤和相关脚本,可以参考https://github.com/lashwang2022/tomcat-installation-ubuntu/blob/main/install-tomcat-ubuntu.sh
官方提供的命令
opengrok-deploy -c /opengrok/etc/configuration.xml \
/opengrok/dist/lib/source.war /var/lib/tomcat8/webapps
需要注意python的版本问题, opengrok-deploy需要在python3环境下使用,自己可以使用pyven创建虚拟python环境.
tomcat路径需要修改为实际安装的路径
可以通过软连接将需要索引的源码链接到opengrok安装的src目录, 不需要将源码放到src目录下。软链的好处是添加删除项目也非常方便
OpenGrop索引的核心就是opengrok.jar, 可以通过执行”java -jar /opengrok/dist/lib/opengrok.jar -h” 查看支持的参数
simon@simon-ubuntu-server:~$ java -jar /opt/opengrok/dist/lib/opengrok.jar -h
Jan 19, 2023 5:05:40 PM org.opengrok.indexer.index.Indexer parseOptions
INFO: Indexer options: [-h]
Usage: java -jar opengrok.jar [options] [subDir1 [...]]
-h, -?, --help [mode]
With no mode specified, display this usage summary. Or specify a mode:
config - display configuration.xml examples.
ctags - display ctags command-line.
guru - display AnalyzerGuru details.
repos - display enabled repositories.
--annotationCache on|off
Annotation cache provides speedup when getting annotation
for files in the webapp at the cost of significantly increased
indexing time (multiple times slower) and slightly increased
disk space (comparable to history cache size).
Can be enabled per project.
--apiTimeout number
Set timeout for asynchronous API requests.
--connectTimeout number
Set connect timeout. Used for API requests.
-A, --analyzer (.ext|prefix.):(-|analyzer)
Associates files with the specified prefix or extension (case-
insensitive) to be analyzed with the given analyzer, where 'analyzer'
may be specified using a class name (case-sensitive e.g. RubyAnalyzer)
or analyzer language name (case-sensitive e.g. C). Option may be
repeated.
Ex: -A .foo:CAnalyzer
will use the C analyzer for all files ending with .FOO
Ex: -A bar.:Perl
will use the Perl analyzer for all files starting with
"BAR" (no full-stop)
Ex: -A .c:-
will disable specialized analyzers for all files ending with .c
-c, --ctags /path/to/ctags
Path to Universal Ctags. Default is ctags in environment PATH.
--canonicalRoot /path/
Allow symlinks to canonical targets starting with the specified root
without otherwise needing to specify -N,--symlink for such symlinks. A
canonical root must end with a file separator. For security, a canonical
root cannot be the root directory. Option may be repeated.
--checkIndex
Check index, exit with 0 on success,
with 1 on failure.
-d, --dataRoot /path/to/data/root
The directory where OpenGrok stores the generated data.
--depth number
Scanning depth for repositories in directory structure relative to
source root. Default is 3.
--disableRepository type_name
Disables operation of an OpenGrok-supported repository. See also
-h,--help repos. Option may be repeated.
Ex: --disableRepository git
will disable the GitRepository
Ex: --disableRepository MercurialRepository
-e, --economical
To consume less disk space, OpenGrok will not generate and save
hypertext cross-reference files but will generate on demand, which could
be slightly slow.
-G, --assignTags
Assign commit tags to all entries in history for all repositories.
-H
Enable history.
--historyBased on|off
If history based reindex is in effect, the set of files
changed/deleted since the last reindex is determined from history
of the repositories. This needs history, history cache and
projects to be enabled. This should be much faster than the
classic way of traversing the directory structure.
The default is on. If you need to e.g. index files untracked by
SCM, set this to off. Currently works only for Git.
All repositories in a project need to support this in order
to be indexed using history.
--historyThreads number
The number of threads to use for history cache generation on repository level.
By default the number of threads will be set to the number of available CPUs.
Assumes -H/--history.
--historyFileThreads number
The number of threads to use for history cache generation
when dealing with individual files.
By default the number of threads will be set to the number of available CPUs.
Assumes -H/--history.
-I, --include pattern
Only files matching this pattern will be examined. Pattern supports
wildcards (example: -I '*.java' -I '*.c'). Option may be repeated.
-i, --ignore pattern
Ignore matching files (prefixed with 'f:' or no prefix) or directories
(prefixed with 'd:'). Pattern supports wildcards (example: -i '*.so'
-i d:'test*'). Option may be repeated.
-l, --lock on|off|simple|native
Set OpenGrok/Lucene locking mode of the Lucene database during index
generation. "on" is an alias for "simple". Default is off.
--leadingWildCards on|off
Allow or disallow leading wildcards in a search. Default is on.
-m, --memory number
Amount of memory (MB) that may be used for buffering added documents and
deletions before they are flushed to the directory (default 16.0).
Please increase JVM heap accordingly too.
--mandoc /path/to/mandoc
Path to mandoc(1) binary.
-N, --symlink /path/to/symlink
Allow the symlink to be followed. Other symlinks targeting the same
canonical target or canonical children will be allowed too. Option may
be repeated. (By default only symlinks directly under the source root
directory are allowed. See also --canonicalRoot)
-n, --noIndex
Do not generate indexes and other data (such as history cache and xref
files), but process all other command line options.
--nestingMaximum number
Maximum depth of nested repositories. Default is 1.
--reduceSegmentCount
Reduce the number of segments in each index database to 1. This might
(or might not) bring some improved performance. Anyhow, this operation
takes non-trivial time to complete.
-o, --ctagOpts path
File with extra command line options for ctags.
-P, --projects
Generate a project for each top-level directory in source root.
-p, --defaultProject path/to/default/project
Path (relative to the source root) to a project that should be selected
by default in the web application (when no other project is set either
in a cookie or in parameter). Option may be repeated to specify several
projects. Use the special value __all__ to indicate all projects.
--profiler
Pause to await profiler or debugger.
--progress
Print per-project percentage progress information.
-Q, --quickScan on|off
Turn on/off quick context scan. By default, only the first 1024KB of a
file is scanned, and a link ('[..all..]') is inserted when the file is
bigger. Activating this may slow the server down. (Note: this setting
only affects the web application.) Default is on.
-q, --quiet
Run as quietly as possible. Sets logging level to WARNING.
-R /path/to/configuration
Read configuration from the specified file.
-r, --remote on|off|uionly|dirbased
Specify support for remote SCM systems.
on - allow retrieval for remote SCM systems.
off - ignore SCM for remote systems.
uionly - support remote SCM for user interface only.
dirbased - allow retrieval during history index only for repositories
which allow getting history for directories.
--renamedHistory on|off
Enable or disable generating history for renamed files.
If set to on, makes history indexing slower for repositories
with lots of renamed files. Default is off.
--repository [path/to/repository|@file_with_paths]
Path (relative to the source root) to a repository for generating
history (if -H,--history is on). By default all discovered repositories
are history-eligible; using --repository limits to only those specified.
File containing paths can be specified via @path syntax.
Option may be repeated.
-S, --search [path/to/repository|@file_with_paths]
Search for source repositories under source root (-s,--source),
and add them. Path (relative to the source root) is optional.
File containing the paths can be specified via @path syntax.
Option may be repeated.
-s, --source /path/to/source/root
The root directory of the source tree.
--style path
Path to the subdirectory in the web application containing the requested
stylesheet. The factory-setting is: "default".
-T, --threads number
The number of threads to use for index generation, repository scan
and repository invalidation.
By default the number of threads will be set to the number of available
CPUs. This influences the number of spawned ctags processes as well.
-t, --tabSize number
Default tab size to use (number of spaces per tab character).
--token string|@file_with_string
Authorization bearer API token to use when making API calls
to the web application
-U, --uri SCHEME://webappURI:port/contextPath
Send the current configuration to the specified web application.
--updateConfig
Populate the web application with a bare configuration, and exit.
--userPage URL
Base URL of the user Information provider.
Example: "https://www.example.org/viewProfile.jspa?username=".
Use "none" to disable link.
--userPageSuffix URL-suffix
URL Suffix for the user Information provider. Default: "".
-V, --version
Print version, and quit.
-v, --verbose
Set logging level to INFO.
-W, --writeConfig /path/to/configuration
Write the current configuration to the specified file (so that the web
application can use the same configuration).
--webappCtags on|off
Web application should run ctags when necessary. Default is off.
d:out
d:prebuilts
d:cts
d:platform_testing
d:autotest
d:*old_codebase*
d:toolchain
d:rockdev
d:pdk
d:.repo
源码和索引的定期更新
可以将索引的命令添加到crontab做成定期任务自动更新
源码存储空间问题
由于AOSP一个项目就几百个G
欢迎关注我的公众号“虎哥 LoveDroid”,原创技术文章第一时间推送。