LibKGE is a PyTorch-based library for efficient training, evaluation, and hyperparameter optimization of knowledge graph embeddings (KGE). It is highly configurable, easy to use, and extensible. Other KGE frameworks are listed below.
The key goal of LibKGE is to foster reproducible research into (as well as meaningful comparisons between) KGE models and training methods. As we argue in our ICLR 2020 paper (see video), the choice of training strategy and hyperparameters are very influential on model performance, often more so than the model class itself. LibKGE aims to provide clean implementations of training, hyperparameter optimization, and evaluation strategies that can be used with any model. Every potential knob or heuristic implemented in the framework is exposed explicitly via well-documented configuration files (e.g., see here and here). LibKGE also provides the most common KGE models and new ones can be easily added (contributions welcome!).
For link prediction tasks, rule-based systems such as AnyBURL are a competitive alternative to KGE.
UPDATE: LibKGE now includes GraSH, an efficient multi-fidelity hyperparameter optimization algorithm for large-scale KGE models. See here for an example on how to use it.
# retrieve and install project in development mode
git clone https://github.com/uma-pi1/kge.git
cd kge
pip install -e .
# download and preprocess datasets
cd data
sh download_all.sh
cd ..
# train an example model on toy dataset (you can omit '--job.device cpu' when you have a gpu)
kge start examples/toy-complex-train.yaml --job.device cpu
Ctrl-C
) and resume at any timetrain.subbatch_auto_tune
)We list some example results (filtered MRR and HITS@k on test data) obtained with
LibKGE below. These results are obtained by running automatic hyperparameter
search as described here.
These results are not necessarily the best results that can be achieved using LibKGE,
but they are comparable in that a common experimental setup (and equal amount of work)
has been used for hyperparameter optimization for each model. Since we use filtered MRR
for model selection, our results may not be indicative of the achievable model performance
for other validation metrics (such as HITS@10, which has been used for model selection
elsewhere).
We report performance numbers on the entire test set, including the
triples that contain entities not seen during training. This is not done
consistently throughout existing KGE implementations: some frameworks remove
unseen entities from the test set, which leads to a perceived increase in
performance (e.g., roughly add +3pp to our WN18RR MRR numbers for this method of
evaluation).
We also provide pretrained models for these results. Each pretrained model is
given in the form of a LibKGE checkpoint, which contains the model as well as
additional information (such as the configuration being used). See the
documentation below on how to use checkpoints.
MRR | Hits@1 | Hits@3 | Hits@10 | Config file | Pretrained model | |
---|---|---|---|---|---|---|
RESCAL | 0.356 | 0.263 | 0.393 | 0.541 | config.yaml | 1vsAll-kl |
TransE | 0.313 | 0.221 | 0.347 | 0.497 | config.yaml | NegSamp-kl |
DistMult | 0.343 | 0.250 | 0.378 | 0.531 | config.yaml | NegSamp-kl |
ComplEx | 0.348 | 0.253 | 0.384 | 0.536 | config.yaml | NegSamp-kl |
ConvE | 0.339 | 0.248 | 0.369 | 0.521 | config.yaml | 1vsAll-kl |
RotatE | 0.333 | 0.240 | 0.368 | 0.522 | config.yaml | NegSamp-bce |
MRR | Hits@1 | Hits@3 | Hits@10 | Config file | Pretrained model | |
---|---|---|---|---|---|---|
RESCAL | 0.467 | 0.439 | 0.480 | 0.517 | config.yaml | KvsAll-kl |
TransE | 0.228 | 0.053 | 0.368 | 0.520 | config.yaml | NegSamp-kl |
DistMult | 0.452 | 0.413 | 0.466 | 0.530 | config.yaml | KvsAll-kl |
ComplEx | 0.475 | 0.438 | 0.490 | 0.547 | config.yaml | 1vsAll-kl |
ConvE | 0.442 | 0.411 | 0.451 | 0.504 | config.yaml | KvsAll-kl |
RotatE | 0.478 | 0.439 | 0.494 | 0.553 | config.yaml | NegSamp-bce |
MRR | Hits@1 | Hits@3 | Hits@10 | Config file | Pretrained model | |
---|---|---|---|---|---|---|
RESCAL | 0.644 | 0.544 | 0.708 | 0.824 | config.yaml | NegSamp-kl |
TransE | 0.676 | 0.542 | 0.787 | 0.875 | config.yaml | NegSamp-bce |
DistMult | 0.841 | 0.806 | 0.863 | 0.903 | config.yaml | 1vsAll-kl |
ComplEx | 0.838 | 0.807 | 0.856 | 0.893 | config.yaml | 1vsAll-kl |
ConvE | 0.825 | 0.781 | 0.855 | 0.896 | config.yaml | KvsAll-bce |
RotatE | 0.783 | 0.727 | 0.820 | 0.877 | config.yaml | NegSamp-kl |
MRR | Hits@1 | Hits@3 | Hits@10 | Config file | Pretrained model | |
---|---|---|---|---|---|---|
RESCAL | 0.948 | 0.943 | 0.951 | 0.956 | config.yaml | 1vsAll-kl |
TransE | 0.553 | 0.315 | 0.764 | 0.924 | config.yaml | NegSamp-bce |
DistMult | 0.941 | 0.932 | 0.948 | 0.954 | config.yaml | 1vsAll-kl |
ComplEx | 0.951 | 0.947 | 0.953 | 0.958 | config.yaml | KvsAll-kl |
ConvE | 0.947 | 0.943 | 0.949 | 0.953 | config.yaml | 1vsAll-kl |
RotatE | 0.946 | 0.943 | 0.948 | 0.953 | config.yaml | NegSamp-kl |
LibKGE supports large datasets such as Yago3-10 (123k entities) and Wikidata5M (4.8M entities).
The results given below were found by automatic hyperparameter search with a similar search
space as above, but with some values fixed (training with shared negative sampling,
embedding dimension: 128, batch size: 1024, optimizer: Adagrad,
regularization: weighted). The Yago3-10 result was obtained by training 30 pseudo-random configurations for
20 epochs, and then rerunning the configuration that performed best on validation
data for 400 epochs.
MRR | Hits@1 | Hits@3 | Hits@10 | Config file | Pretrained model | |
---|---|---|---|---|---|---|
ComplEx | 0.551 | 0.476 | 0.596 | 0.682 | config.yaml | NegSamp-kl |
We report two results for Wikidata5m.
The first result was found by the same automatic hyperparameter search as described for
Yago3-10, but we limited the final training to 200 epochs. The second result was
obtained with significantly less resource consumption by using
the multi-fidelity GraSH search.
Search + budget | Final training | MRR | Hits@1 | Hits@3 | Hits@10 | Config file | Pretrained model | |
---|---|---|---|---|---|---|---|---|
ComplEx | Random, 600 epochs | 200 epochs | 0.301 | 0.245 | 0.331 | 0.397 | config.yaml | NegSamp-kl |
ComplEx | GraSH, 192 epochs | 64 epochs | 0.300 | 0.247 | 0.328 | 0.390 | config.yaml | - |
GraSH was also applied to Freebase, one of the largest benchmarking datasets containing 86M entities.
The reported results were obtained by combining GraSH with distributed training implemented in
Dist-KGE.
The respective config files can be found in the GraSH repository as their execution is not yet supported in LibKGE.
MRR | Hits@1 | Hits@3 | Hits@10 | |
---|---|---|---|---|
ComplEx | 0.594 | 0.511 | 0.667 | 0.726 |
RotatE | 0.613 | 0.578 | 0.637 | 0.669 |
TransE | 0.553 | 0.520 | 0.571 | 0.614 |
CoDEx is a Wikidata-based KG completion
benchmark. The results here have been obtained using the automatic
hyperparameter search used for the Freebase and WordNet datasets, but with fewer
epochs and Ax trials for CoDEx-M and CoDEx-L. See the CoDEx
paper (EMNLP 2020) for details.
MRR | Hits@1 | Hits@3 | Hits@10 | Config file | Pretrained model | |
---|---|---|---|---|---|---|
RESCAL | 0.404 | 0.293 | 0.4494 | 0.623 | config.yaml | 1vsAll-kl |
TransE | 0.354 | 0.219 | 0.4218 | 0.634 | config.yaml | NegSamp-kl |
ComplEx | 0.465 | 0.372 | 0.5038 | 0.646 | config.yaml | 1vsAll-kl |
ConvE | 0.444 | 0.343 | 0.4926 | 0.635 | config.yaml | 1vsAll-kl |
TuckER | 0.444 | 0.339 | 0.4975 | 0.638 | config.yaml | KvsAll-kl |
MRR | Hits@1 | Hits@3 | Hits@10 | Config file | Pretrained model | |
---|---|---|---|---|---|---|
RESCAL | 0.317 | 0.244 | 0.3477 | 0.456 | config.yaml | 1vsAll-kl |
TransE | 0.303 | 0.223 | 0.3363 | 0.454 | config.yaml | NegSamp-kl |
ComplEx | 0.337 | 0.262 | 0.3701 | 0.476 | config.yaml | KvsAll-kl |
ConvE | 0.318 | 0.239 | 0.3551 | 0.464 | config.yaml | NegSamp-kl |
TuckER | 0.328 | 0.259 | 0.3599 | 0.458 | config.yaml | KvsAll-kl |
MRR | Hits@1 | Hits@3 | Hits@10 | Config file | Pretrained model | |
---|---|---|---|---|---|---|
RESCAL | 0.304 | 0.242 | 0.3313 | 0.419 | config.yaml | 1vsAll-kl |
TransE | 0.187 | 0.116 | 0.2188 | 0.317 | config.yaml | NegSamp-kl |
ComplEx | 0.294 | 0.237 | 0.3179 | 0.400 | config.yaml | 1vsAll-kl |
ConvE | 0.303 | 0.240 | 0.3298 | 0.420 | config.yaml | 1vsAll-kl |
TuckER | 0.309 | 0.244 | 0.3395 | 0.430 | config.yaml | KvsAll-kl |
LibKGE supports training, evaluation, and hyperparameter tuning of KGE models.
The settings for each task can be specified with a configuration file in YAML
format or on the command line. The default values and usage for available
settings can be found in config-default.yaml as well
as the model- and embedder-specific configuration files (such as
lookup_embedder.yaml).
First create a configuration file such as:
job.type: train
dataset.name: fb15k-237
train:
optimizer: Adagrad
optimizer_args:
lr: 0.2
valid:
every: 5
metric: mean_reciprocal_rank_filtered
model: complex
lookup_embedder:
dim: 100
regularize_weight: 0.8e-7
To begin training, run one of the following:
# Store the file as `config.yaml` in a new folder of your choice. Then initiate or resume
# the training job using:
kge resume <folder>
# Alternatively, store the configuration anywhere and use the start command
# to create a new folder
# /local/experiments/-
# with that config and start training there.
kge start <config-file>
# In both cases, configuration options can be modified on the command line, too: e.g.,
kge start <config-file> config.yaml --job.device cuda:0 --train.optimizer Adam
Various checkpoints (including model parameters and configuration options) will
be created during training. These checkpoints can be used to resume training (or any other job type such as hyperparameter search jobs).
All of LibKGE’s jobs can be interrupted (e.g., via Ctrl-C
) and resumed (from one of its checkpoints). To resume a job, use:
kge resume <folder>
# Change the device when resuming
kge resume <folder> --job.device cuda:1
By default, the last checkpoint file is used. The filename of the checkpoint can be overwritten using --checkpoint
.
To evaluate trained model, run the following:
# Evaluate a model on the validation split
kge valid <folder>
# Evaluate a model on the test split
kge test <folder>
By default, the checkpoint file named checkpoint_best.pt
(which stores the best validation result so far) is used. The filename of the checkpoint can be overwritten using --checkpoint
.
LibKGE supports various forms of hyperparameter optimization such as grid search,
random search, Bayesian optimization, or resource-efficient multi-fidelity search.
The search type and search space are specified in the configuration file.
For example, you may use Ax for SOBOL
(pseudo-random) and Bayesian optimization. The following config file defines a
search of 10 SOBOL trials (arms) followed by 20 Bayesian optimization trials:
job.type: search
search.type: ax
dataset.name: wnrr
model: complex
valid.metric: mean_reciprocal_rank_filtered
ax_search:
num_trials: 30
num_sobol_trials: 10 # remaining trials are Bayesian
parameters:
- name: train.batch_size
type: choice
values: [256, 512, 1024]
- name: train.optimizer_args.lr
type: range
bounds: [0.0003, 1.0]
- name: train.type
type: fixed
value: 1vsAll
For large graph datasets such as Wikidata5m, you may use
GraSH, which enables resource-efficient
hyperparameter optimization. A full documentation of the GraSH functionality,
useful search configs, and obtained results can
be found in the accompanying repository.
The following example config defines a
search of 64 randomly generated trials with a search budget equivalent
to only 3 full training runs on the whole dataset:
job.type: search
search.type: grash_search
dataset.name: wikidata5m
model: complex
valid.metric: mean_reciprocal_rank_filtered
grash_search:
num_trials: 64 # initial number of randomly generated trials
search_budget: 3 # in terms of full training runs on the whole dataset
eta: 4 # reduction factor - only keep 1/eta best-performing trials per round
variant: combined # low-fidelity approximation technique - combined = epoch + graph reduction
parameters:
- name: train.batch_size
type: choice
values: [256, 512, 1024]
- name: train.optimizer_args.lr
type: range
bounds: [0.0003, 1.0]
- name: train.type
type: fixed
value: 1vsAll
Trials can be run in parallel across several devices:
# Run 4 trials in parallel evenly distributed across two GPUs
kge resume <folder> --search.device_pool cuda:0,cuda:1 --search.num_workers 4
# Run 3 trials in parallel, with per GPUs capacity
kge resume <folder> --search.device_pool cuda:0,cuda:1,cuda:1 --search.num_workers 3
Extensive logs are stored as YAML files (hyperparameter search, training,
validation). LibKGE provides a convenience methods to export the log data to
CSV.
kge dump trace <folder>
The command above yields CSV output such as this output for a training
job or this output for a search
job.
Additional configuration options or metrics can be added to the CSV files as
needed (using a keys
file).
Information about a checkpoint (such as the configuration that was used,
training loss, validation metrics, or explored hyperparameter configurations)
can also be exported from the command line (as YAML):
kge dump checkpoint <checkpoint>
Configuration files can also be dumped in various formats.
# dump just the configuration options that are different from the default values
kge dump config <config-or-folder-or-checkpoint>
# dump the configuration as is
kge dump config <config-or-folder-or-checkpoint> --raw
# dump the expanded config including all configuration keys
kge dump config <config-or-folder-or-checkpoint> --full
# help on all commands
kge --help
# help on a specific command
kge dump --help
Using a trained model trained with LibKGE is straightforward. In the following
example, we load a checkpoint and predict the most suitable object for a two
subject-relations pairs: (‘Dominican Republic’, ‘has form of government’, ?) and
(‘Mighty Morphin Power Rangers’, ‘is tv show with actor’, ?).
import torch
from kge.model import KgeModel
from kge.util.io import load_checkpoint
# download link for this checkpoint given under results above
checkpoint = load_checkpoint('fb15k-237-rescal.pt')
model = KgeModel.create_from(checkpoint)
s = torch.Tensor([0, 2,]).long() # subject indexes
p = torch.Tensor([0, 1,]).long() # relation indexes
scores = model.score_sp(s, p) # scores of all objects for (s,p,?)
o = torch.argmax(scores, dim=-1) # index of highest-scoring objects
print(o)
print(model.dataset.entity_strings(s)) # convert indexes to mentions
print(model.dataset.relation_strings(p))
print(model.dataset.entity_strings(o))
# Output (slightly revised for readability):
#
# tensor([8399, 8855])
# ['Dominican Republic' 'Mighty Morphin Power Rangers']
# ['has form of government' 'is tv show with actor']
# ['Republic' 'Johnny Yong Bosch']
For other scoring functions (score_sp, score_po, score_so, score_spo), see KgeModel.
To use your own dataset, create a subfolder mydataset
(= dataset name) in the data
folder. You can use your dataset later by specifying dataset.name: mydataset
in your job’s configuration file.
Each dataset is described by a dataset.yaml
file, which needs to be stored in the mydataset
folder. After performing the quickstart instructions, have a look at the provided toy example under data/toy/dataset.yaml
. The configuration keys and file formats are documented here.
Your data can be automatically preprocessed and converted into the format required by LibKGE. Here is the relevant part for the toy
dataset, which see:
# download
curl -O http://web.informatik.uni-mannheim.de/pi1/kge-datasets/toy.tar.gz
tar xvf toy.tar.gz
# preprocess
python preprocess/preprocess_default.py toy
LibKGE currently implements the KGE models listed in features.
The examples folder contains some configuration files as examples of how to train these models.
We welcome contributions to expand the list of supported models! Please see CONTRIBUTING for details and feel free to initially open an issue.
LibKGE can be extended with new training, evaluation, or search jobs as well as
new models and embedders.
KGE models implement the KgeModel
class and generally consist of a
KgeEmbedder
to associate each subject, relation and object to an embedding and
a KgeScorer
to score triples given their embeddings. All these base classes
are defined in kge_model.py.
KGE jobs perform training, evaluation, and hyper-parameter search. The relevant base classes are Job, TrainingJob, EvaluationJob, and SearchJob.
To add a component, say mycomp
(= a model, embedder, or job) with
implementation MyClass
, you need to:
Create a configuration file mycomp.yaml
. You may store this file directly
in the LibKGE module folders (e.g.,
) or in your own
module folder. If you plan to contribute your code to LibKGE, we suggest to
directly develop in the LibKGE module folders. If you just want to play
around or publish your code separately from LibKGE, use your own module.
Define all required options for your component, their default values, and
their types in mycomp.yaml
. We suggest to follow LibKGE’s core philosophy
and define every option that can influence the outcome of an experiment in
this way. Please pay attention w.r.t. integer (0
) vs. float (0.0
) values;
e.g., float_option: 0
is incorrect because is interpreted as an integer.
Implement MyClass
in a module of your choice. In mycomp.yaml
, add key
mycomp.class_name
with value MyClass
. If you follow LibKGE’s directory
structure (mycomp.yaml
for configuration and mycomp.py
for
implementation), then ensure that MyClass
is imported in __init__.py
(e.g., as done here).
To use your component in an experiment, register your module via the
modules
key and its configuration via the import
key in the experiment’s
configuration file. See config-default.yaml for a
description of those keys. For example, in myexp_config.yaml
, add:
modules: [ kge.job, kge.model, kge.model.embedder, mymodule ]
import: [ mycomp ]
Yes, see config-default.yaml as well as the configuration files for each component listed above.
Yes, try kge --help
. You may also obtain help for subcommands, e.g., try kge dump --help
or kge dump trace --help
.
train.subbatch_auto_tune
to true (equivalent result, less memory but slower).entity_ranking.chunk_size
to, say, 10000 (equivalent result, less memory but slightly slower, the more so the smaller the chunk size).See here.
Other KGE frameworks:
KGE projects for publications that also implement a few models:
PRs to this list are welcome.
Please cite the following publication to refer to the experimental study about the impact of training methods on KGE performance:
@inproceedings{
ruffinelli2020you,
title={You {CAN} Teach an Old Dog New Tricks! On Training Knowledge Graph Embeddings},
author={Daniel Ruffinelli and Samuel Broscheit and Rainer Gemulla},
booktitle={International Conference on Learning Representations},
year={2020},
url={https://openreview.net/forum?id=BkxSmlBFvr}
}
If you use LibKGE, please cite the following publication:
@inproceedings{
libkge,
title="{L}ib{KGE} - {A} Knowledge Graph Embedding Library for Reproducible Research",
author={Samuel Broscheit and Daniel Ruffinelli and Adrian Kochsiek and Patrick Betz and Rainer Gemulla},
booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
year={2020},
url={https://www.aclweb.org/anthology/2020.emnlp-demos.22},
pages = "165--174",
}