diff --git a/README.md b/README.md
index fe27d4e..98c7cac 100644
--- a/README.md
+++ b/README.md
@@ -50,6 +50,7 @@ Can be used to perform:
* flair - Required if you want to use Flair mentions extractor and for TARS linker and TARS Mentions Extractor.
* blink - Required if you want to use Blink for linking to Wikipedia pages.
* gliner - Required if you want to use GLiNER Linker or GLiNER Mentions Extractor.
+* relik - Required if you want to use Relik Linker.
## Installation
@@ -90,7 +91,7 @@ The linguistic approach relies on the idea that mentions will usually be a synta
### Linker
The **linker** will link the detected entities to a existing set of labels. Some of the **linkers**, however, are *end-to-end*, i.e. they don't need the **mentions extractor**, as they detect and link the entities at the same time.
-Again, there are 5 **linkers** available currently, 3 of them are *end-to-end* and 2 are not.
+Again, there are 6 **linkers** available currently, 4 of them are *end-to-end* and 2 are not.
| Linker Name | end-to-end | Source Code | Paper |
|:-----------:|:----------:|----------------------------------------------------------|--------------------------------------------------------------------|
@@ -99,6 +100,7 @@ Again, there are 5 **linkers** available currently, 3 of them are *end-to-end* a
| SMXM | ✓ | [Source Code](https://github.com/Raldir/Zero-shot-NERC) | [Paper](https://aclanthology.org/2021.acl-long.120/) |
| TARS | ✓ | [Source Code](https://github.com/flairNLP/flair) | [Paper](https://kishaloyhalder.github.io/pdfs/tars_coling2020.pdf) |
| GLINER | ✓ | [Source Code](https://github.com/urchade/GLiNER) | [Paper](https://arxiv.org/abs/2311.08526) |
+| RELIK | ✓ | [Source Code](https://github.com/SapienzaNLP/relik) | [Paper](https://arxiv.org/abs/2408.00103) |
### Relations Extractor
The **relations extractor** will extract relations among different entities *previously* extracted by a **linker**..
@@ -241,7 +243,7 @@ from zshot import PipelineConfig
from zshot.linker import LinkerTARS
from zshot.evaluation.dataset import load_ontonotes_zs
from zshot.evaluation.zshot_evaluate import evaluate, prettify_evaluate_report
-from zshot.evaluation.metrics.seqeval.seqeval import Seqeval
+from zshot.evaluation.metrics._seqeval._seqeval import Seqeval
ontonotes_zs = load_ontonotes_zs('validation')
diff --git a/docs/entity_linking.md b/docs/entity_linking.md
index 0eb68d6..668b1c3 100644
--- a/docs/entity_linking.md
+++ b/docs/entity_linking.md
@@ -2,6 +2,16 @@
The **linker** will link the detected entities to a existing set of labels. Some of the **linkers**, however, are *end-to-end*, i.e. they don't need the **mentions extractor**, as they detect and link the entities at the same time.
-There are 5 **linkers** available currently, 3 of them are *end-to-end* and 2 are not.
+There are 6 **linkers** available currently, 4 of them are *end-to-end* and 2 are not.
+
+| Linker Name | end-to-end | Source Code | Paper |
+|:----------------------------------------------------:|:----------:|----------------------------------------------------------|--------------------------------------------------------------------|
+| [Blink](https://ibm.github.io/zshot/blink_linker/) | X | [Source Code](https://github.com/facebookresearch/BLINK) | [Paper](https://arxiv.org/pdf/1911.03814.pdf) |
+| [GENRE](https://ibm.github.io/zshot/genre_linker/) | X | [Source Code](https://github.com/facebookresearch/GENRE) | [Paper](https://arxiv.org/pdf/2010.00904.pdf) |
+| [SMXM](https://ibm.github.io/zshot/smxm_linker/) | ✓ | [Source Code](https://github.com/Raldir/Zero-shot-NERC) | [Paper](https://aclanthology.org/2021.acl-long.120/) |
+| [TARS](https://ibm.github.io/zshot/tars_linker/) | ✓ | [Source Code](https://github.com/flairNLP/flair) | [Paper](https://kishaloyhalder.github.io/pdfs/tars_coling2020.pdf) |
+| [GLINER](https://ibm.github.io/zshot/gliner_linker/) | ✓ | [Source Code](https://github.com/urchade/GLiNER) | [Paper](https://arxiv.org/abs/2311.08526) |
+| [RELIK](https://ibm.github.io/zshot/relik_linker/) | ✓ | [Source Code](https://github.com/SapienzaNLP/relik) | [Paper](https://arxiv.org/abs/2408.00103) |
+
::: zshot.Linker
\ No newline at end of file
diff --git a/docs/relik_linker.md b/docs/relik_linker.md
new file mode 100644
index 0000000..222c4eb
--- /dev/null
+++ b/docs/relik_linker.md
@@ -0,0 +1,13 @@
+# ReLiK Linker
+ReLiK is a lightweight and fast model for Entity Linking and Relation Extraction. It is composed of two main components: a retriever and a reader. The retriever is responsible for retrieving relevant documents from a large collection, while the reader is responsible for extracting entities and relations from the retrieved documents. ReLiK can be used with the from_pretrained method to load a pre-trained pipeline.
+
+In **Zshot**, we created a linker to use ReLiK, and it works both providing entities or without providing entities, and with descriptions.
+
+This is an *end-to-end* model, so there is no need to use a **mentions extractor** before.
+
+The ReLiK **linker** will use the **entities** specified in the `zshot.PipelineConfig`, if any.
+
+- [Paper](https://arxiv.org/abs/2408.00103)
+- [Original Source Code](https://github.com/SapienzaNLP/relik)
+
+::: zshot.linker.LinkerRelik
\ No newline at end of file
diff --git a/mkdocs.yml b/mkdocs.yml
index ddd85c6..30fc8da 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -22,6 +22,7 @@ nav:
- regen.md
- smxm_linker.md
- tars_linker.md
+ - relik_linker.md
- gliner_linker.md
- Relations Extractor:
- relation_extractor.md
diff --git a/requirements/test.txt b/requirements/test.txt
index 9b82063..203a0f7 100644
--- a/requirements/test.txt
+++ b/requirements/test.txt
@@ -7,4 +7,5 @@ gliner>=0.2.9
flake8>=4.0.1
coverage>=6.4.1
pydantic==1.9.2
+relik==1.0.5
IPython
\ No newline at end of file
diff --git a/zshot/linker/__init__.py b/zshot/linker/__init__.py
index 98a6c21..eb69975 100644
--- a/zshot/linker/__init__.py
+++ b/zshot/linker/__init__.py
@@ -4,4 +4,5 @@
from zshot.linker.linker_smxm import LinkerSMXM # noqa: F401
from zshot.linker.linker_tars import LinkerTARS # noqa: F401
from zshot.linker.linker_ensemble import LinkerEnsemble # noqa: F401
+from zshot.linker.linker_relik import LinkerRelik # noqa: F401
from zshot.linker.linker_gliner import LinkerGLINER # noqa: F401
diff --git a/zshot/linker/linker_relik.py b/zshot/linker/linker_relik.py
new file mode 100644
index 0000000..0cac8bc
--- /dev/null
+++ b/zshot/linker/linker_relik.py
@@ -0,0 +1,80 @@
+import contextlib
+import logging
+import pkgutil
+from typing import Iterator, List, Optional, Union
+
+from relik import Relik
+from relik.inference.data.objects import RelikOutput
+from relik.retriever.indexers.document import Document
+from spacy.tokens import Doc
+
+from zshot.config import MODELS_CACHE_PATH
+from zshot.linker.linker import Linker
+from zshot.utils.data_models import Span
+
+logging.getLogger("relik").setLevel(logging.ERROR)
+
+MODEL_NAME = "sapienzanlp/relik-entity-linking-large"
+
+
+class LinkerRelik(Linker):
+ """ Relik linker """
+
+ def __init__(self, model_name=MODEL_NAME):
+ super().__init__()
+
+ if not pkgutil.find_loader("relik"):
+ raise Exception("relik module not installed. You need to install relik in order to use the relik Linker."
+ "Install it with: pip install relik")
+
+ self.model_name = model_name
+ self.model = None
+ # self.device = {
+ # "retriever_device": self.device,
+ # "index_device": self.device,
+ # "reader_device": self.device
+ # }
+
+ @property
+ def is_end2end(self) -> bool:
+ """ relik is end2end """
+ return True
+
+ def load_models(self):
+ """ Load relik model """
+ # Remove RELIK print
+ with contextlib.redirect_stdout(None):
+ if self.model is None:
+ if self._entities:
+ self.model = Relik.from_pretrained(self.model_name,
+ cache_dir=MODELS_CACHE_PATH,
+ retriever=None, device=self.device)
+ else:
+ self.model = Relik.from_pretrained(self.model_name,
+ cache_dir=MODELS_CACHE_PATH, device=self.device,
+ index_device='cpu')
+
+ def predict(self, docs: Iterator[Doc], batch_size: Optional[Union[int, None]] = None) -> List[List[Span]]:
+ """
+ Perform the entity prediction
+ :param docs: A list of spacy Document
+ :param batch_size: The batch size
+ :return: List Spans for each Document in docs
+ """
+ candidates = None
+ if self._entities:
+ candidates = [
+ Document(text=ent.name, id=i, metadata={'definition': ent.description})
+ for i, ent in enumerate(self._entities)
+ ]
+
+ sentences = [doc.text for doc in docs]
+
+ self.load_models()
+ span_annotations = []
+ for sent in sentences:
+ relik_out: RelikOutput = self.model(sent, candidates=candidates)
+ span_annotations.append([Span(start=relik_span.start, end=relik_span.end, label=relik_span.label)
+ for relik_span in relik_out.spans])
+
+ return span_annotations
diff --git a/zshot/tests/linker/test_gliner_linker.py b/zshot/tests/linker/test_gliner_linker.py
index 24ea27a..42cbd77 100644
--- a/zshot/tests/linker/test_gliner_linker.py
+++ b/zshot/tests/linker/test_gliner_linker.py
@@ -13,7 +13,7 @@
@pytest.fixture(scope="module", autouse=True)
def teardown():
- logger.warning("Starting smxm tests")
+ logger.warning("Starting gliner tests")
yield True
gc.collect()
@@ -25,7 +25,7 @@ def test_gliner_download():
del linker.model, linker
-def test_smxm_linker():
+def test_gliner_linker():
nlp = spacy.blank("en")
gliner_config = PipelineConfig(
linker=LinkerGLINER(),
@@ -43,7 +43,7 @@ def test_smxm_linker():
del doc, nlp, gliner_config
-def test_smxm_linker_no_entities():
+def test_gliner_linker_no_entities():
nlp = spacy.blank("en")
gliner_config = PipelineConfig(
linker=LinkerGLINER(),
diff --git a/zshot/tests/linker/test_relik_linker.py b/zshot/tests/linker/test_relik_linker.py
new file mode 100644
index 0000000..4d51d9a
--- /dev/null
+++ b/zshot/tests/linker/test_relik_linker.py
@@ -0,0 +1,60 @@
+import gc
+import logging
+
+import pytest
+import spacy
+
+from zshot import PipelineConfig, Linker
+from zshot.linker import LinkerRelik
+from zshot.tests.config import EX_DOCS, EX_ENTITIES
+
+logger = logging.getLogger(__name__)
+
+
+@pytest.fixture(scope="module", autouse=True)
+def teardown():
+ logger.warning("Starting relik tests")
+ yield True
+ gc.collect()
+
+
+@pytest.mark.skip(reason="Too expensive to run on every commit")
+def test_relik_download():
+ linker = LinkerRelik()
+ linker.load_models()
+ assert isinstance(linker, Linker)
+ del linker.model, linker
+
+
+@pytest.mark.skip(reason="Too expensive to run on every commit")
+def test_relik_linker():
+ nlp = spacy.blank("en")
+ relik_config = PipelineConfig(
+ linker=LinkerRelik(),
+ entities=EX_ENTITIES
+ )
+ nlp.add_pipe("zshot", config=relik_config, last=True)
+ assert "zshot" in nlp.pipe_names
+
+ doc = nlp(EX_DOCS[1])
+ assert len(doc.ents) > 0
+ del nlp.get_pipe('zshot').linker.model, nlp.get_pipe('zshot').linker
+ nlp.remove_pipe('zshot')
+ del doc, nlp, relik_config
+
+
+@pytest.mark.skip(reason="Too expensive to run on every commit")
+def test_relik_linker_no_entities():
+ nlp = spacy.blank("en")
+ relik_config = PipelineConfig(
+ linker=LinkerRelik(),
+ entities=[]
+ )
+ nlp.add_pipe("zshot", config=relik_config, last=True)
+ assert "zshot" in nlp.pipe_names
+
+ doc = nlp(EX_DOCS[1])
+ assert len(doc.ents) == 0
+ del nlp.get_pipe('zshot').linker.model, nlp.get_pipe('zshot').linker
+ nlp.remove_pipe('zshot')
+ del doc, nlp, relik_config
diff --git a/zshot/utils/download_models.py b/zshot/utils/download_models.py
index ce25422..f7ec006 100644
--- a/zshot/utils/download_models.py
+++ b/zshot/utils/download_models.py
@@ -21,6 +21,10 @@ def load_all():
LinkerGLINER().load_models()
except RuntimeError:
pass
+ # try:
+ # LinkerRelik().load_models()
+ # except RuntimeError:
+ # pass
try:
RelationsExtractorZSRC().load_models()
except RuntimeError: