3 reasons for hallucinations started
only 2 prevailed
Finding how networks behave while hallucinating, they
filter hallucinations (with great success)
https://arxiv.org/abs/2301.07779
#NLProc #neuralEmpty #NLP #deepRead
Hallucination is the case where the network invents information not shown at all in the input
For example, translating from English to English:
This tweet is the best ->
This *paper* is great *is wonderful is best*
The repetition is also considered a (degenerate) hallucination
This paper tested 3 previous hypotheses (original papers included)
Showed that 2 indeed hold
Created a dataset (read the paper for more)
and last showed this can be harnessed for
filtering hallucinations
So the hypotheses:
When a network hallucinates it discards most of the sentence and only attends to a small part of the input
Specifically, this seems not to be the EOS but the beginning tokens
When hallucinating the relevance of words is static
despite the different outputs
what is considered important information to attend is the same
suggested by the above&
https://aclanthology.org/W19-5361/
reason for hallucination didn't hold
Apparently, the network relies on the source and target sentence (the so far decoded translation) similarly when hallucinating
This paper also proposes how to quantify reliance and basic methods used in this paper
https://aclanthology.org/2021.acl-long.91/
Finally, they find that by using those features
a small classifier can reach great filtering scores
(LASER based is also good precision wise)
For the first time it sounds not only interesting, but practical to remove hallucinations
I think that is the first time I have a toot that is more boosted than a similar
tweet