首页 技术 正文
技术 2022年11月20日
0 收藏 412 点赞 2,216 浏览 3326 个字

A Generative Entity-Mention Model for Linking Entities with Knowledge Base



提出了一种生成概率模型,叫做entity-mention model.


In our model, each name mention to be linked is modeled as a sample generated through a three-step generative story, and the entity knowledge is encoded in the distribution of entities in document P(e), the distribution of possible names of a specific entity P(s|e), and the distribution of possible contexts of a specific entity P(c|e). To find the referent entity of a name mention, our method combines the evidences from all the three distributions P(e), P(s|e) and P(c|e).

The P(e), P(s|e) and P(c|e) are respectively called the entity popularity model, the entity name model and the entity context model



Given a set of name mentions M = {m1, m2, …, mk} contained in documents and a knowledge base KB containing a set of entities E = {e1, e2, …, en}, an entity linking system is a function s : M ® E which links these name mentions to their referent entities in KB.

Popularity Knowledge


Name Knowledge


Context Knowledge


三.The Generative Entity-Mention Model for Entity Linking


论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

  1. 首先,该模型根据P(e)中实体的分布情况,从给定知识库中选择提及名称的引用实体e。
  2. 其次,该模型根据被引用实体P(s|e)的可能名称的分布情况输出所述名称的名称s。
  3. 最后,模型根据被引用实体P(c|e)可能的上下文分布输出所提到的名称的上下文c。


The probability of a name mention m (its context is c and its name is s) referring to a specific entity e can be expressed as the following formula (here assume that s and c are independent):

论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

Give a name mention m, to perform entity linking, we need to find the entity e which maximizes the probability P(e|m).

               论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

Candidate Selection

building a name-to-entity dictionary using the redirect links, disambiguation pages, anchor texts of Wikipedia, then the candidate entities of a name mention are selected by finding its name’s corresponding entry in the dictionary

四.Model Estimation

Entity Popularity Model

论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》—-》论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

where Count(e) is the count of the name mentions whose referent entity is e, and the |M| is the total name mention size.

Entity Name Model

比如,我们希望 P(Michael Jordan|Michael Jeffrey Jordan) 高,,P(MJ|Michael Jeffrey Jordan) 也高。 P(Michael I. Jordan|Michael Jeffrey Jordan) 应该是0.


论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》


Eg: “MJ”在Wikipedia指的并不是Michael Jeffrey Jordan, 这个the name model 将不能识别 “MJ” 就是Michael Jeffrey Jordan.


1) It is retained (translated into itself);

2) It is translated into its acronym;

3) It is omitted(translated into the word NULL);

4) It is translated into another word (misspelling or alias).

论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

wheree is a normalization factor, f is the full name of entity e, lf is the length of f, ls is the length of the name s, si the i th word of s, fj is the j th word of f and t(si|fj) is the lexical translation probability which indicates the probability of a word fj in the full name will be written as si in the output name.

Entity Context Model


C1: __wins NBA MVP.

C2: __is a researcher in machine learning

P(C1|Michael Jeffrey Jordan)应该很高,因为NBA球员迈克尔杰弗里乔丹经常出现在C1和P(C2|Michael Jeffrey Jordan)应该是非常低的,因为他很少出现在C2.

a context c containing n terms t1,t2…tn (term: a word; a named entity; a Wikipedia concept) ,the entity context model estimates the probability P(c|e) as

论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

                  论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

where Pg(t) is a general language model which is estimated using the whole Wikipedia data, and the optimal value of λ is set to 0.2

                     论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

where Counte(t) is the frequency of occurrences of a term t in the contexts of the name mentions whose referent entity is e

The NIL Entity Problem


1. add a pseudo entity, the NIL entity, into the knowledge base

2. the probability of a name mention is generated by the NIL entity is higher than all other entities in Knowledge base, we link the name mention to the NIL entity.

论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》


论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

日期:2022-11-24 点赞:878 阅读:9,104
Educational Codeforces Round 11 C. Hard Process 二分
C. Hard Process题目连接:http://www.codeforces.com/contest/660/problem/CDes…
日期:2022-11-24 点赞:807 阅读:5,580
下载Ubuntn 17.04 内核源代码
zengkefu@server1:/usr/src$ uname -aLinux server1 4.10.0-19-generic #21…
日期:2022-11-24 点赞:569 阅读:6,428
可用Active Desktop Calendar V7.86 注册码序列号
可用Active Desktop Calendar V7.86 注册码序列号Name: www.greendown.cn Code: &nb…
日期:2022-11-24 点赞:733 阅读:6,200
日期:2022-11-24 点赞:512 阅读:7,835
一、Struts2的获取  Struts的官方网站为:http://struts.apache.org/  下载完Struts2的jar包,…
日期:2022-11-24 点赞:671 阅读:4,918