首页 技术 正文
技术 2022年11月15日
0 收藏 772 点赞 4,783 浏览 5671 个字

Sometime back, I described how I built (among other things) a custom Solr QParser plugin to handle Payload Term Queries. Looking back on this recently, I realized how lame it was – all it could handle were single Payload Term Queries, and a one level deep AND and OR combinations of these queries. More to the point, I discovered that I had to support queries of this form:

1
+concepts:123456 +(concepts:234567 concepts:345678 ...)

The original parsing code simply split up the query by whitespace, then by colon, and depending on whether the key was preceded by a “+” sign, either added it to the Boolean Query as an Occur.MUST or Occur.SHOULD. Obviously, this would not be able to parse the form of the query above.

Coencidentally, a few days ago, I was hunting around for something completely different on my laptop, and I came across the QueryParser Lucene contrib module that replaces the original Lucene JavaCC based QueryParser with a nice little framework that splits the query parsing into 3 phases – syntax parsing, query processing and query building. It has been available since Lucene 2.9.0, and on the version I am using (Lucene 2.9.3/Solr 1.4.1) both QueryParser implementations are supported.

In my case, my Payload Query syntax is identical to the Term Query syntax, so all I really needed to do was to return a PayloadTermQuery instead of a TermQuery in the query building phase. So all I needed to do to build a robust Payload QueryParser was to just implement a custom QueryBuilder and call it from within this framework.

There is not much documentation available on how to use the framework though, apart from the Javadocs, and the advice in there is to take a look at the StandardQueryParser and use that as a template to design your own. So thats what I did. I ended up building a few more classes in order to integrate it into my custom QParser plugin, but it was really quite simple.

Here is the updated code for my QParser plugin. Apart from this code change, all I had to do was add the lucene-queryparser-2.9.3.jar to the Solr classpath. There is no change in its configuration and the associated Solr request handler I used it from – both these are described in my previous post I referred to above.

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
// $Source: src/java/org/apache/solr/search/ext/PayloadQParserPlugin.java
package org.apache.solr.search.ext;import org.apache.lucene.index.Term;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.core.QueryNodeException;
import org.apache.lucene.queryParser.core.QueryParserHelper;
import org.apache.lucene.queryParser.core.nodes.FieldQueryNode;
import org.apache.lucene.queryParser.core.nodes.QueryNode;
import org.apache.lucene.queryParser.standard.builders.StandardQueryBuilder;
import org.apache.lucene.queryParser.standard.builders.StandardQueryTreeBuilder;
import org.apache.lucene.queryParser.standard.config.StandardQueryConfigHandler;
import org.apache.lucene.queryParser.standard.parser.StandardSyntaxParser;
import org.apache.lucene.queryParser.standard.processors.StandardQueryNodeProcessorPipeline;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.payloads.AveragePayloadFunction;
import org.apache.lucene.search.payloads.PayloadTermQuery;
import org.apache.solr.common.params.SolrParams;
import org.apache.solr.common.util.NamedList;
import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.search.QParser;
import org.apache.solr.search.QParserPlugin;/**
* Parser plugin to parse payload queries.
*/
public class PayloadQParserPlugin extends QParserPlugin { @Override
public QParser createParser(String qstr, SolrParams localParams,
SolrParams params, SolrQueryRequest req) {
return new PayloadQParser(qstr, localParams, params, req);
} public void init(NamedList args) {
// do nothing
}
}class PayloadQParser extends QParser { public PayloadQParser(String qstr, SolrParams localParams,
SolrParams params, SolrQueryRequest req) {
super(qstr, localParams, params, req);
} @Override
public Query parse() throws ParseException {
PayloadQueryParser parser = new PayloadQueryParser();
try {
Query q = (Query) parser.parse(qstr, "concepts");
return q;
} catch (QueryNodeException e) {
throw new ParseException(e.getMessage());
}
}
}class PayloadQueryParser extends QueryParserHelper { public PayloadQueryParser() {
super(new StandardQueryConfigHandler(), new StandardSyntaxParser(),
new StandardQueryNodeProcessorPipeline(null),
new PayloadQueryTreeBuilder());
}
}class PayloadQueryTreeBuilder extends StandardQueryTreeBuilder { public PayloadQueryTreeBuilder() {
super();
setBuilder(FieldQueryNode.class, new PayloadQueryNodeBuilder());
}
}class PayloadQueryNodeBuilder implements StandardQueryBuilder { @Override
public PayloadTermQuery build(QueryNode queryNode) throws QueryNodeException {
FieldQueryNode node = (FieldQueryNode) queryNode;
return new PayloadTermQuery(
new Term(node.getFieldAsString(), node.getTextAsString()),
new AveragePayloadFunction(), false);
}
}

As you can see, in my QParser.parse() method, I instantiate PayloadQueryParser, which is a subclass of QueryParserHelper. I reuse the same constructor code as StandardQueryParser (another subclass of QueryParserHelper and my template), except I pass in a custom QueryBuilder – the PayloadQueryTreeBuilder. The PayloadQueryTreeBuilder subclasses StandardQueryTreeBuilder, except it redefines what builder to use for FieldQueryNode types – the StandardQueryTreeBuilder is sort of a factory and delegates to the appropriate QueryBuilder depending on the type of the node. Finally, the PayloadQueryNodeBuilder implements the StandardQueryBuilder (similar to the FieldQueryNodeBuilder), and redefines the build() method to produce a PayloadTermQuery instead of a TermQuery as FieldQueryNodeBuilder does.

And thats pretty much it. I tested this by hitting the /concept-search URL and verified that the queries are correctly parsed and returned by printing the queries in the log.

Hopefully this post was useful, if for nothing else that people find out about the new QueryParser framework and begin to use it. The customization I did here is pretty trivial in terms of code, but it saved me a lot of work.

http://sujitpal.blogspot.com/2011/03/using-lucenes-new-queryparser-framework.html

相关推荐
python开发_常用的python模块及安装方法
adodb:我们领导推荐的数据库连接组件bsddb3:BerkeleyDB的连接组件Cheetah-1.0:我比较喜欢这个版本的cheeta…
日期:2022-11-24 点赞:878 阅读:9,000
Educational Codeforces Round 11 C. Hard Process 二分
C. Hard Process题目连接:http://www.codeforces.com/contest/660/problem/CDes…
日期:2022-11-24 点赞:807 阅读:5,512
下载Ubuntn 17.04 内核源代码
zengkefu@server1:/usr/src$ uname -aLinux server1 4.10.0-19-generic #21…
日期:2022-11-24 点赞:569 阅读:6,358
可用Active Desktop Calendar V7.86 注册码序列号
可用Active Desktop Calendar V7.86 注册码序列号Name: www.greendown.cn Code: &nb…
日期:2022-11-24 点赞:733 阅读:6,141
Android调用系统相机、自定义相机、处理大图片
Android调用系统相机和自定义相机实例本博文主要是介绍了android上使用相机进行拍照并显示的两种方式,并且由于涉及到要把拍到的照片显…
日期:2022-11-24 点赞:512 阅读:7,771
Struts的使用
一、Struts2的获取  Struts的官方网站为:http://struts.apache.org/  下载完Struts2的jar包,…
日期:2022-11-24 点赞:671 阅读:4,849