1. Introduction to QueryParser
1) Sometimes we want to pass a String like this: "student AND teacher" to execue query.
It means we want to search a certain field which contains both "student" and "teacher".
We can easily use QueryParser to achieve this goal.
2) When using QuerParser, please make sure the field have to be set as "Field.Index.ANALYZED"
Please remember, when using "Field.Index.ANALYZED" when building index, the value of the field would be translated into lowercase.
When we are using common Query, we have to make sure the word we search is lower case.
But when we are using QueryParser, we don't have to care about this, as QueryParser will parse our String into lower case.
2. Example of QueryParser
private void testBuildIndex() { List<Student> studentList = new ArrayList<Student>(); Student student = new Student("11", "Davy", "Jones", "Male aaa Female", 100); studentList.add(student); student = new Student("22", "Davy", "Jones", "Male bbb Female", 110); studentList.add(student); student = new Student("33", "Jones", "Davy", "Male Female", 120); studentList.add(student); student = new Student("44", "Calyp", "Jones", "Female aa bb Male", 130); studentList.add(student); student = new Student("55", "Pso", "Caly", "Female cc dd ee Male", 140); studentList.add(student); searcherUtil.buildIndex(studentList); }
public void buildIndex(List<Student> studentList) { IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_35, new SimpleAnalyzer(Version.LUCENE_35)); IndexWriter writer = null; Document doc = null; try { writer = new IndexWriter(directory, config); for (Student student : studentList) { doc = new Document(); doc.add(new Field("id", student.getId(), Field.Store.YES, Field.Index.NOT_ANALYZED)); doc.add(new Field("name", student.getName(), Field.Store.YES, Field.Index.NOT_ANALYZED)); doc.add(new Field("password", student.getPassword(), Field.Store.YES, Field.Index.NOT_ANALYZED)); doc.add(new Field("gender", student.getGender(), Field.Store.YES, Field.Index.ANALYZED)); doc.add(new NumericField("score", Field.Store.YES, true) .setIntValue(student.getScore())); writer.addDocument(doc); } } catch (CorruptIndexException e) { e.printStackTrace(); } catch (LockObtainFailedException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { try { writer.close(); } catch (CorruptIndexException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } }
public void searchByQueryParser(Query query, int resultSize) { IndexSearcher searcher = getSearcher(); try { TopDocs tds = searcher.search(query, resultSize); Document document = null; for (ScoreDoc sd : tds.scoreDocs) { document = searcher.doc(sd.doc); System.out.println("id = " + document.get("id") + ", name = " + document.get("name") + ", password = " + document.get("password") + ", gender = " + document.get("gender") + ", score = " + document.get("score")); } } catch (IOException e) { e.printStackTrace(); } }
@Test
public void testSearchByQueryParser()
{
testBuildIndex();
// Create instance of QueryParser
QueryParser parser = new QueryParser(Version.LUCENE_35, "gender",
new SimpleAnalyzer(Version.LUCENE_35));
// Create instance of Query
// Search 'gender' that contains 'Female'
Query query = null;
try
{
query = parser.parse("Female AND Male");
} catch (ParseException e)
{
e.printStackTrace();
}
searcherUtil.searchByQueryParser(query, 100);
}
id = 33, name = Jones, password = Davy, gender = Male Female, score = 120 id = 11, name = Davy, password = Jones, gender = Male aaa Female, score = 100 id = 22, name = Davy, password = Jones, gender = Male bbb Female, score = 110 id = 44, name = Calyp, password = Jones, gender = Female aa bb Male, score = 130 id = 55, name = Pso, password = Caly, gender = Female cc dd ee Male, score = 140
Comments:
1) The query we are using is not a specific Query but a Query that is created by QueryParser.
2) We can use AND, OR, NOT to organize our sql String.
3) By default, "space" means OR.
parser.setDefaultOperator(Operator.AND);
Means we are suppressing the default "space" value and using AND to replace "space" instead.
4) By default, fieldName is defined with sentence below:
QueryParser parser = new QueryParser(Version.LUCENE_35, "gender", new SimpleAnalyzer(Version.LUCENE_35));
But we can use sql String below to replace default fieldName:
@Test
public void testSearchByQueryParser()
{
testBuildIndex();
// Create instance of QueryParser
QueryParser parser = new QueryParser(Version.LUCENE_35, "gender",
new SimpleAnalyzer(Version.LUCENE_35));
// Create instance of Query
// Search 'gender' that contains 'Female'
Query query = null;
try
{
query = parser.parse("name: Davy");
} catch (ParseException e)
{
e.printStackTrace();
}
searcherUtil.searchByQueryParser(query, 100);
}
3. By default, * is not allowed as first character when using WildcardQuery OR QueryParser.
But we can make this possible by enable leading wild card = true.
@Test
public void testSearchByQueryParser()
{
testBuildIndex();
// Create instance of QueryParser
QueryParser parser = new QueryParser(Version.LUCENE_35, "gender",
new SimpleAnalyzer(Version.LUCENE_35));
parser.setAllowLeadingWildcard(true);
// Create instance of Query
// Search 'gender' that contains 'Female'
Query query = null;
try
{
query = parser.parse("name: *vy AND gender: Male*");
} catch (ParseException e)
{
e.printStackTrace();
}
searcherUtil.searchByQueryParser(query, 100);
}
id = 11, name = Davy, password = Jones, gender = Male aaa Female, score = 100 id = 22, name = Davy, password = Jones, gender = Male bbb Female, score = 110
But how can we enable leadingWildcard when we are using WildcardQuer instead of QueryParser?
4. Query can also be parsed as TermRangQuery using QueryParser
@Test public void testSearchByQueryParser() { testBuildIndex(); // Create instance of QueryParser QueryParser parser = new QueryParser(Version.LUCENE_35, "gender", new SimpleAnalyzer(Version.LUCENE_35)); // Create instance of Query // Search 'id' that within the range of '1' to '3' Query query = null; try { query = parser.parse("id:[1 TO 3]"); } catch (ParseException e) { e.printStackTrace(); } searcherUtil.searchByQueryParser(query, 100); }
id = 11, name = Davy, password = Jones, gender = Male aaa Female, score = 100 id = 22, name = Davy, password = Jones, gender = Male bbb Female, score = 110
1) TO must be uppercase.
2) We can use query = parser.parse("id: {1 TO 3}"); instead of parser.parse("id: [1 TO 3]");
{} means contains left value(1) and right value(3).
[] means doesn't contain left value(1) and right value(3).
5. Query can also be parsed as PrefixQuery or TermQuery using QueryParser
1) We want to fetch the the document whose gender="Male aaa Female" as precise prefix query
@Test public void testSearchByQueryParser() { testBuildIndex(); // Create instance of QueryParser QueryParser parser = new QueryParser(Version.LUCENE_35, "gender", new SimpleAnalyzer(Version.LUCENE_35)); // Create instance of Query // Search 'gender' that contains 'Female' Query query = null; try { query = parser.parse("gender: \"Male aaa Female\""); } catch (ParseException e) { e.printStackTrace(); } searcherUtil.searchByQueryParser(query, 100); }
id = 11, name = Davy, password = Jones, gender = Male aaa Female, score = 100
6. Query cannot be parsed as NumericRangeQuery
@Test public void testSearchByQueryParser() { testBuildIndex(); QueryParser parser = new QueryParser(Version.LUCENE_35, "gender", new SimpleAnalyzer(Version.LUCENE_35)); Query query = null; try { query = parser.parse("score: [100 TO 130]"); } catch (ParseException e) { e.printStackTrace(); } searcherUtil.searchByQueryParser(query, 100); }
Result set is empty.
In order to achieve this, we have to create custom query parser that extends QueryParser.
This will be introduced in detail in the next few chapters.
Summary:
1) QueryParser can be used to parse a certain SQL String.
2) The SQL String can be parsed into TermQuery, TermRangQuery, WildcardQuery, BooleanQuery etc, according to the SQL String.
3) The SQL String cannot be parsed into NumericRangeQuery using the QueryParser provided by Lucene.
4) There are various rules for organizing SQL String and can be parsed into different kinds of Query.
Reference Links:
1) http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html describes all the rules for QueryParser.