See Also
You are here: Full-Text Indexing > Create full-text index
ContentsIndexHome
PreviousUpNext
Create full-text index
Create full-text index

 

Syntax:

CREATE FREETEXT INDEX index_name
(column_definition [, ...]);

 

column_definition syntax :

column_name type_name
[COMMENT {'constant' | ('constant')}]
[STORE|UNSTORE]
[UNINDEX]
[KEYWORD]

 

type_name syntax :

TINYINT | SMALLINT | INT | INTEGER | BIGINT | DOUBLE | FLOAT | DATETIME | CHAR | VARCHAR | TEXT

 

Creates a new Lucene full-text index. index_name specify the name of the index in database. To specify the database use ‘USE db’ command before CREATE statement or set DB name in index name: database.index_name

 

Column Definition 

Each column must have at least a name and a type listed above. 

STORE 

If specified and a column is not TEXT type, the value of the column will be indexed and stored in full-text index. We not recommend to store all columns in index. Usually, to store unique fields, used uniquely identify table’s row, for example primary key field(s). 

UNSTORE 

If specified, index the column values and allow search, sort. However, the full-text search results contain NULL values for the field, since they was not stored. 

UNINDEX 

The field will be stored in full-text index, but not indexed, therefore not searchable. Usually, we store unique table’s fields, for instance primary key fields. The fields have no interest in full-text search query, but they are needed to join the free text results with the linked DB table. Read more in <a>Querying full-text</a> section. 

KEYWORD 

When present, the value will not be tokenized / split. All not character fields is auto set to use KEYWORD. The string ‘http://www.server.com’ will only be searchable by entire string, www, http result in nothing. 

Sorting and range queries can only be applied on fields that has been specified with KEYWORD. 

 

Type conversion 

All data in Lucene internally stored as text. Therefore, the data types: TINYINT | SMALLINT | INT | INTEGER | BIGINT | DOUBLE | FLOAT | DATETIME will be converted into a text representation that can be ordered alphabetically. Numbers will be padded with the leading zeros, so that the numbers has the same width and sort order is the same as the converted number’s order. DATETIME are converted using: YYYYMMDD format. 

 

Setting up “stop words” 

Stop words allows skipping from indexing special words that is meaningless for the search output. Such words are: “and, or, is, the,” or just a single character. By default, full-text index uses English stop words. Developers can customize stop words. Custom stop words have to be enlisted in stopwords.txt file, where each word must be separated by newline. Copy the file to full-text index folder. The folder location can be obtained from “system.systables” table’s column “path”. 

You must to re-index Lucene index after changing stop words. Although, it is not necessary to do so, if full-text index is empty. How to re-index the index, please read in <a>Rebuilding Index</a> section. 

An example:  

use debate
create freetext index debate_search
(
postid int unindex,
threadid int unindex unstore,
dateline varchar unstore,
title varchar unstore,
pagetext text unstore
)