Class | Ferret::Analysis::RegExpAnalyzer |
In: |
ext/r_analysis.c
|
Parent: | Ferret::Analysis::Analyzer |
Using a RegExpAnalyzer is a simple way to create a custom analyzer. If implemented in Ruby it would look like this;
class RegExpAnalyzer def initialize(reg_exp, lower = true) @lower = lower @reg_exp = reg_exp end def token_stream(field, str) if @lower return LowerCaseFilter.new(RegExpTokenizer.new(str, reg_exp)) else return RegExpTokenizer.new(str, reg_exp) end end end
csv_analyzer = RegExpAnalyzer.new(/[^,]+/, false)
Create a new RegExpAnalyzer which will create tokenizers based on the regular expression and lowercasing if required.
reg_exp: | the token matcher for the tokenizer to use |
lower: | set to false if you don‘t want to downcase the tokens |
Create a new TokenStream to tokenize input. The TokenStream created may also depend on the field_name. Although this parameter is typically ignored.
field_name: | name of the field to be tokenized |
input: | data from the field to be tokenized |