drupal - Ideographic space in solr query -


i have issue solr don't seem able on with...

when searching "マルチェロ ブラック" (with normal space between words) i'm getting expected results (15 of them). when searching "マルチェロ ブラック" (which has ideographic space \u3000 between words instead of normal one) i'm not getting results.

my fieldtype configuration pretty basic:

<fieldtype name="text_cjk" class="solr.textfield">   <analyzer>     <tokenizer class="solr.cjktokenizerfactory"/>   </analyzer> </fieldtype> 

i've tried adding

<charfilter class="solr.mappingcharfilterfactory" mapping="mapping-japanese.txt"/> 

with mapping like

"\u3000" => "\u0020" 

or even

"\u3000" => " " 

but didn't help.

also tried adding

<filter class="solr.positionfilterfactory" /> 

as suggested in language analysis: chinese, japanese, korean, started getting 200+ results first search, , 1000+ results second. no either.

running solr version 3.5, using cjkbigramfilterfactory out of question. (just saying, no idea if anyhow.)

read quite lot of japanese blogs on solr configuration (thanks google chrome making easy!), examples have cjkbigramfilterfactory, lowercasefilterfactory, nothing seem in case.

any ideas else try make work?

we use basis tech's rosette lucene & solr, not free.


Comments

Popular posts from this blog

Unable to remove the www from url on https using .htaccess -