java - Regex not to give overflow error -


sample code:

import java.util.regex.matcher; import java.util.regex.pattern;  public class regex {     public static void main(string[] args) {         string data = "shyam , you. 2.3 km away home. lakshmi , you. ram , you. mike. ";         pattern pattern = pattern.compile(                 "\\s*((?:[^\\.]|(?:\\w+\\.)+\\w)*are.*?)(?:\\.\\s|\\.$)",                 pattern.dotall);         matcher matcher = pattern.matcher(data);         while (matcher.find()) {             system.out.println(matcher.group(0));         }     } } 

output:

you 2.3 km away home.   mike.  

i getting expected output on executing code above. problem when testing same regex greater string , showing overflow error. searched same , came know alternation (a|b)* in regular expression causes problem. there way solve issue? please help.

i have tried refactor regex avoid backtracking. can try out regex:

pattern pattern = pattern.compile("(?>[^.]|(?:\\w+\\.)+\\w)+\\sare\\s.*?(?>\\.\\s|\\.$)",                   pattern.dotall); 

(?>group) called atomic grouping.

as per: http://www.regular-expressions.info/atomic.html

atomic grouping

an atomic group group that, when regex engine exits it, automatically throws away backtracking positions remembered tokens inside group. atomic groups non-capturing. syntax (?>group).


Comments

Popular posts from this blog

Unable to remove the www from url on https using .htaccess -