java - Regex not to give overflow error -
sample code:
import java.util.regex.matcher; import java.util.regex.pattern; public class regex { public static void main(string[] args) { string data = "shyam , you. 2.3 km away home. lakshmi , you. ram , you. mike. "; pattern pattern = pattern.compile( "\\s*((?:[^\\.]|(?:\\w+\\.)+\\w)*are.*?)(?:\\.\\s|\\.$)", pattern.dotall); matcher matcher = pattern.matcher(data); while (matcher.find()) { system.out.println(matcher.group(0)); } } }
output:
you 2.3 km away home. mike.
i getting expected output on executing code above. problem when testing same regex greater string , showing overflow error. searched same , came know alternation (a|b)* in regular expression causes problem. there way solve issue? please help.
i have tried refactor regex avoid backtracking. can try out regex:
pattern pattern = pattern.compile("(?>[^.]|(?:\\w+\\.)+\\w)+\\sare\\s.*?(?>\\.\\s|\\.$)", pattern.dotall);
(?>group)
called atomic grouping.
as per: http://www.regular-expressions.info/atomic.html
atomic grouping
an atomic group group that, when regex engine exits it,
automatically throws away backtracking positions remembered tokens inside group
. atomic groups non-capturing. syntax(?>group)
.
Comments
Post a Comment