mercredi 2 avril 2014

Unescaped java not matching in regex matcher.find()


Vote count:

0




I have the following code that basically matches "Match this:" and keeps the first sentence. However, there are sometimes unicode characters that get passed into the text that are causing backtracking on other more complicated regex's. Escaping seem to alleviate the backtracking index out of range exceptions. However, now the regex isn't matching.


What i would like to know is why this regex isn't matching when escaped? If you comment out the escape/unescape java lines everything.



String text = "Keep this\n\n"
+ "Match this:\n\nDelete 📱 this";
text = org.apache.commons.lang.StringEscapeUtils.escapeJava(text);
Pattern PATTERN = Pattern.compile("^Match this:$",
Pattern.MULTILINE);
Matcher m = PATTERN.matcher(text);
if (m.find()) {
text = text.substring(0, m.start()).replaceAll("[\\n]+$", "");
}
text = org.apache.commons.lang.StringEscapeUtils.unescapeJava(text);
System.out.println(text);


asked 1 min ago

JaJ

2,019





Aucun commentaire:

Enregistrer un commentaire