regex - How do you understand the output of re.pm when debug turned on? -
[root@ test]# perl -e 'use re "debug";"a" =~ /.*/'; compiling rex `.*' size 3 got 28 bytes offset annotations. first @ 2 1: star(3) 2: reg_any(0) 3: end(0) anchored(mbol) implicit minlen 0 offsets: [3] 2[1] 1[1] 3[0] matching rex ".*" against "a" setting eval scope, savestack=3 0 <> <a> | 1: star reg_any can match 1 times out of 2147483647... setting eval scope, savestack=3 1 <a> <> | 3: end match successful! freeing rex: `".*"'`
anyone can interpret this?
the output has 2 important parts: pattern compilation , runtime matching.
the first part describes nodes, of there three, in compiled automaton.
star(n)
matches 0 or more of following node , continues through node n.reg_any
matches character except newline (i.e.,/./
)end
marks end state of automaton.
mbol
matches beginning-of-line in multiline match mode, i.e., /^/m
. there implicitly because of .*
@ beginning of pattern. (remember: regex quantifiers greedy default.)
the minimum length of string can match pattern zero, or empty string. (remember: *
quantifier always succeeds!)
offsets of form
nodenum:position[length]
and link nodes regex in program. in case, .*
(nodes 2 , 1) begins @ first position in pattern, , end state there implicitly. offsets handy regex debuggers, e.g., highlight subpattern attempting match.
now it's compiled, can matched, , latter part traces execution. pragmas , debugging section of perlretut documentation explains form of lines describe match progress:
each step of form
n <x> <y>
,<x>
part of string matched ,<y>
part not yet matched.
the match in question begins no text consumed, .*
matches a
, , pattern matches successfully.
the eval scope machinery related executable code in regexes, don't use.
the debugging regular expressions section of perldebguts documentation gives more background information, and, always, use source, luke!
Comments
Post a Comment