python - regexp and txt file -
python - regexp and txt file -
i have txt file , regexp , seems regexp working have excess symbols in tail
reg = re.findall(r"source rpm: [ \t\n\r]*(.*?) \s", stdout, re.dotall|re.multiline|re.ignorecase)
and in output have
liblqr-0.4.1-5.src.rpm size gwenhywfar-4.1.0-2.src.rpm size texlive-20110705-1.src.rpm size mandriva-theme-1.4.9-9.2.src.rpm size
or
['liblqr-0.4.1-5.src.rpm\nsize'] ['gwenhywfar-4.1.0-2.src.rpm\nsize'] ['texlive-20110705-1.src.rpm\nsize'] ['mandriva-theme-1.4.9-9.2.src.rpm\nsize']
what "nsize"?
you're doing ungreedy search . ('any character'), including new lines, until space met. new line isn't explicitly space (' ') character, why removing regex create work.
r"source rpm: [ \t\n\r]*(.*?)\s" ^ removed ' '
python regex
Comments
Post a Comment