[Pyparsing] parse challange, at least for me:)

Discussion:

Werner

2014-05-23 11:30:54 UTC

Hi,

I like to parse many .py files and check if any of the following is
present in it:

test = """# Tags: phoenix-port, unittest, documented, py3-port"""

A file might or might not have this comment line somewhere at the top,
and it might have one or more of the tags.

I like to report on the file name and what tags are present in it, the
use of this is a check list of what modules have been converted/been done.

On the above test string I tried this, but it only reports on the first.

allTags = pp.Literal("# Tags:") +\
pp.Literal("phoenix-port").setResultsName('phoenix') |\
pp.FollowedBy("unittest").setResultsName('test') |\
pp.FollowedBy("py3-port").setResultsName('py3') |\
pp.FollowedBy("documented").setResultsName('doc')

result = allTags.parseString(test)
print(result)

The other problem I have when using 'parseFile' is how to tell it to
ignore everything before or after, or even all if the '# Tags:' line is
not present.

Hopefully someone can push me in the right direction.

Werner

Werner

2014-05-24 12:43:09 UTC

Permalink

Hi,

Made a bit of progress, following works for my test string, but doesn't
yet for when I parse files.

tagStart = pp.Literal("# Tags:").setDebug()
otherStuff = pp.lineStart + pp.restOfLine

def aTagLineAction(s, l, t):
return 'test'

aTagLine = tagStart + pp.restOfLine
aTagLine.setParseAction(aTagLineAction)

allLines = pp.OneOrMore(otherStuff | aTagLine)

result = allLines.parseString(test)
print(result)

Werner

Werner

2014-05-24 15:04:11 UTC

Permalink

Hi Diez,

Hi,
for that problem (if it's that "simple", meaning one line, strict conventions) I wouldn't bother using pyparsing.
Just readline, and string-methods.

Good point, will do that, but I am still intrigued why my code in the
last post is not handling correctly the none matching lines, i.e. I like
to just have them ignored.

Werner