Eric S.. Johansson
2012-09-18 02:57:35 UTC
refining the testing process a little more, I've come up with some simple test cases that represent actual usage. still not grocking the given example. heck, I'm having trouble generating a bnf description. funny how when you design for human speech, you get hard to parse. :-)
how does a parser like this handle recursion? for example:
"test_9": "[nested_text [one text] some [not [very ] plain] text]",
I expect to walk depth first and on the way back, there are calls to my code so I can do "stuff". I expect something like the following calls in this sequence:
call arg name=one, parent="nested_text", text="text"
call found_plain_text, text="some"
call arg name=very, parent="not", text="plain"
call keyword name=not
call found_plain_text, text="text"
call keyword name = nested_text,
anyway, here is my latest test cases and results. I'm really lost here. the docs are not helping. I need a mentor chat.
from pyparsing import *
all_tests = {
"test_1": "some plain text",
"test_2": "[simple ]",
"test_3": "[simple_text some plain text]",
"test_4": "[onearg [one ]]",
"test_5": "[twoarg [one ] [two ]]",
"test_6": "[onearg_text [one some plain text]]",
"test_7": "[twoarg_text [one ] [two some plain text arg]]",
"test_8": "[nested_text some [not plain] text]",
"test_9": "[nested_text [one text] some [not [very ] plain] text]",
"test_10": "[nested_text_escaped [one text] some [not [very ] plain] bracketed \[text\]]",
"test_11": """[nested_text_escaped_indented
[one text] some
[not
[very ]
plain
]
bracked \[text\]
]""",
}
LBRACK,RBRACK = map(Suppress,'[]')
escapedChar = Combine('\\' + oneOf(list(printables)))
keyword = Word(alphas,alphanums).setName("keyword").setDebug()
argword = Word(alphas,alphanums).setName("argword").setDebug()
arg = Forward()
dss = Forward()
text = ZeroOrMore(escapedChar | originalTextFor(OneOrMore(Word(printables,excludeChars='[]"\'\\'))) | quotedString | dss)
arg << Group(LBRACK + argword("arg") + Group(text)("text") + RBRACK)
arg.setName("arg").setDebug()
dss << Group(LBRACK + keyword("keyword") + Group(ZeroOrMore(arg))("args") +
Group(text)("text") + RBRACK)
parser = ZeroOrMore(dss)
j = ""
for i,j in all_tests.items():
print "------", i, "---------"
test = parser.parseString(j)
print "keyword is: %s" % test.keyword
print "arg is: %s" % test.arg
print "text is: %s" % test.text
------------------------- results ---------------------------
------ test_11 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['nested']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 60(2,30)
Matched keyword -> ['one']
Match arg at loc 64(2,34)
Exception raised:Expected "[" (at char 64), (line:2, col:34)
Match keyword at loc 105(3,30)
Matched keyword -> ['not']
Match arg at loc 145(4,36)
Match argword at loc 146(4,37)
Matched argword -> ['very']
Matched arg -> [['very', []]]
Match arg at loc 152(4,43)
Exception raised:Expected "[" (at char 189), (line:5, col:36)
keyword is:
arg is:
text is:
------ test_10 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['nested']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 22(1,23)
Matched keyword -> ['one']
Match arg at loc 26(1,27)
Exception raised:Expected "[" (at char 26), (line:1, col:27)
Match keyword at loc 38(1,39)
Matched keyword -> ['not']
Match arg at loc 42(1,43)
Match argword at loc 43(1,44)
Matched argword -> ['very']
Matched arg -> [['very', []]]
Match arg at loc 49(1,50)
Exception raised:Expected "[" (at char 50), (line:1, col:51)
keyword is:
arg is:
text is:
------ test_7 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['twoarg']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 14(1,15)
Matched keyword -> ['one']
Match arg at loc 18(1,19)
Exception raised:Expected "[" (at char 18), (line:1, col:19)
Match keyword at loc 21(1,22)
Matched keyword -> ['two']
Match arg at loc 25(1,26)
Exception raised:Expected "[" (at char 25), (line:1, col:26)
keyword is:
arg is:
text is:
------ test_6 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['onearg']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 14(1,15)
Matched keyword -> ['one']
Match arg at loc 18(1,19)
Exception raised:Expected "[" (at char 18), (line:1, col:19)
keyword is:
arg is:
text is:
------ test_5 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['twoarg']
Match arg at loc 8(1,9)
Match argword at loc 9(1,10)
Matched argword -> ['one']
Matched arg -> [['one', []]]
Match arg at loc 14(1,15)
Match argword at loc 16(1,17)
Matched argword -> ['two']
Matched arg -> [['two', []]]
Match arg at loc 21(1,22)
Exception raised:Expected "[" (at char 21), (line:1, col:22)
keyword is:
arg is:
text is:
------ test_4 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['onearg']
Match arg at loc 8(1,9)
Match argword at loc 9(1,10)
Matched argword -> ['one']
Matched arg -> [['one', []]]
Match arg at loc 14(1,15)
Exception raised:Expected "[" (at char 14), (line:1, col:15)
keyword is:
arg is:
text is:
------ test_3 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['simple']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
keyword is:
arg is:
text is:
------ test_2 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['simple']
Match arg at loc 8(1,9)
Exception raised:Expected "[" (at char 8), (line:1, col:9)
keyword is:
arg is:
text is:
------ test_1 ---------
keyword is:
arg is:
text is:
------ test_9 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['nested']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 14(1,15)
Matched keyword -> ['one']
Match arg at loc 18(1,19)
Exception raised:Expected "[" (at char 18), (line:1, col:19)
Match keyword at loc 30(1,31)
Matched keyword -> ['not']
Match arg at loc 34(1,35)
Match argword at loc 35(1,36)
Matched argword -> ['very']
Matched arg -> [['very', []]]
Match arg at loc 41(1,42)
Exception raised:Expected "[" (at char 42), (line:1, col:43)
keyword is:
arg is:
text is:
------ test_8 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['nested']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 19(1,20)
Matched keyword -> ['not']
Match arg at loc 23(1,24)
Exception raised:Expected "[" (at char 23), (line:1, col:24)
keyword is:
arg is:
text is:
[Finished in 0.1s]
how does a parser like this handle recursion? for example:
"test_9": "[nested_text [one text] some [not [very ] plain] text]",
I expect to walk depth first and on the way back, there are calls to my code so I can do "stuff". I expect something like the following calls in this sequence:
call arg name=one, parent="nested_text", text="text"
call found_plain_text, text="some"
call arg name=very, parent="not", text="plain"
call keyword name=not
call found_plain_text, text="text"
call keyword name = nested_text,
anyway, here is my latest test cases and results. I'm really lost here. the docs are not helping. I need a mentor chat.
from pyparsing import *
all_tests = {
"test_1": "some plain text",
"test_2": "[simple ]",
"test_3": "[simple_text some plain text]",
"test_4": "[onearg [one ]]",
"test_5": "[twoarg [one ] [two ]]",
"test_6": "[onearg_text [one some plain text]]",
"test_7": "[twoarg_text [one ] [two some plain text arg]]",
"test_8": "[nested_text some [not plain] text]",
"test_9": "[nested_text [one text] some [not [very ] plain] text]",
"test_10": "[nested_text_escaped [one text] some [not [very ] plain] bracketed \[text\]]",
"test_11": """[nested_text_escaped_indented
[one text] some
[not
[very ]
plain
]
bracked \[text\]
]""",
}
LBRACK,RBRACK = map(Suppress,'[]')
escapedChar = Combine('\\' + oneOf(list(printables)))
keyword = Word(alphas,alphanums).setName("keyword").setDebug()
argword = Word(alphas,alphanums).setName("argword").setDebug()
arg = Forward()
dss = Forward()
text = ZeroOrMore(escapedChar | originalTextFor(OneOrMore(Word(printables,excludeChars='[]"\'\\'))) | quotedString | dss)
arg << Group(LBRACK + argword("arg") + Group(text)("text") + RBRACK)
arg.setName("arg").setDebug()
dss << Group(LBRACK + keyword("keyword") + Group(ZeroOrMore(arg))("args") +
Group(text)("text") + RBRACK)
parser = ZeroOrMore(dss)
j = ""
for i,j in all_tests.items():
print "------", i, "---------"
test = parser.parseString(j)
print "keyword is: %s" % test.keyword
print "arg is: %s" % test.arg
print "text is: %s" % test.text
------------------------- results ---------------------------
------ test_11 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['nested']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 60(2,30)
Matched keyword -> ['one']
Match arg at loc 64(2,34)
Exception raised:Expected "[" (at char 64), (line:2, col:34)
Match keyword at loc 105(3,30)
Matched keyword -> ['not']
Match arg at loc 145(4,36)
Match argword at loc 146(4,37)
Matched argword -> ['very']
Matched arg -> [['very', []]]
Match arg at loc 152(4,43)
Exception raised:Expected "[" (at char 189), (line:5, col:36)
keyword is:
arg is:
text is:
------ test_10 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['nested']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 22(1,23)
Matched keyword -> ['one']
Match arg at loc 26(1,27)
Exception raised:Expected "[" (at char 26), (line:1, col:27)
Match keyword at loc 38(1,39)
Matched keyword -> ['not']
Match arg at loc 42(1,43)
Match argword at loc 43(1,44)
Matched argword -> ['very']
Matched arg -> [['very', []]]
Match arg at loc 49(1,50)
Exception raised:Expected "[" (at char 50), (line:1, col:51)
keyword is:
arg is:
text is:
------ test_7 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['twoarg']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 14(1,15)
Matched keyword -> ['one']
Match arg at loc 18(1,19)
Exception raised:Expected "[" (at char 18), (line:1, col:19)
Match keyword at loc 21(1,22)
Matched keyword -> ['two']
Match arg at loc 25(1,26)
Exception raised:Expected "[" (at char 25), (line:1, col:26)
keyword is:
arg is:
text is:
------ test_6 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['onearg']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 14(1,15)
Matched keyword -> ['one']
Match arg at loc 18(1,19)
Exception raised:Expected "[" (at char 18), (line:1, col:19)
keyword is:
arg is:
text is:
------ test_5 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['twoarg']
Match arg at loc 8(1,9)
Match argword at loc 9(1,10)
Matched argword -> ['one']
Matched arg -> [['one', []]]
Match arg at loc 14(1,15)
Match argword at loc 16(1,17)
Matched argword -> ['two']
Matched arg -> [['two', []]]
Match arg at loc 21(1,22)
Exception raised:Expected "[" (at char 21), (line:1, col:22)
keyword is:
arg is:
text is:
------ test_4 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['onearg']
Match arg at loc 8(1,9)
Match argword at loc 9(1,10)
Matched argword -> ['one']
Matched arg -> [['one', []]]
Match arg at loc 14(1,15)
Exception raised:Expected "[" (at char 14), (line:1, col:15)
keyword is:
arg is:
text is:
------ test_3 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['simple']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
keyword is:
arg is:
text is:
------ test_2 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['simple']
Match arg at loc 8(1,9)
Exception raised:Expected "[" (at char 8), (line:1, col:9)
keyword is:
arg is:
text is:
------ test_1 ---------
keyword is:
arg is:
text is:
------ test_9 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['nested']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 14(1,15)
Matched keyword -> ['one']
Match arg at loc 18(1,19)
Exception raised:Expected "[" (at char 18), (line:1, col:19)
Match keyword at loc 30(1,31)
Matched keyword -> ['not']
Match arg at loc 34(1,35)
Match argword at loc 35(1,36)
Matched argword -> ['very']
Matched arg -> [['very', []]]
Match arg at loc 41(1,42)
Exception raised:Expected "[" (at char 42), (line:1, col:43)
keyword is:
arg is:
text is:
------ test_8 ---------
Match keyword at loc 1(1,2)
Matched keyword -> ['nested']
Match arg at loc 7(1,8)
Exception raised:Expected "[" (at char 7), (line:1, col:8)
Match keyword at loc 19(1,20)
Matched keyword -> ['not']
Match arg at loc 23(1,24)
Exception raised:Expected "[" (at char 23), (line:1, col:24)
keyword is:
arg is:
text is:
[Finished in 0.1s]