[Pyparsing] Dealing with blocks in different order

Discussion:

mlist @dslextreme.com

2015-04-20 19:10:27 UTC

I have everything I need to parse a file defined, but I am running in to a
problem. The main part of the file consists of different blocks (see
below) and the blocks can be in different order or not even there. I am not
sure how to deal with this. For example, it could be

[video block]
[audio block]

or

[audio block]
[video block]

or

[video block]
[audio block]
[audio block]

or any permutation of the above.

Here is the sample text

Track ID 1 soun (Audio) Enabled Not self-contained
Format soun/aac 48000 Hz aac FormatFlags: 0x00000000 Bytes/Pkt: 0
Frames/Pkt: 1024 Bytes/Frame: 0 Chan/Frame: 2 Bits/Chan: 0 Reserved:
0x00000000
ChannelLayout: Stereo (L R)
Media Timescale: 48000 Duration: 240640/48000 00:00:05.013
MinSampleDuration: 1024/48000 AdvanceDecodeDelta: 0/48000
00:00:00.000
Num data bytes: 80213 Est. data rate: 127.999 kbps Nominal
framerate: 46.875 fps 235 samples
Track volume: 1
Included in auto selection. Language code <und>
Dimensions: 0 x 0 Track Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0
1 edit: Media start 0/48000 00:00:00.000 dur 3000/600
00:00:05.000 Track start 0/600 00:00:00.000 dur 3000/600
00:00:05.000

Track ID 2 vide (Video) Enabled Not self-contained
Format vide/avc1 dimensions: video 1920 x 1080, presentation: 1920
x 1080 (pixelAspect+clean), cleanAperture: 1920 x 1080 @ 0,0
(originTopLeft)
Media Timescale: 600 Duration: 3003/600 00:00:05.005
MinSampleDuration: 20/600 AdvanceDecodeDelta: 21/600 00:00:00.035
Num data bytes: 6555892 Est. data rate: 10.479 Mbps Nominal
framerate: 29.970 fps 150 samples
Frame Reordering Required
Included in auto selection. Language code <und>
Dimensions: 1920 x 1080 CleanAperture: 1920 x 1080
ProductionAperture: 1920 x 1080 EncodedPixels: 1920 x 1080 Track
Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0
1 edit: Media start 0/600 00:00:00.000 dur 3000/600
00:00:05.000 Track start 0/600 00:00:00.000 dur 3000/600
00:00:05.000

And here is what my code looks like:

# Audio Block
self.audio_track_info = Group(
self.track_id + self.audio_track_format + self.channel_layout +
self.media_timescale +
self.track_data + self.audio_track_volume + self.included +
self.audio_track_dimensions +
OneOrMore(self.edits).setResultsName('edits')).setResultsName('audio_track')

# Video Block
self.video_track_info = Group(self.track_id + self.video_track_format
+ self.media_timescale + self.track_data +
self.included + self.video_track_dimensions +

OneOrMore(self.edits).setResultsName('edits')).setResultsName('video_track')

self.tracks = Group(ZeroOrMore(self.video_track_info)) +
Group(ZeroOrMore(self.audio_track_info)).setResultsName('tracks')

I am using parseString on the text and this only returns the audio block
and not the video block.

How do I fix this?

john grant

2015-04-21 05:41:08 UTC

Permalink

I've been stuck on the same thing. I think I know the answer, but I have not had time to verify it. If you find one of the examples for parsing a C structure, I think it must contain the secret because the parser works no matter the order of the structure members. If my memory is correct, that example has parsers defined for each type of member (e.g. single variable, array, pointer, etc), and then there is a single parser that ORs each of the other parsers together, and the matches get inserted into a container (i.e. OneOrMore).
Let me know if you find a fix!
FYI: I've had great support on the wikispaces site. Paul has answered many of my questions. However, finding that support channel was insanely difficult. You have to go to the home page, then click Getting Help in the left side menu, then in the body of the page that appears, click the text that is underlined/hyperlinked saying "pyparsing home page" (which is poorly named).
-John

On Monday, April 20, 2015 12:37 PM, "mlist @dslextreme.com" <***@dslextreme.com> wrote:

I have everything I need to parse a file defined, but I am running in to a
problem. The main part of the file consists of different blocks (see
below) and the blocks can be in different order or not even there. I am not
sure how to deal with this. For example, it could be

[video block]
[audio block]

or

[audio block]
[video block]

or

[video block]
[audio block]
[audio block]

or any permutation of the above.

Here is the sample text

Track ID 1 soun (Audio) Enabled Not self-contained
Format soun/aac 48000 Hz aac FormatFlags: 0x00000000 Bytes/Pkt: 0
Frames/Pkt: 1024 Bytes/Frame: 0 Chan/Frame: 2 Bits/Chan: 0 Reserved:
0x00000000
ChannelLayout: Stereo (L R)
Media Timescale: 48000 Duration: 240640/48000 00:00:05.013
MinSampleDuration: 1024/48000 AdvanceDecodeDelta: 0/48000
00:00:00.000
Num data bytes: 80213 Est. data rate: 127.999 kbps Nominal
framerate: 46.875 fps 235 samples
Track volume: 1
Included in auto selection. Language code <und>
Dimensions: 0 x 0 Track Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0
1 edit: Media start 0/48000 00:00:00.000 dur 3000/600
00:00:05.000 Track start 0/600 00:00:00.000 dur 3000/600
00:00:05.000

Track ID 2 vide (Video) Enabled Not self-contained
Format vide/avc1 dimensions: video 1920 x 1080, presentation: 1920
x 1080 (pixelAspect+clean), cleanAperture: 1920 x 1080 @ 0,0
(originTopLeft)
Media Timescale: 600 Duration: 3003/600 00:00:05.005
MinSampleDuration: 20/600 AdvanceDecodeDelta: 21/600 00:00:00.035
Num data bytes: 6555892 Est. data rate: 10.479 Mbps Nominal
framerate: 29.970 fps 150 samples
Frame Reordering Required
Included in auto selection. Language code <und>
Dimensions: 1920 x 1080 CleanAperture: 1920 x 1080
ProductionAperture: 1920 x 1080 EncodedPixels: 1920 x 1080 Track
Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0
1 edit: Media start 0/600 00:00:00.000 dur 3000/600
00:00:05.000 Track start 0/600 00:00:00.000 dur 3000/600
00:00:05.000

And here is what my code looks like:

# Audio Block
self.audio_track_info = Group(
self.track_id + self.audio_track_format + self.channel_layout +
self.media_timescale +
self.track_data + self.audio_track_volume + self.included +
self.audio_track_dimensions +
OneOrMore(self.edits).setResultsName('edits')).setResultsName('audio_track')

# Video Block
self.video_track_info = Group(self.track_id + self.video_track_format
+ self.media_timescale + self.track_data +
self.included + self.video_track_dimensions +

OneOrMore(self.edits).setResultsName('edits')).setResultsName('video_track')

self.tracks = Group(ZeroOrMore(self.video_track_info)) +
Group(ZeroOrMore(self.audio_track_info)).setResultsName('tracks')

I am using parseString on the text and this only returns the audio block
and not the video block.

How do I fix this?
------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
Pyparsing-users mailing list
Pyparsing-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pyparsing-users

Hans Meine

2015-04-21 06:58:47 UTC

Permalink

Hi,

I've been stuck on the same thing. I think I know the answer, … that example has parsers defined for each type of member (e.g. single variable, array, pointer, etc), and then there is a single parser that ORs each of the other parsers together, and the matches get inserted into a container (i.e. OneOrMore).

Exactly that is how I would express it:

OneOrMore(Video | Audio)

Or is there anything I am missing?

Best regards,
Hans

Paul McGuire

2015-04-21 06:59:00 UTC

Permalink

I agree with John's suggestion - define an overall grammar that is simply OneOrMore(video_block | audio_block), and then use that to parse the mixed listing of blocks.

The other way in pyparsing to treat a grammar like "A, B, C and D, in any order" is to use the Each construct in pyparsing, created using the '&' operator. So "A & B & C & D" will match the 4 elements in any order, but all 4 must be present. If some are optional, then indicate them so using Optional, as in "A & B & Optional(C) & D". Finally, if some items might appear more than once, use ZeroOrMore or OneOrMore, as in "OneOrMore(A) & B & ZeroOrMore(C) & D". This last expression will match the repeated elements even if they are not all together, so AABADBACC would match, as would DABAAA. ABC would *not* match, as both B and D elements are required. But for your particular case, I think just OneOrMore(A | B) should be sufficient for any combination of A's and B's.

I'm glad to see some of the other pyparsing folks stepping up to answer some questions, here and on StackOverflow. And thanks, John, for your kind comments on the help you get on the wikispaces site. As it turns out, between work and family activities for the next month or so, my participation on these lists will be limited, so I appreciate other experienced pyparsing users helping the new folks.

Cheers,
-- Paul

-----Original Message-----
From: john grant [mailto:***@yahoo.com]
Sent: Tuesday, April 21, 2015 12:41 AM
To: mlist @dslextreme.com; pyparsing-***@lists.sourceforge.net
Subject: Re: [Pyparsing] Dealing with blocks in different order

I've been stuck on the same thing. I think I know the answer, but I have not had time to verify it. If you find one of the examples for parsing a C structure, I think it must contain the secret because the parser works no matter the order of the structure members. If my memory is correct, that example has parsers defined for each type of member (e.g. single variable, array, pointer, etc), and then there is a single parser that ORs each of the other parsers together, and the matches get inserted into a container (i.e. OneOrMore).
Let me know if you find a fix!
FYI: I've had great support on the wikispaces site. Paul has answered many of my questions. However, finding that support channel was insanely difficult. You have to go to the home page, then click Getting Help in the left side menu, then in the body of the page that appears, click the text that is underlined/hyperlinked saying "pyparsing home page" (which is poorly named).
-John

---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com