Python Stateful Parsing Peter Wad Sackett Stateful parsing

  • Slides: 4
Download presentation
Python Stateful Parsing Peter Wad Sackett

Python Stateful Parsing Peter Wad Sackett

Stateful parsing – a text file example Imagine a file with this content I

Stateful parsing – a text file example Imagine a file with this content I am a data file. Some of my lines don’t matter to you. Other lines contain data you want to extract, like DNA sequence. These data span several lines. The lines with the data can be identified by a specific pattern. Green ATCGTCGATGCATCGATCATCGTGATAGCTACGT ACTACGTCATGCTCTGTCGTACGCATGATAGCTCGTACGTCG GTAGACCGCTACGATGCACCACACAGCGCGAATACTAGC Red As can be seen the sequence is between the green and the red line. Some number data 12 3. 34 54 7. 22 End of file 2 DTU Health Tech, Technical University of Denmark

Stateful parsing - method Stateful parsing is reading a file line by line with

Stateful parsing - method Stateful parsing is reading a file line by line with the intention of collecting some data from the file by observing a state. The state is cheked by a flag (variable) which is either True or False. The flag is initially False, denoting that we have not seen the green line, which tells us that the data comes now. When we see/pass the green line the flag is set to True. It is now time to collect the data. When we pass the red line the flag is set back to False indicating no more data. Here is some pseudo code for the process: (data, flag) = (’’, False) for line in file: if line is green: if flag is True: if line is red: 3 flag = True data += line flag = False DTU Health Tech, Technical University of Denmark

Stateful parsing – real python code Real code example # statements opening a file

Stateful parsing – real python code Real code example # statements opening a file (data, seqflag) = (’’, False) for line in infile: if line[: -1] == ”Red”: seqflag = False if seqflag: data += line[: -1] if line[: -1] == ”Green”: seqflag = True if seqflag: raise Value. Error(”Format is not right. Can’t trust result”) # Statements closing file By changing the order of the 3 if’s, the green line and/or the red line will be considered as part of the collected data or not. 4 DTU Health Tech, Technical University of Denmark