Lecture Notes 21: File IO
      
        
      
      
      What are files?
       
      
        Files are collections of values saved under a known address. 
        
        Internally, they are just 0s and 1s, but they can be interpreted as different things, like ASCII characters in a text file or just as 0s and 1s in a binary file. 
        
        When we want to interact with a file, we create a file object by providing the file's path.      
        
        
        
        A file object (more on objects later) has the file's information and allows the user to make use of a series of object methods to interact with that data.
      
      
      Working with files
       
      
        
Opening and Closing
        The first step is to open a file object by using the 
open function:
        
my_file = open('readme.txt') 
 
          
          There are many ways to open a file (parameters one can pass to the function), but we will only look at some basic ones:
          
          The second parameter can be a series of "flags" to indicate how exactly we wish to open the file:
          
          
          
          
          
          The defaults are 'r' and 't', which corresponds to 
opening the file for reading a text file.
          
          
          Some things to keep in mind are: 
          
            - if you want to write to it, we'll need to use the 'w' flag or the '+' flag
 
            - if you don;t want to overwrite the file, and you just want to append (add), use the 'a' flag
 
            - the flag 'x' is useful when you want a single copy of a file within a directory
 
          
          
          There are certain rules for what can be read/written, or created/overwritten, depending on what flags where selected:
          
  
    | Mode | 
    Description | 
    Allow read? | 
    Allow write? | 
    Create missing file? | 
    Overwrite file? | 
  
  
    | r | 
    Open the file for reading. | 
    Yes | 
    No | 
    No | 
    No | 
  
  
    | w | 
    Open the file for writing. If file does not exist then the file is created. Contents of an existing file are overwritten. | 
    No | 
    Yes | 
    Yes | 
    Yes | 
  
  
    | a | 
    Open the file for appending. If file does not exist then the file is created. Writes are added to end of existing file contents. | 
    No | 
    Yes | 
    Yes | 
    No | 
  
          
          
          For a deeper look at the open function, read the 
open Documentation
          
          Reading
          
          Activity 1 [2 minutes]: 
          
            Let's print the contents of a file.
            
            
              - First, open the scratchpad in replit
 
              - Then create a new text file called example.txt
 
              - Copy-paste the following into example.txt:
                
                
                  
                    There are only 10 kinds of people in this world: 
                    those who know binary and those who don’t.
                  
                
               
              - move to main.py and write the following code:
                
                
                  
                    file_object_handle = open("example.txt")
print('Reading file example.txt.')
contents = file_object_handle.read()  # read file text into a string
print('Closing file example.txt.')
file_object_handle.close()  # close the file
print('\nContents of example.txt:\n', contents)
 
                  
                 
               
            
           
          
          That program reads 
the whole file into the string 
contents. Which might be useful.
          
          However, one of te most common ways of dealing with structured data is to read a file line by line or between important markers (like commas).
          
          To read line by line, one can use the 
readlines function, which returns a 
list of lines within the file.
          
          
Activity 2 [2 minutes]: 
          
            Comment out the last block of code and try this one out:
            
            Predict what is going to be printed before you run it!
            
          
# Read file contents
print ('Reading in data...')
f = open('example.txt')
lines = f.readlines()
f.close()
# Iterate over each line
for ln in lines:
    print("### {} ###".format(ln))
 
            
            Did it do what you expected?
            
            Why?
           
          
          Alternatively, you can read the full contents and simply 
iterate trhough it 
          
          
Activity 3 [2 minutes]: 
          
            Comment out the last block of code and try this one out:
            
            Predict what is going to be printed before you run it!
            
f = open('example.txt')
for line in f:
    print(line, end="")
f.close()
 
            
            Did it do what you expected?
            
            Why?
           
          
          Writing
          
          Activity 4 [2 minutes]: 
          
            Let's modify our code to see what hapens when we try to write to our example.txt file like this:
            
            Predict what is going to be printed before you run it!
            
f = open("example.txt", "w")
print('Opening file example.txt for writing')
f.write('Joke #2: \nThe generation of random numbers is too important to be left to chance.')  # Write string
f.close()  # Close the file
f = open('example.txt')
contents = f.read()
print(contents)
f.close()
 
            
            Did it do what you expected?
            
            Why?
           
          
          Appending!
          Let's try adding our first joke back in:
          
          
Activity 5 [2 minutes]: 
          
            We will use a different flag this time: 
a
            
            Predict what is going to be printed before you run it!
            
f = open("example.txt", "a")
print('Opening file example.txt for appending')
f.write('Joke #1: \nThere are only 10 kinds of people in this world:\nthose who know binary and those who don’t.')  # Write string
f.close()  # Close the file
f = open('example.txt')
contents = f.read()
print(contents)
f.close()
 
            
            Did it do what you expected?
            
            Why?
            
            How can we fix these errors? (let's go one by one)
           
        
        The with operator
        
        The 
with operator encloses a file operation and takes care of closing the file for you.
        
        example:
        
# Open a file for reading and appending
with open('example.txt', 'r+') as f:
  # Read in two integers
  lines = f.readlines()
print ("lines:\n",lines)
print()
print ("BEFORE:")
for idx in range(len(lines)):
  lines[idx] = lines[idx].replace("\n", "")
  print(lines[idx])
  
print()
# move line 2 to 0, 1, and 2
new_lines = []
new_lines.append(lines[2]+"\n")
new_lines.append(lines[3]+"\n")
new_lines.append(lines[4]+"\n")
new_lines.append(lines[0]+"\n")
new_lines.append(lines[1]+"\n")
print ("new lines:\n",new_lines)
print()
# and write everything back
with open('example.txt', 'w+') as f:
  f.writelines( new_lines )
# Open a file for reading and appending
with open('example.txt', 'r+') as f:
  # Read in two integers
  lines = f.readlines()
print ("AFTER:")
for li in lines:
  li = li.replace("\n", "")
  print(li)
  
print()
# No need to call f.close() - f closed automatically 
print('Closed example.txt')
 
      
      
      Parsing: Practice with Files and Debugging
      
        In the remaining time, let's work on combining File I/O and String Parsing.
        
        First, open This Link
        
        Now, open 
Check out smith-map.
        
        OSM map structure
        
        An osm map is a collection of nodes (red) and a series of paths made up of sequences of these nodes.
            
            
            
        
        
          - the osm file format is similr to that of html or xml: which follows a tree structure to describe things
            
            something like: 
            
            top level: way 
               |
               |
               | --- middle level: node 
               |
               | --- middle level: node 
                      | 
                      | --- low level: tag
            
            would look like this:
            
            < way>
            
              < node\>
            
              < node>
            
                < tag\>
            
              <\node>
            
            <\way>
            
            
            
            (More details here)
              
                - 
                  
                    - simple elements are surrounded by open tags: "<" and close tags: "\>"
 
                    - complex objects are initiated by an openin element, like : "<node >" 
                    and closed by a closing element, like "<\node>" 
                  
                 
                - every map has many points called nodes
 
                - a node can be simple or complex
                  
                  
                    - a simple node has all its attributes in a single line
 
                    - a complex node has tag elements
 
                  
                 
                - A way is a complex element composed of many nodes (a path is just a connection of points)
 
              
            
             
           
        
        
        Activity 6 [2 minutes]: 
        
          Come up with an algorithm to count the number of nodes that have a single tag inside them.
          
          You must make use of file I/O and string processing
          
          (Only design the high-level steps using Einglish)
          
        
        
        Activity 7 [2 minutes]: 
        
          Expand the algorithm to count the number of nodes that have a single tag inside them.
          
          Transform your english-language steps into a flow diagram that uses file I/O and string processing
          
        
        
        Activity 8 [2 minutes]: 
        
          Write the Code to count the number of nodes that have a single tag inside them.
          
          Transform your flow diagram into code that uses file I/O and string processing
          
        
        
        
      
      
      Additional readings
      
        Check out this article on working with files
      
      
        
      
      Homework
      [Due for everyone]  
      
      I am extending the Due date for HW06 until this Thursday at 5PM
      
      
[Optional]
      
      ZyBooks Sections 6.12, 6.13, 6.14