SmithLogo

CSC 111

Introduction to Computer Science Through Programming

Smith Computer Science



Lecture Notes 21: File IO





What are files?

Files are collections of values saved under a known address.

Internally, they are just 0s and 1s, but they can be interpreted as different things, like ASCII characters in a text file or just as 0s and 1s in a binary file.

When we want to interact with a file, we create a file object by providing the file's path.



A file object (more on objects later) has the file's information and allows the user to make use of a series of object methods to interact with that data.




Working with files

Opening and Closing

The first step is to open a file object by using the open function:

my_file = open('readme.txt') 


There are many ways to open a file (parameters one can pass to the function), but we will only look at some basic ones:

The second parameter can be a series of "flags" to indicate how exactly we wish to open the file:







The defaults are 'r' and 't', which corresponds to opening the file for reading a text file.

Some things to keep in mind are:

There are certain rules for what can be read/written, or created/overwritten, depending on what flags where selected:
Mode Description Allow read? Allow write? Create missing file? Overwrite file?
r Open the file for reading. Yes No No No
w Open the file for writing. If file does not exist then the file is created. Contents of an existing file are overwritten. No Yes Yes Yes
a Open the file for appending. If file does not exist then the file is created. Writes are added to end of existing file contents. No Yes Yes No


For a deeper look at the open function, read the open Documentation

Reading



Activity 1 [2 minutes]:
Let's print the contents of a file.

  1. First, open the scratchpad in replit
  2. Then create a new text file called example.txt
  3. Copy-paste the following into example.txt:
    There are only 10 kinds of people in this world:
    those who know binary and those who don’t.
  4. move to main.py and write the following code:
    file_object_handle = open("example.txt")
    print('Reading file example.txt.')
    contents = file_object_handle.read()  # read file text into a string
    
    print('Closing file example.txt.')
    file_object_handle.close()  # close the file
    
    print('\nContents of example.txt:\n', contents)
    


That program reads the whole file into the string contents. Which might be useful.

However, one of te most common ways of dealing with structured data is to read a file line by line or between important markers (like commas).

To read line by line, one can use the readlines function, which returns a list of lines within the file.

Activity 2 [2 minutes]:
Comment out the last block of code and try this one out:

Predict what is going to be printed before you run it!

# Read file contents
print ('Reading in data...')
f = open('example.txt')
lines = f.readlines()
f.close()

# Iterate over each line
for ln in lines:
    print("### {} ###".format(ln))


Did it do what you expected?

Why?


Alternatively, you can read the full contents and simply iterate trhough it

Activity 3 [2 minutes]:
Comment out the last block of code and try this one out:

Predict what is going to be printed before you run it!

f = open('example.txt')

for line in f:
    print(line, end="")

f.close()


Did it do what you expected?

Why?


Writing



Activity 4 [2 minutes]:
Let's modify our code to see what hapens when we try to write to our example.txt file like this:

Predict what is going to be printed before you run it!

f = open("example.txt", "w")
print('Opening file example.txt for writing')
f.write('Joke #2: \nThe generation of random numbers is too important to be left to chance.')  # Write string
f.close()  # Close the file

f = open('example.txt')
contents = f.read()
print(contents)
f.close()


Did it do what you expected?

Why?


Appending!

Let's try adding our first joke back in:

Activity 5 [2 minutes]:
We will use a different flag this time: a

Predict what is going to be printed before you run it!

f = open("example.txt", "a")
print('Opening file example.txt for appending')
f.write('Joke #1: \nThere are only 10 kinds of people in this world:\nthose who know binary and those who don’t.')  # Write string
f.close()  # Close the file

f = open('example.txt')
contents = f.read()
print(contents)
f.close()


Did it do what you expected?

Why?

How can we fix these errors? (let's go one by one)


The with operator



The with operator encloses a file operation and takes care of closing the file for you.

example:

# Open a file for reading and appending
with open('example.txt', 'r+') as f:
  # Read in two integers
  lines = f.readlines()


print ("lines:\n",lines)
print()

print ("BEFORE:")
for idx in range(len(lines)):
  lines[idx] = lines[idx].replace("\n", "")
  print(lines[idx])
  
print()

# move line 2 to 0, 1, and 2
new_lines = []
new_lines.append(lines[2]+"\n")
new_lines.append(lines[3]+"\n")
new_lines.append(lines[4]+"\n")
new_lines.append(lines[0]+"\n")
new_lines.append(lines[1]+"\n")

print ("new lines:\n",new_lines)
print()


# and write everything back
with open('example.txt', 'w+') as f:
  f.writelines( new_lines )

# Open a file for reading and appending
with open('example.txt', 'r+') as f:
  # Read in two integers
  lines = f.readlines()

print ("AFTER:")
for li in lines:
  li = li.replace("\n", "")
  print(li)
  
print()

# No need to call f.close() - f closed automatically 
print('Closed example.txt')




Parsing: Practice with Files and Debugging

In the remaining time, let's work on combining File I/O and String Parsing.

First, open This Link

Now, open

Check out smith-map.



OSM map structure



An osm map is a collection of nodes (red) and a series of paths made up of sequences of these nodes.







Activity 6 [2 minutes]:
Come up with an algorithm to count the number of nodes that have a single tag inside them.
You must make use of file I/O and string processing
(Only design the high-level steps using Einglish)


Activity 7 [2 minutes]:
Expand the algorithm to count the number of nodes that have a single tag inside them.
Transform your english-language steps into a flow diagram that uses file I/O and string processing


Activity 8 [2 minutes]:
Write the Code to count the number of nodes that have a single tag inside them.
Transform your flow diagram into code that uses file I/O and string processing




Additional readings

Check out this article on working with files




Homework

[Due for everyone]
I am extending the Due date for HW06 until this Thursday at 5PM

[Optional]
ZyBooks Sections 6.12, 6.13, 6.14