Lecture Notes 21: File IO
What are files?
Files are collections of values saved under a known address.
Internally, they are just 0s and 1s, but they can be interpreted as different things, like ASCII characters in a text file or just as 0s and 1s in a binary file.
When we want to interact with a file, we create a file object by providing the file's path.
A file object (more on objects later) has the file's information and allows the user to make use of a series of object methods to interact with that data.
Working with files
Opening and Closing
The first step is to open a file object by using the
open function:
my_file = open('readme.txt')
There are many ways to open a file (parameters one can pass to the function), but we will only look at some basic ones:
The second parameter can be a series of "flags" to indicate how exactly we wish to open the file:
The defaults are 'r' and 't', which corresponds to
opening the file for reading a text file.
Some things to keep in mind are:
- if you want to write to it, we'll need to use the 'w' flag or the '+' flag
- if you don;t want to overwrite the file, and you just want to append (add), use the 'a' flag
- the flag 'x' is useful when you want a single copy of a file within a directory
There are certain rules for what can be read/written, or created/overwritten, depending on what flags where selected:
Mode |
Description |
Allow read? |
Allow write? |
Create missing file? |
Overwrite file? |
r |
Open the file for reading. |
Yes |
No |
No |
No |
w |
Open the file for writing. If file does not exist then the file is created. Contents of an existing file are overwritten. |
No |
Yes |
Yes |
Yes |
a |
Open the file for appending. If file does not exist then the file is created. Writes are added to end of existing file contents. |
No |
Yes |
Yes |
No |
For a deeper look at the open function, read the
open Documentation
Reading
Activity 1 [2 minutes]:
Let's print the contents of a file.
- First, open the scratchpad in replit
- Then create a new text file called example.txt
- Copy-paste the following into example.txt:
There are only 10 kinds of people in this world:
those who know binary and those who don’t.
- move to main.py and write the following code:
file_object_handle = open("example.txt")
print('Reading file example.txt.')
contents = file_object_handle.read() # read file text into a string
print('Closing file example.txt.')
file_object_handle.close() # close the file
print('\nContents of example.txt:\n', contents)
That program reads
the whole file into the string
contents. Which might be useful.
However, one of te most common ways of dealing with structured data is to read a file line by line or between important markers (like commas).
To read line by line, one can use the
readlines function, which returns a
list of lines within the file.
Activity 2 [2 minutes]:
Comment out the last block of code and try this one out:
Predict what is going to be printed before you run it!
# Read file contents
print ('Reading in data...')
f = open('example.txt')
lines = f.readlines()
f.close()
# Iterate over each line
for ln in lines:
print("### {} ###".format(ln))
Did it do what you expected?
Why?
Alternatively, you can read the full contents and simply
iterate trhough it
Activity 3 [2 minutes]:
Comment out the last block of code and try this one out:
Predict what is going to be printed before you run it!
f = open('example.txt')
for line in f:
print(line, end="")
f.close()
Did it do what you expected?
Why?
Writing
Activity 4 [2 minutes]:
Let's modify our code to see what hapens when we try to write to our example.txt file like this:
Predict what is going to be printed before you run it!
f = open("example.txt", "w")
print('Opening file example.txt for writing')
f.write('Joke #2: \nThe generation of random numbers is too important to be left to chance.') # Write string
f.close() # Close the file
f = open('example.txt')
contents = f.read()
print(contents)
f.close()
Did it do what you expected?
Why?
Appending!
Let's try adding our first joke back in:
Activity 5 [2 minutes]:
We will use a different flag this time:
a
Predict what is going to be printed before you run it!
f = open("example.txt", "a")
print('Opening file example.txt for appending')
f.write('Joke #1: \nThere are only 10 kinds of people in this world:\nthose who know binary and those who don’t.') # Write string
f.close() # Close the file
f = open('example.txt')
contents = f.read()
print(contents)
f.close()
Did it do what you expected?
Why?
How can we fix these errors? (let's go one by one)
The with operator
The
with operator encloses a file operation and takes care of closing the file for you.
example:
# Open a file for reading and appending
with open('example.txt', 'r+') as f:
# Read in two integers
lines = f.readlines()
print ("lines:\n",lines)
print()
print ("BEFORE:")
for idx in range(len(lines)):
lines[idx] = lines[idx].replace("\n", "")
print(lines[idx])
print()
# move line 2 to 0, 1, and 2
new_lines = []
new_lines.append(lines[2]+"\n")
new_lines.append(lines[3]+"\n")
new_lines.append(lines[4]+"\n")
new_lines.append(lines[0]+"\n")
new_lines.append(lines[1]+"\n")
print ("new lines:\n",new_lines)
print()
# and write everything back
with open('example.txt', 'w+') as f:
f.writelines( new_lines )
# Open a file for reading and appending
with open('example.txt', 'r+') as f:
# Read in two integers
lines = f.readlines()
print ("AFTER:")
for li in lines:
li = li.replace("\n", "")
print(li)
print()
# No need to call f.close() - f closed automatically
print('Closed example.txt')
Parsing: Practice with Files and Debugging
In the remaining time, let's work on combining File I/O and String Parsing.
First, open This Link
Now, open
Check out smith-map.
OSM map structure
An osm map is a collection of nodes (red) and a series of paths made up of sequences of these nodes.
- the osm file format is similr to that of html or xml: which follows a tree structure to describe things
something like:
top level: way
|
|
| --- middle level: node
|
| --- middle level: node
|
| --- low level: tag
would look like this:
< way>
< node\>
< node>
< tag\>
<\node>
<\way>
(More details here)
-
- simple elements are surrounded by open tags: "<" and close tags: "\>"
- complex objects are initiated by an openin element, like : "<node >"
and closed by a closing element, like "<\node>"
- every map has many points called nodes
- a node can be simple or complex
- a simple node has all its attributes in a single line
- a complex node has tag elements
- A way is a complex element composed of many nodes (a path is just a connection of points)
Activity 6 [2 minutes]:
Come up with an algorithm to count the number of nodes that have a single tag inside them.
You must make use of file I/O and string processing
(Only design the high-level steps using Einglish)
Activity 7 [2 minutes]:
Expand the algorithm to count the number of nodes that have a single tag inside them.
Transform your english-language steps into a flow diagram that uses file I/O and string processing
Activity 8 [2 minutes]:
Write the Code to count the number of nodes that have a single tag inside them.
Transform your flow diagram into code that uses file I/O and string processing
Additional readings
Check out this article on working with files
Homework
[Due for everyone]
I am extending the Due date for HW06 until this Thursday at 5PM
[Optional]
ZyBooks Sections 6.12, 6.13, 6.14