That Blue Square Thing

AQA Computer Science GCSE

Programming projects - File Handling

The basics of file handling is dealt with in the basic file handling page. This page takes the techniques learned there a but further and reads in text files as lists.

Having data files as lists means that it's much easier to add items or remove items from them. For example, a high scores list would be easy to add an item to or remove an item and could even be sorted.

Don't forget that a Python list is an array. An exam will always refer to them as arrays.

Step 1 - Reading the file in

The full method will get built up in steps until it gets to a stage where it would actually be useful. You will need to download the countries text file and make sure that this is saved in the same folder as your Python files.

Text file iconCountries text file - right click and Save As

Open the text file and take a look at it first. It's just a list of countries. The start with a basic program to read in the text file:

myFile = open("countries.txt", "r")
countryFile = myFile.read()
myFile.close()
print(countryFile)

This just reads in the file and simply prints it. The "r" in the first line specifies that you're going to be reading the file rather than writing to it.

As it stands this file isn't much help for us as it's not looking like an array at all. Everything's just stored in one variable.

Step 2 - Using readline

To get the file as a list we need to read in one line at a time. This uses the readline command.

Start with this program which only changes one word in the code. Run it and see what happens.

myFile = open("countries.txt", "r")
countryFile = myFile.readline()
myFile.close()
print(countryFile)

Open the text file to see what happened.

Clearly that's not quite there yet, but you should be able to figure out what it's done.

To get this usable, we just need to add a single letter - we change readline to readlines, simply adding an s on the end.

myFile = open("countries.txt", "r")
countryFile = myFile.readlines()
myFile.close()
print(countryFile)

And, suddenly, we have an array - square brackets and items divided by commas. There are some issues yet - the \n bits need to be dealt with and there are better ways to print this.

Step 2.5 - Improved Printing

To print out the list line by line we can add a simple FOR loop at the end.

myFile = open("countries.txt", "r")
countryFile = myFile.readlines()
myFile.close()
for country in countryFile:
print(country)

When you run this it looks as if it's solved the \n problem, but adds a line between each country. Which you don't want.

Actually it hasn't solved the \n problem, it's just hidden it. By printing out each line the \n doesn't show - but it does add a line break at the end of each line. The \n is still there though - which you really don't want when it comes to dealing with the array.

Step 3 - Stripping out the \n

So, we need to get rid of the \n characters from the list. This is slightly tricky, but doesn't involve too much work.

There's a special command called rstrip that needs to get used. This code shows you how it works:

myFile = open("countries.txt", "r")
countryFile = myFile.readlines()
myFile.close()
for country in countryFile:
country = country.rstrip("\r\n")
print(country)

Now when you run the code the line breaks caused by the \n have disappeared. The problem is that the \n are still there - all we did was strip them out just before printing. You can prove that by adding a line of code to the end to print the whole list again:

myFile = open("countries.txt", "r")
countryFile = myFile.readlines()
myFile.close()
for country in countryFile:
country = country.rstrip("\r\n")
print(country)
print(countryFile)

As you can see, the \n characters are still there and will cause chaos when you're dealing with each list item. We need to strip them out for good...

Step 4 - The Solution

To strip the \n characters out for good we need to create a new array. We take each country in turn, strip out the \n and then, instead of printing the country, we append it to the new array.

I've called my array strippedCountryList. You have to set up a blank array first and then append after you've stripped out the \n each time.

myFile = open("countries.txt", "r")
countryFile = myFile.readlines()
myFile.close()

strippedCountryList = [] # blank array set up

for country in countryFile:
country = country.rstrip("\r\n") # strip the \n
strippedCountryList.append(country) # add it to the array

print(strippedCountryList)

And, eventually, we got there.

More Complex Lists

This technique only really works for simple lists. If you need to read in a two-dimensional array then things start to get a little more complex. Fortunately there's a really easy solution at the reading 2D lists page.