Files I/O#
Files#
π Files are named locations on disk to store related information. They are used to permanently store data in a non-volatile memory (e.g. hard disk).
π Since Random Access Memory (RAM) is volatile (which loses its data when the computer is turned off), we use files for future use of the data by permanently storing them.
π When we want to read from or write to a file, we need to open it first. When we are done, it needs to be closed so that the resources that are tied with the file are freed.
Hence, in Python, a file operation takes place in the following order:
Open a file
Close the file
Write into files (perform operation)
Read contents of files (perform operation)
Opening Files#
π Python has a built-in open() function to open a file. This function returns a file object, also called a handle, as it is used to read or modify the file accordingly.
>>> f = open("test.txt") # open file in current directory
>>> f = open("C:/Python99/README.txt") # specifying full path
π We can specify the mode while opening a file. In mode, we specify whether we want to read r, write w or append a to the file.
π We can also specify if we want to open the file in text mode or binary mode.
π The default is reading in text mode. In this mode, we get strings when reading from the file.
π Binary mode returns bytes and this is the mode to be used when dealing with non-text files like images or executable files.
Mode |
Description |
|---|---|
|
Read -Opens a file for reading only. The file pointer is placed at the beginning of the file. This is the default mode. |
|
Text - Opens in text mode. (default). |
|
Binary - Opens in binary mode (e.g. images). |
|
Create - Opens a file for exclusive creation. If the file already exists, the operation fails. |
|
Opens a file for reading only in binary format. The file pointer is placed at the beginning of the file. This is the default mode. |
|
Opens a file for both reading and writing. The file pointer placed at the beginning of the file. |
|
Opens a file for both reading and writing in binary format. The file pointer placed at the beginning of the file. |
|
Write - Opens a file for writing only. Overwrites the file if the file exists. If the file does not exist, creates a new file for writing. |
|
Opens a file for writing only in binary format. Overwrites the file if the file exists. If the file does not exist, creates a new file for writing. |
|
Opens a file for both writing and reading. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing. |
|
Opens a file for both writing and reading in binary format. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing. |
|
Append - Opens a file for appending. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing. |
|
Opens a file for appending in binary format. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing. |
|
Opens a file for both appending and reading. The file pointer is at the end of the file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing. |
|
Opens a file for both appending and reading in binary format. The file pointer is at the end of the file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing. |
f = open("test.txt",'w') # write in text mode
print(f)
<_io.TextIOWrapper name='test.txt' mode='w' encoding='UTF-8'>
f = open("test.txt") # equivalent to 'r' or 'rt'
print(f) # <_io.TextIOWrapper name='test.txt' mode='r' encoding='cp1252'>
<_io.TextIOWrapper name='test.txt' mode='r' encoding='UTF-8'>
f = open("logo.png",'wb+') # read and write in binary mode
Hence, when working with files in text mode, it is highly recommended to specify the encoding type.
f = open("test.txt", mode='r', encoding='utf-8')
Closing files#
π When we are done with performing operations on the file, we need to properly close the file.
π Closing a file will free up the resources that were tied with the file. It is done using the close() method available in Python.
π Python has a garbage collector to clean up unreferenced objects but we must not rely on it to close the file.
f = open("test.txt", encoding = 'utf-8')
# perform file operations
f.close()
This method is not entirely safe. If an exception occurs when we are performing some operation with the file, the code exits without closing the file.
So to avoid this use, exception handling.
try:
f = open("test.txt", encoding = 'utf-8')
# perform file operations
finally:
f.close()
This way, we are guaranteeing that the file is properly closed even if an exception is raised that causes program flow to stop.
The best way to close a file is by using the with statement. This ensures that the file is closed when the block inside the with statement is exited.
We donβt need to explicitly call the close() method. It is done internally.
>>>with open("test.txt", encoding = 'utf-8') as f:
# perform file operations
Writing to files#
π In order to write into a file in Python, we need to open it in write w, append a or exclusive creation x mode.
π We need to be careful with the w mode, as it will overwrite into the file if it already exists.
π Writing a string or sequence of bytes (for binary files) is done using the write() method. This method returns the number of characters written to the file.
with open("test_1.txt",'w',encoding = 'utf-8') as f:
f.write("my first file\n")
f.write("This file\n\n")
f.write("contains three lines\n")
Reading files#
π To read a file in Python, we must open the file in reading r mode.
π We can use the read(size) method to read in the size number of data. If the size parameter is not specified, it reads and returns up to the end of the file.
f = open("test_1.txt",'r',encoding = 'utf-8')
txt = f.read() # read all the characters in the file
print(type(txt))
print(txt)
f.close()
<class 'str'>
my first file
This file
contains three lines
Alternatively, we can use the readline() method to read individual lines of a file. This method reads a file till the newline, including the newline character.
with open("test_1.txt",'r',encoding = 'utf-8') as f:
txt = f.readlines()
print(txt)
['my first file\n', 'This file\n', '\n', 'contains three lines\n']
Here is the complete list of methods in text mode with a brief description:
Method |
Description |
|---|---|
|
Closes an opened file. It has no effect if the file is already closed. |
|
Separates the underlying binary buffer from the |
|
Returns an integer number (file descriptor) of the file. |
|
Flushes the write buffer of the file stream. |
|
Returns |
|
Reads at most |
|
Returns |
|
Reads and returns one line from the file. Reads in at most |
|
Reads and returns a list of lines from the file. Reads in at most |
|
Changes the file position to |
|
Returns |
|
Returns the current file location. |
|
Resizes the file stream to |
|
Returns |
|
Writes the string |
|
Writes a list of |
File types#
Text files#
A common file extension, covered in previous sections
Json files#
JSON stands for JavaScript Object Notation. Actually, it is a stringified JavaScript object or Python dictionary.
# dictionary
person_dct= {
"name":"Anukool",
"country":"England",
"city":"London",
"skills":["Python", "ML","AI"]
}
# JSON: A string form a dictionary
person_json = "{'name': 'Anukool', 'country': 'England', 'city': 'London', 'skills': ['Python', 'ML','AI']}"
# we use three quotes and make it multiple line to make it more readable
person_json = '''{
"name":"Anukool",
"country":"England",
"city":"London",
"skills":["Python", "ML","AI"]
}'''
To convert from JSON to dictionary we use json.loads() method
import json
# JSON
person_json = '''{
"name":"Anukool",
"country":"England",
"city":"London",
"skills":["Python", "ML","AI"]
}'''
# let's change JSON to dictionary
person_dct = json.loads(person_json)
print(type(person_dct))
print(person_dct)
print(person_dct['name'])
<class 'dict'>
{'name': 'Anukool', 'country': 'England', 'city': 'London', 'skills': ['Python', 'ML', 'AI']}
Anukool
To convert the dictionary into JSON, we use the json.dumps() method.
import json
# python dictionary
person = {
"name":"Anukool",
"country":"England",
"city":"London",
"skills":["Python", "ML","AI"]
}
# let's convert it to json
person_json = json.dumps(person, indent=4) # indent could be 2, 4, 8. It beautifies the json
print(type(person_json))
print(person_json)
# when you print it, it does not have the quote, but actually it is a string
# JSON does not have type, it is a string type.
<class 'str'>
{
"name": "Anukool",
"country": "England",
"city": "London",
"skills": [
"Python",
"ML",
"AI"
]
}
You can save it as a json file using the json.dump() method.
import json
# python dictionary
person = {
"name":"Anukool",
"country":"England",
"city":"London",
"skills":["Python", "ML","AI"]
}
with open('json_example.json', 'w', encoding='utf-8') as f:
json.dump(person, f, ensure_ascii=False, indent=4)
File management#
π If there are a large number of files to handle in our Python program, we can arrange our code within different directories to make things more manageable.
π A directory or folder is a collection of files and subdirectories. Python has the os module that provides us with many useful methods to work with directories (and files as well).
getcwd()#
We can get the present working directory using the getcwd() method of the os module.
import os
print(os.getcwd())
chdir()#
We can change the current working directory using the chdir() method of the os module.
import os
os.chdir(r"C:\Users\Anukool\xyz")
print("Directory changed")
print(os.getcwd())
mkdir() & listdir()#
π We can create a new directory using the mkdir() method of the os module.
π The listdir() method displays all files and sub-directories inside a directory.
import os
os.mkdir('python_study')
print("Directory created")
os.listdir()
rmdir()#
We can remove a directory using the rmdir() method of the os module.
import os
os.rmdir('python_study')
There are many more functions which are supported by os module which makes it easier to interact for various system level operations