Python openpyxl module is a native library to work with excel files. We can read excel files as well as write excel files.
Python openpyxl模块是使用excel文件的本机库。 我们可以读取excel文件,也可以写入excel文件。
We can install openpyxl module using the PIP command.
我们可以使用PIP命令安装openpyxl模块。
$ pip install openpyxl
I have created a sample excel file (records.xlsx) with three sheets. The data is present in the first two sheets.
我用三张纸创建了一个示例excel文件(records.xlsx)。 数据显示在前两页中。
We will use this excel file to look at some examples to read data from the excel sheet.
我们将使用此excel文件查看一些示例以从excel表中读取数据。
import openpyxl
excel_file = openpyxl.load_workbook('records.xlsx')
# sheet names
print(excel_file.sheetnames)
Output:
输出:
['Employees', 'Cars', 'Numbers']
The sheetnames
variable returns the list of the names of worksheets in the workbook. The names are returned in the order of the worksheets in the excel file.
sheetnames
变量返回工作簿中工作表名称的列表。 名称按照excel文件中工作表的顺序返回。
We can access a specific worksheet using the index variable with the workbook object.
我们可以使用带有工作簿对象的index变量来访问特定的工作表。
employees_sheet = excel_file['Employees']
print(type(excel_file))
print(type(employees_sheet))
currently_active_sheet = excel_file.active
Output:
输出:
If you want to access the currently active sheet, use the active
property of the workbook.
如果要访问当前活动的工作表,请使用工作簿的active
属性。
There are two ways to get a cell value from the excel sheet. We can get the Cell object using the cell() function or we can get it using the index of the cell.
有两种方法可以从Excel工作表中获取单元格值。 我们可以使用cell()函数获取Cell对象,也可以使用单元格的索引获取它。
cell_obj = employees_sheet.cell(row=1, column=1)
print(type(cell_obj))
print(f'Employees[A1]={cell_obj.value}')
# second way
print(f'Employees[A1]={employees_sheet["A1"].value}')
Output:
输出:
Employees[A1]=EmpID
Employees[A1]=EmpID
We can get the total number of rows and columns using the max_row
and max_column
properties of the worksheet.
我们可以使用工作表的max_row
和max_column
属性获取行和列的max_row
。
print(f'Total Rows = {employees_sheet.max_row} and Total Columns = {employees_sheet.max_column}')
Output:
输出:
Total Rows = 4 and Total Columns = 3
header_cells_generator = employees_sheet.iter_rows(max_row=1)
for header_cells_tuple in header_cells_generator:
for i in range(len(header_cells_tuple)):
print(header_cells_tuple[i].value)
Output:
输出:
EmpID
EmpName
EmpRole
The iter_rows() function generates cells from the worksheet, by row. We can use it to get the cells from a specific row.
iter_rows()函数从工作表中按行生成单元格。 我们可以使用它来获取特定行中的单元格。
for x in range(1, employees_sheet.max_row+1):
print(employees_sheet.cell(row=x, column=1).value)
Output:
输出:
EmpID
1
2
3
for x in range(1, employees_sheet.max_column+1):
print(employees_sheet.cell(row=2, column=x).value)
Output:
输出:
1
Pankaj
CEO
We can pass the range of cells to read multiple cells at a time.
我们可以传递单元格的范围以一次读取多个单元格。
cells = employees_sheet['A2':'C3']
for id, name, role in cells:
print(f'Employee[{id.value}, {name.value}, {role.value}]')
Output:
输出:
Employee[1, Pankaj, CEO]
Employee[2, David Lee, Editor]
for row in employees_sheet.iter_rows(min_row=2, min_col=1, max_row=4, max_col=3):
for cell in row:
print(cell.value, end="|")
print("")
Output:
输出:
1|Pankaj|CEO|
2|David Lee|Editor|
3|Lisa Ray|Author|
The arguments passed to the iter_rows() function creates the two-dimensional table from which the values are read, by row. In this example, the values are read between A2 and C4.
传递给iter_rows()函数的参数创建一个二维表,从该表中按行读取值。 在此示例中,在A2和C4之间读取值。
for col in employees_sheet.iter_cols(min_row=2, min_col=1, max_row=4, max_col=3):
for cell in col:
print(cell.value, end="|")
print("")
Output:
输出:
1|2|3|
Pankaj|David Lee|Lisa Ray|
CEO|Editor|Author|
The iter_cols() function is same as iter_rows() except that the values are read column-wise.
iter_cols()函数与iter_rows()相同,只不过是按列读取值。
In this section, we will look into some examples of writing excel files and cell data.
在本节中,我们将研究一些编写excel文件和单元格数据的示例。
from openpyxl import Workbook
import datetime
excel_file = Workbook()
excel_sheet = excel_file.create_sheet(title='Holidays 2019', index=0)
# creating header row
excel_sheet['A1'] = 'Holiday Name'
excel_sheet['B1'] = 'Holiday Description'
excel_sheet['C1'] = 'Holiday Date'
# adding data
excel_sheet['A2'] = 'Diwali'
excel_sheet['B2'] = 'Biggest Indian Festival'
excel_sheet['C2'] = datetime.date(year=2019, month=10, day=27).strftime("%m/%d/%y")
excel_sheet['A3'] = 'Christmas'
excel_sheet['B3'] = 'Birth of Jesus Christ'
excel_sheet['C3'] = datetime.date(year=2019, month=12, day=25).strftime("%m/%d/%y")
# save the file
excel_file.save(filename="Holidays.xlsx")
Output:
输出:
We can either use the index of the cell or use the cell object to set the value. Let’s change some values in the excel file created in the last example.
我们可以使用单元格的索引,也可以使用单元格对象来设置值。 让我们更改在上一个示例中创建的excel文件中的一些值。
import openpyxl
excel_file = openpyxl.load_workbook('Holidays.xlsx')
excel_sheet = excel_file['Holidays 2019']
# using index
excel_sheet['A2'] = 'Deepawali'
# using cell object
excel_sheet.cell(row=2, column=2).value = 'Biggest Indian Festival for Hindus'
excel_file.save('Holidays.xlsx')
Output:
输出:
We can use the append() function to add a sequence of values to the bottom of the worksheet.
我们可以使用append()函数在工作表的底部添加一系列值。
holiday_rows = (
('Black Friday', 'Fourth Thursday of November, Shopping Day', '11/29/19'),
('Holi', 'Festival of Colors', '3/20/19')
)
for row in holiday_rows:
excel_sheet.append(row)
excel_file.save('Holidays.xlsx')
Output:
输出:
We can use the delete_cols() and delete_rows() functions to delete columns and rows from the excel sheet.
我们可以使用delete_cols()和delete_rows()函数从excel工作表中删除列和行。
import openpyxl
excel_file = openpyxl.load_workbook('Holidays.xlsx')
excel_sheet = excel_file['Holidays 2019']
# delete column
excel_sheet.delete_cols(idx=2) # B=2
# delete row
excel_sheet.delete_rows(idx=2, amount=2) # rows 2,3 are deleted
excel_file.save('Holidays.xlsx')
The idx parameter provides the index of the rows and columns to delete. If we want to delete multiple adjacent rows and columns, we can provide the amount argument.
idx参数提供要删除的行和列的索引。 如果要删除多个相邻的行和列,可以提供amount参数。
Python openpyxl module is a perfect choice to work with excel sheets. We can also add images to the excel sheet by using the pillow library with it. But, it doesn’t guard us against quadratic blowup or billion laughs XML attacks. So, if you are getting values from the user and saving it, then try to validate and sanitize it.
Python openpyxl模块是使用excel工作表的理想选择。 我们还可以通过使用枕头库将图像添加到excel工作表中。 但是,它不能防止我们遭受二次爆炸或数十亿次XML攻击。 因此,如果您要从用户那里获取并保存值,请尝试对其进行验证和消毒。
翻译自: https://www.journaldev.com/33325/openpyxl-python-read-write-excel-files