首页 > 解决方案 > 如何使用pyxl(python)遍历excel中的特定列

问题描述

我是 python 新手,需要你的帮助。我正在尝试使用 pyxl 编写遍历 excel 中特定列的代码

from io import StringIO
import pandas as pd
import pyodbc 
from openpyxl import load_workbook


d=pd.read_excel('workbook.xlsx',header=None)
wb = load_workbook('workbook.xlsx')

所以在上面的例子中,我必须去列 J 并显示列中的所有值。

请帮我解决这个问题。

另外,我在我的 excel 表中重复了相同的列名。例如,“示例”列名在 B2 和 J2 中都可用。但我想获取 J2 的所有列信息。

请让我知道如何解决这个问题...

谢谢..请回复

标签: pythonexcelpython-2.7pandasdataframe

解决方案


由于您是 python 新手,因此您应该学习阅读文档。有大量可用的模块,如果你先努力,它对你来说会更快,对我们其他人来说会更容易。

import openpyxl
from openpyxl.utils import cell as cellutils

## My example book simply has "=Address(Row(),Column())" in A1:J20

## Because my example uses formulae, I am loading my workbook with
## "data_only = True" in order to get the values; if your cells do not
## contain formulae, you can omit data_only
workbook = openpyxl.load_workbook("workbook.xlsx", data_only = True)
worksheet = workbook.active
## Alterntively: worksheet = workbook["sheetname"]


## A container for gathering the cell values
output = []

## Current Row = 2 assumes that Cell 1 (in this case, J1) contains your column header
## Adjust as necessary
column = cellutils.column_index_from_string("J")
currentrow = 2

## Get the first cell
cell = worksheet.cell(column = column, row = currentrow)

## The purpose of "While cell.value" is that I'm assuming the column
## is complete when the cell does not contain a value

## If you know the exact range you need, you can either use a for-loop,
## or look at openpyxl.utils.cell.rows_from_range
while cell.value:
    ## Add Cell value to our list of values for this column
    output.append(cell.value)

    ##  Move to the next row
    currentrow += 1

    ## Get that cell
    cell = worksheet.cell(column = column, row = currentrow)


print(output)
""" output: ['$J$2', '$J$3', '$J$4', '$J$5', '$J$6', '$J$7',
            '$J$8', '$J$9', '$J$10', '$J$11', '$J$12', '$J$13', '$J$14',
            '$J$15', '$J$16', '$J$17', '$J$18', '$J$19', '$J$20']

推荐阅读