用python合并execl做统计分析-Toy模板网

这篇具有很好参考价值的文章主要介绍了用python合并execl做统计分析。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

背景：我们有项目问题记录的表格，每一个项目都会反馈来一个execl表格，表格的格式都是统一的，我需要定期把这些项目上反馈来的表格进行统计，但是表格非常多，我要每个都统计出来在来相加，这样操作很麻烦，所以我想有没有一个方法可以把所有的execl 都合并成为一个execl ,然后再去做统计分析。这样会节省不少时间。最后我发现可以用python 来做合并execl的事情。

1、目录结构

用python合并execl做统计分析,python,python,开发语言

util : 是工具包，里面是封装的execl的类，实现了execl的基本操作
sources : 这里存放的需要进行合并的所有execl文件
results : 存放合并后的execl文件
main : 主函数实现execl合并的方法

2、实现过程

ExcelUtil类的实现

在util包里面创建excel_util.py文件
导入相关的模块，实现ExcelUtil的初始化方法
初始化方法的时候，要传入execl文件的路径excel_path，不传是使用一个默认地址，index 是指定execl的哪个sheet，不传默认操作的是第一个sheet

# --coding:utf-8--
# ! python3
import xlrd
from xlutils.copy import copy
import os
import json
import codecs


class ExcelUtil:
    def __init__(self, excel_path=None, index=None):
        if excel_path == None:
            self.excel_path = "E:\\python\\问题列表文档\\execl合并\\results\\result.xlsx"
        else:
            self.excel_path = excel_path
        if index == None:
            self.index = 0       # 0 代表第一个sheet
        else:
            self.index = index
        self.data = xlrd.open_workbook(self.excel_path)    #将整个execl内容读到self.data里面
        self.table = self.data.sheets()[self.index]  #在从data里面获取指定sheet的内容

获取sheet中的行数

def get_lines(self):
        # 行数
        rows = self.table.nrows
        if rows >= 1:
            return rows
        return None

获取sheet中某个单元格的数据，需要传入行和列

 def get_col_value(self, row, col):
        # print
        if self.get_lines() > row:
            data = self.table.cell(row, col).value
            return data
        return None

获取整个sheet中的数据，并且存放到一个list中

获取sheet数据，按照每行一个list，添加到一个大的list里面
def get_data(self):
        result = []
        rows = self.get_lines()
        if rows != None:
            for i in range(rows):
                col = self.table.row_values(i)
                result.append(col)
            return result
        return None

给单元格写入数据，需要传入行和列，还有要写入的内容

def write_value(self, row, col, value):
        read_value = xlrd.open_workbook(self.excel_path)
        write_data = copy(read_value)   #将execl的内容读取出来拷贝一份
        write_data.get_sheet(self.index).write(row, col, value)   #写入数据
        write_data.save(self.excel_path)   #保存数据

添加新的sheet, 要传入sheet的名称

def add_excel_sheet(self,sheetname):
        rb = xlrd.open_workbook(self.excel_path, formatting_info=True)
        # make a copy of it
        wb = copy(rb)
        # add sheet to workbook with existing sheets
        wb.add_sheet(sheetname)
        wb.save(self.excel_path)

execl 的合并

在main包下面创建combine.py的文件
合并execl 要用到pandas模块

# --coding:utf-8--
# ! python3
import os
import pandas as pd

class CombineExcel(object):
    '''合并sources目录下的所有excel文件'''
    def __init__(self,pwd=None):
        '''pwd为要合并的当前目录'''
        if pwd == None:
            self.pwd = 'sources'
        else:
            self.pwd = pwd

合并的方法，需要传入index指定哪个sheet, 合并后的文件保存路径，以及合并的sheet名称

def combine_all(self,index,file_name,re_sheetname='sheet1'):
        '''合并文件夹下的excel
            index：指定要合并excel文件的哪个sheet，取值0,1,2,.......
            file_name:合并后生成的excel文件的保存路径
            re_sheetname:合并后sheet的名字，默认为sheet1
        '''
        # 新建列表，存放文件名
        file_list = []
        # 新建列表存放每个文件数据（依次读取多个相同结构的execl文件并创建DataFrame）
        dfs = []
        for root, dirs, files in os.walk(self.pwd):
            for file in files:
                file_path = os.path.join(root, file)
                file_list.append(file_path)
                print(file_path)
                df = pd.read_excel(io=file_path, sheet_name=index, header=None)
                dfs.append(df)
        df = pd.concat(dfs)
        df.to_excel(file_name, sheet_name=re_sheetname, index=False)

通过CombineExcel 类就可以把execl合并，合并后还需要统计分析，需要用到ExcelUtil类的方法。

数据统计分析

在main包中创建了statistics_main.py 就是来对合并后的execlt统计分析的。

#--coding:utf-8--
#! python3
import sys
sys.path.append('E:\python\问题列表文档\execl合并')
from util.excel_util import ExcelUtil
from main.combine import CombineExcel
import os
import xlrd
from xlutils.copy import copy

class Statistics(object):
    '''统计每个项目问题列表的解决情况，和每个组的问题解决情况统计'''
    def __init__(self,excel_path,index):
        self.excel_path = excel_path
        self.index = index
        self.ex = ExcelUtil(self.excel_path, self.index)