首页 > 解决方案 > 如何创建一个 python 脚本,以便在目录中的 csv 文件在过去 24 小时内未更新时发送电子邮件?

问题描述

我是 python 新手,并试图了解如何使东西自动化。我有一个文件夹,其中每天更新 5 个 csv 文件,但有时其中一个或两个文件不会在特定日期更新。我不得不手动检查这个文件夹。相反,我想以这样一种方式自动执行此操作,即如果 csv 文件在过去 24 小时内未更新,它可以向自己发送一封电子邮件,提醒我这一点。

我的代码:

import datetime
import glob
import os
import smtplib
import string
 
now  = datetime.datetime.today() #Get current date

list_of_files = glob.glob('c:/Python/*.csv') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime) #get latest file created in folder

newestFileCreationDate = datetime.datetime.utcfromtimestamp(os.path.getctime(latest_file)) # get creation datetime of last file

dif = (now - newestFileCreationDate) #calculating days between actual date and last creation date

logFile = "c:/Python/log.log" #defining a log file

def checkFolder(dif, now, logFile):
    if dif > datetime.timedelta(days = 1): #Check if difference between today and last created file is greater than 1 days
        
        HOST = "12.55.13.12" #This must be your smtp server ip
        SUBJECT = "Alert! At least 1 day wthout a new file in folder xxxxxxx"
        TO = "xx.t@gmail.com"
        FROM = "xx.t@gmail.com"
        text = "%s - The oldest file in folder it's %s old " %(now, dif) 
        BODY = string.join((
            "From: %s" % FROM,
            "To: %s" % TO,
            "Subject: %s" % SUBJECT ,
            "",
            text
            ), "\r\n")
        server = smtplib.SMTP(HOST)
        server.sendmail(FROM, [TO], BODY)
        server.quit()
        
        file = open(logFile,"a") #Open log file in append mode
 
        file.write("%s - [WARNING] The oldest file in folder it's %s old \n" %(now, dif)) #Write a log
 
        file.close() 
        
    else : # If difference between today and last creation file is less than 1 days
                
        file = open(logFile,"a")  #Open log file in append mode
 
        file.write("%s - [OK] The oldest file in folder it's %s old \n" %(now, dif)) #write a log
 
        file.close() 

checkFolder(dif,now,logFile) #Call function and pass 3 arguments defined before
 

但是,这不会没有错误地运行,我只想通过邮件通知文件夹中尚未更新的那些文件。即使它是其中 5 个文件之一或 5 个文件中的 5 个尚未更新。

标签: pythonstringoperating-systemglobsmtplib

解决方案


使用纯python和简洁的方式

import hashlib
import glob
import json
import smtplib
from email.message import EmailMessage
import time
import schedule #pip install schedule
hasher = hashlib.md5()
size = 65536 #to read large files in chunks 
list_of_files = glob.glob('./*.csv') #absolute path for crontab

第 1 部分)首先运行此脚本,然后将其注释掉。它将创建一个包含文件哈希的 json 文件。

first_hashes = {}
for x in list_of_files:

    with open(x, 'rb') as f:
        buf = f.read(size)
        while len(buf) > 0:
            hasher.update(buf)
            buf = f.read(size)
            first_hashes[x] = hasher.hexdigest()

with open('hash.json', 'w') as file:
     file.write(json.dumps(first_hashes, indent=2))

现在将其注释掉,甚至删除它。

第 2 部分)自动化脚本:

def send_email():


    check_hash = {} #Contain hashes that have not changed
    
    with open('hash.json') as f: #absolute path for crontab
         data = json.load(f)

    for x in list_of_files:

        with open(x, 'rb') as f:
            buf = f.read(size)
            while len(buf) > 0:
                hasher.update(buf)
                buf = f.read(size)
                new_hash = hasher.hexdigest()
                #if a hash match with one in data, that file has not changed
                if new_hash in data.values():
                    check_hash[x] = new_hash
                data[x] = new_hash


    #update our hashes
    with open('hash.json', 'w') as file:  #absolute path for crontab
         file.write(json.dumps(data, indent=2))

    if len(check_hash) > 0: #check if there's anything in check_hash

        filename="check_hash.txt" #absolute path for crontab

        #write to a text file named "check_hash.txt"
        with open(filename, 'w') as f: #absolute path for crontab
            f.write(json.dumps(check_hash, indent=2))

        
        # for gmail smtp setup watch youtu.be/JRCJ6RtE3xU 
        EMAIL_ADDRESS = 'SMTPAddress@gmail.com' 
        EMAIL_PASSWORD = 'SMTPPassWord'

        msg = EmailMessage()

        msg['Subject'] = 'Unupdated files'
        msg['From'] = EMAIL_ADDRESS
        msg['To'] = 'receive@gmail.com'
        msg.set_content('These file(s) did not update:')
        msg.add_attachment(open(filename, "r").read(), filename=filename)



        with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp:
            smtp.login(EMAIL_ADDRESS, EMAIL_PASSWORD)
            smtp.send_message(msg)
 

#for faster testing check other options here github.com/dbader/schedule
schedule.every().day.at("10:30").do(send_email) 
while 1:
    schedule.run_pending()
    time.sleep(1)

编辑:如果你重新启动你的电脑,你需要再次运行这个文件来重新启动计划,为避免这种情况,你可以使用 crontab 如下(从 youtu.be/j-KgGVbyU08 学习如何):

# mm hh DOM MON DOW command 
30 10 * * *  python3 path-to-file/email-script.py #Linux
30 10 * * *  python path-to-file/email-script.py #Windows

如果当时电脑处于开启状态,这将在每天上午 10:30 运行脚本。为了更快的测试(每 1 分钟运行一次),请使用:

* * * * *  python3 path-to-file/email-script.py

注意:如果你要使用 crontab,你必须对所有文件引用使用绝对路径并替换

schedule.every().day.at("10:30").do(send_email) 
while 1:
    schedule.run_pending()
    time.sleep(1)

if __name__ == "__main__":
    send_email()

经过测试,它工作得很好!


推荐阅读