首页 > 解决方案 > 如何修复“AttributeError:'RDD'对象没有属性'rfind'”?

问题描述

我正在编写一个附加来自 HDFS 的文件并发送电子邮件的代码。我已经让代码使用本地文件夹(linux主目录)中的文件,但是当我将附件位置更改为 HDFS 位置时,我得到 AttributeError: 'RDD' object has no attribute 'rfind' 错误。有人可以帮忙吗?

我已将编码更改为

part = MIMEApplication("".join(f.collect()).encode('utf-8').strip(), Name=basename(f))

也试过

part = MIMEApplication(u"".join(f.collect()), Name=basename(f))

但仍然有同样的错误

这是我的代码

import smtplib
from os.path import basename
from email.mime.application import MIMEApplication
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.utils import COMMASPACE, formatdate

def success_mail():
    sender = "no-reply@company.com"
    receivers = 'user@company.com'
    msg = MIMEMultipart()
    msg.attach(MIMEText("Scoring completed. Attached is the latest report"))
    f=sc.textFile("/user/userid/folder/report_20190501.csv")
    part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(f))
    part['Content-Disposition'] = 'attachment; filename="%s"' % basename(f)
    msg.attach(part)
    try:
        smtp = smtplib.SMTP('smtp.company.com')
        smtp.sendmail(sender, receivers, msg.as_string())  
        smtp.close()
        logMessage("INFO - Successfully sent email with Attachment")
    except:
        emsg =  traceback.format_exc()
        logMessage("ERROR -  Unable to send email because of :"+emsg)

错误:

AttributeError                            Traceback (most recent call last)
<ipython-input-6-5606e23c7cf8> in <module>()
     33         emsg =  traceback.format_exc()
     34         logMessage("ERROR -  Unable to send email because of :"+emsg)
---> 35 success_mail()

<ipython-input-6-5606e23c7cf8> in success_mail()
     22     msg.attach(MIMEText("Scoring completed. Attached is the latest report"))
     23     f=sc.textFile("/user/userid/folder/report_20190501.csv")
---> 24     part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(f))
     25     part['Content-Disposition'] = 'attachment; filename="%s"' % basename(f)
     26     msg.attach(part)

/hadoop/ipython/userid/pyspark/lib64/python2.7/posixpath.pyc in basename(p)
    112 def basename(p):
    113     """Returns the final component of a pathname"""
--> 114     i = p.rfind('/') + 1
    115     return p[i:]
    116 

AttributeError: 'RDD' object has no attribute 'rfind'

标签: pythonemailpyspark

解决方案


尝试替换这 3 行:

f=sc.textFile("/user/userid/folder/report_20190501.csv")
part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(f))
part['Content-Disposition'] = 'attachment; filename="%s"' % basename(f)

和 :

file_path="/user/userid/folder/report_20190501.csv"
f=sc.textFile(file_path)
part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(file_path))
part['Content-Disposition'] = 'attachment; filename="%s"' % basename(file_path)

推荐阅读