python - Python datefinder 中的 IllegalMonthError
问题描述
我正在尝试使用datefinder
python 库从电子邮件文本中提取日期。
下面是我正在尝试做的代码片段。
import datefinder
#body has list of email texts
email_dates=[]
for b in body:
dates = datefinder.find_dates(b)
date = []
for d in dates:
date.append(d)
email_dates.append(date)
datefinder 尝试将电子邮件中的所有数字构造为日期。我得到很多误报。我可以使用一些逻辑删除那些。但是我收到IllegalMonthError
了一些电子邮件,我无法越过错误并从其他电子邮件中检索日期。下面是错误
---------------------------------------------------------------------------
IllegalMonthError Traceback (most recent call last)
c:\python\python38\lib\site-packages\dateutil\parser\_parser.py in parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
654 try:
--> 655 ret = self._build_naive(res, default)
656 except ValueError as e:
c:\python\python38\lib\site-packages\dateutil\parser\_parser.py in _build_naive(self, res, default)
1237
-> 1238 if cday > monthrange(cyear, cmonth)[1]:
1239 repl['day'] = monthrange(cyear, cmonth)[1]
c:\python\python38\lib\calendar.py in monthrange(year, month)
123 if not 1 <= month <= 12:
--> 124 raise IllegalMonthError(month)
125 day1 = weekday(year, month, 1)
IllegalMonthError: bad month number 42; must be 1-12
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-39-1fbacc8ca3f6> in <module>
7 dates = datefinder.find_dates(b)
8 date = []
----> 9 for d in dates:
10 date.append(d)
11
c:\python\python38\lib\site-packages\datefinder\__init__.py in find_dates(self, text, source, index, strict)
30 ):
31
---> 32 as_dt = self.parse_date_string(date_string, captures)
33 if as_dt is None:
34 ## Dateutil couldn't make heads or tails of it
c:\python\python38\lib\site-packages\datefinder\__init__.py in parse_date_string(self, date_string, captures)
100 # otherwise self._find_and_replace method might corrupt them
101 try:
--> 102 as_dt = parser.parse(date_string, default=self.base_date)
103 except (ValueError, OverflowError):
104 # replace tokens that are problematic for dateutil
c:\python\python38\lib\site-packages\dateutil\parser\_parser.py in parse(timestr, parserinfo, **kwargs)
1372 return parser(parserinfo).parse(timestr, **kwargs)
1373 else:
-> 1374 return DEFAULTPARSER.parse(timestr, **kwargs)
1375
1376
c:\python\python38\lib\site-packages\dateutil\parser\_parser.py in parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
655 ret = self._build_naive(res, default)
656 except ValueError as e:
--> 657 six.raise_from(ParserError(e.args[0] + ": %s", timestr), e)
658
659 if not ignoretz:
TypeError: unsupported operand type(s) for +: 'int' and 'str'
假设如果我在第 5 封电子邮件中收到此错误,我将无法从 5 日起检索日期。如何绕过此错误,删除导致此错误的条目并检索所有其他日期?
提前致谢
解决方案
使用try/except
块:
try:
datefinder.find_dates(b)
except IllegalMonthError as e:
# this will print the error, but will not stop the program
print(e)
except Exception as e:
# any other unexpected error will be propagated
raise e
从编辑更新:
请注意,回溯显示
----> 9 for d in dates:
在这里提出了例外。事实上,检查文档find_dates
,您会看到它find_dates
返回了一个生成器:
如果需要,返回生成 datetime.datetime 对象的生成器,或带有源文本和索引的元组
因此,日期的实际解析不是在您调用时完成find_dates
,而是在您迭代结果时完成。这使得包装 a 变得更加棘手try/catch
,因为您必须逐项迭代生成器,每个都在一个单独的try/catch
块中:
from datefinder import find_dates
string_with_dates = """
...
entries are due by January 4th, 2017 at 8:00pm
...
created 01/15/2005 by ACME Inc. and associates.
...
Liverpool NY 13088 42 cases
"""
matches = find_dates(string_with_dates)
print(type(matches)) # <class 'generator'>
while True:
try:
m = next(matches)
# this is the exception seen by the program, rather than IllegalMonthError
except TypeError as e:
print(f"TypeError {e}")
continue
# the generator has no more items
except StopIteration as e:
print(f"StopIteration {e}")
break
# any other unexpected error will be propagated
except Exception as e:
raise e
print(f"m {m}")
你可以做m
任何你需要的事情。
干杯!
推荐阅读
- sql-server - 在日期列中使用 NULL 时,多索引的 SQL Server 统计信息不正确
- python - 为什么 ascii+ast.literal_eval 转换和反转换返回不同大小的对象?
- flutter - 在 Flutter auto_route 包中,我需要将构建器转换为什么类型?
- javascript - 如何使递归函数异步
- nginx - 如何将 nginx 入口自定义端口列入白名单
- lua - 尝试在此表中打印“发现”为真的条目数
- python - 数据框分组
- php - 时间相关格式:获取去年第一个星期日的日期
- angular - Ionic 5将http数据传递给模态
- amazon-web-services - AWS:只允许一部分联合用户做某事(例如担任角色)