python-3.x - python: UnicodeDecodeError: 'utf-8' codec can't decode byte
问题描述
I am trying to read the latest jstack file and search for "RUNNABLE", "BLOCKED", and "TIMED_WAITING". This was working before but after a few runs and trying to modify some of the list words it stopped working and started to see the following error on the output. I tried encoding to utf-8
but received the same error. When I tried encoding to ISO-8859-1
it worked but the count is not correct
import os
def wordcount(filename, listwords):
try:
# file = open(filename, encoding ='ISO-8859-1')
# file = open(filename, encoding ='utf-8')
file = open(filename, "r")
read = file.readlines()
file.close()
for word in listwords:
#lower = word.lower()
count = 0
for sentence in read:
line = sentence.split()
for each in line:
line2 = each.upper()
#line2 = line2.strip("java.lang.Thread.State: ")
if word == line2:
count += 1
print (word, ":", count)
except FileExistsError:
print ("Thread dump is not there")
path = '/Users/YEscobar/Desktop/jstack'
filePath = [os.path.join(path, fname) for fname in os.listdir(path)]
lastFile = sorted(filePath, key=os.path.getctime)[-1]
wordcount (lastFile,["RUNNABLE","BLOCKED", "TIMED_WAITING"])
console output
/Users/YEscobar/.virtualenvs/python_workstation1/bin/python /Users/YEscobar/Library/Preferences/PyCharmCE2018.2/scratches/test6.py
Traceback (most recent call last):
File "/Users/YEscobar/Library/Preferences/PyCharmCE2018.2/scratches/test6.py", line 32, in <module>
wordcount (lastFile,["RUNNABLE","BLOCKED","TIMED_WAITING"])
File "/Users/YEscobar/Library/Preferences/PyCharmCE2018.2/scratches/test6.py", line 9, in wordcount
read = file.readlines()
File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdb in position 20: invalid continuation byte
console Output with uncommented encoding = ISO-8859-1
RUNNABLE : 2
BLOCKED : 0
TIMED_WAITING : 3
Grep on the console
grep -o RUNNABLE jstack.20180802-202002.log | wc -l
14
grep -o BLOCKED jstack.20180802-202002.log | wc -l
0
grep -o TIMED_WAITING jstack.20180802-202002.log | wc -l
24
解决方案
推荐阅读
- javascript - Vue | 如何将网络端口号更改为 80?
- google-bigquery - 查询查找在bigquery中一个接一个创建的记录
- javascript - 这些 Vue 和手动生成的 html css 元素有什么区别?
- autohotkey - 比较不同目录中的文件名(减去扩展名),如果名称与 AutoHotkey 相同,则复制/移动
- extjs - EXTJS:未捕获的 ReferenceError:Dexie 未在 myJavascriptFile.js:18 中定义
- vue.js - 在 Vue.JS 应用加载时从远程 API 获取数据
- apache - NGINX 不使用 proxy_cache_background_update 提供“快速”陈旧内容
- java - 在我的系统中哪里可以找到 App Engine Maven 插件的当前安装版本?
- regex - 使用嵌套标记扫描具有非定界字符串的语言
- android - notifyDataSetChanged 在自定义 BaseExpandableListAdapter 中不起作用