python - 如何使用 pandas 读取日志文件?
问题描述
在我的日志文件中,一些条目是 -
1. IP428702 - - [02/Sep/2017:18:44:27 +0200] "GET /?ln=de HTTP/1.1" 200 4858 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 122026 0 NOSSL
2. 22354 - - [01/Sep/2017:07:12:06 +0200] "GET / HTTP/1.1" 200 18359 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1" 131909 0 NOSSL
3. IP428702 - - [02/Sep/2017:18:42:14 +0200] "GET /search?ln=en&sc=1&p=1&action_search=1 HTTP/1.1" 200 9490 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\"'`--" 2155371 2 NOSSL
4. IP428702 - - [02/Sep/2017:18:42:43 +0200] "GET /search?ln=en&sc=1&p=&action_search= HTTP/1.1" 200 9796 "http://doc.rero.ch/search?l...\"'`--" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 5776261 5 NOSSL
5. IP173839 - - [02/Sep/2017:12:09:55 +0200] "GET /server/document/get_indexing?page_nr=16&from=&to=&url=http://doc.rero.ch/record/1... HTTP/1.1" 200 131113 "http://doc.rero.ch/client/fr//" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112
6. IP423766 - - [01/Sep/2017:14:30:25 +0200] "GET /record/11876/files/bulletin_vals_asla_2007_085.pdf?version=1'\" HTTP/1.1" 200 6847339 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; iebar; acc=none; SV1; snprtz|S04087544802137; .NET CLR 1.1.4322)" 241381 0 NOSSL
IP427 - - [01/Sep/2017:14:30:25 +0200] "GET /record/258826/export/xd?ln=en HTTP/1.1" 200 441 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search..." 114963 0 NOSSL
我用来读取日志条目的代码是
data = pd.read_csv(
'path_to loffile',
sep=r'\s+(?=(?:[^"]*"[^"]*")*[^"]*$)(?![^\[]*\])',
engine='python', names = ["ip", "time", "request",
"status","size",
"referer", "user_agent"],skipfooter = 1,
usecols = [0,3,4,5,6,7,8])
它返回的是——
"IP423766 - - [01/Sep/2017:14:30:25 +0200] "GET "
如何从条目中获取所有内容?
解决方案
推荐阅读
- javascript - window.print() to generate PDF with electron
- c# - 如何找到特定颜色并询问它的位置、宽度和高度
- mysql - 我应该为每个用户使用单独的数据库吗?
- php - 如何将自定义验证规则添加到默认 Validator::make($data... 在 laravel 中注册?
- php - PHP - For each to get variable based on $i
- ios - kSecAccessControlUserPresence 不会在身份验证时启动 FaceId
- c# - Ajaxcontroltoolkit 文件上传控件
- tibco - Loop through non-repeated elements in Tibco Designer
- amazon-web-services - AWS SSM 参数存储可靠性
- java - 将二维 json 数组映射到 java 对象