mongodb - 用于拆分 Rasa 对话阵列的 Mongo 聚合管道
问题描述
我一直在尝试为特定用户扁平化 rasa 聊天机器人对话,使用 mongo 聚合方法来获取对话流及其各自的指标,如识别的意图、信心等。
这是用户对象:
{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
'sender_id': '4e2a453009e44767bd09f254c230bd37',
'events': [{'event': 'action',
'name': 'action_session_start',
'confidence': 1.0},
{'event': 'session_started', 'timestamp': 1623840469.2076938},
{'event': 'action',
'name': 'action_listen',
'confidence': None},
{'event': 'user',
'text': 'hi',
'parse_data': {'intent': {'id': -7469901240970573106,
'name': 'greet',
'confidence': 0.9363290667533875},
'entities': [],
'text': 'hi',
'message_id': '66ce7731a1934c40be23c3237b611d1f',
'metadata': {}},
'input_channel': 'cmdline',
'message_id': '66ce7731a1934c40be23c3237b611d1f',
'metadata': {}},
{'event': 'action',
'timestamp': 1623840469.3662663,
'name': 'utter_greet',
'policy': 'policy_1_MemoizationPolicy',
'confidence': 1.0},
{'event': 'bot',
'timestamp': 1623840469.3663723,
'metadata': {'template_name': 'utter_greet'},
'text': 'Hey! How are you?',
'data': {'elements': None,
'quick_replies': None,
'buttons': [{'title': 'great', 'payload': '/mood_great'},
{'title': 'super sad', 'payload': '/mood_unhappy'}],
'attachment': None,
'image': None,
'custom': None}},
{'event': 'action',
'timestamp': 1623840469.370795,
'name': 'action_listen',
'policy': 'policy_1_MemoizationPolicy',
'confidence': 1.0},
{'event': 'user',
'timestamp': 1623840577.3517263,
'text': '/mood_great',
'parse_data': {'intent': {'name': 'mood_great', 'confidence': 1.0},
'entities': [],
'text': '/mood_great',
'message_id': '8ed81a9d7cc546fc8d775245d0498213',
'metadata': {},
'intent_ranking': [{'name': 'mood_great', 'confidence': 1.0}]},
'input_channel': 'cmdline',
'message_id': '8ed81a9d7cc546fc8d775245d0498213',
'metadata': {}},
{'event': 'action',
'timestamp': 1623840577.3575015,
'name': 'utter_happy',
'policy': 'policy_1_MemoizationPolicy',
'confidence': 1.0},
{'event': 'bot',
'timestamp': 1623840577.3575854,
'metadata': {'template_name': 'utter_happy'},
'text': 'Great, carry on!',
'data': {'elements': None,
'quick_replies': None,
'buttons': None,
'attachment': None,
'image': None,
'custom': None}},
{'event': 'action',
'timestamp': 1623840577.363018,
'name': 'utter_please_rephrase',
'policy': 'policy_1_MemoizationPolicy',
'confidence': 1.0},
{'event': 'bot',
'timestamp': 1623840584.5869896,
'metadata': {'template_name': 'utter_please_rephrase'},
'text': "I'm sorry, I didn't quite understand that. Could you rephrase?",
'data': {'elements': None,
'quick_replies': None,
'buttons': None,
'attachment': None,
'image': None,
'custom': None}}]}
这是我获取必要细节的代码:
list(my_records.aggregate([{"$unwind": {"path": "$events", "includeArrayIndex":
"arrayIndex"}},
{"$match" : { "$or" : [{"events.event" : {"$in" : ['bot','user']}}, {"$and" :
[{"events.event": "action"},{"events.name": {"$nin":
['action_listen','action_session_start']}}]}]}},
{"$project":
{"sender_id":1,"events.text":1,"events.intent":"$events.parse_data.intent.name",
"events.confidence":"$events.parse_data.intent.confidence",
"events.name":1,"events.event":1}}]))
这是获得的输出:
[{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
'sender_id': '4e2a453009e44767bd09f254c230bd37',
'events': {'event': 'user',
'text': 'hi',
'intent': 'greet',
'confidence': 0.9363290667533875}},
{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
'sender_id': '4e2a453009e44767bd09f254c230bd37',
'events': {'event': 'action', 'name': 'utter_greet'}},
{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
'sender_id': '4e2a453009e44767bd09f254c230bd37',
'events': {'event': 'bot', 'text': 'Hey! How are you?'}},
{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
'sender_id': '4e2a453009e44767bd09f254c230bd37',
'events': {'event': 'user',
'text': '/mood_great',
'intent': 'mood_great',
'confidence': 1.0}},
{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
'sender_id': '4e2a453009e44767bd09f254c230bd37',
'events': {'event': 'action', 'name': 'utter_happy'}},
{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
'sender_id': '4e2a453009e44767bd09f254c230bd37',
'events': {'event': 'bot', 'text': 'Great, carry on!'}},
{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
'sender_id': '4e2a453009e44767bd09f254c230bd37',
'events': {'event': 'action', 'name': 'utter_please_rephrase'}},
{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
'sender_id': '4e2a453009e44767bd09f254c230bd37',
'events': {'event': 'bot',
'text': "I'm sorry, I didn't quite understand that. Could you rephrase?"}}]
有没有办法通过进一步使用管道以以下扁平方式获得所需的对话输出?注意:对话流从“用户”事件开始。输出格式如下:
[sender_id, user input text, intent name, confidence, action name, bot response text]
我正在寻找的确切输出如下:
[['4e2a453009e44767bd09f254c230bd37','hi','greet',0.9363290667533875,'utter_greet','Hey! How are you?'],
['4e2a453009e44767bd09f254c230bd37','/mood_great','mood_great',1.0,'utter_happy','Great, carry on!','utter_please_rephrase',"I'm sorry, I didn't quite understand that. Could you rephrase?"]]
解决方案
推荐阅读
- python - 我无法理解程序的输出是 3
- plsql - 自动创建 .pdf 文件 centura
- excel - 为什么以及如何在 excel 中使用“op”?
- python - 使用 beautifulsoup 从网页中抓取(动态?)表格时为什么会出错?
- azure - 如果 Azure Function 损坏了怎么办?
- javascript - forEach 通过数组和输入属性的变化
- python - 使用 python 转换 oracle.sql.STRUCT@(转换为 geojson 或数据框)
- python - 检查一列值是否在 Pandas 中另一列的正负 10% 范围内
- objective-c - 是否可以将 os_log 消息存储到文本文件中?或者是否可以使用方法调配来观察 os_log() 函数调用?
- javascript - 用户名和密码的 Javascript 登录页面数组不起作用