首页 > 解决方案 > 如何将抓取的数据保存到 mongodb 数据库

问题描述

我有不同的想法,将我的每个输出保存到数据库中,并且 mongodb 的一个关键因素期望在您的项目可以保存在 mongo 集合中之前传入字典数据类型。

我的数据来自这个函数,我可以在 if 条件后轻松打印出来

article_text, summary_sentences, entity_names, categorized_data, max_img, keywords = get_article_metadata(article_url, reject_p_tags, num_sents_in_summary, sentence_threshold)
                    print(summary_sentences)
                    print(article_title)
                    print('%s at %s:00' % (published_date, hour))
                    print('Entity Names', entity_names)
                    print('Categorized data', categorized_data)
                    print('Top images', max_img)
                    print('Keyowrds', keywords)

我怎样才能轻松地将其保存到数据框中,因为我可以轻松地将数据框转换为字典,然后按该顺序插入到 mongodb

这是输出示例,但在这里我已将其附加到字典列表

[{'title': 'Shahrukh Birthday Countdown, Trimurti! Anil and Jackie and Shahrukh, Oh My!!!!', 'publish_date': datetime.date(2020, 9, 24), 'Summary': ['Anil stills milk and toys for the baby, Jackie yells at him for teaching the baby bad values, lil’ Anil ends up running away from home.', 'He has such zest for life and just wants to share it, none of that boring rules and advice that Jackie gave out.', 'Childish Shahrukh with Jackie felt like a 60 year old talking to a 6 year old, Anil and Shahrukh feels more like a 25 year old and 18 year old, the characters they are actually playing.'], 'Article_category': 'Entertainment', 'image_links': ['https://dontcallitbollywood.files.wordpress.com/2018/08/cropped-akbar.jpg', 'https://d1ba50i68eftrl.cloudfront.net/images/Artifact_Data/CINE/CINE-pos/.resized_img_s3/medium/1018695.CINE.pos.jpg'], 'keywords': ['ghai likes symbolism', 'put shahrukh ’', 'deep character work', 'highly stylized plots', 'two styles sets']}]

提前致谢

标签: pythonpandasmongodbdataframe

解决方案


推荐阅读