python - 构建大型列表的 Python Discord.py 程序内存不足并在 PythonAnywhere 上崩溃
问题描述
我为我的 Discord 服务器创建了一个机器人,它转到给定 subreddit 的 Reddit API,并根据您输入的 subreddit 在 Discord 聊天中发布当天的前 10 个结果。它不理会自己的帖子,实际上只发布图片和 GIF。Discord 消息命令看起来像这样:=get funny awww news
,在从 Reddit API 获取结果时发布每个 subreddit 的结果。这没有问题。我知道该机器人能够访问 API 并发布到不和谐的作品。
这是我需要帮助的问题:
我添加了另一个命令=getshuffled
,它将来自 subreddits 的所有结果放在一个大列表中,然后在发布之前将它们打乱。这对于最多约 50 个子版块的请求非常有效。
因为它可能是一个很大的结果列表,100 多个结果来自 100 多个 subreddit,所以机器人在非常大的请求上崩溃。现在,我想我知道问题出在哪里了:该机器人托管在 PythonAnywhere 上,在那里我只有 3GB 的 RAM。现在,在我的脑海中,即使有一个很大的列表,它也不应该占用这么多内存。没门。所以,我想我不是在清除记忆?
我敢肯定,从 Reddit 中提取数据并创建大的结果列表会消耗很多,然后当它尝试随机播放并将结果发布到下一个函数中时,它会耗尽内存并且 PythonAnywhere 会终止该进程。
这就是我卡住的地方:我不确定python如何管理函数之间的内存,或者即使我做得对。我认为这很简单,比如我如何调用函数或其他东西,或者我不了解 Asyncio 是如何在幕后工作的。
以下是 PythonAnywhere 对 RAM 的评价:
由于系统 RAM 是 PythonAnywhere 服务器上最稀缺的资源之一,因此我们将您的进程限制为最大 3GB 的内存大小。这是每个进程的限制,而不是系统范围的限制,因此如果您有更大的内存需求,您可以通过运行多个较小的进程来完成您需要的处理。如果一个进程超过内存限制,它将被杀死。
在这方面的任何帮助将不胜感激!!!
(另外,如果代码中有什么让你摇头的地方,请告诉我,我确定我犯了一些错误的做法!谢谢大家)
编辑:Python 版本是 3.6,Discord.py 版本是 0.16.12
编辑 2:我检查了服务器日志:这是在运行大量结果时引发的错误(这是在 main_post 函数上引发的,我已经注释了该行):
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending coro=<Client._run_event() running at /home/GageBrk/.local/lib/python3.6/site-packages/discord/client.py:307> wait_for=<Future pending cb=[BaseSelectorEventLoop._sock_connect_done(13)(), <TaskWakeupMethWrapper object at 0x7f0aa5bbaa38>()]>>
代码:
import logging
import socket
import sys
import random
import os
from redditBot_auth import reddit
import discord
import asyncio
from discord.ext.commands import Bot
#from discord.ext import commands
import platform
@client.event
async def on_ready():
return await client.change_presence(game=discord.Game(name='Getting The Dank Memes'))
def is_number(s):
try:
int(s)
return True
except:
pass
def show_title(s):
try:
if s == 'TITLE':
return True
if s == 'TITLES':
return True
if s == 'title':
return True
if s == 'titles':
return True
except:
pass
#This gets results for each subreddit and puts into nested list with format
#[[title1, URL1], [title2, URL2], [title3, URL3]]
#It then passes that results list to the function that posts to Discord
async def main_loop(*args, shuffled=False):
q=10
#This takes a integer value argument from the input string.
#It sets the number variable,
#Then deletes the number from the arguments list.
#same with the title variable
title = False
for item in args:
if is_number(item):
q = item
q = int(q)
if q > 15:
q=15
args = [x for x in args if not is_number(x)]
if show_title(item):
title = True
args = [x for x in args if not show_title(x)]
number_of_posts = q * len(args)
results=[]
#results = [[] for x in range(number_of_posts)] #create nested lists for each post. This is ALL the links that will be posted.
TESTING = False #If this is turned to True, the subreddit of each post will be posted. Will use defined list of results
NoGrabResults = False
i = 0
await client.say('*Posting ' + str(number_of_posts) + ' posts from ' + str(len(args))+' subreddits*')
#This pulls the data and creates a list of links for the bot to post
if NoGrabResults == False: #This is for testing, ignore
for item in args:
try:
#subreddit_results = [[] for x in range(q)] #create nested lists for each subreddit.
#This allows for duplicate deletion within the subreddit results,
#rather than going over the entire 'results' list.
subreddit_results=[]
e = 0 #counter for the subreddit_results list.
#await client.say('<'+item+'>')
subreddit = reddit.subreddit(item)
Day = subreddit.top('day', limit= q*2)
Week = subreddit.top('week', limit = q*2)
Month = subreddit.top('month', limit = q*2)
Year = subreddit.top('year', limit = q*2)
AllTime = subreddit.top('all', limit = q*2)
print(item)
for submission in Day:
post = []
if len(subreddit_results) < q :
if submission.is_self is False:
if '/v.redd.it/' not in submission.url:
#print(submission.url)
if '.gif' or 'imgur.com' or 'gfycat' in submission.url:
if submission.url not in subreddit_results:
post.append(submission.title)
post.append(submission.url)
#post.append(item)
subreddit_results.append(post)
#print('pulled posts from Daily')
if len(subreddit_results) < q :
#print('getting posts from Weekly')
for submission in Week:
post = []
if len(subreddit_results) < q :
if submission.is_self is False:
if '/v.redd.it/' not in submission.url:
#print(submission.url)
if '.gif' or 'imgur.com' or 'gfycat' in submission.url:
if submission.url not in subreddit_results:
post.append(submission.title)
post.append(submission.url)
#post.append(item)
subreddit_results.append(post)
if len(subreddit_results) < q :
#print('getting posts from Monthly')
for submission in Month:
post = []
if len(subreddit_results) < q :
if submission.is_self is False:
if '/v.redd.it/' not in submission.url:
#print(submission.url)
if '.gif' or 'imgur.com' or 'gfycat' in submission.url:
if submission.url not in subreddit_results:
post.append(submission.title)
post.append(submission.url)
#post.append(item)
subreddit_results.append(post)
if len(subreddit_results) < q :
#print('getting posts from Yearly')
for submission in Year:
post = []
if len(subreddit_results) < q :
if submission.is_self is False:
if '/v.redd.it/' not in submission.url:
if '.gif' or 'imgur.com' or 'gfycat' in submission.url:
if submission.url not in subreddit_results:
post.append(submission.title)
post.append(submission.url)
#post.append(item)
subreddit_results.append(post)
if len(subreddit_results) < q :
#print('getting posts from All Time')
for submission in AllTime:
post = []
if len(subreddit_results) < q :
if submission.is_self is False:
if '/v.redd.it/' not in submission.url:
if '.gif' or 'imgur.com' or 'gfycat' in submission.url:
if submission.url not in subreddit_results:
post.append(submission.title)
post.append(submission.url)
#post.append(item)
subreddit_results.append(post)
#print (subreddit_results)
# If they don't want shuffled results, it will post results
# to Discord as it gets them, instead of creating the nested list
if shuffled == False:
await client.say('<'+item+'>')
for link in subreddit_results:
if title==True:
await client.say(link[0]) #title
await client.say(link[1]) #post url 1
if TESTING == True:
await client.say(link[2]) #subreddit
if title==True:
await client.say("_") #spacer to seperate title from post above
else:
for link in subreddit_results:
results.append(link)
except Exception as e:
if 'Redirect to /subreddits/search' or '404' in str(e):
await client.say('*'+item+' failed...* '+'`Subreddit Does Not Exist`')
if '403' in str(e):
await client.say('*'+item+' failed...* '+'`Access Denied`')
print(str(e) + ' --> ' + item)
pass
print ('results loaded')
await main_post(results, shuffled, title, TESTING)
else:
from Test_args import LargeResults as results
#print(results)
await main_post(results, shuffled, title, TESTING)
.
#this shuffles the posts and posts to Discord.
async def main_post(results, shuffled, title, TESTING):
try:
if shuffled == True:
print('____SHUFFLED___')
random.shuffle(results)
random.shuffle(results)
random.shuffle(results)
#This posts the links in the 'results' list to Discord
for post in results:
try:
# THIS IS WHERE THE PROGRAM IS FAILING!!
if title==True:
await client.say(post[0]) #title
await client.say(post[1]) #post url
if TESTING == True:
await client.say(post[2]) #subreddit
if title==True:
await client.say("_") #spacer to separate title from post above
except Exception as e:
print(e)
pass
await client.say('ALL DONE! ! !')
except Exception as e:
print (e)
pass
await client.say('`' +str(e) +'`')
.
@client.command()
async def get(*args, brief="say '=get' followed by a list of subreddits", description="To get the 10 Top posts from a subreddit, say '=get' followed by a list of subreddits:\n'=get funny news pubg'\n would get the top 10 posts for today for each subreddit and post to the chat."):
#sr = '+'.join(args)
await main_loop(*args)
#THIS POSTS THE POSTS RANDOMLY
@client.command()
async def getshuffled(*args, brief="say '=getshuffled' followed by a list of subreddits", description="Does the same thing as =get, but grabs ALL of the posts and shuffles them, before posting."):
await main_loop(*args, shuffled=True)
client.run('my ID')
解决方案
推荐阅读
- wagtail - “‘根’值必须是整数。” 预览鹡鸰页时
- javascript - TestCafe 在移动 Chrome 模拟器上出现“无法读取未定义的属性‘pageX’”错误
- swift - Observe text change in NSTextView
- reactjs - 使用重定向传递状态
- r - Maximise columns and rows from a binary matrix
- ios - 如何将条形图的值放在 Swift iOS 图表的 HorizontalBarChartView 条形图上?
- java - Spring Boot application doesn't start when deployed to Tomcat inside Eclipse
- git - 包含重音字符的 VB6 源文件在 git 中不能正确区分?
- java - 从 SoapUI 项目执行 Java 方法
- arraylist - 如何在 Kotlin 的类之外定义静态初始化块?