python - 如何使用马尔可夫链获得所有可能性?
问题描述
我正在尝试使用马尔可夫链实现语言密码破解器。
这背后的想法是从文本中选择 n-gram,选择一个起始 n-gram(通常是一个位于句子开头的单词)并将其表示为一个状态,使用前 n-1 个字符。例如,对于“the”,我将使用“th”。这将有一个字母列表及其出现次数,并将表示为字典。dict["th"] = [('e', 120), ('a', 79)]
等等。对于这些值中的每一个,我将尝试创建一个满足我的密码或密码长度的马尔可夫链。这意味着当马尔可夫链与我要查找的密码长度相同时,我将停止执行并检查马尔可夫链是否与我的密码相同。我正在尝试使用递归函数来实现这一点,但由于某种原因,我遇到了堆栈溢出。
def ceva(myTry, good_all, pwd, guess, level):
save = myTry
if len(pwd) == len(guess):
if pwd == guess:
return 1
else:
if myTry in good_all.keys():
values = good_all[myTry]
for i in range(0,len(values)):
#print(i, len(values))
letter = values[i][0]
#print("First",myTry, letter)
pwd += letter
if i != len(values)-1:
if len(pwd) == len(guess):
#print("In if", pwd, myTry)
if pwd == guess:
print("I found:", pwd)
return 1
else:
pwd = pwd[0:len(pwd)-1]
else:
myTry += letter
myTry = myTry[1:]
#print("In else: ",pwd, myTry)
return ceva(myTry, good_all, pwd, guess, level)
else:
if len(pwd) == len(guess):
#print("In if", pwd, myTry)
if pwd == guess:
print("I found:", pwd)
return 1
pwd = pwd[0:len(pwd)-1]
for key, letterList in starter_follows.items():
myTry = key.replace("_", "")
# i will not treat the case when the starting phrase
# is a single character
if myTry == "i":
pass
else:
for letter in letterList:
if letter[0] not in "_.-\"!":
myTry += letter[0]
pwd = copy.copy(myTry)
#print("Starter:", pwd)
res=ceva(myTry, good_all, pwd, toGuess, 1)
myTry = myTry[0:len(myTry)-1]
使用这个算法,我达到了最大递归深度。但我正在尝试获取所有马尔可夫链,直到找到密码。
编辑1:现在,使用更新的代码,找到了密码,但这只是因为我正在遍历所有可能的最后一个字母。
例如:“确实”
ind
已经在我的初学者列表中,并且我发现的所有三元组都有“e”作为它们最常见的下一个字母。所以添加了 e,然后是下一个 e,然后是下一个 e,现在密码是“indeee”,但我正在切片最后一个字母并再次遍历,最终找到“确实”,这没关系。问题是,如果我给indedd
它,它将找不到我的密码,因为第二个“d”永远不会循环通过。我怎样才能回到我的迭代中并遍历所有级别的所有可能的字母?
解决方案
由于给出的答案,我最终成功地完成了它。我正在发布工作算法,希望有一天它会对某人有所帮助。
def ceva(myTry, good_all, pwd, guess, flag):
if len(pwd) > len(guess):
return 0
if len(pwd) == len(guess):
if pwd == guess:
flag = 1
return 1
else:
return 0
save = copy.copy(myTry)
#print("Start of functionn:", "[", pwd, ",", myTry, "]")
if flag == 1:
return 1
else:
if myTry in good_all.keys():
# get the list of letters for this specific trigram
values = good_all[myTry]
if len(pwd) <= len(guess):
for i in range(0,len(values)):
#myTry = copy.copy(save)
# get the letter
letter = values[i][0]
# add the letter to the password
pwd += letter
# if i found the password, set flag to 1 and break the loop
if pwd == guess:
flag = 1
print("I found:", pwd)
return 1
# add the new letter to the trigram, and the get rid of the first
# letter, in order to create the new trigram
myTry += letter
myTry = myTry[1:]
#print("Pwd before cutting: [", pwd, "]")
res = ceva(myTry, good_all, pwd, guess, flag)
#print(res)
if res == 0:
pwd = pwd[0:len(pwd)-1]
myTry = pwd[-3:]
# print("This is after stop: [", pwd,",", myTry, "]")
else:
return 1
if flag == 0:
return 0
else:
return 1
flag = 0
for key, letterList in starter_follows.items():
myTry = key.replace("_", "")
# i will not treat the case when the starting phrase
# is a single character
if myTry == "i":
pass
else:
#print("Aaaa")
for letter in letterList:
if letter[0] not in "_.-\"!":
# add the letter to create a (n-1)-gram
myTry += letter[0]
# create a copy of the starting password
pwd = copy.copy(myTry)
# call the recursive function
res=ceva(myTry, good_all, pwd, toGuess, flag)
# go back to the begining, to try a new start
# e.g.: "the" first, "tha" second
myTry = myTry[0:len(myTry)-1]
推荐阅读
- python - 具有评分准确性的 RandomizedSearchCV 会导致错误:评分失败。这些参数在此训练测试分区上的分数将设置为 nan
- java - java - 如何在Java运行时获取从抽象类继承的正确实例?
- r - 使用 kernlabs lssvm 生成预测
- python - 使用 reg exp 分隔打印在一行中的多个文本日志
- c# - c# AutoUpdater.NET - 如果更新可用,应用程序不应关闭
- java - 使用用户输入附加文件名的一部分
- python - 如何在多进程之间共享值?
- python - python绕过二进制if语句并发出全0
- php - 如何在 PHP 函数中传递多个按位值?
- python - 在 OSX 上使用 cx_Freeze 构建后应用程序无法启动