首页 > 解决方案 > 如何在 python 中解码这个二进制字符串?

问题描述

所以,我有这个字符串01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000

我想使用python对其进行解码,我收到了这个错误 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 280: invalid start byte

根据这个网站:https ://www.binaryhexconverter.com/binary-to-ascii-text-converter

输出应该是S�ellotherehowyoudoingimfineareyoufineP

这是我的代码:

def decodeAscii(bin_string):
    binary_int = int(bin_string, 2);
  
    byte_number = binary_int.bit_length() + 7 // 8
    binary_array = binary_int.to_bytes(byte_number, "big")
    ascii_text = binary_array.decode()
    
    print(ascii_text)

我如何解决它?

标签: pythonbinary

解决方案


正如错误消息告诉您的那样,您的字节根本无法解码为 utf-8。

utf-8 是decode的默认编码参数- 输入正确编码值的最佳方法是知道编码 - 否则您将不得不猜测。

猜测可能也是网站所做的,通过尝试最常见的编码,直到不抛出异常:

def decodeAscii(bin_string):
    binary_int = int(bin_string, 2);
    byte_number = binary_int.bit_length() + 7 // 8
    binary_array = binary_int.to_bytes(byte_number, "big")
    ascii_text = "Bin string cannot be decoded"
    for enc in ['utf-8', 'ascii', 'ansi']:
        try:
            ascii_text = binary_array.decode(encoding=enc)
            break
        except:
            pass
    print(ascii_text)

s = "01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000"
decodeAscii(s)

输出:

S°ellotherehowyoudoingimfineareyoufineP

但是不能保证您通过猜测找到“正确”的编码。


推荐阅读