首页 > 解决方案 > 是否有可能在 phython 中使用更大的列表?

问题描述

对于学校,我必须制作一个关于 wifisignals 的项目,并且我正在尝试将数据放入数据框中。有 208.000 行数据。而当涉及到下面的代码时,代码并没有完成。代码就像陷入了无限循环。但是当我只使用 1000 行时,我的程序可以工作。所以我认为如果可能的话,我的名单太小了。phython 中是否存在更大的列表?还是因为我使用了糟糕的编码?提前致谢。

编辑1:(数据是原始数据框,wifiinfo是其中的一列)我有这种格式:

df = pd.DataFrame(columns=['Sender','Time','Date','Place','X','Y','Bezetting','SSID','BSSID','Signal'])

我正在尝试填充SSIDBSSID为此我必须Signal从列中拆分数据。WifiInfo

这就是 1 WifiInfo 的样子:

ODISEE@88-1d-fc-41-dc-50:-83,ODISEE@88-1d-fc-2c-c0-00:-72,ODISEE@88-1d-fc-41-d2-d0:-82,CiscoC5976@58-6d-8f-19-14-38:-78,CiscoC5959@58-6d-8f-19-13-f4:-93,SNB@c8-d7-19-6f-be-b7:-99,ODISEE@88-1d-fc-2c-c5-70:-94,HackingDemo@58-6d-8f-19-11-48:-156,ODISEE@88-1d-fc-30-d4-40:-85,ODISEE@88-1d-fc-41-ac-50:-100

我目前的方法如下:

for index, row in data.iterrows():
    bezettingList = list()
    ssidList = list()
    bssidList = list()
    signalList = list()

    #WifiInfo splitting  
    wifis = row.WifiInfo.split(',')
    for wifi in wifis:
        #split wifi and add to List
        ssid, bssid = wifi.split('@')
        bssid, signal = bssid.split(':')
        ssidList.append(ssid)
        bssidList.append(bssid)
        signalList.append(int(signal))

    #add bezettingen to List 
    bezettingen = row.Bezetting.split(',')
    for bezetting in bezettingen:
        bezettingList.append(bezetting) 

    #add list to dataframe
    df.loc[index,'SSID'] = ssidList
    df.loc[index,'BSSID'] = bssidList
    df.loc[index,'Signal'] = signalList
    df.loc[index,'Bezetting'] = bezettingList

df.head()

标签: pythonpandas

解决方案


IIUC,您需要先用逗号分解该行,以便:

    SSID    BSSID   Signal  WifiInfo
0   NaN     NaN     NaN     ODISEE@88-1d-fc-41-dc-50:-83,ODISEE@88- ...

变成这样:

    SSID    BSSID   Signal  WifiInfo
0   NaN     NaN     NaN     ODISEE@88-1d-fc-41-dc-50:-83
1   NaN     NaN     NaN     ODISEE@88-1d-fc-2c-c0-00:-72
2   NaN     NaN     NaN     ODISEE@88-1d-fc-41-d2-d0:-82
3   NaN     NaN     NaN     CiscoC5976@58-6d-8f-19-14-38:-78
4   NaN     NaN     NaN     CiscoC5959@58-6d-8f-19-13-f4:-93
5   NaN     NaN     NaN     SNB@c8-d7-19-6f-be-b7:-99
6   NaN     NaN     NaN     ODISEE@88-1d-fc-2c-c5-70:-94
7   NaN     NaN     NaN     HackingDemo@58-6d-8f-19-11-48:-156
8   NaN     NaN     NaN     ODISEE@88-1d-fc-30-d4-40:-85
9   NaN     NaN     NaN     ODISEE@88-1d-fc-41-ac-50:-100
# use `.explode`
data = data.assign(WifiInfo=data.WifiInfo.str.split(',')).explode('WifiInfo')

现在你可以使用.str.extract

data['SSID'] = data['WifiInfo'].str.extract(r'(.*)@')
data['BSSID'] = data['WifiInfo'].str.extract(r'@(.*):')
data['Signal'] = data['WifiInfo'].str.extract(r':(.*)')
    SSID        BSSID               Signal  WifiInfo
0   ODISEE      88-1d-fc-41-dc-50   -83     ODISEE@88-1d-fc-41-dc-50:-83
1   ODISEE      88-1d-fc-2c-c0-00   -72     ODISEE@88-1d-fc-2c-c0-00:-72
2   ODISEE      88-1d-fc-41-d2-d0   -82     ODISEE@88-1d-fc-41-d2-d0:-82
3   CiscoC5976  58-6d-8f-19-14-38   -78     CiscoC5976@58-6d-8f-19-14-38:-78
4   CiscoC5959  58-6d-8f-19-13-f4   -93     CiscoC5959@58-6d-8f-19-13-f4:-93
5   SNB         c8-d7-19-6f-be-b7   -99     SNB@c8-d7-19-6f-be-b7:-99
6   ODISEE      88-1d-fc-2c-c5-70   -94     ODISEE@88-1d-fc-2c-c5-70:-94
7   HackingDemo 58-6d-8f-19-11-48   -156    HackingDemo@58-6d-8f-19-11-48:-156
8   ODISEE      88-1d-fc-30-d4-40   -85     ODISEE@88-1d-fc-30-d4-40:-85
9   ODISEE      88-1d-fc-41-ac-50   -100    ODISEE@88-1d-fc-41-ac-50:-100

如果您想在列爆炸后保持数据分组,我会首先为每组条目分配一个 ID:

data['Group'] = pd.factorize(data['WifiInfo'])[0]+1
    SSID    BSSID   Signal  WifiInfo                                     Group 
0   NaN     NaN     NaN     ODISEE@88-1d-fc-41-dc-50:-83,ODISEE@88- ...  1
1   NaN     NaN     NaN     ASD@22-1d-fc-41-dc-50:-83,QWERTY@88-    ...  2
# after you explode the column
SSID        BSSID           Signal  WifiInfo                        Group 
ODISEE      88-1d-fc-41-dc-50   -83 ODISEE@88-1d-fc-41-dc-50:-83    1
ODISEE      88-1d-fc-2c-c0-00   -72 ODISEE@88-1d-fc-2c-c0-00:-72    1
...
...
ASD         22-1d-fc-41-dc-50   -83 ASD@88-1d-fc-41-dc-50:-83       2
QWERTY      88-1d-fc-2c-c0-00   -72 QWERTY@88-1d-fc-2c-c0-00:-72    2

推荐阅读