首页 > 解决方案 > ValueError:所有数组的长度必须相同,在数据框中附加数据

问题描述

import requests
from bs4 import BeautifulSoup
import pandas as pd
headers ={
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
productlink=[]
n=[]
a=[]
re=[]
ra=[]
w=[]

r =requests.get('https://www.houzz.com/professionals/general-contractor')
soup=BeautifulSoup(r.content, 'html.parser')
tra = soup.find_all('div',class_='hz-pro-search-result__info')
for pro in tra:
    name=pro.find('span',class_='mlm header-5 text-unbold').text
    n.append(name)
    address=pro.find('span',class_='hz-pro-search-result__location-info__text').text
    a.append(address)
    reviews=pro.find('span',class_='hz-star-rate__review-string').text
    re.append(reviews)
    rating=pro.find('span',class_='hz-star-rate__rating-number').text
    ra.append(rating)
for links in tra:
    for link in links.find_all('a',href=True)[2:]:
            if link['href'].startswith('https://www.houzz.com/professionals/general-contractors'):
                productlink.append(link['href'])

for link in productlink:
    r =requests.get(link,headers=headers)
    soup=BeautifulSoup(r.content, 'html.parser')
    for web in soup.find_all('a',attrs={'class':'sc-62xgu6-0 jxCcwv mwxddt-0 bSdLOV hui-link trackMe'}):
        w.append(web['href'])
df = pd.DataFrame({'name':n,'address':a,'reviews':re,'rating':ra,'web':w})
print(df)

当我尝试将数据附加到数据框中时,代码运行良好,它们向我展示了所有ValueError: All arrays must be of the same length如何将这些数据附加到数据框中如何解决这些问题如果您在这件事上帮助我,我将非常感谢

这是我的输出:

Capital Remodeling Hanover, Maryland 21076, United States 409 Reviews 4.8
SOD Home Group 367 Santana Heights, Unit #3-3021, San Jose, California 95128, United States 238 Reviews 5.0
Innovative Construction Inc. 3040 Amwiler Rd, Suite B, Peachtree Corners, Georgia 30360, United States 100 Reviews 5.0
Baron Construction & Remodeling Co. Saratoga & Los Angeles, California 95070, United States 69 Reviews 4.8
Luxe Remodel 329 N. Wetherly Dr., Suite 205, Los Angeles, California 90211, United States 79 Reviews 4.9
California Home Builders & Remodeling Inc. STUDIO CITY, California 91604, United States 232 Reviews 5.0
Sneller Custom Homes and Remodeling, LLC 17018 Seven Pines Dr Ste 100, Spring, Texas 77379, United States 77 Reviews 4.9
123 Remodeling Inc. 5070 N. Kimberly Ave Suite C, Chicago, Illinois 60630, United States 83 Reviews 4.7
Professional builders & Remodeling, Inc 15335 Morrison St #325, Sherman Oaks, California 91403, United States 203 Reviews 5.0
Rudloff Custom Builders 896 Breezewood Lane, West Chester, Pennsylvania 19382, United States 111 Reviews 5.0
LAR Construction & Remodeling 6371 canby ave, Tarzana, California 91335, United States 191 Reviews 5.0
Erie Construction Mid West 4271 Monroe St., Toledo, Ohio 43606, United States 231 Reviews 4.8
Regal Construction & Remodeling Inc. 19537 � Ventura Blvd., Tarzana, California 91356, United States 96 Reviews 4.8
Mr. & Mrs. Construction & Remodeling 2570 N 1st street, ste 212, San Jose, California 95131, United States 75 Reviews 5.0
Bailey Remodeling and Construction LLC 201 Meridian Ave., Suite 201, Louisville, Kentucky 40207, United States 106 Reviews 5.0

https://www.houzz.com/trk/aHR0cDovL3d3dy5iYWlsZXlyZW1vZGVsLmNvbQ/2f005891e940e2c01021b57733580fa3/ue/NDU3NDcxNQ/a3be682e415d6c23590401e416ee1018

标签: pythonpandasdataframebeautifulsouppython-requests

解决方案


使其尽可能简单,不要将来自不同循环的信息存储在这些列表中,尝试将它们存储在一个中dict

可能的解决方案

import requests
from bs4 import BeautifulSoup
import pandas as pd
headers ={
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}

r =requests.get('https://www.houzz.com/professionals/general-contractor')
soup=BeautifulSoup(r.content, 'html.parser')
tra = soup.find_all('div',class_='hz-pro-search-result__info')

data = []

for pro in tra:
    name=pro.find('span',class_='mlm header-5 text-unbold').text
   
    address=pro.find('span',class_='hz-pro-search-result__location-info__text').text
    reviews=pro.find('span',class_='hz-star-rate__review-string').text
    rating=pro.find('span',class_='hz-star-rate__rating-number').text
    productlink.append(pro.find('a')['href'])
    
    w  = pro.find('a')['href']
    
    data.append({'name':name,'address':address,'reviews':reviews,'rating':rating,'web':w})

for idx,item in enumerate(data):
    r =requests.get(item['web'],headers=headers)
    soup=BeautifulSoup(r.content, 'html.parser')
    for web in soup.find_all('a',attrs={'class':'sc-62xgu6-0 jxCcwv mwxddt-0 bSdLOV hui-link trackMe'}):
        data[idx]['web']=(web['href']) 
    
df = pd.DataFrame(data)
df

输出

    name    address reviews rating  web
0   Capital Remodeling  Hanover, Maryland 21076, United States  409 Reviews 4.8 https://www.houzz.com/trk/aHR0cDovL3d3dy5jYXBp...
1   SOD Home Group  367 Santana Heights, Unit #3-3021, San Jose, C...   238 Reviews 5.0 https://www.houzz.com/trk/aHR0cHM6Ly9zb2RoZy5j...
2   Innovative Construction Inc.    3040 Amwiler Rd, Suite B, Peachtree Corners, G...   100 Reviews 5.0 https://www.houzz.com/trk/aHR0cHM6Ly9pbm5vdmF0...
3   Baron Construction & Remodeling Co. Saratoga & Los Angeles, California 95070, Unit...   69 Reviews  4.8 https://www.houzz.com/trk/aHR0cDovL3d3dy5iYXJv...
4   Luxe Remodel    329 N. Wetherly Dr., Suite 205, Los Angeles, C...   79 Reviews  4.9 https://www.houzz.com/professionals/general-co...
5   California Home Builders & Remodeling Inc.  STUDIO CITY, California 91604, United States    232 Reviews 5.0 https://www.houzz.com/trk/aHR0cDovL3d3dy5teWNh...
6   Sneller Custom Homes and Remodeling, LLC    17018 Seven Pines Dr Ste 100, Spring, Texas 77...   77 Reviews  4.9 https://www.houzz.com/trk/aHR0cDovL3NuZWxsZXJj...
7   123 Remodeling Inc. 5070 N. Kimberly Ave Suite C, Chicago, Illinoi...   83 Reviews  4.7 https://www.houzz.com/trk/aHR0cHM6Ly8xMjNyZW1v...
8   Professional builders & Remodeling, Inc 15335 Morrison St #325, Sherman Oaks, Californ...   203 Reviews 5.0 https://www.houzz.com/trk/aHR0cDovL3d3dy5wcm9m...
9   Rudloff Custom Builders 896 Breezewood Lane, West Chester, Pennsylvani...   111 Reviews 5.0 https://www.houzz.com/trk/aHR0cDovL1J1ZGxvZmZj...
10  LAR Construction & Remodeling   6371 canby ave, Tarzana, California 91335, Uni...   191 Reviews 5.0 https://www.houzz.com/trk/aHR0cDovL3d3dy5sYXJy...
11  Erie Construction Mid West  4271 Monroe St., Toledo, Ohio 43606, United St...   231 Reviews 4.8 https://www.houzz.com/trk/aHR0cDovL3d3dy5lcmll...
12  Regal Construction & Remodeling Inc.    19537 ½ Ventura Blvd., Tarzana, California 913...   96 Reviews  4.8 https://www.houzz.com/trk/aHR0cDovL3JlZ2FscmVu...
13  Mr. & Mrs. Construction & Remodeling    2570 N 1st street, ste 212, San Jose, Californ...   75 Reviews  5.0 https://www.houzz.com/trk/aHR0cDovL3d3dy5NcmFu...
14  Bailey Remodeling and Construction LLC  201 Meridian Ave., Suite 201, Louisville, Kent...   106 Reviews 5.0 https://www.houzz.com/trk/aHR0cDovL3d3dy5iYWls...

推荐阅读