首页 > 解决方案 > LinkedIn Scraping 返回仅包含列标题和空行的空数据框

问题描述

我正在尝试使用此 URL 抓取 LinkedIn 工作页面:

https://www.linkedin.com/jobs/search?keywords=&location=Egypt&geoId=106155005&trk=public_jobs_jobs-search-bar_search-submit&position=1&pageNum=0&sortBy=DD

我正在使用此代码来抓取信息:

# setting up list for job information

post_title = []
company_name = []


# for loop for job title, company


for job in job_container:
    job_titles = driver.find_elements_by_css_selector("a.job-card-list__title")
    
    post_title.append(job_titles)
   
    Company_Names = driver.find_elements_by_css_selector("a.job-card-container__company-name")
    
    company_name.append(Company_Names)

在检查它返回的数据长度后:1985

# to check if we have all information
print(len(company_name))
print(len(post_title))

使用这段代码创建数据框时:

# creating a dataframe
job_data = pd.DataFrame({
'Company Name': company_name,
'Post': post_title,

})

print(job_data.info())
job_data

它返回像这样的空行:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1985 entries, 0 to 1984
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Company Name  1985 non-null   object
 1   Post          1985 non-null   object
dtypes: object(2)
memory usage: 31.1+ KB
None
Company Name    Post
0   []  []
1   []  []
2   []  []
3   []  []
4   []  []
... ... ...
1980    []  []
1981    []  []
1982    []  []
1983    []  []
1984    []  []
1985 rows × 2 columns

请告诉我如何成功解决问题

标签: pythonseleniumweb-scrapingselenium-chromedriver

解决方案


推荐阅读