首页 > 解决方案 > 使用 RSTUDIO 对 SpaceX 工作进行网络抓取

问题描述

使用此链接回答以下问题。“https://www.spacex.com/careers/?department="

  1. 为 SpaceX 的所有工作创建一个数据框。在表格中显示前 10 个。(knitr::kable()用于创建表)

  2. 创建另一个数据框,显示每个州有多少工作。(如果有多个位置选择第一个)。在表格中显示数据框。

标签: rweb-scrapingscreen-scraping

解决方案


试用RSelenium包,

#Launch server 
library(RSelenium)
driver = rsDriver(browser = c("firefox"))

remDr <- driver[["client"]]

remDr$navigate("https://www.spacex.com/careers/?department=")
button_element <- remDr$findElement(using = 'xpath', value = '//*[@id="jobs-list"]/table')
#get the table
df = button_element$getElementText()

#Converting the text to table 
library(vroom)
df= vroom(df[[1]])
# A tibble: 930 x 5
   JOB         TITLE       LOCATION EMPLOYMENT   TYPE                                                                              
   <chr>       <chr>       <chr>    <chr>        <chr>                                                                             
 1 Application Software    Engineer Hawthorne,   CA, United States Full-Time                                                       
 2 Application Software    Engineer Brownsville, TX, United States Full-Time                                                       
 3 Application Software    Engineer (Developer   Tools) Hawthorne, CA, United States Full-Time                                     
 4 Application Software    Engineer II           Hawthorne, CA, United States Full-Time          

推荐阅读