首页 > 解决方案 > How to download data from Jotform using python?

问题描述

I am collecting some survey data from jotform, my data include audio recording and the URL for audio in form is

'https://www.jotform.com/widget-uploads/voiceRecorder/201374133/981221_121.wav'

If I try to download this using python, it gives an error because the user can only download this file if he is logged in Jotform account.

Login is easy if it's in-browser, I am working on google cloud and trying to access this file from the terminal.

I checked their official API, the last update was 6 years back on that repo.

I am trying to access using requests, I tried this

import requests

s = requests.Session()
s.post('https://www.jotform.com/login/', data={'username': 'dummy_username', 'password': 'dummy_password'})

s.get( 'https://www.jotform.com/widget-uploads/voiceRecorder/201374133/981221_121.wav')

But it's giving <Response [404]> error.

I inspect the username and password field :

enter image description here

Am I using the current field for username and password?

I also tried to use mechanize but it's giving the same error :

import mechanize

import http.cookiejar as cookielib

browser = mechanize.Browser()

cookiejar = cookielib.LWPCookieJar() 
browser.set_cookiejar( cookiejar ) 


browser.open('https://www.jotform.com/login/')
browser.select_form(nr = 0)

browser.form['username'] = 'dummy_username'
browser.form['password'] = 'dummy_password'
result = browser.submit()
browser.retrieve('https://www.jotform.com/widget-uploads/voiceRecorder/201374133/981221_121.wav')

How I can download audio files using the requests module?

标签: pythonapipython-requestsmechanizejotform

解决方案


实际上,您没有正确使用表单数据。name元素的input用于识别它。在您的情况下,这些是loPasswordand loUsername。所以你想要做的是:

import requests 
sess = requests.Session()

payload = {
    'loPassword': 'dummy_password',
    'loUsername`' : 'dummy_username',
}
op = sess.post('https://www.jotform.com/login/',data=payload)

op.status_code

编辑:我还在网站上看到了一个 csrf 令牌。你必须先从网站上抓取一个 csrftoken,然后在你的payload.

from bs4 import BeautifulSoup
import requests
page = requests.get('https://www.jotform.com/login/')
soup = BeautifulSoup(page.text,'lxml') 
csrf = soup.find('input',{'name':'csrf-token'})['value']
#now create the payload with this csrftoken
payload = {
    'csrf-token':csrf,
    'loUsername':'dummy_username',
    'loPassword':'dummy_password',
}
sess = requests.Session()
op = sess.post('https://www.jotform.com/login/',data=payload)
op.status_code

推荐阅读