Response
Web Scraper API response is either JSON object or HTML page, depending on response format you've chosen (html by default).
We only charge successful responses
Please note that we only charge requests that have the following original status codes1:
- success (200, 201, 204)
- redirect (301, 302) if the follow redirect returned any content
- not found (404)
Any other original status codes will not be charged.
Timeouts
Average response time of our API is between 5 to 20 seconds. However, we recommend setting a timeout to 90 seconds.
Processing time
Please note that we only charge requests that were completed within 90 seconds. If the processing of your request takes more that that, you will not be charged.
Headers
Our API response always contains these custom headers:
NAME | DESCRIPTION |
---|---|
wsa-url | Requested website for scraping |
wsa-original-status | Original status code from the scraped website |
wsa-task-id | ID of the scraping task |
HTML response
If you selected html response format, your response will be HTML content from the requested website.
GET 'https://webscraperapi.datashake.com/?api_key=1234567890&url=https://www.nytimes.com/subscription/education'
<!DOCTYPE html>
<html lang="en">
<head>
<title>The New York Times Digital Subscription at Academic Rate</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="sourceApp" content="MOPS" />
<meta name="description" content="Discover the value of independent Times journalism today." />
...
wsa-task-id: 1616074757768664452-c8ea8bf2
wsa-original-status: 200
wsa-url: https://www.nytimes.com/subscription/education
Optionally, if you used get_cookies=True
, you'll receive:
wsa-task-id: 1616074757768664452-c8ea8bf2
wsa-original-status: 200
wsa-url: https://www.nytimes.com/subscription/education
wsa-set-cookie: nyt-a=HPrBrf2KOk79Qu1FK-5Oj4; expires=Fri, 17 Jun 2022 11:36:40 GMT; path=/
wsa-set-cookie: nyt-gdpr=0; expires=Thu, 17 Jun 2021 17:36:40 GMT; path=/
wsa-set-cookie: nyt-purr=cfhhcfhhhuk; expires=Fri, 17 Jun 2022 11:36:40 GMT; path=/
JSON response
If you selected json response format, you will receive a JSON object containing these parameters:
KEY | DATATYPE | DESCRIPTION |
---|---|---|
success | boolean | Success of the API call |
details | string or list | Additional API call details. Set to null by default |
response | JSON | Scraping response object. In case of failure, it's null |
Scraping response object
If the scraping was successful, you will receive a JSON object with information related to the original website response.
KEY | DATATYPE | DESCRIPTION |
---|---|---|
url | string | Scraped website URL |
headers | JSON | Original website response headers |
cookies | list[JSON] | Original website response cookies |
json_body | JSON | Original JSON response body, if jsonable |
body | string | Original response HTML body |
status_code | integer | Original website response status code |
GET 'https://webscraperapi.datashake.com/?api_key=1234567890&url=https://www.nytimes.com/subscription/education&response_format=json'
{
"success": true,
"details": null,
"response": {
"url": "https://www.nytimes.com/subscription/education",
"headers": {
"content-length": "17201",
"content-type": "text/html;charset=UTF-8",
...
},
"cookies": null,
"json_body": null,
"body": "<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n\n\n<title>The New York Times Digital Subscription at Academic Rate...",
"status_code": 200
}
}
Optionally, if you used get_cookies=True
, you'll receive:
{
"success": true,
"details": null,
"response": {
"url": "https://www.nytimes.com/subscription/education",
"headers": {
"content-length": "17201",
"content-type": "text/html;charset=UTF-8",
...
},
"cookies": [
{
"name": "nyt-a",
"value": "Hd0coYSkfqyGhOA_mHdyQd",
"expires": "Fri, 17 Jun 2022 11:28:49 GMT",
"domain": "nytimes.com",
"path": "/"
},
{
"name": "nyt-gdpr",
"value": "0",
"expires": "Thu, 17 Jun 2021 17:28:49 GMT",
"domain": "nytimes.com",
"path": "/"
},
{
"name": "nyt-purr",
"value": "cfhhcfhhhck",
"expires": "Fri, 17 Jun 2022 11:28:49 GMT",
"domain": "nytimes.com",
"path": "/"
}
],
"json_body": null,
"body": "<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n\n\n<title>The New York Times Digital Subscription at Academic Rate...",
"status_code": 200
}
}
wsa-task-id: 1616074757768664452-c8ea8bf2
wsa-original-status: 200
wsa-url: https://www.nytimes.com/subscription/education
Failed response
For errors please check our Errors page.
-
Original status code is a status code that we (Web Scraper API) received from requested url. ↩