Skip to content

Scraping Dynamic JavaScript Websites

First try: inspect network calls

Many β€œdynamic” sites load data from JSON endpoints.

Use browser devtools Network tab and try:

  • call the JSON endpoint with requestsrequests
  • avoid full browser automation

Selenium approach

  • open page
  • wait for element
  • extract HTML
selenium_get_html.py
from selenium import webdriver
 
 
driver = webdriver.Chrome()
try:
    driver.get("https://example.com")
    html = driver.page_source
    print(html[:500])
finally:
    driver.quit()
selenium_get_html.py
from selenium import webdriver
 
 
driver = webdriver.Chrome()
try:
    driver.get("https://example.com")
    html = driver.page_source
    print(html[:500])
finally:
    driver.quit()

Risks

  • more fragile
  • slower
  • easier to get blocked

If this helped you, consider buying me a coffee β˜•

Buy me a coffee

Was this page helpful?

Let us know how we did