-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Hi Michael and team - first of all thanks so much for an amazing package... I have so many good things to say about it (and your examples... and your videos...) but will save that for another place/time!
I just wanted to check if I'm missing something important with regards to context managers and most/all of the examples in your documentation starting with with SB(uc=True, test=True, locale_code="en") as sb:
etc?
For debugging and REPL work, and generally doing more with the sb object without one massive with
block, I prefer to manually create and close the object, which means the browser stays open for as long as I need to tinker. Indeed many original selenium scripts include a final driver.close()
. Here's the wrapper I've come up with but since I don't know what I don't know, I just wanted to ask if there are any problems you can see with this approach please? And if not, maybe include something like this as an optional approach in the docs or the package itself?
Also, very trivial points but could the arguments of SB.__exit__
be made optional so that SB.__exit__()
works without specifying None
three times? Also SB.activate_cdp_mode()
defaulting to None
? And pls=None
instead of pls="none"
?
Many thanks in advance for your time considering this!
import atexit
import platform
import httpx
import requests
from seleniumbase import SB
sb_defaults = {
'uc': True,
'headless': False,
'test': False,
'locale_code': "en",
'cdp': False,
'pls': "none",
'ad_block': True,
'xvfb': platform.platform().startswith("Linux")
}
def get_sb(**kwargs):
"""
Return a SeleniumBase object with Undetected Chromedriver (UC) and/or Chrome Devtools Protocol
(CDP) enabled, and helper methods added.
>>> sb = get_sb()
>>> sb.get("https://www.rightmove.co.uk/properties/157644773")
>>> sb.click('#onetrust-accept-btn-handler')
>>> sb.fetch()
>>> sb.highlight('h1')
>>> sb.post_message("SeleniumBase wasn't detected", duration=4)
>>> sb.close()
"""
kwargs = sb_defaults | kwargs
cdp = kwargs.pop('cdp')
sb_base = SB(**kwargs)
sb = sb_base.__enter__()
if cdp:
sb.activate_cdp_mode(None)
def _close():
sb_base.__exit__(None, None, None)
def _fetch(url=None):
"""Fetch html page source and save as .html attribute"""
if url:
sb.get(url)
sb.html = sb.get_page_source()
def _get_response(url=None, use_httpx=False):
"""
Generate requests.Response or httpx.Response object eg for further use by parsel/lxml etc
"""
sb.fetch(url) # copy page source to .html
status_code = sb.get_link_status_code(sb.get_current_url())
if use_httpx:
response = httpx.Response(status_code)
else:
response = requests.Response()
response.status_code = status_code
response._content_consumed = True
response.url = sb.get_current_url()
if isinstance(sb.html, bytes):
response._content = base64.b64decode(sb.html)
if isinstance(sb.html, str):
response._content = sb.html.encode("utf-8")
return response
for method in [_close, _fetch, _get_response]:
setattr(sb, method.__name__.removeprefix("_"), method)
# Add aliases:
sb.first = sb.find_element
sb.all = sb.find_elements
sb.quit = sb.close
atexit.register(sb.quit)
return sb