python selenium linux 提示加载超时 Traceback (most recent call last): File "/www/wwwroot/hrjob/hr.py", line 34, in <module> wait.until(EC.presence_of_element_located((By.ID, 'app'))) File "/usr/local/lib/python3.10/dist-packages/selenium/webdriver/support/wait.py", line 95, in until raise TimeoutException(message, screen, stacktrace)
一、浏览器驱动下载Selenium需要浏览器驱动程序才能与所选浏览器交互。例如,Firefox需要安装geckodriver。确保它在PATH中。主流浏览器驱动下载地址如下:具体,可以查看:chromedriver、geckodriver、MicrosoftWebDriver、IEDriverServer和operadriver之间的恩怨纠葛ps:最新版chromedriver的下载地址:https://googlechromelabs.github.io/chrome-for-testing/#stable
Selenium 简介Selenium 经历了三个大版本,Selenium 1.0、Selenium 2.0 和 Selenium 3.0。Selenium 不是由单独一个工具构成的,而是由一些插件和类库组成的,这些插件和类库有其各自的特点和应用场景。Selenium 1.0 家族关系如下图所示。1.1 Selenium 1.0(1)Selenium IDE。Selenium IDE 是嵌入在 Firefox 浏览器中的一个插件,它提供了比较完备的自动化功能,如脚本录制/回放、定时任务等;还可以将录制的脚本导成不同编程语言的 Selenium 测试脚本,这在很大程度上可以帮助新手编写测试用例。但旧版的 Selenium IDE 不支持 Firefox 新版本支持的 API,因此 Selenium 团队重新开发了新版的 Selenium IDE,可以同时支持 Chrome、Firefox 以及其他浏览器。项目地址:https://github.com/SeleniumHQ/selenium-ide。(2)Selenium Grid。Selenium Grid 是一个自动化测试辅助工具。利用 Grid 可以很方便地实现在多台机器上或异构环境中运行测试用例。(3)Selenium RC(Remote Control)。Selenium RC是 Selenium 家族的核心部分,支持多种不同语言编写的自动化测试脚本。把 Selenium RC 的服务器作为代理服务器去访问应用,即可达到测试的目的。Selenium RC 分为 Client Libraries 和 Selenium Server 两部分。Client Libraries 主要用于编写测试脚本,负责控制 Selenium Server 的库。Selenium Server 负责控制浏览器行为。 Selenium Server 主要分为三部分:Selenium Core、Launcher 和 Http Proxy。Selenium Core就是一堆 JavaScript 函数的集合。通过这些 JavaScript 函数,我们可以用程序对浏览器进行操作。Launcher 用于启动浏览器,把 Selenium Core 加载到浏览器页面当中,同时,把浏览的代理设置为 Http Proxy。1.2 Selenium 2.0Selenium 2.0 把 WebDriver 加到了 Selenium1.0 这个家族中,简单用公式表示如下:Selenium 2.0 = Selenium 1.0 + WebDriver需要注意的是,在 Selenium 2.0 中主推的是 WebDriver,可以将其看作 Selenium RC 的替代品。为了保持向下的兼容性,Selenium 2.0 并没有彻底抛弃 Selenium RC。Selenium RC 与 WebDriver 的工作方式有着本质的区别。(1)Selenium RC的工作原理:需要Selenium RC启动一个Server,将操作Web元素的API调用转化为一段段Javascript,在Selenium内核启动浏览器之后注入这段Javascript。这种Javascript注入技术的缺点是速度不理想,而且稳定性大大依赖于Selenium内核对API翻译成的Javascript质量高低。 (2)WebDriver的工作原理:当Selenium2.x 提出了WebDriver的概念之后,它提供了完全另外的一种方式与浏览器交互。那就是利用浏览器原生的API,封装成一套更加面向对象的Selenium WebDriver API,直接操作浏览器页面里的元素,甚至操作浏览器本身(截屏,窗口大小,启动,关闭,安装插件,配置证书之类的)。由于使用的是浏览器原生的API,速度大大提高,并且避免了 JavaScript 安全模型导致的限制。当然带来了一些副作用,就是不同的浏览器厂商,对Web元素的操作和呈现多少会有一些差异,这就直接导致了Selenium WebDriver要分浏览器厂商不同,而提供不同的实现。例如Firefox就有专门的FirefoxDriver,Chrome就有专门的ChromeDriver等等。(甚至包括了AndroidDriver和iOS WebDriver)1.3 Selenium 3.0Selenium 3.0 做了以下更新:(1)去掉了 Selenium RC,简单用公式表示如下:Selenium 3.0 = Selenium 2.0 −Selenium RC(2)Selenium 3.0 只支持 Java 8 以上版本。(3)Selenium 3.0 中的 Firefox 浏览器驱动独立了。Selenium 2.0 测试库默认是集成Firefox 浏览器驱动的,在 Selenium 3.0 中,Firefox 浏览器和 Chrome 浏览器一样,在使用前需要下载和设置浏览器驱动。(4)mac OS 操作系统集成了 Safari 的浏览器驱动,该驱动默认在/usr/bin/safaridriver 目录下。(5)只支持 IE 9.0 以上版本。1.4 各浏览器驱动下载地址GeckoDriver(Firefox):https://github.com/mozilla/geckodriver/releasesChromeDriver(Chrome):https://sites.google.com/a/chromium.org/chromedriver/homeIEDriverServer(IE):http://selenium-release.storage.googleapis.com/index.htmlOperaDriver(Opera):https://github.com/operasoftware/operachromiumdriver/releasesMicrosoftWebDriver(Edge):https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver
Selenium 八大定位法2.1 ID定位HTML Tag 的 id 属性值是唯一的,故不存在根据 id 定位多个元素的情况。下面以在百度首页搜索框输入文本“python”为例。搜索框的 id 属性值为“kw”,如图1.1所示:代码如下,“find_element_by_id”方法已废弃,使用find_element(By.ID, 'kw')from selenium import webdriver from selenium.webdriver.common.by import By driver = webdriver.Firefox() # 需要将浏览器驱动添加到环境变量中 # 打开百度 driver.get('https://www.baidu.com/') # 通过id,在搜索输入框中输入文本“python” driver.find_element(By.ID, 'kw').send_keys('python') # 点击搜索 driver.find_element(By.ID, 'su').click() # 关闭浏览器 driver.close() 2.2 name 定位以上百度搜索框也可以用 name 来实现,如图 1.1 所示,其 name 属性值为“wd”,方法“find_element(By.NAME, 'wd')”表示通过 name 来定位代码如下:driver = webdriver.Firefox() # 打开百度 driver.get('https://www.baidu.com/') # 通过name,在搜索输入框中输入文本“自动化测试” driver.find_element(By.NAME, 'wd').send_keys('自动化测试') # 点击搜索 driver.find_element(By.ID, 'su').click() # 关闭浏览器 driver.close()注意:用 name 方式定位需要保证 name 值唯一,否则定位失败。2.3 class 定位以百度首页搜索框为例,如图 1.1所示,其 class 属性值为“s_ipt”,“By.CLASS_NAME, 's_ipt'”表示通过 class_name 来定位代码如下:driver = webdriver.Firefox() # 打开百度 driver.get('https://www.baidu.com/') # 通过class,在搜索输入框中输入文本“web测试” driver.find_element(By.CLASS_NAME, 's_ipt').send_keys('web测试') # 点击搜索 driver.find_element(By.ID, 'su').click() # 关闭浏览器 driver.close()2.4 link_text 定位代码如下:driver = webdriver.Firefox() # 打开百度 driver.get('https://www.baidu.com/') # 通过link_text定位,点击‘新闻’超链接 driver.find_element(By.LINK_TEXT, '新闻').click() # 关闭浏览器 driver.close()注意:用此方法定位元素超链接,中文字需要写全。2.5 partial_link_text 定位即用超链接文字的部分文本来定位元素,类似数据库的模糊查询。以“新闻”超链接为例,只需“新”一个字即可,即取超链接全部文本的一个子集。代码如下:driver = webdriver.Firefox() # 打开百度 driver.get('https://www.baidu.com/') # 通过partial_link_text定位,用超链接文字的部分文本来定位元素,类似数据库的模糊查询 driver.find_element(By.PARTIAL_LINK_TEXT, '新').click() # 关闭浏览器 driver.close()2.6 tag_name 定位tag_name 定位即通过标签名称定位,如图 1.6所示,定位标签“form”并打印标签属性值“name”。代码如下:driver = webdriver.Firefox() # 打开百度 driver.get('https://www.baidu.com/') # tag_name 定位即通过标签名称定位 print(driver.find_element(By.TAG_NAME, 'form').get_attribute('name'))成功后控制台输出“f”:2.7 CSS 定位CSS 定位的优点是速度快、语法简洁。表 1.1 中的内容出自 W3School 的 CSS 参考手册。CSS 定位的选择器有十几种,在本节中主要介绍几种比较常用的选择器。仍以百度搜索框为例,代码如下:driver = webdriver.Firefox() # 打开百度 driver.get('https://www.baidu.com/') # 以class选择器为例,实现CSS定位,在搜索框输入“python3” driver.find_element(By.CSS_SELECTOR, '.s_ipt').send_keys('python3') # 以id定位语法结构为:#加 id 名,实现CSS定位,在搜索框输入“python3” driver.find_element(By.CSS_SELECTOR, '#kw').send_keys('python3') # CSS 定位主要利用属性 class 和 id 进行元素定位。也可以利用常规的标签名称来定位,如输入框标签“input”,在标签内部又设置了属性值为“name=’wd’” driver.find_element(By.CSS_SELECTOR, "input[name='wd']").send_keys('python3') # CSS 定位方式可以使用元素在页面布局中的绝对路径来实现元素定位。百度首页搜索输入框元素的绝对路 # 径为“html>body>div>div>div>div>div>form>span>input[name="wd"]” driver.find_element(By.CSS_SELECTOR, 'html>body>div>div>div>div>div>form>span>input[name="wd"]').send_keys('python3') # CSS 定位也可以使用元素在页面布局中的相对路径来实现元素定位。相对路径的写法和直接利用标签名称来定位,两者 # 的代码实现的功能是一致的 driver.find_element(By.CSS_SELECTOR, "input[name='wd']").send_keys('python3') # 点击搜索 driver.find_element(By.ID, 'su').click() # 关闭浏览器 driver.close()2.8 XPath 定位通过 XPath 来定位元素的方式,对比较难以定位的元素来说很有效,几乎都可以解决,特别是对于有些元素没有 id、name 等属性的情况。XPath 是 XML Path 语言的缩写,是一种用来确定 XML 文档中某部分位置的语言。它在 XML 文档中通过元素名和属性进行搜索,主要用途是在 XML 文档中寻找节点。XPath定位比 CSS 定位有更大的灵活性。XPath 可以向前搜索也可以向后搜索,而 CSS 定位只能向前搜索,但是 XPath 定位的速度比 CSS 慢一些。XPath 语言包含根节点、元素、属性、文本、处理指令、命名空间等。以下文本为 XML实例文档,用于演示 XML 的各种节点类型,便于理解 XPath。<?xml version = "1.0" encoding = "utf-8" ?> <!-- 这是一个注释节点 --> <animalList type="mammal"> <animal categoruy = "forest"> <name>Tiger</name> <size>big</size> <action>run</action> </animal> </animalList>其中<animalList>为文档节点,也是根节点;<name>为元素节点;type=“mammal”为属性节点。节点之间的关系:• 父节点。每个元素都有一个父节点,如上面的 XML 示例中,animal 元素是 name、size,以及 action 元素的父节点。• 子节点。与父节点相反,这里不再赘述。• 兄弟节点,有些也叫同胞节点。它表示拥有相同父节点的节点。如上代码所示,name、size 和 action 元素都是同胞节点。• 先辈节点。它是指某节点的父节点,或者父节点的父节点,以此类推。如上代码所示,name 元素节点的先辈节点有 animal 和 animalList。• 后代节点。它表示某节点的子节点、子节点的子节点,以此类推。如上代码所示,animalList 元素节点的后代节点有 animal、name 等。仍以百度搜索框为例,代码如下:driver = webdriver.Firefox() # 打开百度 driver.get('https://www.baidu.com/') # XPath 有多种定位策略,最简单直观的就是写出元素的绝对路径。 driver.find_element(By.XPATH, '/html/body/div/div/div/div/div/form/span/input').send_keys('python3') # XPath还可以使用元素的属性值来定位。//input 表示当前页面某个 input 标签,[@id='kw'] 表示这个元素的 id 值是 kw。 driver.find_element(By.XPATH, "//input[@id='kw']").send_keys('python3') # 如果一个元素本身没有可以唯一标识这个元素的属性值,我们可以查找其上一级元素。 # form[@class='fm has-soutu']通过 class 定位到父元素,后面的/span/input 表示父元素下面的子元素。 driver.find_element(By.XPATH, "//form[@class='fm has-soutu']/span/input").send_keys('python3') # 如果一个属性不能唯一区分一个元素,那么我们可以使用逻辑运算符连接多个属性来查找元素 driver.find_element(By.XPATH, "//input[@id='kw' and @class='s_ipt']").send_keys('python3') # 点击搜索 driver.find_element(By.ID, 'su').click() # 关闭浏览器 driver.close()
本地构建运行seleniumhq.github.io时,报错 PS D:\Repositories\Geekmister\seleniumhq.github.io\seleniumhq.github.io.git\website_and_docs> hugo server WARN deprecated: config: languages.zh-cn.description: custom params on the language top level was deprecated in Hugo v0.112.0 and will be removed in a future release. Put the value below [languages.zh-cn.params]. See https://gohugo.io/content-management/multilingual/#changes-in-hugo-01120 WARN deprecated: config: languages.ja.description: custom params on the language top level was deprecated in Hugo v0.112.0 and will be removed in a future release. Put the value below [languages.ja.params]. See https://gohugo.io/content-management/multilingual/#changes-in-hugo-01120 WARN deprecated: config: languages.other.description: custom params on the language top level was deprecated in Hugo v0.112.0 and will be removed in a future release. Put the value below [languages.other.params]. See https://gohugo.io/content-management/multilingual/#changes-in-hugo-01120 WARN deprecated: config: languages.en.description: custom params on the language top level was deprecated in Hugo v0.112.0 and will be removed in a future release. Put the value below [languages.en.params]. See https://gohugo.io/content-management/multilingual/#changes-in-hugo-01120 WARN deprecated: config: languages.pt-br.description: custom params on the language top level was deprecated in Hugo v0.112.0 and will be removed in a future release. Put the value below [languages.pt-br.params]. See https://gohugo.io/content-management/multilingual/#changes-in-hugo-01120 hugo: downloading modules … go: github.com/google/docsy@v0.8.0 requires github.com/FortAwesome/Font-Awesome@v0.0.0-20230327165841-0698449d50f2: invalid version: git fetch -f origin refs/heads/*:refs/heads/* refs/tags/*:refs/tags/* in C:\Users\Administrator\AppData\Local\hugo_cache\modules\filecache\modules\pkg\mod\cache\vcs\a81254f0cf90f611158215d1cb50586eccc323b54ab825bb7e9c8b2580ac9fb3: exit status 128: error: RPC failed; curl 92 HTTP/2 stream 5 was not closed cleanly: CANCEL (err 8) error: 6019 bytes of body are still expected fetch-pack: unexpected disconnect while reading sideband packet fatal: early EOF fatal: fetch-pack: invalid index-pack output hugo: collected modules in 182960 ms Hugo provides its own webserver which builds and serves the site. While hugo server is high performance, it is a webserver with limited options. 'hugo server' will by default write and server files from disk, but you can render to memory by using the '--renderToMemory' flag. This can be faster in some cases, but it will consume more memory. By default hugo will also watch your files for any changes you make and automatically rebuild the site. It will then live reload any open browser pages and push the latest content to them. As most Hugo sites are built in a fraction of a second, you will be able to save and see your changes nearly instantly. Usage: hugo server [command] [flags] hugo server [command] Aliases: server, serve Available Commands: trust Install the local CA in the system trust store. Flags: --appendPort append port to baseURL (default true) -b, --baseURL string hostname (and path) to the root, e.g. https://spf13.com/ --bind string interface to which the server will bind (default "127.0.0.1") -D, --buildDrafts include content marked as draft -E, --buildExpired include expired content -F, --buildFuture include content with publishdate in the future --cacheDir string filesystem path to cache directory --cleanDestinationDir remove files from destination not found in static directories -c, --contentDir string filesystem path to content directory --disableBrowserError do not show build errors in the browser --disableFastRender enables full re-renders on changes --disableKinds strings disable different kind of pages (home, RSS etc.) --disableLiveReload watch without enabling live browser reload on rebuild --enableGitInfo add Git revision, date, author, and CODEOWNERS info to the pages --forceSyncStatic copy all files when static is changed. --gc enable to run some cleanup tasks (remove unused cache files) after the build -h, --help help for server --ignoreCache ignores the cache directory -l, --layoutDir string filesystem path to layout directory --liveReloadPort int port for live reloading (i.e. 443 in HTTPS proxy situations) (default -1) --minify minify any supported output format (HTML, XML etc.) --navigateToChanged navigate to changed content file on live browser reload --noBuildLock don't create .hugo_build.lock file --noChmod don't sync permission mode of files --noHTTPCache prevent HTTP caching --noTimes don't sync modification time of files --panicOnWarning panic on first WARNING log --poll string set this to a poll interval, e.g --poll 700ms, to use a poll based approach to watch for file system changes -p, --port int port on which the server will listen (default 1313) --pprof enable the pprof server (port 8080) --printI18nWarnings print missing translations --printMemoryUsage print memory usage to screen at intervals --printPathWarnings print warnings on duplicate target paths etc. --printUnusedTemplates print warnings on unused templates. --renderStaticToDisk serve static files from disk and dynamic files from memory --templateMetrics display metrics about template executions --templateMetricsHints calculate some improvement hints when combined with --templateMetrics -t, --theme strings themes to use (located in /themes/THEMENAME/) --tlsAuto generate and use locally-trusted certificates. --tlsCertFile string path to TLS certificate file --tlsKeyFile string path to TLS key file --trace file write trace to file (not useful in general) -w, --watch watch filesystem for changes and recreate as needed (default true) Global Flags: --clock string set the clock used by Hugo, e.g. --clock 2021-11-06T22:30:00.00+09:00 --config string config file (default is hugo.yaml|json|toml) --configDir string config dir (default "config") --debug debug output -d, --destination string filesystem path to write files to -e, --environment string build environment --ignoreVendorPaths string ignores any _vendor for module paths matching the given Glob pattern --logLevel string log level (debug|info|warn|error) --quiet build in quiet mode --renderToMemory render to memory (mostly useful when running the server) -s, --source string filesystem path to read files relative from --themesDir string filesystem path to themes directory -v, --verbose verbose output Use "hugo server [command] --help" for more information about a command. Error: command error: failed to load modules: failed to download modules: failed to execute 'go [mod download -modcacherw]': failed to execute binary "go" with args [mod download -modcacherw]: go: github.com/google/docsy@v0.8.0 requires github.com/FortAwesome/Font-Awesome@v0.0.0-20230327165841-0698449d50f2: invalid version: git fetch -f origin refs/heads/*:refs/heads/* refs/tags/*:refs/tags/* in C:\Users\Administrator\AppData\Local\hugo_cache\modules\filecache\modules\pkg\mod\cache\vcs\a81254f0cf90f611158215d1cb50586eccc323b54ab825bb7e9c8b2580ac9fb3: exit status 128: error: RPC failed; curl 92 HTTP/2 stream 5 was not closed cleanly: CANCEL (err 8) error: 6019 bytes of body are still expected fetch-pack: unexpected disconnect while reading sideband packet fatal: early EOF fatal: fetch-pack: invalid index-pack output *errors.errorString 尝试过的方案,均失败 1、git config --global http.postBuffer 1024M 暂时想不到其他的解决方案
实例代码 from selenium import webdriver from selenium.common.exceptions import NoSuchElementException from selenium.webdriver.chrome.options import Options def initialize_driver(): options = Options() driver = webdriver.Chrome(options=options) return driver def find_element_by_id(driver, element_id): try: element = driver.find_element_by_id(element_id) print(element) except NoSuchElementException: print(f"Element with ID '{element_id}' not found.") def find_elements_by_id(driver, element_id): try: elements = driver.find_elements_by_id(element_id) print(elements) except NoSuchElementException: print(f"Elements with ID '{element_id}' not found.") def find_element_by_class(driver, class_name): try: element = driver.find_element_by_class_name(class_name) print(element) except NoSuchElementException: print(f"Element with class '{class_name}' not found.") def find_element_by_xpath(driver, xpath): try: element = driver.find_element_by_xpath(xpath) print(element) except NoSuchElementException: print(f"Element with XPath '{xpath}' not found.") def find_element_by_link_text(driver, link_text): try: element = driver.find_element_by_link_text(link_text) print(element) except NoSuchElementException: print(f"Element with link text '{link_text}' not found.") def find_element_by_partial_link_text(driver, partial_link_text): try: element = driver.find_element_by_partial_link_text(partial_link_text) print(element) except NoSuchElementException: print(f"Element with partial link text '{partial_link_text}' not found.") def find_elements_by_tag_name(driver, tag_name): try: elements = driver.find_elements_by_tag_name(tag_name) print(elements) except NoSuchElementException: print(f"Elements with tag name '{tag_name}' not found.") def retrieve_tags(): driver = initialize_driver() driver.get('https://www.douban.com') # 使用封装的函数进行元素查找 find_element_by_id(driver, 'anony-nav') # find_elements_by_id(driver, 'anony-nav') # find_element_by_class(driver, 'anony-nav') # find_element_by_xpath(driver, '//*[@id="anony-nav"]/h1/a') # find_element_by_link_text(driver, '下载豆瓣 App') # find_element_by_partial_link_text(driver, '豆瓣') # find_elements_by_tag_name(driver, 'div') # find_element_by_tag_name(driver, 'h1') # find_element_by_link_text(driver, '下载豆瓣 App') # 关闭WebDriver driver.quit() if __name__ == "__main__": retrieve_tags() 解析:如上代码在脚本通过“find_element_by_id”获取元素对象的时候就一直报错,断点查看“driver”对象是存在的,但是“find_element_by_id”不存在,是因为版本问题吗? "image.png" (https://wmprod.oss-cn-shanghai.aliyuncs.com/c/user/20240928/35ed6f018b9fd2fe84cf688161507eac.png) 有没有大佬帮忙看一下,刚刚学,不是很懂,谢谢。
selenium 开启远程调试端口后,处理器温度飙升,是chrome问题?
用selenium去登录temu的网站,为什么每次输入完账号后点击下一步,不会出现出现输入密码的框,他都会自己自动刷新了,又叫我输入账号,有没有大神求一些思路怎么弄?
几天前不小心删了chrome浏览器,重装后出现问题,今天运行python代码时出现 "init() got an unexpected keyword argument 'executable_path’" 错误,以前没有出现过的,于是我执行 pip show selenium pip install selenium==4.9.0 后,出现更严重的问题 Traceback (most recent call last): File "D:\aaa\py3\DayUpdateDatafun.py", line 131, in DayUpdateDatafun() File "D:\aaa\py3\DayUpdateDatafun.py", line 80, in DayUpdateDatafun findx.up_ths_gn_kl_all() File "D:\aaa\py3\findx.py", line 33077, in up_ths_gn_kl_all up_ths_gn_kl(code,name) File "D:\aaa\py3\findx.py", line 33060, in up_ths_gn_kl df=get_ths_kl_rt_cs(code,name,1) File "D:\aaa\py3\findx.py", line 33038, in get_ths_kl_rt_cs return get_ths_gn_kl(platecode) File "D:\aaa\py3\findx.py", line 32949, in get_ths_gn_kl html = get_page_detail(url) File "D:\aaa\py3\findx.py", line 32935, in get_page_detail 'Cookie': 'v={}'.format(get_cookie()) File "D:\aaa\py3\findx.py", line 32916, in get_cookie driver = webdriver.Chrome(executable_path=CHROME_DRIVER_PATH,chrome_options= options)# File "d:\Python37\lib\site-packages\selenium\webdriver\chrome\webdriver.py", l ine 93, in __init__ keep_alive, File "d:\Python37\lib\site-packages\selenium\webdriver\chromium\webdriver.py", line 112, in __init__ options=options, File "d:\Python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", l ine 286, in __init__ self.start_session(capabilities, browser_profile) File "d:\Python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", l ine 378, in start_session response = self.execute(Command.NEW_SESSION, parameters) File "d:\Python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", l ine 440, in execute self.error_handler.check_response(response) File "d:\Python37\lib\site-packages\selenium\webdriver\remote\errorhandler.py" , line 245, in check_response raise exception_class(message, screen, stacktrace)selenium.common.exceptions.WebDriverException: Message: unknown error: Failed to create Chrome process. Stacktrace: Backtrace: (No symbol) [0x00496643] (No symbol) [0x0042BE21] (No symbol) [0x0032DA9D] (No symbol) [0x0034D95D] (No symbol) [0x0034A899] (No symbol) [0x00386917] (No symbol) [0x0038655C] (No symbol) [0x0037FB76] (No symbol) [0x003549C1] (No symbol) [0x00355E5D] GetHandleVerifier [0x0070A142+2497106] GetHandleVerifier [0x007385D3+2686691] GetHandleVerifier [0x0073BB9C+2700460] GetHandleVerifier [0x00543B10+635936] (No symbol) [0x00434A1F] (No symbol) [0x0043A418] (No symbol) [0x0043A505] (No symbol) [0x0044508B] BaseThreadInitThunk [0x76B6343D+18] RtlInitializeExceptionChain [0x779E9802+99] RtlInitializeExceptionChain [0x779E97D5+54] 这个问题困扰我几天了,昨天重装浏览器后好了一阵,跑我的python代码没问题,今天又出现了前面提到的问题
使用ptyhon的selenium来抓取目标站时,发现网站使用了cloudflare,用平常的手段均无法跳过,都会被屏蔽掉。