前言
-
平时做爬虫我比较喜欢用
selenium chrome
,一直困扰我一个问题,就是只要谷歌浏览器更新了,就要重新去下载对应版本的chromedriver_win32
,这让我十分烦恼 -
比如我的谷歌浏览器已经94版本了,但是
chromedriver_win32
还停留在92版本,就会报出下面的错误 -
selenium.common.exceptions.SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 92
Current browser version is 94.0.4606.71 with binary path C:\Program Files (x86)\Google\Chrome\Application\chrome.exe -
下面就解决这个痛点,实现自动下载和本地谷歌浏览器匹配的
chromedriver_win32
selenium基本代码,抛出问题
① 我们新建一个selenium_test.py
,下面代码是打开百度页面的基本selenium代码,
from selenium import webdriver
import time
class MySelenium(object):
def __init__(self):
self.basic_url = "https://www.baidu.com/"
self.executable_path = r"./chromedriver_win32/chromedriver.exe"
@property
def start_driver(self):
self.browser = webdriver.Chrome(executable_path=self.executable_path)
self.browser.maximize_window()
def request_url(self, url):
"""
:param url:
"""
self.browser.get(url)
if __name__ == '__main__':
start_time = time.time()
ms = MySelenium()
ms.start_driver
ms.request_url(ms.basic_url)
#ms.close()
test_time = time.strftime("%H:%M:%S", time.gmtime(time.time() - start_time))
print(test_time)
② 因为这个代码是我在很早之前写的,当时本地的谷歌浏览器版本还是92,现在已经升级到94了,于是就报了下面的错误,就是版本不匹配
selenium.common.exceptions.SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 92
Current browser version is 94.0.4606.71 with binary path C:\Program Files (x86)\Google\Chrome\Application\chrome.exe
实现自动下载匹配的chromedriver ,解决问题
问题1,怎么获取本地谷歌浏览器的版本?
①,一般获取本地应用的版本信息,用的都是win32com
库的接口,新建文件 download_driver.py
具体代码如下
- 请先知道谷歌浏览器在本地的安装位置,代码中列出来一般情况,特殊安装位置要加进列表
local_chrome_paths
from win32com.client import Dispatch
class auto_download_chromedrive(object):
def __init__(self):
self.local_chrome_paths = [r"C:\Program Files\Google\Chrome\Application\chrome.exe",
r"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe"]
def get_version_via_com(self, filename):
parser = Dispatch("Scripting.FileSystemObject")
try:
version = parser.GetFileVersion(filename)
except Exception:
return None
return version
def start(self):
version = list(filter(None, [self.get_version_via_com(p) for p in self.local_chrome_paths]))[0]
if not version:
print("check chrome browser version failed!")
return None
print("chrome browser version:", version)
if __name__ == "__main__":
chrome = auto_download_chromedrive()
chrome.start()
②,输出结果可以得到本地浏览器的版本是94版本的。
问题2,怎么知道本地的 chromedriver_win32版本不匹配?
①,我们再看下chrome不匹配的报错,具体的报错位SessionNotCreatedException
我们修改下selenium
的脚本,加上
try except捕捉这个错误
- 更改下
selenium_test.py
文件的脚本, 导入SessionNotCreatedException
用try
,捕捉到这个报错return 0
from selenium import webdriver
from selenium.common.exceptions import SessionNotCreatedException #导入SessionNotCreatedException
import time
class MySelenium(object):
def __init__(self):
self.basic_url = "https://www.baidu.com/"
self.executable_path = r"./chromedriver_win32/chromedriver.exe"
@property
def start_driver(self):
try:
self.browser = webdriver.Chrome(executable_path=self.executable_path)
self.browser.maximize_window()
except SessionNotCreatedException:
print("Chrome version unmatch. ")
return 0
return 1
问题3,怎么下载最新的chromedriver.exe?
①,首先肯定不能用selenium
了,这里我选择了requests
和 lxml
库完成下载任务
如下图 chromedriver.exe 下载地址 和页面结构
②,更新download_driver.py
,添加一个函数 get_chromedriver_urls
,用来解析页面
from win32com.client import Dispatch
import requests
from lxml import etree
class auto_download_chromedrive(object):
def __init__(self):
self.chromedrive_url = "https://chromedriver.chromium.org/downloads"
self.local_chrome_paths = [r"C:\Program Files\Google\Chrome\Application\chrome.exe",
r"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe"]
self.headers = {'content-type': 'application/json',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0'}
def get_version_via_com(self, filename):
parser = Dispatch("Scripting.FileSystemObject")
try:
version = parser.GetFileVersion(filename)
except Exception:
return None
return version
def get_chromedriver_urls(self):
try:
r = requests.Session()
response = r.get(self.chromedrive_url, headers=self.headers)
print(response.status_code, response.encoding)
html = etree.HTML(response.text, etree.HTMLParser()) # 解析HTML文本内容
version_href = html.xpath(".//strong//..//@href")
return version_href
except Exception:
return None
def start(self):
'''读取本地chrome version'''
version = list(filter(None, [self.get_version_via_com(p) for p in self.local_chrome_paths]))[0]
if not version:
print("check chrome browser version failed!")
return None
print("chrome browser version:", version)
'''下载网页端与本地匹配的chromedriver.exe'''
version_href = self.get_chromedriver_urls()
if not version_href:
print("request %s failed!"%self.chromedrive_url)
return None
print("all chrome browser versions can be choosed:\n{}".format(version_href))
if __name__ == "__main__":
chrome = auto_download_chromedrive()
chrome.start()
③,输出结果如下图,得到所有的chromedriver
的下载地址
休息下1分钟。。。
④,更新download_driver.py
,目的是在所有的版本中找到本地浏览器匹配的版本,并下载下来
- 新加函数
find_local_version
,在所有的版本中找到本地浏览器匹配的版本 - 然后删减和拼接下url,组成新的url,直指
chromedriver_win32.zip
- 新加函数
download_chromadrive
,根据上面的url,下载指定版本
from win32com.client import Dispatch
import re
import requests
from lxml import etree
class auto_download_chromedrive(object):
def __init__(self):
self.chromedrive_url = "https://chromedriver.chromium.org/downloads"
self.local_chrome_paths = [r"C:\Program Files\Google\Chrome\Application\chrome.exe",
r"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe"]
self.headers = {'content-type': 'application/json',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0'}
def get_version_via_com(self, filename):
parser = Dispatch("Scripting.FileSystemObject")
try:
version = parser.GetFileVersion(filename)
except Exception:
return None
return version
def get_chromedriver_urls(self):
try:
r = requests.Session()
response = r.get(self.chromedrive_url, headers=self.headers)
print(response.status_code, response.encoding)
html = etree.HTML(response.text, etree.HTMLParser()) # 解析HTML文本内容
version_href = html.xpath(".//strong//..//@href")
print("all chrome browser versions can be choosed:")
for href in version_href:
print(href)
return version_href
except Exception:
return None
def download_chromadrive(self, url):
try:
r = requests.Session()
response = r.get(url, headers=self.headers)
if response.status_code == 200:
with open("chromedriver_win32.zip", "wb") as f:
f.write(response.content)
print("下载完成")
return 1
else:
print('Url请求返回错误,错误码为: %d' % response.status_code)
return None
except Exception:
print("request download chromedriver_win32.zip failed!")
return None
def find_local_version(self, loc_ver, all_ver):
"""
:param loc_ver: 本地浏览器的版本
:param all_ver: 下载的所有版本浏览器版本
:return: 找到匹配的,return url,否则return None
"""
for href in all_ver:
try:
res = re.search(r"path=(.*?)/", href)
find_ver = res.group(1).split(".")[0] #截取大版本
if loc_ver == find_ver:
return href
except Exception:
continue
print("not find match chrome browser{} version!".format(loc_ver))
return None
def start(self):
'''读取本地chrome version'''
version = list(filter(None, [self.get_version_via_com(p) for p in self.local_chrome_paths]))[0]
if not version:
print("check chrome browser version failed!")
return None
print("chrome browser version:", version)
'''下载网页端与本地匹配的chromedriver.exe'''
version_href = self.get_chromedriver_urls()
if not version_href:
print("request %s failed!"%self.chromedrive_url)
return None
'''找到'''
find_url = self.find_local_version(version.split(".")[0], version_href)
print("找到匹配的版本:\n%s"%find_url)
if not find_url:
return None
version_num = re.search(r"path=(.*?)/", find_url).group(1)
find_url_2 = find_url.rsplit('/', 2)[0]
new_url = "{}/{}/chromedriver_win32.zip".format(find_url_2, version_num)
print("downloading......\n%s"%new_url)
ret = self.download_chromadrive(new_url)
if not ret:
return None
if __name__ == "__main__":
chrome = auto_download_chromedrive()
chrome.start()
⑤,输出结果如下图
问题4,下载的怎么解压并换掉旧版`?
①,直接删除是肯定不行的,因为,当你使用过selenium
之后,必须要在进程找那个杀掉chromedriver.exe才可以替换掉旧版本的。
- 定义一个新函数
kill_process
,如果进程中存在chromedriver.exe
就杀掉
def kill_process(self, process_name):
print("检测{}进程是否存在,存在则杀掉。".format(process_name))
pl = psutil.pids()
for pid in pl:
if psutil.Process(pid).name() == process_name:
print('{} 存在进程中,杀掉'.format(process_name))
os.popen('taskkill /f /im %s' %process_name)
return pid
print('{} 不存在进程中。'.format(process_name))
return None
- 下面一段代码是清除chromedriver_win32文件夹内文件的只读属性(因为我的代码从托管平台同步后就只读了)
old_driver_path = os.path.join(os.getcwd(), "chromedriver_win32")
if os.path.exists(old_driver_path):
for sub_file in os.listdir(old_driver_path):
os.chmod(os.path.join(old_driver_path, sub_file), stat.S_IRWXU)
time.sleep(1) #这个delay必须要有,os操作还是需要时间的
- 下面一段代码解压新版本的
chromedriver
,替换掉旧版本
print('''解压 chromedriver_win32.zip,覆盖旧版本''')
zFile = zipfile.ZipFile(os.path.join(os.getcwd(), "chromedriver_win32.zip"), "r")
for fileM in zFile.namelist():
zFile.extract(fileM, old_driver_path)
zFile.close()
-
download_driver.py
完整代码如下:
from win32com.client import Dispatch
import re
import stat,zipfile,os,psutil
import requests
from lxml import etree
import time
class auto_download_chromedrive(object):
def __init__(self):
self.chromedrive_url = "https://chromedriver.chromium.org/downloads"
self.local_chrome_paths = [r"C:\Program Files\Google\Chrome\Application\chrome.exe",
r"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe"]
self.headers = {'content-type': 'application/json',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0'}
def get_version_via_com(self, filename):
parser = Dispatch("Scripting.FileSystemObject")
try:
version = parser.GetFileVersion(filename)
except Exception:
return None
return version
def get_chromedriver_urls(self):
try:
r = requests.Session()
response = r.get(self.chromedrive_url, headers=self.headers)
print(response.status_code, response.encoding)
html = etree.HTML(response.text, etree.HTMLParser()) # 解析HTML文本内容
version_href = html.xpath(".//strong//..//@href")
print("all chrome browser versions can be choosed:")
for href in version_href:
print(href)
return version_href
except Exception:
return None
def download_chromadrive(self, url):
try:
r = requests.Session()
response = r.get(url, headers=self.headers)
if response.status_code == 200:
with open("chromedriver_win32.zip", "wb") as f:
f.write(response.content)
print("下载完成")
return 1
else:
print('Url请求返回错误,错误码为: %d' % response.status_code)
return None
except Exception:
print("request download chromedriver_win32.zip failed!")
return None
def find_local_version(self, loc_ver, all_ver):
"""
:param loc_ver: 本地浏览器的版本
:param all_ver: 下载的所有版本浏览器版本
:return: 找到匹配的,return url,否则return None
"""
for href in all_ver:
try:
res = re.search(r"path=(.*?)/", href)
find_ver = res.group(1).split(".")[0] #截取大版本
if loc_ver == find_ver:
return href
except Exception:
continue
print("not find match chrome browser{} version!".format(loc_ver))
return None
def kill_process(self, process_name):
print("检测{}进程是否存在,存在则杀掉。".format(process_name))
pl = psutil.pids()
for pid in pl:
if psutil.Process(pid).name() == process_name:
print('{} 存在进程中,杀掉'.format(process_name))
os.popen('taskkill /f /im %s' %process_name)
return pid
print('{} 不存在进程中。'.format(process_name))
return None
def unzip(self):
self.kill_process("chromedriver.exe")
print("去除旧版本chromedriver_win32文件夹内文件的只读属性(如果是只读)")
old_driver_path = os.path.join(os.getcwd(), "chromedriver_win32")
if os.path.exists(old_driver_path):
for sub_file in os.listdir(old_driver_path):
os.chmod(os.path.join(old_driver_path, sub_file), stat.S_IRWXU)
time.sleep(1) #这个delay必须要有,os操作还是需要时间的
print('''解压 chromedriver_win32.zip,覆盖旧版本''')
zFile = zipfile.ZipFile(os.path.join(os.getcwd(), "chromedriver_win32.zip"), "r")
for fileM in zFile.namelist():
zFile.extract(fileM, old_driver_path)
zFile.close()
def start(self):
'''读取本地chrome version'''
version = list(filter(None, [self.get_version_via_com(p) for p in self.local_chrome_paths]))[0]
if not version:
print("check chrome browser version failed!")
return None
print("chrome browser version:", version)
'''下载网页端与本地匹配的chromedriver.exe'''
version_href = self.get_chromedriver_urls()
if not version_href:
print("request %s failed!"%self.chromedrive_url)
return None
find_url = self.find_local_version(version.split(".")[0], version_href)
print("找到匹配的版本:\n%s"%find_url)
if not find_url:
return None
version_num = re.search(r"path=(.*?)/", find_url).group(1)
find_url_2 = find_url.rsplit('/', 2)[0]
new_url = "{}/{}/chromedriver_win32.zip".format(find_url_2, version_num)
print("downloading......\n%s"%new_url)
ret = self.download_chromadrive(new_url)
if not ret:
return None
self.unzip()
if __name__ == "__main__":
chrome = auto_download_chromedrive()
chrome.start()
真舒服。。。
完善selenium主代码
我们只需要在selenium_test.py
中引用download_driver.py
中的auto_download_chromedrive
类即可
源码我放在git上,如需自取
from selenium import webdriver
from selenium.common.exceptions import SessionNotCreatedException #导入NoSuchElementException
import time
from download_driver import auto_download_chromedrive
class MySelenium(object):
def __init__(self):
self.basic_url = "https://www.baidu.com/"
self.executable_path = r"./chromedriver_win32/chromedriver.exe"
@property
def start_driver(self):
try:
self.browser = webdriver.Chrome(executable_path=self.executable_path)
self.browser.maximize_window()
except SessionNotCreatedException:
print("Chrome version unmatch. ")
return None
return 1
def request_url(self, url):
self.browser.get(url)
if __name__ == '__main__':
start_time = time.time()
ms = MySelenium()
if not ms.start_driver:
chrome = auto_download_chromedrive()
chrome.start()
ms.start_driver
ms.request_url(ms.basic_url)
#ms.close()
test_time = time.strftime("%H:%M:%S", time.gmtime(time.time() - start_time))
print(test_time)
源码我放在git上,如需自取
总结
文章来源:https://www.toymoban.com/news/detail-658444.html
- 本博客谢绝转载
文章来源地址https://www.toymoban.com/news/detail-658444.html
- 要有最朴素的生活,最遥远的梦想,即使明天天寒地冻,路遥马亡!
- 如果这篇博客对你有帮助,请 “点赞” “评论”“收藏”一键三连 哦!码字不易,大家的支持就是我坚持下去的动力。当然执意选择白嫖也常来哈。
到了这里,关于关于selenium, 你还在因为chromedriver的版本与Chrome的版本不一致,需要手动更新chromedriver而烦恼吗?的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!