前言
有人开发了一个识别图片验证码的工具库ddddocr,原来题主之前有讲过,遇到这种问题就放弃或者协商,去识别存在一定开发成本或者是错误成本,毕竟正确率并没有达到100,即使是ddddocr,只有万能验证码或者不校验才是万全之策,如果它需要在生产环境运行呢?那就不能这样玩了,所以还是需要去开发对应的脚本来解决这个问题,想要保障准确率呢就选择花钱,如果无所谓就使用ddddocr库就可以了。
目的
很多时候,这样的问题应用于爬虫,或者说是UI自动化测试,但是很多时候都只是发自于个人的兴趣爱好,与实际工作关系并不十分密切。所以很多小白,在一头扎进来的时候就埋头苦干,从来不考虑经济效益!投入产出比很明显是亏的,还干得津津有味,当作是提升自己的一种方式;其实不然,这种顶多只是看起来很努力,自己却从来不思考。
介绍
[ddddocr第三方库](https://github.com/sml2h3/ddddocr),需要python环境>=3.8,安装pip install ddddocr
使用
以下函数只是封装了,且只对已保存为图片格式的文件有效,如果是那种base64位的还需要再给写入文件才能操作。文章来源:https://www.toymoban.com/news/detail-544648.html
def get_captcha(img_url, filename=os.path.join(screen_dir, "captcha.png")):
"""通过元素的src属性来download图片"""
requestd.urlretrieve(img_url, filename=filename)
with open(os.path.join(screen_dir, "captcha.png"), "rb") as pf:
img_bytes = pf.read()
captcha = ocr.classification(img_bytes)
return captcha
文章来源地址https://www.toymoban.com/news/detail-544648.html
代码演示
import os
import urllib.request as requestd
from time import sleep
import ddddocr
from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
from Common.conf_dirs import screen_dir
ocr = ddddocr.DdddOcr()
driver = webdriver.Chrome()
driver.implicitly_wait(30)
driver.maximize_window()
driver.get("http://192.168.2.211/login")
img_url = driver.find_element(By.CSS_SELECTOR, "div.login-captcha>img").get_attribute("src")
def get_captcha(img_url, filename=os.path.join(screen_dir, "captcha.png")):
"""通过元素的src属性来download图片"""
requestd.urlretrieve(img_url, filename=filename)
with open(os.path.join(screen_dir, "captcha.png"), "rb") as pf:
img_bytes = pf.read()
captcha = ocr.classification(img_bytes)
return captcha
captcha = get_captcha(img_url)
driver.find_element(By.XPATH, "//input[@placeholder='请输入用户名']").send_keys("***********")
driver.find_element(By.XPATH, "//input[@placeholder='请输入密码']").send_keys("**********")
driver.find_element(By.XPATH, "//input[@placeholder='请输入验证码']").send_keys(captcha + "1")
driver.find_element(By.XPATH, "//span[text()='登录']").click()
flag = True
while flag:
ele = driver.find_element(By.XPATH, "//p[contains(text(),'验证码不正确')]")
if ele:
print("出现了")
img_url = driver.find_element(By.CSS_SELECTOR, "div.login-captcha>img").get_attribute("src")
captcha = get_captcha(img_url)
driver.find_element(By.XPATH, "//input[@placeholder='请输入验证码']").clear()
driver.find_element(By.XPATH, "//input[@placeholder='请输入验证码']").send_keys(captcha)
driver.find_element(By.XPATH, "//span[text()='登录']").click()
sleep(2)
print("登录了")
if not ele:
print("跳出")
break
action = ActionChains(driver)
action.move_to_element(driver.find_element(By.CSS_SELECTOR, ".el-popover__reference-wrapper")).click().perform()
driver.find_element(By.XPATH, "//li[contains(text(),'订单管理')]").click()
"""
driver.find_element(By.XPATH, "//input[@placeholder='全部订单状态']").click()
eles = driver.find_elements(By.XPATH, "/html/body/div[4]/div[1]/div[1]/ul/li")
for ele in eles:
# ele.click()
driver.execute_script("arguments[0].click()", ele)
option = ele.get_attribute("key")
driver.find_element(By.XPATH, "//input[@placeholder='{}']".format(option)).click()
sleep(0.5)
"""
# driver.find_element(By.XPATH, "//input[@placeholder='开始下单时间']").send_keys("2022-07-04 00:00:00")
# driver.find_element(By.XPATH, "//input[@placeholder='结束下单时间']").send_keys("2022-08-25 00:00:00")
# driver.find_element(By.XPATH,"//span[text()='查询']").click()
# driver.find_element(By.XPATH,"//span[text()='导出Excel']").click()
driver.quit()
到了这里,关于Selenium工具:图片验证码识别技术(小白技术)的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!