网络爬虫开发(五)01-爬虫高级——Selenium简介 & 根据平台选择安装selenium-webdriver包 & Selenium的基本使用-Toy模板网

这篇具有很好参考价值的文章主要介绍了网络爬虫开发(五)01-爬虫高级——Selenium简介 & 根据平台选择安装selenium-webdriver包 & Selenium的基本使用。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

网络爬虫开发(五)01-爬虫高级——Selenium简介 & 根据平台选择安装selenium-webdriver包 & Selenium的基本使用

第3章爬虫高级

学习目标：

使用Selenium库爬取前端渲染的网页
反反爬虫技术

Selenium简介

官方原文介绍：

Selenium automates browsers. That’s it! What you do with that power is entirely up to you. Primarily, it is for automating web applications for testing purposes, but is certainly not limited to just that. Boring web-based administration tasks can (and should!) be automated as well.

Selenium has the support of some of the largest browser vendors who have taken (or are taking) steps to make Selenium a native part of their browser. It is also the core technology in countless other browser automation tools, APIs and frameworks.

百度百科介绍：

Selenium [1] 是一个用于Web应用程序测试的工具。Selenium测试直接运行在浏览器中，就像真正的用户在操作一样。支持的浏览器包括IE（7, 8, 9, 10, 11），[Mozilla Firefox](https://baike.baidu.com/item/Mozilla Firefox/3504923)，Safari，Google Chrome，Opera等。这个工具的主要功能包括：测试与浏览器的兼容性——测试你的应用程序看是否能够很好得工作在不同浏览器和操作系统之上。测试系统功能——创建回归测试检验软件功能和用户需求。支持自动录制动作和自动生成 .Net、Java、Perl等不同语言的测试脚本。

简单总结：

Selenium是一个Web应用的自动化测试框架，可以创建回归测试来检验软件功能和用户需求，通过框架可以编写代码来启动浏览器进行自动化测试，换言之，用于做爬虫就可以使用代码启动浏览器，让真正的浏览器去打开网页，然后去网页中获取想要的信息！从而实现真正意义上无惧反爬虫手段！

Selenium的基本使用

根据平台下载需要的webdriver
项目中安装selenium-webdriver包
根据官方文档写一个小demo

根据平台选择webdriver

浏览器	webdriver
Chrome	chromedriver(.exe)
Internet Explorer	IEDriverServer.exe
Edge	MicrosoftWebDriver.msi
Firefox	geckodriver(.exe)
Safari	safaridriver

选择版本和平台：

网络爬虫开发(五)01-爬虫高级——Selenium简介 & 根据平台选择安装selenium-webdriver包 & Selenium的基本使用,node.js进阶,爬虫,selenium,测试工具

下载后放入项目根目录

安装selenium-webdriver的包

npm i selenium-webdriver

自动打开百度搜索“黑马程序员“

const { Builder, By, Key, until } = require('selenium-webdriver');

(async function example() {
  let driver = await new Builder().forBrowser('chrome').build();
  // try {
  await driver.get('https://www.baidu.com');
  await driver.findElement(By.id('kw')).sendKeys('黑马程序员', Key.ENTER);
  console.log(await driver.wait(until.titleIs('黑马程序员_百度搜索'), 1000))
  // } finally {
  //   await driver.quit();
  // }
})();

实例

Selenium目录

网络爬虫开发(五)01-爬虫高级——Selenium简介 & 根据平台选择安装selenium-webdriver包 & Selenium的基本使用,node.js进阶,爬虫,selenium,测试工具

第一步：安包

第二步：npm i 安装依赖

package.json

{
  "name": "selenium-demo",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "selenium-webdriver": "^4.0.0-alpha.4"
  }
}

第三步：新建demo文件

helloworld.js

const { Builder, By, Key, until } = require('selenium-webdriver');

(async function example() {
  let driver = await new Builder().forBrowser('chrome').build();
  try {
    // 自动打开百度,并搜索黑马程序员（webdriver）
    await driver.get('https://www.boxuegu.com');
    // 找到元素, 向里面发送一个关键字并按回车 sendKeys第一个参数是搜索的关键词
    await driver.findElement(By.id('kw')).sendKeys('webdriver', Key.RETURN);
    // 验证是否搜索成功
    // await driver.wait(until.titleIs('webdriver - Google Search'), 1000);
  } finally {
    // 退出
    // await driver.quit();
  }
})();

第四步：运行测试

node .\helloworld.js

此时，自动新开启浏览器并进行搜索

网络爬虫开发(五)01-爬虫高级——Selenium简介 & 根据平台选择安装selenium-webdriver包 & Selenium的基本使用,node.js进阶,爬虫,selenium,测试工具文章来源地址https://www.toymoban.com/news/detail-821353.html

到了这里，关于网络爬虫开发(五)01-爬虫高级——Selenium简介 & 根据平台选择安装selenium-webdriver包 & Selenium的基本使用的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！