Selenium Python 实战：爬取个股实时信息

这篇具有很好参考价值的文章主要介绍了Selenium Python 实战：爬取个股实时信息。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

Selenium是广泛使用的开源Web UI（用户界面）自动化测试套件之一。支持Chrome, Edge, Firfox等常见浏览器。除用于web应用程序自动化测试外，Selenium 也适合用于抓取JavaScript 动态网页数据。
python爬股票时时更新,selenium,python,测试工具,爬虫

本文演示如何使用 Selenium python库编程来爬取个股数据。

1、安装 selenium python 库

用pip安装 selenium库

pip install selenium

下载浏览器的驱动

确定你使用的浏览器，并下载相应驱动。常见浏览器驱动下载地址如下:
– | –
Chrome | https://sites.google.com/chromium.org/driver/
Edge | https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/
Firefox | https://github.com/mozilla/geckodriver/releases
Safari | https://webkit.org/blog/6900/webdriver-support-in-safari-10/

下载的驱动可以放在项目目录下，或者将驱动程序的路径加入到系统 path环境变量中。
如果电脑上没有安装FireFox，即使下载了驱动，由于电脑未安装浏览器程序，运行时也会报错。

2、selenium基本编程流程

导入selenium 库

from selenium import webdriver

导入keys 类

from selenium.webdriver.common.keys import Keys

导入定位元素方法类：

from selenium.webdriver.common.by import By

创建 webdriver对象

driver = webdriver.Chrome(executable_path=r'./chromedriver.exe')

打开网页

driver.get("http://www.python.org")

运行程序，会自动打开1个浏览器窗口, 并打开网页http://www.python.org

定位元素：

网页中有1个元素

使用find_element()方法定位元素
element = driver.find_element(By.ID, “passwd-id”)
element = driver.find_element(By.NAME, “passwd”)
element = driver.find_element(By.XPATH, “//input[@id=‘passwd-id’]”)
element = driver.find_element(By.CSS_SELECTOR, “input#passwd-id”) 注意，find_element（）方法中，对于xpath, css_selector，只返回第1个匹配到的结果

3. 抓取个股数据

本例功能：从某网站首页，自动输入股票代码，打开1个新窗口，定位价格元素并打印数据。
完整代码如下：

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

import time

options = webdriver.ChromeOptions()
options.add_argument('user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"')
driver=webdriver.Chrome(executable_path=r'./chromedriver.exe',chrome_options=options)

# tell selenium implicity to wait for 3 seconds for next action 
driver.implicitly_wait(3) 
driver.get("https://www.eastmoney.com/")
print("title:",driver.title)
elem = driver.find_element(By.ID,'code_suggest')
elem.clear()
elem.send_keys("600332")
elem = driver.find_element(By.ID,'search_view_btn1').click()
#切换到新窗口
driver.switch_to.window(driver.window_handles[-1])
# time.sleep(6)
try:
    price_o = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "/html[1]/body[1]/div[1]/div[1]/div[1]/div[8]/div[2]/div[1]/table[1]/tbody[1]/tr[1]/td[2]/span[1]/span[1]")) )
    print("Stock open at: ",price_o.text)
    price_c = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//div[@class='zxj']/span/span[@class='price_up blinkred']")) )
    #price_c = driver.find_element(By.XPATH,"//div[@class='zxj']/span/span[@class='price_up blinkred']")
    print("stock close at: ",price_c.text)
except:
    print("error happened")
finally:
    driver.quit()

说明：

options.add_argument('user-agent=…) 设置请求头user-agent参数，避免被网站拒绝
2） elem.send_keys(“600332”) 表示输入600332代码
3） elem = driver.find_element(By.ID,‘search_view_btn1’).click() 表示定位id=search_view_btn1的元素，并单击。
4） driver.switch_to.window(driver.window_handles[-1]) 表示切换到浏览器最新创建的窗口
5） WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, “/html[1]/body[1]/div[1]/div[1]/div[1]/div[8]/div[2]/div[1]/table[1]/tbody[1]/tr[1]/td[2]/span[1]/span[1]”)) ) 表示尝试等待元素出现，通常用于AJAX网页，最多等10秒。

运行 python demo.py文章来源地址https://www.toymoban.com/news/detail-817896.html

title: 东方财富网：财经门户，提供专业的财经、股票、行情、证券、基金、理财、银行、保险、信托、期货、黄金、股吧、博客等各类财经资讯及数据
[3744:12972:0605/214254.992:ERROR:interface_endpoint_client.cc(696)] Message 0 rejected by interface blink.mojom.WidgetHost
Stock open at:  34.70
stock close at:  34.77

到了这里，关于Selenium Python 实战：爬取个股实时信息的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！