Web scraping is essential for collecting real-time financial data, stock prices, cryptocurrency trends, and market news. Python provides powerful libraries such as BeautifulSoup, Selenium, Scrapy, and requests for financial data extraction.
Key Topics Covered
✔ Scraping stock prices from Yahoo Finance
✔ Extracting financial news headlines
✔ Scraping cryptocurrency data
✔ Handling JavaScript-rendered financial data
✔ Storing financial data in a database
1. Installing Required Libraries
pip install requests beautifulsoup4 selenium pandas yfinance scrapy
2. Scraping Stock Prices from Yahoo Finance
Yahoo Finance provides historical and real-time stock price data.
import requests
from bs4 import BeautifulSoup
# Define stock symbol (e.g., Apple - AAPL)
stock_symbol = "AAPL"
url = f"https://finance.yahoo.com/quote/{stock_symbol}"
# Send request
response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
soup = BeautifulSoup(response.text, 'html.parser')
# Extract stock price
stock_price = soup.find("fin-streamer", {"data-field": "regularMarketPrice"}).text
print(f"{stock_symbol} Stock Price: ${stock_price}")
✔ Helps track live stock prices
3. Scraping Financial News Headlines
Scraping news from Yahoo Finance News Section
news_url = "https://finance.yahoo.com"
response = requests.get(news_url, headers={'User-Agent': 'Mozilla/5.0'})
soup = BeautifulSoup(response.text, 'html.parser')
# Extract news headlines
headlines = soup.find_all("h3", class_="Mb(5px)")
print("Latest Financial News:")
for i, headline in enumerate(headlines[:5]):
print(f"{i+1}. {headline.text}")
✔ Used for sentiment analysis and market trend detection
4. Scraping Cryptocurrency Prices from CoinMarketCap
Extract Bitcoin & Ethereum Prices
crypto_url = "https://coinmarketcap.com/"
response = requests.get(crypto_url, headers={'User-Agent': 'Mozilla/5.0'})
soup = BeautifulSoup(response.text, 'html.parser')
# Extract top 5 cryptocurrencies
cryptos = soup.find_all("tr", class_="cmc-table-row", limit=5)
for crypto in cryptos:
name = crypto.find("p", class_="coin-item-symbol").text
price = crypto.find("span", class_="sc-f70bb44c-0").text
print(f"{name}: {price}")
✔ Helps in crypto trend analysis
5. Scraping JavaScript-Rendered Financial Data Using Selenium
Some websites use JavaScript to load data dynamically.
from selenium import webdriver
from selenium.webdriver.common.by import By
# Set up Selenium WebDriver
driver = webdriver.Chrome()
driver.get("https://finance.yahoo.com/quote/AAPL")
# Extract stock price
price_element = driver.find_element(By.XPATH, '//fin-streamer[@data-field="regularMarketPrice"]')
print(f"Apple Stock Price: ${price_element.text}")
driver.quit()
✔ Useful for scraping financial data that loads via AJAX
6. Scraping Historical Stock Data Using yfinance
Fetch historical data for AAPL stock
import yfinance as yf
# Download historical data
df = yf.download("AAPL", start="2023-01-01", end="2024-01-01")
print(df.head())
✔ Used for trend analysis & backtesting strategies
7. Storing Scraped Data in a Database
Save stock prices into an SQLite database
import sqlite3
# Connect to database
conn = sqlite3.connect("finance.db")
cursor = conn.cursor()
# Create table
cursor.execute("CREATE TABLE IF NOT EXISTS stock_prices (symbol TEXT, price REAL, date TEXT)")
# Insert data
cursor.execute("INSERT INTO stock_prices VALUES (?, ?, datetime('now'))", ("AAPL", stock_price))
conn.commit()
conn.close()
✔ Useful for building financial dashboards
8. Automating Data Extraction with Scrapy
Create a Scrapy spider for financial data
import scrapy
class FinanceSpider(scrapy.Spider):
name = "finance"
start_urls = ["https://finance.yahoo.com"]
def parse(self, response):
for headline in response.css("h3.Mb(5px)"):
yield {"headline": headline.css("a::text").get()}
# Run Scrapy
# scrapy runspider finance_spider.py -o finance_news.json
✔ Automates bulk data extraction