Web Scraping Financial Data Using Python

Loading

Web scraping is essential for collecting real-time financial data, stock prices, cryptocurrency trends, and market news. Python provides powerful libraries such as BeautifulSoup, Selenium, Scrapy, and requests for financial data extraction.

Key Topics Covered

✔ Scraping stock prices from Yahoo Finance
✔ Extracting financial news headlines
✔ Scraping cryptocurrency data
✔ Handling JavaScript-rendered financial data
✔ Storing financial data in a database


1. Installing Required Libraries

pip install requests beautifulsoup4 selenium pandas yfinance scrapy

2. Scraping Stock Prices from Yahoo Finance

Yahoo Finance provides historical and real-time stock price data.

import requests
from bs4 import BeautifulSoup

# Define stock symbol (e.g., Apple - AAPL)
stock_symbol = "AAPL"
url = f"https://finance.yahoo.com/quote/{stock_symbol}"

# Send request
response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
soup = BeautifulSoup(response.text, 'html.parser')

# Extract stock price
stock_price = soup.find("fin-streamer", {"data-field": "regularMarketPrice"}).text
print(f"{stock_symbol} Stock Price: ${stock_price}")

Helps track live stock prices


3. Scraping Financial News Headlines

Scraping news from Yahoo Finance News Section

news_url = "https://finance.yahoo.com"
response = requests.get(news_url, headers={'User-Agent': 'Mozilla/5.0'})
soup = BeautifulSoup(response.text, 'html.parser')

# Extract news headlines
headlines = soup.find_all("h3", class_="Mb(5px)")

print("Latest Financial News:")
for i, headline in enumerate(headlines[:5]):
print(f"{i+1}. {headline.text}")

Used for sentiment analysis and market trend detection


4. Scraping Cryptocurrency Prices from CoinMarketCap

Extract Bitcoin & Ethereum Prices

crypto_url = "https://coinmarketcap.com/"
response = requests.get(crypto_url, headers={'User-Agent': 'Mozilla/5.0'})
soup = BeautifulSoup(response.text, 'html.parser')

# Extract top 5 cryptocurrencies
cryptos = soup.find_all("tr", class_="cmc-table-row", limit=5)

for crypto in cryptos:
name = crypto.find("p", class_="coin-item-symbol").text
price = crypto.find("span", class_="sc-f70bb44c-0").text
print(f"{name}: {price}")

Helps in crypto trend analysis


5. Scraping JavaScript-Rendered Financial Data Using Selenium

Some websites use JavaScript to load data dynamically.

from selenium import webdriver
from selenium.webdriver.common.by import By

# Set up Selenium WebDriver
driver = webdriver.Chrome()
driver.get("https://finance.yahoo.com/quote/AAPL")

# Extract stock price
price_element = driver.find_element(By.XPATH, '//fin-streamer[@data-field="regularMarketPrice"]')
print(f"Apple Stock Price: ${price_element.text}")

driver.quit()

Useful for scraping financial data that loads via AJAX


6. Scraping Historical Stock Data Using yfinance

Fetch historical data for AAPL stock

import yfinance as yf

# Download historical data
df = yf.download("AAPL", start="2023-01-01", end="2024-01-01")
print(df.head())

Used for trend analysis & backtesting strategies


7. Storing Scraped Data in a Database

Save stock prices into an SQLite database

import sqlite3

# Connect to database
conn = sqlite3.connect("finance.db")
cursor = conn.cursor()

# Create table
cursor.execute("CREATE TABLE IF NOT EXISTS stock_prices (symbol TEXT, price REAL, date TEXT)")

# Insert data
cursor.execute("INSERT INTO stock_prices VALUES (?, ?, datetime('now'))", ("AAPL", stock_price))
conn.commit()
conn.close()

Useful for building financial dashboards


8. Automating Data Extraction with Scrapy

Create a Scrapy spider for financial data

import scrapy

class FinanceSpider(scrapy.Spider):
name = "finance"
start_urls = ["https://finance.yahoo.com"]

def parse(self, response):
for headline in response.css("h3.Mb(5px)"):
yield {"headline": headline.css("a::text").get()}

# Run Scrapy
# scrapy runspider finance_spider.py -o finance_news.json

Automates bulk data extraction

Leave a Reply

Your email address will not be published. Required fields are marked *