Sentiment analysis in finance involves analyzing news articles, reports, and social media to assess whether the market sentiment is positive, negative, or neutral. This technique helps investors and traders make data-driven decisions.
Applications in Finance
✔ Predicting stock market trends
✔ Assessing investor sentiment
✔ Identifying market risks
✔ Automating news analysis
1. Data Collection – Fetching Financial News
To perform sentiment analysis, we first need to gather financial news data.
Here’s an example using newsapi
to fetch news articles:
import requests
# API Key (Get from https://newsapi.org/)
API_KEY = "your_api_key"
url = f"https://newsapi.org/v2/everything?q=stock market&language=en&apiKey={API_KEY}"
# Fetch news articles
response = requests.get(url)
data = response.json()
# Extract headlines
articles = data.get("articles", [])
headlines = [article["title"] for article in articles]
# Display sample headlines
for i, headline in enumerate(headlines[:5], 1):
print(f"{i}. {headline}")
✔ Retrieves recent stock market news headlines
2. Preprocessing the Text Data
Before performing sentiment analysis, we clean the text:
import re
import nltk
from nltk.corpus import stopwords
nltk.download("stopwords")
stop_words = set(stopwords.words("english"))
def clean_text(text):
text = text.lower() # Convert to lowercase
text = re.sub(r"[^\w\s]", "", text) # Remove punctuation
text = " ".join(word for word in text.split() if word not in stop_words) # Remove stopwords
return text
# Clean headlines
cleaned_headlines = [clean_text(headline) for headline in headlines]
✔ Removes unnecessary words, punctuation, and stopwords
3. Sentiment Analysis Using VADER (NLTK)
VADER (Valence Aware Dictionary and sEntiment Reasoner) is a rule-based sentiment analyzer designed for financial and social media texts.
from nltk.sentiment import SentimentIntensityAnalyzer
nltk.download("vader_lexicon")
sia = SentimentIntensityAnalyzer()
# Analyze sentiment of each headline
sentiments = [sia.polarity_scores(headline) for headline in cleaned_headlines]
# Display sample sentiment scores
for i, (headline, sentiment) in enumerate(zip(headlines[:5], sentiments[:5]), 1):
print(f"{i}. {headline}")
print(f" Sentiment Score: {sentiment}\n")
✔ Provides sentiment scores:
compound
(overall sentiment score)pos
,neu
,neg
(percentages of positive, neutral, and negative words)
4. Categorizing Sentiment (Positive, Negative, Neutral)
We classify the compound score into three categories:
def classify_sentiment(score):
if score["compound"] >= 0.05:
return "Positive"
elif score["compound"] <= -0.05:
return "Negative"
else:
return "Neutral"
# Categorize sentiment
categories = [classify_sentiment(score) for score in sentiments]
# Display categorized sentiment
for i, (headline, category) in enumerate(zip(headlines[:5], categories[:5]), 1):
print(f"{i}. {headline} -> Sentiment: {category}")
✔ Classifies news as Positive, Negative, or Neutral
5. Visualizing Sentiment Distribution
We plot the sentiment distribution using matplotlib & seaborn:
import seaborn as sns
import matplotlib.pyplot as plt
# Count sentiment categories
sentiment_counts = { "Positive": categories.count("Positive"),
"Negative": categories.count("Negative"),
"Neutral": categories.count("Neutral") }
# Plot bar chart
plt.figure(figsize=(7,5))
sns.barplot(x=list(sentiment_counts.keys()), y=list(sentiment_counts.values()), palette="coolwarm")
plt.title("Financial News Sentiment Distribution")
plt.ylabel("Count")
plt.show()
✔ Displays a bar chart showing the sentiment distribution
6. Correlating Sentiment with Stock Market Trends
We compare sentiment scores with stock market indices to find correlations:
import pandas as pd
import yfinance as yf
# Get S&P 500 index data for the past week
sp500 = yf.download("^GSPC", period="7d", interval="1d")
# Compute average daily sentiment score
daily_sentiment = pd.DataFrame({"date": pd.to_datetime(sp500.index), "sentiment": [sum([s['compound'] for s in sentiments])/len(sentiments)]*len(sp500)})
# Merge sentiment with stock prices
merged_data = sp500[["Adj Close"]].reset_index().merge(daily_sentiment, on="date")
# Plot stock price vs sentiment
fig, ax1 = plt.subplots(figsize=(10,5))
ax2 = ax1.twinx()
ax1.plot(merged_data["date"], merged_data["Adj Close"], "g-", label="S&P 500")
ax2.plot(merged_data["date"], merged_data["sentiment"], "b--", label="Sentiment Score")
ax1.set_xlabel("Date")
ax1.set_ylabel("S&P 500 Price", color="g")
ax2.set_ylabel("Sentiment Score", color="b")
plt.title("Stock Market vs News Sentiment")
plt.legend()
plt.show()
✔ Compares market trends with sentiment scores