How to Extract Yahoo Finance Data with Stock Prices, Price Change, Bids, and more?

The share market is a huge database for technology companies with millions of records that are getting updated constantly! As there are numerous companies, which offer financial data, this is generally done using real-time data scraping API, as well as APIs, which are available with their premium forms. Yahoo Finance is a reliable resource of share market data. This is a premium form as Yahoo is also having a Yahoo Finance API. As a substitute, you can have free access to any company’s stock data on a website.

Though it is very popular amongst the stock traders, this has persevered in the market while many big-size competitors like Google Finance are unsuccessful. For people interested in succeeding in the stock markets, Yahoo offers the most contemporary news on a stock market as well as firms.

Steps of Extracting Yahoo Finance

  • Make an URL of search result pages from Yahoo Finance.
  • Download HTML of search result pages with Python requests.
  • Scrolling the page with LXML-LXML as well as help you navigate an HTML tree structure through Xpaths. We have well-defined the Xpaths for details needed for a code.
  • Save downloaded data into the JSON file.

We would scrape the given data fields:

You would require to install the Python 3 packages to download as well as parse an HTML file.

The Script We Have Used

from lxml import html
import requests
import json
import argparse
from collections import OrderedDict
def get_headers():
return {"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"accept-encoding": "gzip, deflate, br",
"accept-language": "en-GB,en;q=0.9,en-US;q=0.8,ml;q=0.7",
"cache-control": "max-age=0",
"dnt": "1",
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "none",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36"}
def parse(ticker):
url = "" % (ticker, ticker)
response = requests.get(
url, verify=False, headers=get_headers(), timeout=30)
print("Parsing %s" % (url))
parser = html.fromstring(response.text)
summary_table = parser.xpath(
summary_data = OrderedDict()
other_details_json_link = "{0}?formatted=true&lang=en-US®ion=US&modules=summaryProfile%2CfinancialData%2CrecommendationTrend%2CupgradeDowngradeHistory%2Cearnings%2CdefaultKeyStatistics%2CcalendarEvents&".format(
summary_json_response = requests.get(other_details_json_link)
json_loaded_summary = json.loads(summary_json_response.text)
summary = json_loaded_summary["quoteSummary"]["result"][0]
y_Target_Est = summary["financialData"]["targetMeanPrice"]['raw']
earnings_list = summary["calendarEvents"]['earnings']
eps = summary["defaultKeyStatistics"]["trailingEps"]['raw']
datelist = []
for i in earnings_list['earningsDate']:
earnings_date = ' to '.join(datelist)
for table_data in summary_table:
raw_table_key = table_data.xpath(
raw_table_value = table_data.xpath(
table_key = ''.join(raw_table_key).strip()
table_value = ''.join(raw_table_value).strip()
summary_data.update({table_key: table_value})
summary_data.update({'1y Target Est': y_Target_Est, 'EPS (TTM)': eps,
'Earnings Date': earnings_date, 'ticker': ticker,
'url': url})
return summary_data
except ValueError:
print("Failed to parse json response")
return {"error": "Failed to parse json response"}
return {"error": "Unhandled Error"}
if __name__ == "__main__":
argparser = argparse.ArgumentParser()
argparser.add_argument('ticker', help='')
args = argparser.parse_args()
ticker = args.ticker
print("Fetching data for %s" % (ticker))
scraped_data = parse(ticker)
print("Writing data to output file")
with open('%s-summary.json' % (ticker), 'w') as fp:
json.dump(scraped_data, fp, indent=4)

How the Scraper Has Got Executed?

Suppose the script has given the name In case, you type a code name in the command prompt or station having a -h.

python3 -h
usage: [-h] ticker
positional arguments: ticker optional arguments: -h, --help show this help message and exit

The ticker sign, often identified as the stock symbol, is utilized to recognize an organization.

To get data on Apple Inc stock, we will do the given argument:

python3 AAPL

It will make the JSON file called AAPL-summary.json in a similar folder as a script.

That is what output files might look like:

"Previous Close": "293.16",
"Open": "295.06",
"Bid": "298.51 x 800",
"Ask": "298.88 x 900",
"Day's Range": "294.48 - 301.00",
"52 Week Range": "170.27 - 327.85",
"Volume": "36,263,602",
"Avg. Volume": "50,925,925",
"Market Cap": "1.29T",
"Beta (5Y Monthly)": "1.17",
"PE Ratio (TTM)": "23.38",
"EPS (TTM)": 12.728,
"Earnings Date": "2020-07-28 to 2020-08-03",
"Forward Dividend & Yield": "3.28 (1.13%)",
"Ex-Dividend Date": "May 08, 2020",
"1y Target Est": 308.91,
"ticker": "AAPL",
"url": ""

The code would work to fetch stock market information of different companies. In case, you want to scrape Yahoo Finance historical data Python often, there are different things that you must know.

Why Do You Need to Scrape Yahoo Finance News?

In case, you’re dealing with share market data as well as require a clear, free, and reliable resource, then web scraping stock prices could be the finest option. Various company profile pages are available with a similar format, therefore if you make a script for JavaScript scrape Yahoo Finance from the Microsoft business page, you might utilize the similar script for scraping data from the Apple business page.

If anybody can’t select how to extract Yahoo financial data then the better option is to hire a skilled web data scraping company like 3i Data Scraping.

For further queries, contact 3i Data Scraping now or ask for a free quote!




3i Data Scraping is an Experienced Web Scraping Service Provider in the USA. We offering a Complete Range of Data Extraction from Websites and Online Outsource.

Recommended from Medium

All about AppBar in Flutter

Day 49: Ford-Fulkerson

Encode videos from your browser with Jupyter Notebook

Jupyter Logo

How to Remove a Prefix from a String in Python

Responsive Layout Using CSS Grid

Quickly Transitioning to JUnit 5

Looking down an old bridge to transition forward.

IBM Cloud Databases for PostgreSQL

Deploy Plotly Dash Apps to Heroku in Under a Minute — Fastest and Easiest Method

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
3i Data Scraping

3i Data Scraping

3i Data Scraping is an Experienced Web Scraping Service Provider in the USA. We offering a Complete Range of Data Extraction from Websites and Online Outsource.

More from Medium


Build a web app in 5 minutes for S&P 500 stock visualisation

How to Rebalance Your Stock Portfolio with Alpaca Trading API ~ Part II

Tracking Barrons “Roundtable 2022” portfolios