Use Web Scraping to Get Data

9 Dec 2023 ~ 9 Dec 2023 • Luciano Nieto

I always heard about web scraping but I have no idea how or what will be the purpose of using it. That was really strange to me because in my projects I always used the corporate system as a source and didn’t use it in any project that I had worked on.

After a quick search on the internet, I found some interesting posts about it and I resolved to check it out. This is a quite simple line of code but shows the idea of how to navigate and how to work with web scraping.

One thing that we have to keep in mind is to read first the terms of conditions of the website that you’re accessing to complain with the rules.

Requirements

To use this repo, you will need the following:

Clone this repository to your local machine.
Install the library: BeautifulSoup

Acessing the website & understand the structure

The first step is to access the website that you’re planning to get data and understand the HTML structure. I’ve used the structure based on websites that use HTML, and I’m not sure how it works with different types, such as Javascript.

Inspect:

To understand it, you have to identify the level and pattern of the objects that the information is inserted into.

For this one:

DIV/quote-header-info -> fin-streamer -> DIV/quote-market-notice.

Code & Result

Some websites don’t allow to get data directly or updated - in this case there are a couple of hours of delay (of course it’s a piece of financial information that should be used for many robots - and isn’t paid).

GitHub Repo here.

Use Web Scraping to Get Data

Requirements

Acessing the website & understand the structure

Code & Result

More

Table of Contents