How does python crawler analyze a website that will be crawled?
Crawling web data requires some tools, such as requests, regular expressions, bs4, etc. Parsing a web page is the first step. You can grab data through tags and nodes.
As it happens, I recently published an article on data analysis of web pages, which contains a complete crawling step. Can you have a look? Sorry to advertise yourself?