Current location - Education and Training Encyclopedia - Resume - How to crawl data with Python?
How to crawl data with Python?
Methods/steps

Before grabbing data, you need to download and install two things, one is urllib and the other is python-docx.

Please click to enter a picture description.

Then enter the import options in the editor of python to provide the services of these two libraries.

Please click to enter a picture description.

Urllib is mainly responsible for crawling the data of web pages. It's actually very simple to simply grab the data of the web page. Enter the command as shown, followed by a link.

Please click to enter a picture description.

If you catch it, it doesn't count. You must read it, otherwise it will be invalid.

Please click to enter a picture description.

five

The next step is to grab the code, and you can't save it without flipping the code. Decode the read function. Mark another one casually, such as XA.

Please click to enter a picture description.

six

Finally, enter three more sentences. The first sentence means to create a new blank word document.

The second sentence means to add a text paragraph to the document to introduce what the variable XA captures.

The third sentence means to save the document docx, and the name is in brackets.

Please click to enter a picture description.

seven

This is the source code. If you still need to filter, you need to add various regular expressions yourself.