Detailed usage of WeChat article capture tool
Now more and more high-quality content is posted on the official WeChat account. Faced with these contents, some friends have the need to collect them. Here, we will introduce how to use the octopus crawling tool to crawl and collect the information of WeChat articles.
The crawled content includes: WeChat article title, WeChat article keywords, partial content display of WeChat article, WeChat official account to which WeChat belongs, WeChat article publishing time, WeChat article URL and other field data.
Collection website:
Step 1: Create an acquisition task.
1) Enter the main interface and select "Custom Mode".
Octopus Tongyun wealth management service platform
2) Copy and paste the URL of the website to be collected into the website input box, and click "Save URL".
Octopus Tongyun wealth management service platform
Detailed use of WeChat article capture tool Step 2
Step 2: Create a page turning cycle
1) In the upper right corner of the page, open "Process" to display "Process Designer" and "Customize Current Operation". Click on the article search box in the page and select "Enter Text" in the operation prompt box on the right.
Octopus Tongyun wealth management service platform
Detailed use of WeChat article capture tool Step 3
2) Enter the article information to be searched. Take "Octopus Big Data" as an example. Click "OK" after entering.
Detailed use of WeChat article capture tool Step 4
Octopus Tongyun wealth management service platform
3) "Octopus Big Data" will be automatically filled into the search box. Click the "Search for Articles" button and select "Click this button" in the operation prompt box to use the WeChat article capture tool in detail. Step five.
4) "Octopus Big Data" appears in the page.
Search results of articles. Pull down the result page to the bottom, click the "Next Page" button, and select "Click Next Page Cycle" in the operation prompt box on the right.
Octopus Tongyun wealth management service platform
Detailed use of WeChat article capture tool Step 6
Step 3: Create a list loop and extract data.
1) Move the mouse and select the block of the first article in the page. The system will identify the child elements in this block. In the operation prompt box, select Select Child Element.
Octopus Tongyun wealth management service platform
Detailed use of wechat article capture tool Step 7
2) Continue to select the block of the second article in the page, the sub-elements in the second article will be automatically selected, and other 10 groups of similar elements in the page will be identified. In the operation prompt box, select Select All.
Detailed use of the WeChat article capture tool Step 8
Octopus Tongyun wealth management service platform
3) We can see that all the elements of the article block in the page are selected and turn green. In the operation prompt box on the right, a field preview table appears. Move the mouse to the title and click the trash can icon to delete unnecessary fields. After the field selection is completed, select "Collect the following data" to use the WeChat article crawling tool in detail. Step 9 4) Because we also want to collect the URL of each article, we need to extract a field. Click first
Article links, and then click the link of the second article, the system will automatically select a group of article links in the page. In the operation prompt box on the right, select "Collect the following link addresses".
Octopus Tongyun wealth management service platform
Detailed usage steps of WeChat article capture tool 10.
5) After the field selection is completed, select the corresponding field to customize the field naming. When finished, click "Save and Start" in the upper left corner to start the acquisition task.
Detailed usage steps of the WeChat article crawling tool 1 1.
6) Select "Start Local Collection"
Octopus Tongyun wealth management service platform
Detailed usage steps of WeChat article capture tool 12.
Step 4: data acquisition and export
1) After the collection is completed, you will be prompted to select "Export Data" and "Appropriate Export Method" to export the collected sogou WeChat article data.
Octopus Tongyun wealth management service platform
Detailed usage steps of WeChat article capture tool 13.
2) Here we choose excel as the export format, and the data export is shown below.
Detailed usage steps of WeChat article capture tool 14.
Note: The websites of sogou WeChat articles collected by this method are time-sensitive.
Octopus Tongyun wealth management service platform
Internal failure. This is due to the limitations of sogou WeChat itself.
Related acquisition tutorial:
JD.COM commodity information collection
Sina Weibo data acquisition
58 City Information Collection
Octopus-a network data collector selected by 700,000 users.
1, the operation is simple, anyone can use it: no technical background, you can collect it online. Fully visual process, click the mouse to complete the operation, and you can get started quickly in 2 minutes.
2. Powerful functions, any website can use: click, log in, turn pages, identify verification codes, waterfall streams and Ajax scripts to load data asynchronously, which can be collected through simple settings.
3, cloud collection, shutdown is also possible. After configuring the collection task, you can turn it off and perform the task in the cloud. Pang Collection Cluster runs continuously 24*7, without worrying about IP blocking and network interruption.
4. Free function+value-added service, which can be selected as required. The free version is fully functional and can meet the basic collection needs of users. At the same time, some value-added services (such as private cloud) have been set up to meet the needs of high-end paid enterprise users.