I. Possession of data sources
The amount of data is increasing day by day, and more and more data is stored in the form of database. If you want to get your own data source, you must be able to use SQL to retrieve data.
Except for adding, deleting and modifying SQL, the most commonly used keywords are around Select, including Where, From, Group By, Order By, Having, Like, Sum, As, Distinct, Join and Limit.
In addition, it should be noted that different databases have different compatibility.
Second, external data sources
1) Web page crawling data
If you study Python, you can grab some data from the website, such as JD. COM review data and public comment data.
2) National Bureau of Statistics data
National data sources, including all aspects of our country's economy and people's livelihood, can be consulted from monthly, quarterly and annual dimensions.
3) Baidu index data
Baidu is a product, which can help to gain insight into the situation that a keyword is concerned at a certain time. It can usually be used for trend analysis, crowd insight and so on. Of course, in addition, there are sogou Index, 360 Index and other search index products.
4) Tencent TBI index
Tencent is a product, which helps to gain insight into hot information on the Internet and understand the general industry trends and crowd characteristics.
5) Ali index
Ali products, relying on Ali's own transaction data such as Tmall and Taobao, are relatively authoritative big data platforms in China.
In addition, there are big data products such as iQiyi Index and WeChat Index.