Current location - Education and Training Encyclopedia - Education and training - Which Zhihu to choose for data analysis and web backend?
Which Zhihu to choose for data analysis and web backend?
Which data analysis and WEB backend do you choose? The differences between "front end" and "back end" in Zhihu web development are as follows:

First, the Web front end:

1) is proficient in HTML and can write HTML structures with reasonable semantics, clear structure and easy maintenance.

2) Proficient in CSS, able to restore visual design and compatible with mainstream browsers recognized by the industry.

3) Be familiar with JavaScript, understand the basic content of ECMAScript, and master the 1~2 js framework, such as JQuery.

4) Have a clear understanding of common browser compatibility problems and have reliable solutions.

5) Have certain performance requirements, understand Yahoo's performance optimization suggestions, and effectively implement them in the project.

Second, the Web backend:

1) Proficient in jsp, servlet, java bean, JMS, EJB, Jdbc, Flex development, or very familiar with related tools, class libraries and frameworks, such as Velocity, Spring, Hibernate, iBatis, OSGI, etc., and have a deep understanding of Web development mode.

2) Practice using common database systems such as oracle, sqlserver and mysql. , and has strong database design ability.

3) Familiar with maven project configuration management tools, tomcat, jboss and other application servers, and experience in load tuning under high concurrent processing is preferred.

4) Proficient in object-oriented analysis and design techniques, including design patterns and UML modeling.

5) Familiar with network programming, have the experience and ability to design and develop external API interfaces, and have the ability of cross-platform API specification design and API efficient call design.

-

The difference between data analysis and data mining is Zhihu 1. Data analysis focuses on data mining and KDD (knowledge discover in database).

2, data analysis conclusion intelligent conclusion data mining conclusion machine learning set (or training set, this set) discovers knowledge rules;

3. Data Analysis Conclusion Intelligent data mining is used to discover knowledge rules and directly apply them to prediction.

4. Data analysis The establishment of digital model needs engineering modeling, and data mining directly models. Traditional cybernetic modeling essentially describes the functional relationship between input variables, while data mining establishes the functional relationship between input and output through machine learning, and gives a set of input parameters according to KDD rules.

-

How to collect back-end data for data analysis General data collection is divided into page data collection and API data collection. Collection is generally done in python language, and data analysis is generally based on python framework. There are many free data collected in the source data, which can be downloaded directly. You can go and have a look if you are interested.

Data analysis sql which book 1, basic statistics: mean, median, mode, percentile, extreme value, etc.

2. Other descriptive statistics: skewness, variance, standard deviation, significance, etc.

3. Other statistical knowledge: population and sample, parameters and statistics, error line.

4. Probability distribution and hypothesis testing: various distribution and hypothesis testing processes.

5. Other knowledge of probability theory: conditional probability, Bayes, etc.

What is the major of business data analysis? Zhihu can input your GPA, major and other information into the volunteer reference system for studying abroad, and the system will automatically match the cases of students with similar situations from the database to see which institutions and majors they have successfully applied for.

In this way, you can see what level of colleges and majors you can apply for at your current level and position yourself accurately.

Which data analysis app is better? There are many apps for data analysis, including statistics, analysis and testing, which can be viewed on app Prophet with many functions.

Which is better, python or R data analysis? 20 12 We say that R is the mainstream of academia, but now Python is slowly replacing R in academia. I don't know if it is because of the arrival of the era of big data.

Python is faster than R, and Python can directly process G data; R can't When analyzing data, R needs to convert big data into small data through the database (through groupby), and then give it to R for analysis, so R can't directly analyze the behavior list, only analyze the statistical results. So some people say: Python=R+SQL/Hive, which makes sense.

One of the most obvious advantages of Python is its glue language, which is also mentioned in many books. Some algorithms written in C, encapsulated in Python package, are very efficient.

(Python's data mining package Orange canve

The decision tree in the analysis of 500,000 users takes 10 second to get the result, but it can't get out for several hours with R, and the memory in 8G is full). However, everything is not absolute. If R vectorization programming is done well (a bit difficult), it will.

The speed of R and the length of the program have been significantly improved.

The advantage of R is that there are all kinds of statistical functions that can be called, especially in time series analysis, both classical and cutting-edge methods have corresponding packages that can be used directly.

In contrast, Python was poor in this respect before. But now Python has it.

Panda. Panda provides a set of standard time series processing tools and data algorithms. Therefore, you can effectively handle very large time series and slice/dice, aggregate and periodically easily.

/irregular time series for resampling, etc. As you may have guessed, most of these tools are particularly useful for financial and economic data, but you can also use them to analyze server log data. So, near

In recent years, Python has become an excellent substitute for data processing tasks because of its constantly improving libraries (mainly pandas).

I have done several experiments:

1. A statistical method is implemented in python, in which ctypes and multiprocess are used.

After that, a project needs a comparison method, and R is used again, and it is found that some packages on bioconductor have used parallel by default. (But that package is still very slow, and all the threads are used at once, making the whole computer unusable, and it is also very difficult to read the webpage ~)

2. Do some data sorting work with python Panda, similar to a database, and check back and forth to match two or three tables. It still feels very convenient. Although R can do this work, it is estimated that it will be slower. After all, there are hundreds of thousands of lines of entries.

3. Drawing with python matplotlib. Pyplot's drawing method is very different from R. R is a command to draw points to the east.

Lucy, Pillot is ready to come out together. The color selection of pyplot is a bit embarrassing. The default color is less. You can use color later, but the name is too long. pyplot

Legend is much better than R. It is semi-automatic. Pyplot can be scaled freely after drawing, and then saved as a picture, which is better than R.

Generally speaking, Python is a relatively balanced language, which can be used in all aspects, whether calling other languages, connecting and reading data sources, operating systems, regular expressions and writing.

Science, Python has obvious advantages.

R is statistically more prominent. But data analysis is not only statistics, data collection, data processing, data sampling and data clustering, but also more complex data mining algorithms, data modeling and so on.

These tasks, as long as the data is greater than 100M, R is difficult to be competent, but Python is basically competent.

Combined with its great strength in general programming, we can only use Python as a language to build data-centric applications.

But there is no best software or program in the world, and few people can apply monolingual mining to the extreme. In particular, many people learn R earlier, and now they don't need it at all, so it will be better for those who want to apply what they have learned to Python.

Data analysis or big data, which is the big platform, big data training will answer for you:

1, big data:

Refers to the data set that traditional software tools can't capture, manage and process in an affordable time range. It is a massive, high-growth and diversified information asset, which needs a new processing mode to have stronger decision-making, insight and discovery, and process optimization ability.

In The Age of Big Data, co-authored by Victor Meyer-Schoenberg and Kenneth Cookeye, big data means that all data are used for analysis and processing, and there is no shortcut to random analysis (sampling survey). 5V characteristics of big data (proposed by IBM): volume (mass), speed (high speed), diversity (diversity) and value (authenticity).

2. Data analysis:

It refers to the process of analyzing a large number of collected data with appropriate statistical analysis methods, extracting useful information and forming conclusions, and studying and summarizing the data in detail. This process is also the supporting process of quality management system. In practice, data analysis can help people make judgments in order to take appropriate actions.

The mathematical foundation of data analysis was established in the early 20th century, but it was not until the appearance of computers that practical operation became possible and data analysis was popularized. Data analysis is the product of the combination of mathematics and computer science.

Xiao Bai wants to change careers, do web front-end or data analysis? With the rapid development of the Internet, the software industry is becoming more and more popular. Almost all high-paying jobs are linked to the software industry and become a symbol of high salary. As a very popular software development language in recent years, web front-end has been praised and favored by many people. Since the front end of the web is so hot, of course, the prospect of learning this course is more promising.

As long as you study hard, the future will naturally not be bad. If you want to learn web front-end development well, it usually takes about 2 weeks. You should go to the field according to your actual needs, and then choose the one that suits you, hoping to bring you help.

Do data analysts need to learn hadoop? Zhihu Hadoop ecosystem is an important part of big data development and analysis, which needs to be studied.