1. 1 single search engine (independent search engine) It is characterized by retrieving information only in the database of the search engine itself, such as Yahoo.
1.2 yuan search engine completes the retrieval function by calling other independent search engines when retrieving information, and can handle the results queried from multiple independent search engines to varying degrees, such as deleting duplicate results, checking connections, ranking the results according to relevance, etc. The meta search engine itself may or may not have its own database. Because the independent search engines linked by different meta-search engines are different, the query syntax of independent search engines is quite different. The meta-search engine itself only supports simple grammatical operations such as AND, 0R and N0T, and the returned results can only meet the "lowest common denominator", that is, the accuracy of search results cannot be improved.
1.3 network search engine (network search software) means that network users can download the corresponding search software to their local computers and install queries. This is an offline browser with network query function. Compared with the meta-search engine, it can flexibly control the output results, and its biggest feature is that it is convenient for users to use and can quickly query network-related resources. 2 working principle and basic composition of network search engine
When users retrieve information, the search engine searches the corresponding information from the index database according to the user's query requirements, and returns it to users according to a certain algorithm. In order to ensure the accuracy and freshness of the information searched by users. For an independent search engine, it is necessary to establish and maintain a huge database. The information in the index database of independent search engines is regularly crawled on the Internet through a program software called Spider. By visiting every site in the public area of the public network, the information resources of the network are collected, and then the collected information is automatically indexed by indexing software to create a web page index database for users to query according to keywords, and the search software provides query services for users through the index database. So the general search engine is mainly composed of three parts: web spider, index and search software.
Web spider. Is a very powerful program, it will regularly check the corresponding web page according to the preset address, if the web page changes, it will regain the web page, otherwise it will continue to visit according to the links in the web page. The process of web spider accessing web pages is the process of traversing information on the Internet. In order to ensure the breadth of traversing information, web spiders usually set some important links in advance before traversing. In the process of traversal, keep recording the links in the web page and continue to traverse until all the links are visited.
Index software. Web spiders store the web pages obtained by traversing the search set in the database. In order to improve retrieval efficiency, it is necessary to establish an index. An index is generally an inverted index.
Search software. The software is used to screen countless web pages in the index database, select and sort the web pages that meet the user's retrieval requirements. Then the hierarchical sorted results are displayed to users.
3 main performance evaluation indicators of search engines
3. 1 search engine indexing method The indexes in the database are generally stored according to the file format of inverse document, and different search engines have different options when establishing indexes. Some search engines build full-text indexes for information pages; Others only create a summary or index at the front of the paragraph; Some search engines, such as Google, will also consider the different meanings expressed by different tags of hypertext when building indexes. What is displayed in bold and large fonts is often more important; The information placed in the anchor chain is often the information summary of the page it points to, so it is the important information of the page it points to. Google and infoseek also collect hyperlinks in pages during the indexing process. These hyperlinks reflect the spatial structure of the collected information, and the accuracy of judging page relevance can be improved by using these results. Because of the different indexes, the results will be different when retrieving information.
3.2 Search functions of search engines The number of search functions supported by search engines and the quality of their implementation directly determine the quality of retrieval. Therefore, in addition to supporting basic retrieval functions such as Boolean retrieval, proximity retrieval, word truncation retrieval and field retrieval, network retrieval tools should also be based on online information resources. & gt
Question 2: What are the requirements for retrieval tools? Search tool: index external features (such as name, author, source, publication date, etc.). ) and internal characteristics (such as subject, topic, etc. ), and organize all the features scientifically in a certain order, so as to point out the location of the document to readers and provide tools for searching and finding the required documents. It can be books, cards, movies and tapes.
Generally speaking, retrieval tools must meet the following conditions:
(1) describes the external features and content features of the collected literature information in detail. (including title, title, author, subject, classification number, abstract, source and other items).
(2) Each item is marked with a sign of search. Such as: classification number, title, keywords, subject words, document serial number, code code, website and so on.
(3) All projects are scientifically organized into an organic whole. When arranging the whole retrieval tool system, it should be clearly defined, detailed and interrelated.
(4) It has a variety of necessary retrieval methods. That is, classified index, subject index, author index, digital index and other system indexes. , so that readers can search more conveniently.
(5) Clear scope. Explain the nature of this tool.
(6) The retrieval speed should be fast and the accuracy of retrieval results should be high.
(7) It has the functions of error correction and recommendation. Prompt the user actively when the user inputs the wrong search word; Recommend information related to the retrieval subject to users.
Question 3: What are the functions of Baidu Advanced Search? Examples are given to illustrate its application. I don't know which one you are talking about.
Baidu news advanced retrieval
The search tool of Baidu search can search by time, web page type and designated domain name.
Question 4: What are the main functions of search engines?
The basic retrieval function of Cnki is 5 points. The basic retrieval functions of China HowNet include primary retrieval and secondary retrieval, as follows:
1, main search
Navigation and retrieval: users do not need to enter any search words, but only choose the column names they care about, and they can directly find the articles with the required topics.
Title retrieval: Retrieve articles whose search terms appear in article titles.
Author search: search for articles published by authors.
Keyword retrieval: use the search words in the article keywords to retrieve articles.
Institution retrieval: enter the name of the institution to retrieve the articles published by the author of the institution.
Chinese abstract retrieval: search articles with search terms in the Chinese abstract of articles.
Chinese title retrieval: retrieval of articles published in journals.
Annual retrieval: retrieve articles of a certain year.
Periodical retrieval: retrieval of articles in a certain issue.
Full-text retrieval: full-text retrieval of articles with search terms (including all articles).
2. Second retrieval
For the retrieval results in any way in 1, you can use new search words to carry out continuous approximate retrieval within this retrieval range.
Question 6: Which retrieval tools are better for searching documents? Do you have any questioning skills? Search online directly with EndNote.
The keyword 1 can generally be separated by spaces, and quotation marks can be used to retrieve a phrase. "-"stands for excluding keywords. You can also use AND, OR etc. Perform logical retrieval.
Some databases are searched by items, that is, each item is filled with a search term, and the required option is selected from the drop-down menu.
The above databases have been covered almost, and other databases are basically unnecessary. There is also a trick, you can use google academic search related literature. You can also go to the magazine's own home page to search and download.
Question 7: What is the function of literature retrieval? Information retrieval refers to the process of obtaining documents according to the needs of study and work.
The function is as follows, so:
Question 8: What are the components of a search engine and what are its functions? There are only two parts in a website, and search engines are no exception.
Pc side and server side
What you want to ask is how many parts are there on the server side.
1. Spiders, reptiles
2. database,
3. Algorithm program
Mastering and database is very simple,
Crawling is only responsible for crawling featured pages.
The database is only responsible for storing the captured pages.
The algorithm is more complicated.
As far as Baidu is concerned, there should be more than 300 large and small algorithms.
The main algorithm is divided into the following parts.
Link algorithm, content algorithm, domain name algorithm, anti-cheating algorithm, etc.
Among them, the link algorithm accounts for the largest proportion.
Question 9: What are the basic types of commonly used search engines? A search engine is a system that collects and sorts out information resources on the Internet and then provides them to you for inquiry. It includes three parts: information collection, information arrangement and user inquiry.
A search engine is a website that provides you with information "retrieval" service. It uses some programs to classify all the information on the Internet and help people find the information they need in the vast network.
Early search engines collected the addresses of resource servers on the Internet, divided the resources they provided into different directories, and then classified them layer by layer. People who want to find the information they want can enter layer by layer according to their own classification, and then they can finally reach their destination and find the information they want. This is actually the most primitive way, which is only applicable when there is not much information on the Internet. With the geometric growth of Internet information, a real search engine appears. These search engines know the beginning of every page on the website, then search all the hyperlinks on the Internet and put all the words representing hyperlinks into a database. This is the embryonic form of search engine now.
Use Yahoo! With the appearance of Internet, the development of search engine has entered a golden age, and its performance is better than before. Today's search engines don't just search for information on web pages, they become more comprehensive and perfect. With the authority of search engine Yahoo! For example, since March 1995, Yahoo was founded by Chinese American Yang Zhiyuan and others! From the beginning to the present, they have developed from a single search engine to a variety of network services such as e-commerce, news information service and personal free e-mail service, which fully illustrates the process of the development of search engines from single to comprehensive.
However, due to the working mode of search engines and the rapid development of the Internet, the search results are increasingly unsatisfactory. For example, searching for the word "computer" may result in millions of pages. This is because the search engine optimizes the search results through the correlation with the website, and the correlation of the website is determined by the formulas such as the position of keywords in the website, the name of the website, and the label. This is why the search results of search engines are numerous and miscellaneous. Due to the development and changes of the Internet, the database in search engines must contain dead links.
In this article, we introduce google, which is the prototype of a large search engine. Search engines are widely used in hypertext. Google's design can efficiently crawl web pages and build indexes, and its query results are better than other existing systems. This prototype full-text and hyperlink database contains at least 24,000,000 web pages. We can download it from google.stanford.edu/..
Designing a search engine is a challenging job. Search engines index hundreds of millions of web pages which contain a large number of very different words. Answer thousands of questions every day. In the network, although large search engines are very important, they are rarely studied in academic circles. In addition, due to the rapid development of technology and a large number of web pages, building a search engine now is completely different from three years ago.
This paper introduces our large search engine in detail. As far as we know, this is the first published paper that describes it in such detail. In addition to the problems encountered when applying traditional data search technology to such a large number of web pages, there are many new technical challenges, including applying additional information in hypertext to improve search results.
This paper will solve this problem and describe how to use the additional information in hypertext to build a large practical system. Anyone can post information on the internet at will, and how to deal with these unorganized hypertext effectively is also a problem that this paper should pay attention to.
World Wide Web, search engine, information retrieval, PageRank, Google 1 Introduction The Web has brought new challenges to information retrieval. The amount of information on the network is increasing rapidly, and at the same time, there are new inexperienced users to experience the art of the network. People like to use hyperlinks to surf the Internet, usually starting from important web pages or search engines like Yahoo. Everyone thinks;
Question 10: What is the difference between retrieval reference books and reference reference books? A reference book is a manual that collects some knowledge and materials for readers to find words and nouns to explain. A person who provides specific documents, dictionaries, geographical profiles and background information. The specific and practical information provided by the company includes difficult words, explanations of technical terms, storage and retrieval tools for literature information, retrieval of reference books, etc., which people use to report.
Search reference books only provide clues, encyclopedias and bibliographies.