Keywords: multimedia image audio function
Application of multimedia technology
Multimedia communication technology is the most dynamic and fastest-growing high-tech information technology in the field of science and technology in the world today. It always affects the pace of world economic development and scientific and technological progress, and changes the lifestyle and quality of life of human beings. Multimedia communication is the communication between all kinds of media information. It is a communication means to transmit and receive multimedia information and dump it through various existing communication networks, covering almost all fields of information technology, including the comprehensive processing and application technology of data, audio and video. The key of its technology is the efficient transmission and interactive processing of multimedia information.
Keywords: multimedia audio feature images
cite
With the rapid development of science and technology, multimedia data such as images and videos have gradually become the main forms of information media in the field of information processing. Multimedia communication is a key technology in the construction of information superhighway, and it is the product of the mutual infiltration and development of multimedia, communication, computer and network. It will greatly improve people's work efficiency, change people's education, entertainment and other lifestyles, and is the basic way for people to communicate in 2 1 century.
The first chapter is a brief introduction of multimedia communication technology.
Basic concepts and characteristics of multimedia communication
The basic concept of 1. 1
Media is the carrier of information representation and transmission, and it is an important concept. ITU-T I .374 suggests that media should be divided into five categories: sensory media, expressive media, display media, storage media and transmission media.
Multimedia data refers to the carrier of various styles of information, such as text, graphics, images, sounds and other data. Its characteristics mainly include the following points:
(1) There are many kinds of multimedia data (mostly unstructured data), and the media from different sources have completely different forms and formats;
(2) The amount of multimedia data is huge;
(3) Multimedia data has time characteristics and version concept. For example, in a video-on-demand system, time synchronization between media and within media must be considered.
It can be seen that multimedia data is different from traditional values and characters, so its storage structure and access mode are also special, and its data structure and data model are also different. In this case, a brand-new database system-multimedia database system came into being.
Multimedia database is a database system that can effectively realize the functions of multimedia data storage, reading and retrieval. Its main features are:
(1) inherits some advantages of traditional database, such as data independence, advanced query using database query language, development control, fault-tolerant technology, etc.
(2) It can synchronize and manage data with time-space relationship.
However, at present, we have not reached a * * * understanding of the functions and implementation methods of multimedia databases, so various forms of media databases have emerged, and the implementation methods are also different. From its overall development, the data models of multimedia database can be divided into three categories: relational data model, object-oriented data model and hypermedia data model.
The functions of multimedia database management system (DBMS) based on different data models are also very different. Usually, multimedia DBMS based on relational data model can access multimedia data, and the semantic relationship, temporal relationship and spatial relationship between multimedia data objects are not processed, so this part of the work is left to the application. Object-oriented data model and hypermedia data type can support the processing of semantic relationship, temporal relationship and spatial relationship between multimedia data objects, which are highly abstract, but the implementation of DBMS is relatively complicated.
Another frequently used word in multimedia communication system is "hypermedia". The word "notes" often appears in publications. You can find a paragraph or an article related to it through "notes". This link from "notes" to a paragraph of text or an article is called hyperlink dialing. Similarly, a hyperlink can also link several different media, and its collection is called hypermedia.
1.2 characteristics of multimedia communication
The development of multimedia communication technology has broken the traditional communication system pattern of single media and single telecommunication service, reflecting a trend of high-level communication, which is people's yearning for future social work and lifestyle. Multimedia communication technology is a comprehensive technology, involving multimedia technology, computer technology, communication technology and other fields. Multimedia communication system must have three main characteristics: integration, interactivity and synchronization.
1.2. 1 integration
The integration of multimedia communication system refers to the ability to store, transmit, process and display content data information, multimedia and hypermedia information, script information and specific application information.
(1) Content data information
(2) Information exists in a certain structural form. There are two typical structures: one is an object.
Structure, in which the smallest unit that can be processed is an object; The other is the file structure, in which
The smallest processing unit is a file.
Multimedia and hypermedia information
Different from single media information, multimedia and hypermedia information are structured information, which consists of two parts: structural framework and content data. The minimum expression forms of multimedia and hypermedia information are divided into two categories, one is called object, and the other is called file.
(3) Script information
Script information is a set of specific structured multimedia and hypermedia information linked by semantic relations. It is necessary to provide the operation process of this set of multimedia information and its relationship with external processing modules.
(4) Specific application information
The above three types of information are all low-level information, which can be defined and expressed by standards. The specific application information is high-level information, which is closely related to the application and will vary greatly due to different application occasions. Its representation is based on the above three categories.
1.2.2 interactivity
Interactivity refers to the ability of mutual control between people and systems in communication systems. In multimedia communication system, interactivity has two aspects. One is the man-machine interface, that is, the operation interface provided by the user terminal when people use the terminal of the system; The second is the application layer communication protocol between the user terminal and the system.
Users of multimedia communication terminals have complete interactive control ability over the whole communication process, which is the main feature of multimedia communication systems and the main standard to distinguish multimedia communication systems from non-multimedia communication systems.
1.2.3 synchronization
Synchronization means that images, sounds and characters appearing on multimedia communication terminals all work in a synchronous manner. If the user wants to retrieve a fragment of an important historical event, the moving image or still image of the event is stored in the image database, and its text description and language description are placed in other databases. The multimedia communication terminal extracts the required information from different databases through different transmission channels, and synchronizes these images, sounds and characters to form a complete information for users.
Synchronization in multimedia communication system is one of the most important characteristics of multimedia communication system, and whether the information is synchronized or not determines whether the system is a multimedia system or a non-multimedia system. Synchronization can be achieved at the link layer, presentation layer and application layer.
Chapter II Multimedia Audio Technology
Audio technology developed earlier. A few years ago, some technologies were mature and commercialized, and even entered the home, such as digital audio. Audio technology mainly includes audio digitization, speech processing, speech synthesis and speech recognition.
At present, audio digitization is a mature technology, multimedia sound cards are designed with this technology, and digital audio has also adopted this technology to replace the traditional analog mode, and achieved ideal sound effects. Audio sampling includes two important parameters, namely sampling frequency and sampling data bits. Sampling frequency refers to the number of times the sound is sampled per second. The upper limit of human hearing is about 20KHz. At present, the commonly used sampling frequencies are 1 1KHz, 22KHz and 44KHz. The higher the sampling frequency, the better the sound quality and the larger the amount of data stored. The sampling frequency of CD records is 44. 1KHz, which has achieved the best auditory effect at present. The number of sampling data bits is the data representation range of each sampling point. At present, there are three commonly used types: 8-bit, 12-bit and 16-bit. Different sampling data bits determine different sound quality. The higher the number of samples, the greater the amount of data stored and the better the sound quality. CD records are sampled with dual channels 16 bits, and the sampling frequency is 44. 1KHz, reaching the professional level.
Audio processing involves a wide range, but the main aspect focuses on audio compression. At present, the latest MPEG speech compression algorithm can compress sound 6 times. Speech synthesis refers to synthesizing text into language and playing it. At present, the synthesis level of several foreign speakers has reached the practical stage, and Chinese synthesis has made great progress in recent years, and the experimental system is running. In audio technology, speech recognition is the most difficult and attractive technology. Although it is only in the experimental research stage at present, its broad application prospect makes it one of the research hotspots.
Chapter III Multimedia Image and Video Technology
3. 1 video technology
Although the development of video technology is relatively short, the application scope of products has been very large, and products combined with MPEG compression technology have begun to enter the home. Video technology includes video digitization and video coding technology.
Video digitization is to convert analog video signals into digital signals that can be processed by computers through analog-to-digital conversion and color space transformation, so that computers can display and process video signals. At present, there are two sampling formats: y: u: v4:1:kloc-0/and Y:U:V4:2:2. The former is the main format used in early products, and the Y:U:V4:2:2 format doubles the sampling of chroma signals, and the color, clarity and stability of digital video are obviously improved, so it is the next generation product.
Video coding technology is to encode digitized video signals into TV signals, which can be recorded on video tapes or played on TV. Different application environments can adopt different technologies. The coding technology from low-end game consoles to TV broadcast levels has matured.
3.2 Image compression technology
Image compression has always been one of the technical hotspots, and its potential value is very considerable. It is an important basis for computer processing images and videos and network transmission. At present, ISO has formulated two compression standards, namely JPEG and MPEG. JPEG is a compression standard for still images, which is suitable for continuous-tone color or grayscale images. It includes two parts: one is lossless coding based on DPCM (Spatial Linear Prediction) technology, and the other is distortion algorithm based on DCT (Discrete Cosine Transform) and huffman encoding. The former image compression has no distortion, but the compression ratio is very small. At present, the latter algorithm is mainly used, and the image is lossy, but the compression ratio is very large, and no distortion can be seen after compression of about 20 times.
MJPEG refers to MotionJPEG, which uses JPEG algorithm to compress video signals at a speed of 25 frames per second to complete the compression of dynamic video.
MPEG algorithm is a compression algorithm suitable for dynamic video. It not only encodes a single image, but also removes the redundancy between frames by using the correlation principle in the image sequence, thus greatly improving the compression ratio of the image. Usually the image quality is high, and the compression ratio is as high as 100 times. The disadvantage of MPEG algorithm is that the compression algorithm is complex and difficult to realize.
Chapter IV Multimedia Communication System
1, building
Multimedia communication is a kind of meeting or communication between participants in different geographical locations. Compressed digital images and sound signals are transmitted through local area network (LAN), wide area network (WAN), intranet, Internet or telephone network. Multi-target broadcasting like TV, streaming broadcasting like video recorder, telephone conference, video conference, IP phone, videophone and IP fax are all concrete and unique applications of multimedia communication technology. Over the years, the International Telecommunication Union (ITU) has formulated many recommended standards for multimedia computing and communication systems for public and private telecommunication organizations to promote telecommunication cooperation among countries. Among the ITU's 26 (A ~ Z series) series of recommended standards, the seven series of standards that are most closely related to multimedia communication are shown in Table 4- 1, and the core technical standards of the three types of multimedia communication systems are shown in Table 4- 1.
Table 4- 1 ITU Series Recommended Standards
Main contents of series names
G series transmission systems, media digital systems and networks
H series audio-visual and multimedia systems
Series I Integrated Services Digital Network (ISDN)
Transmission of television, sound programs and other multimedia signals.
Q series telephone exchange and control signal transmission method
T series remote information service terminal equipment
2. The function and structure of the gateway
Gateway is a powerful computer or workstation, which is responsible for real-time two-way communication between circuit-switched networks (such as telephone networks) and packet-switched networks (such as the Internet) and provides connections between heterogeneous networks. It is a bridge between the traditional circuit-switched network and the modern IP network.
The emergence of IP telephone (see "7.4 IP telephone") allows telephone calls to be made on packet-switched networks, thus triggering a revolution in the telecommunications industry. However, IP telephone has encountered many obstacles on the road to becoming the mainstream telephone service. One of the biggest problems is the lack of connectivity between IP telephone network and public switched telephone network. An important reason is that the early gateway restricted IP phones from entering the mainstream telephone service. For example, it is difficult to establish a call through the gateway and it is necessary to use an unconventional phone number; The compatibility between different gateways hinders the establishment of the call; Poor sound quality, echo and long delay time. This promotes the development of gateways that allow IP and PSTN clients to communicate with each other, and one of the measures is to improve the processing capacity of gateways. The low-end gateway has 1 ~ 6 ports, and generally adopts the PC scheme of high-end Pentium processor to provide gateway functions such as media processing, call control and packet processing. The high-end gateway distributes the gateway function to several processors, which is called CTI platform and can provide more than 100 ports.
The basic functions of gateways can be summarized into three types:
(1) Translation protocol: As an interpreter, the gateway enables different networks to establish contact. For example, it allows PSTN and H.323 networks to talk to each other to establish and clear calls.
(2) Convert information format: Different networks use different coding methods, and the gateway will convert information, so that heterogeneous networks can freely exchange information, such as voice and television.
(3) Information transmission: responsible for information transmission between different networks.
The main components of the gateway include:
(1) Switched Circuit Network (SCN) interface card is a typical T 1/E 1 or PRI ISDN line interface card, which communicates with SCN. The primary rate interface (PRI) consists of 23 B channels and a 64 kb/s D channel, which is called 23b+D, which is equivalent to the bandwidth of T 1 line.
(2) Digital Signal Processor (DSP) card, which performs tasks such as sound signal compression and echo cancellation.
(3) Network interface card, which is used to communicate with H.323 network. Typical network cards include 10/ 100 network interface card (NIC), or their functions are integrated on the motherboard.
(4) The control processor, which coordinates all activities of other gateway components, is usually located on the motherboard of the system.
The main software of the gateway includes:
(1) Gateway software that performs all basic gateway functions and selection functions. For example, the H.323 gateway platform performs basic functions such as protocol conversion, message format conversion and information transmission, and supports voice compression, protocol conversion, real-time fax demodulation/remodulation and the implementation of H.323 series protocols.
(2) Application software of specific gateway, which performs customization function and management and control function.
3. Function and structure of conference equipment
Gatekeeper is used to connect the H.323 video conference client on IP network, and it is one of the key components of video conference. Many people regard it as the "brain" of video conference. It provides authorization and verification, saves and maintains call records, performs address conversion without requiring you to remember IP addresses, monitors networks, manages bandwidth to limit the number of simultaneous calls, thus ensuring the quality of video conferencing and providing an interface with existing systems. Usually, the functions of the conference server are realized by software. The function of conference equipment is divided into two parts: basic function and selection function.
The basic functions that the conference server must provide include:
"Address Translation: Use a translation table that can be updated by the registration message to translate alias addresses into transport addresses. This function is especially important when a telephone on a circuit-switched network tries to call a PC on an IP network, and it is also important when determining the gateway address.
Admission control: Access to LAN is authorized by ARQ/ARC/ARJ (Admission Request, Confirmation and Rejection) messages. H323 standard stipulates that there must be a RAS message for authorizing network services. RAS is a registration/admission/status protocol, but it doesn't define rules or policies to authorize access to network resources, so service providers need servers to intervene in the existing authorization methods. In addition, business managers and service providers may want to use it.
Authorization according to your own standards, for example, by deposit, credit card, etc.
Bandwidth control: Support RAS bandwidth messages, namely BRQ/BCF/BRJ (Request, Acknowledgement and Rejection) messages to implement bandwidth control. As for how to manage it, it should be decided according to the policies of service providers or enterprise managers. In many cases, if the network or a specific gateway is not crowded.
In any case, requests for any bandwidth should be met.
Zone management: used to manage all registered H.323 endpoints and provide them with the above functions. As for determining which terminal can register and the composition of geographical or logical areas (terminals managed by a single conference server, gateway and multipoint control unit MCU), it is up to the network designer to decide.
The selection functions provided by the conference server include:
Call control signaling mode: There are two call control signaling models in H.323: gatekeeper routing call signaling model and direct endpoint call signaling model. The conference server can be selected according to the requirements of the access provider.
Call authorization: The conference server can authorize or reject a given call according to the conditions specified by the service provider. Its conditions may include meeting time, predetermined service type, access rights to restricted gateways or available bandwidth, etc.
Bandwidth management: according to the bandwidth allocation specified by the service provider, determine whether the call has enough bandwidth.
Call management: Provide intelligent call management. The conference server maintains the H.323 call table to indicate whether the called terminal is busy or not, and provides information for the bandwidth management function.
Structure of conference equipment
Conference equipment is usually designed as an inner and outer layer, as shown in Figure 4-8. The inner layer of conference equipment is called the core layer, which is composed of software that realizes H.323 protocol stack and software that realizes the function of multipoint control unit (MCU). Some software development companies call it the core functional component of H.323 conference equipment. The main function of MCU is to automatically or manually connect multiple lines and exchange TV numbers under the guidance of the conference host. The outer layer of the conference server is composed of many application interfaces, which are used to connect many existing services on the network. External software