From Eros (silicon) to Gaia (storytelling business): transmitting HEVC-coded video over broadband mobile LTE

Byung K. Yi; Yan Ye

doi:10.1017/ATSIP.2015.20

From Eros (silicon) to Gaia (storytelling business): transmitting HEVC-coded video over broadband mobile LTE

Published online by Cambridge University Press: 14 December 2015

Byung K. Yi and

Yan Ye

Show author details

Byung K. Yi*: Affiliation:
InterDigital Communications Inc., 9710 Scranton Road, Suite 250, San Diego, CA 92121, USA
Yan Ye: Affiliation:
InterDigital Communications Inc., 9710 Scranton Road, Suite 250, San Diego, CA 92121, USA
*: Corresponding author:B. K. Yi Email: Byung.Yi@interdigital.com

Article contents

Abstract
A BRIEF REVISIT OF THE HISTORY OF OUR DIGITALLY CONNECTED WORLD
MOBILE NETWORK EVOLUTION
VIDEO COMPRESSION EVOLUTION
A STORYTELLER'S PERSPECTIVE
CONCLUDING REMARKS
References

Abstract

Being connected “anywhere anytime” has become a way of life for much of the world's population. Thanks to major technological advances in internet, wireless communication, video technology, silicon manufacturing, etc., our mobile devices have become not only faster and more powerful, but also smaller and sleeker. With the popularity of rich media on the rise, the no. 1 data traffic over the mobile network is attributed to video. That is the reason why we depict the Freeman Dyson's book title “From Eros to Gaia.” Equipped with rich media capabilities, our mobile devices enable a completely different storytelling experience unlike anything the human race has experienced before. In this paper, we review the latest technological evolutions in the wireless space and in the video space, namely long-term evolution and high-efficiency video coding, respectively. We then discuss how these advanced technologies impact our way of life at present and in years to come.

Keywords

Mobile broadband Hybrid video coding LTE HEVC Video delivery

Type: Industrial Technology Advances
Information: APSIPA Transactions on Signal and Information Processing , Volume 4 , 2015 , e18

DOI: https://doi.org/10.1017/ATSIP.2015.20 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright: Copyright © The Authors, 2015

I. A BRIEF REVISIT OF THE HISTORY OF OUR DIGITALLY CONNECTED WORLD

We live in an increasingly connected world. Figure 1 provides a conceptual illustration of today's connected world, where a diverse set of end devices are connected to a massive and heterogeneous network infrastructure. Together, network servers, cell phone towers, and satellite receivers provide “anywhere anytime” type of wired (e.g. fiber optics cables) and wireless connections (e.g. cellular and satellite) to portable devices (e.g. phones, tablets, and laptops), wearable devices (e.g. glasses, watches, and clothes), computers, homes, automobiles, and even airplanes. With a connected device, people can do almost anything anywhere at any time: social networking, online TV and movie streaming, video calls, shopping, banking, online education, etc.; even voting online is allowed in some parts of the world. Further, the connections are becoming increasingly multimedia in nature. That is, rather than communicating through a monolithic medium (i.e. voice calls only), people can exchange various forms of information, including video, graphics, images, audio, data, and a combination of these.

Fig. 1. Connected world.

Across the planet Earth, the internet connects people from far apart and delivers messages and information at high speeds. Invented back in 1965 for military-research purposes, the internet is a complex global communications network with thousands of constituent networks interconnected using fiber optic cables (a.k.a, the Internet back-bone) (http://www.cs.ucsb.edu/~almeroth/classes/F04.176A/homework1_good_papers/Alaa-Gharbawi.html). In the 1990s, the World Wide Web as we know it today was invented, and browsers to surf the web became available. These enabled the commercialization of the internet at mass scale [1]. The rapid wave of commercialization sent the internet-based hi-tech world through significant economic turbulence around the turn of the century, as evidenced by the failure of many first-generation internet startup companies. These economic bubble bursts were mainly due to flaws in business models instead of technical shortfalls. Nonetheless, the successful companies survived the turbulence, and nowadays the internet is deeply entrenched in many people's daily life across the world.

Starting in the early 1990s, another communication technology that would later see mass-scale commercial success and become deeply entrenched in the modern society also started to undergo stable and continuous growth. This is the digital cellular (i.e. mobile) communications technology. The earliest digital cellular network is also known as the 2G networks, with a well-known example being the European dominated GSM network and the North America and Asia dominated IS-95 CDMA network; digital cellular networks start from 2G because the 1G networks were analog. Although the growth rate of consumer mobile phone adoption was not very high in the early years (1990s) (probably mainly due to cost considerations), the mobile phone user base continued to grow steadily throughout the 1990s and early 2000s, thanks to the appeal of being able to communicate and stay in touch anywhere, without being held back by the physical constraint imposed by wires. As the use of 2G phones became more widespread, demand for data access (such as browsing the internet on the phone) beyond simple voice calls grew. In response, the industry worked to continuously improve mobile network connection speed, developing the 3G (e.g. WCDMA), 3.5G (e.g. HSPA) in GSM networks and 3G (e.g. cdma 2000), 3.5G (e.g. 1xEvDO and 1xEvDV) separately, and were then merged to the latest 4G (e.g. LTE) networks for mobile broadband connection. Compared to the 2G network, which relied on circuit switching for data transmission, 3G and 4G networks support packet switching, with the 4G network being a native all-IP based network, enabling data-optimized transmission with significantly increased speed. The 4G LTE network will be discussed in depth in Section II. Figure 2 shows the projected mobile subscription by network generation, according to the Ericsson Mobility Report [2]. The latest 4G LTE deployment is expected to grow significantly in the coming years, while the older 2G GSM networks are being phased out.

Fig. 2. Mobile subscription by cellular network generation. Source: Ericsson.

As silicon manufacturing technologies evolved according to Moore's law, mobile phones became smaller, sleeker, faster, and more powerful. Early generations of mobile phones started in the same way as wired telephones and were only capable of making voice calls. As technological innovations continued, the mobile phones took on many enhanced features and functionalities. Early smart phones such as Blackberry became popular as an email and personal digital assistant device. Since the mid-2000s, the smartphone revolution began in full force with the introduction of the first iPhone, quickly followed by many Android-based smartphones (e.g. Samsung's flagship Galaxy series). Nowadays, our mobile phones and tablets are all-in-one handheld computers with impressive software and hardware capabilities, serving our entertainment needs (video and audio/music consumption, gaming, etc.), connecting us with friends (social networking), telling us where to go and what to do (navigation and maps), taking pictures and videos at important moments, etc. For many of these things that we do on our handheld devices, a high-speed connection is required. Figure 3 shows the mobile data traffic by application type [2]. Every month, exabytes (10¹⁸ bytes) of data go through the mobile networks throughout the globe and the data traffic is expected to continue its explosive growth in the foreseeable future. Among different types of data-intensive applications in Fig. 3, video traffic takes the lion's share: video accounted for about ~45% of data traffic in 2014, and that percentage is expected to grow to ~55% in 2020. Much of the growth in video is due to richer content (for example, embedded video in online news, use of video on social network platforms) and over-the-top (OTT) video consumption such as YouTube and Netflix. Thus, efficient mobile video delivery mechanisms are of utter importance to reducing network congestion and improving user experience.

Fig. 3. Mobile data traffic by application types. Source: Ericsson.

As the human race evolved out of the Stone Age into the modern days, we went through different phases of human societies: the hunting society, the agricultural society, the industrial society, now full of the information society, and the dream society which is depicted by storytelling society by many futurists [Reference Jensen3]. Though much has changed with time, one fundamental desire of human nature – that is, storytelling – remains the same. According to Robert McGee, the famous lecturer of the STORY seminars, “Stories fulfill a profound human need to grasp the patterns of living – not merely as an intellectual exercise, but within a very personal, and emotional experience.” Though the desire to tell stories has never changed, technological advances have significantly changed, and these changes have enhanced the means of storytelling throughout human history: from tribe gathering meeting after the hunting parties, to primitive art drawings on the wall in the Stone Age, to modern media such as books, movies, and TV shows. In today's connected world, stories are being told not only by the professional artists and writers, but also by the general population. Websites like YouTube and social networks like Facebook gave the general public the means to share their personal emotional moments. More importantly, this storytelling capability not only consists of rich media (audio and visual information), but also can be done anywhere and anytime, through the support of enormous network infrastructure, as we depicted in Fig. 1. In fact, the human race is indeed now undergoing a migration from information society into the “storytelling society.”

The remainder of this paper is organized as follows. In Section II, different generations of digital mobile networks will be briefly reviewed, including a quick review of the latest 4G LTE technology and LTE Advance. In Section III, different generations of the video compression standards, including the latest HEVC standard, will be reviewed. In the last part of this paper, we discuss and predict, from a storyteller's perspective, how these technological advances will influence the way we live and tell our tales. We conclude the paper in Section V.

II. MOBILE NETWORK EVOLUTION

A) Spectral efficiency

It is a well-known fact that any wireless point to point channel capacity cannot exceed the Shannon limit, as defined in equation (1) below:

(1)

$$C\le W \cdot \log_2 \left(1 + {S \over N} \right),$$

where C is the channel capability in units of bits/s, W is the spectrum bandwidth in units of Hz, and S/N is the signal-to-noise ratio. According to equation (1), one way to increase channel capacity is to increase the spectrum bandwidth W. However, the spectrum needs to be licensed from government regulatory agencies (such as Federal Communication Commission in the USA) and can be very costly. Even though, it is the most popular choice in the network operator community. For a given bandwidth, a spectral efficiency, C/W, can be described as the signal-to-noise term, (S/N)=(E _b/N ₀) · (R _b/W) for the digital communication parameters as shown in equation (2):

(2)

$$\eqalign{{c \over W} & = \log_2 \left(1 + {S \over N} \right) \cr & = \log_2 \left(1 + {E_b \over N_0} \cdot {R_b \over W} \right),}$$

where R _b/W is the modulation index, which depends on the modulation scheme: R _b/W is 1 for BPSK modulation, 2 for QPSK modulation, 4 for 16QAM modulation, 6 for 64QAM modulation, and so on, and E _b/N ₀ is the ratio between bit energy and noise spectral density.

Also, a modern cellular system has another bound, the so-called interference bound. This interference bound is the power limitation from and/or to the adjacent cells; it limits the power that each cell can transmit such that each cell does not cause too much interference to neighboring cells. The interference bound is negatively proportional to the Eb/No as shown in Fig. 4. Figure 4 also shows the theoretical spectral efficiency that could be achieved using WCDMA (the yellow dots), if the interference bound (i.e. frequency reuse equal to (1) was not observed. Also, it is worth noting that AMPS, an analog cellular phone technology in the early days, has a frequency reuse equal to 1/7; because its frequency reuse is lower, AMPS does not observe the inference bound of frequency reuse equal to 1, and can transmit “louder” than the threshold in Fig. 4

Fig. 4. Achievable Shannon bound and interference bound.

According to Fig. 4, wireless industry had achieved spectral efficiency that is very close to the theoretical spectral efficiency bound, through advanced coding scheme, hybrid ARQ, and adaptive modulations. In order to improve spectral efficiency further, generations of mobile cellular communication standards have been focusing on two main areas: (1) developing interference mitigating techniques such as interference cancellations, beam forming, and coordinating transmissions; (2) using spatial multiplexing mechanisms (which requires higher transmitter and receiver numbers) creating number of virtual channels. The theoretical channel capacity for the multiple antenna system has been developed by Telatar [Reference Telatar4] as shown in equation (3).

(3)

$$\eqalign{\left({C \over W} \right)_{MIMO} & = \min (n,k) \log_2 \left(1 + {S \over N} \right)\cr & = \min (n,k) \log_2 \left(1 + {E_b \over N_0} \cdot {R_b \over W} \right),}$$

where n is the number of receiver antenna and k is the number of transmit antenna, respectively. It showed that the channel capacity could be increased linearly by increasing the number of antennas.

Figure 5 shows the achievable bound of spectral efficiency as a function of the degrees of spatial multiplexing (number of antenna) and interference reduction. The curves show the spectral efficiency envelops for different multiplexing factors, which are bounded by the straight lines that indicate different amount of interference reduction. We could achieve system requirement by combining spatial multiplexing with interference reduction. For example, if we want to design a cellular system with a spectral efficiency of 4 bits/s/Hz (dashed horizontal line), we could use a combination of at least 3 dB interference reduction and a multiplexing factor of 12 (gray dot in Fig. 5). Another choice would be to use a multiplexing factor of 6 in combination with interference reduction of at least 6 dB (brown dot in Fig. 5). Though not shown in Fig. 5, many commercial implementations use a combination of at least 6 dB interference reduction and a multiplexing factor of 4. In general, higher multiplexing factor may be combined with lower interference reduction (or vice versa) to achieve a required spectral efficiency.

Fig. 5. Achievable envelope of spectral efficiency by varying the degrees of spatial multiplexing and interference reduction.

B) Comparison of different generations of digital mobile networks

Historically 3rd Generation Partnership Project (3GPP) and 3rd Generation Partnership Project 2 (3GPP2) have been successful at developing mobile network standards that have enjoyed wide commercial adoption. Well-known standards produced by 3GPP included Global System for Mobile communications (GSM), General Packet Radio Service (GPRS), and different releases of Universal Mobile Telecommunications System (UMTS) known as WCDMA (UMTS R99.4), HSDPA (UMTS R5), and HSUPA (UMTS R6), and the latest 4G standard Long-Term Evolution (LTE). Well-known standards produced by 3GPP2 included cdma2000, 1x EV-DO (Evolution Data Optimized) and 2x EV–DV. Figure 6 shows the evolution of different generations of mobile network standards produced by 3GPP. According to the AT&T study, the conclusion was drawn that on average the peak data rates of the Up-Link and Down-Link had been doubled by the above-mentioned technologies: increasing the modulation index and number of antennas.

Fig. 6. Evolution path of 3GPP standards peak DL (Down-Link) and UL (Up-Link) Data Rate, Source: AT&T 3GPP presentation.

C) The 4G LTE network

In order to meet the fast-growing demand for fast mobile data access and services, 3GPP, in conjunction with the International Telecommunications Union (ITU), have developed the widely popular 4G LTE standards. Aside from better accommodating mobile video, Web 2.0 personalization, and streaming applications, 4G LTE gives the mobile network operators a way to compete on an equal footing with cable and landline telecommunications providers for broadband data users. Under nominal network conditions across a user population in a given wireless sector, the average user throughput speeds fall between 5 and 12 Mbps for downlink and 2–5 Mbps for uplink, which are comparable with landline broadband speeds. LTE also boasts reduced latency of below 50 ms, in comparison with latency of 150–250 ms for 3G networks. Not like 3G cellular systems, 4G LTE adopts the orthogonal frequency division multiplexing (OFDM) for the DL and SC-FDMA for the UL instead of the CDMA technologies for both links and the all IP packet networking instead of the Circuit Switched core network. The SC-FDMA adds discrete Fourier transform (DFT) in front of the OFDM signaling, which allows multiple access from different users and reduction of the peak to average power ratio.

The new features of the 4G system are carrier aggregation (CA) aggregating many carriers above 100 MHz bands, Enhanced Home-Node B for small/femto cell applications, Self-Organizing Network reducing the capital investments, multi-user MIMO (multiple-input and multiple-output) using 4X4 antennas. Also, there were some discussions on in-band and out-of-band relay and Heterogeneous network.

LTE enjoys strong and widespread support from the mobile carriers, including backing from a majority of the industry's key players. LTE has been selected by more than 297 mobile operators worldwide, including Vodafone, China Mobile, Verizon Wireless, etc.

D) The 5G network

To discuss the 5G system seems premature at this point, as industry-wide consensus is needed on overall performance requirements, service definitions, and spectrum allocations. However, we can share some of InterDigital's research directions. We believe that 5G network would be an in-door oriented technology evolution enhancing spectral efficiency for indoor and seamless connectivity with the outdoor cellular system. Recent statistics showed that 86% of all the wireless calls were initiated and terminated from indoor and to indoor. Only 14% of calls were originated and terminated from to outdoor. If we reflected, the wireless legacy system was intended for the outdoor mobility alone and for the indoor the connectivity was relying on signal spill-over from outdoor or reconnecting to the indoor access network such as WiFi [5]. This means that the modern cellular systems have been designed and optimized only for 14% of the overall traffic, that is, only for the outdoor-to-outdoor traffic. Table 1 shows the 5G indoor performance parameters the whole industry would have to come up with in the next 5 years.

Table 1. Suggested 5G network performance parameters.

III. VIDEO COMPRESSION EVOLUTION

Historically, most of the video compression standards that are widely deployed have been produced by the ITU standardization sector (ITU-T) and the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC). Figure 7 shows the different generations of video compression standards from these two standardization organizations (SDOs) over the years, starting from early 1990. On the ITU-T side, the video compression standards are the H.26x series, including H.261 [6], H.262 [7], H.263 [8], H.264 [9], and H.265 [10]. On the ISO/IEC side, the video compression standards are known as the Moving Picture Expert Group (MPEG) standards series, including MPEG-1 [11], MPEG-2 [7], MPEG-4 Part 2 [12], MPEG-4 Part 10 Advanced Video Coding (AVC) [9], and MPEG-H High Efficiency Video Coding (HEVC) [10]. Some of these standards are produced together under a joint committee of the two SDOs, including H.264/MPEG-4 Part 10 AVC [9] and the latest H.265/MPEG-H HEVC [10]. A new generation video compression standard gets developed roughly every 10 years, and generally achieves 50% bit rate reduction to compress the video at the same subjective quality.

Fig. 7. Generations of video compression standards.

All of these standardized video codecs rely on the block-based hybrid video codec paradigm to compress the input video signal, as depicted in Fig. 8. In a block-based hybrid video coding system, the input video is partitioned into block units. Each block is predicted using either spatial prediction (using already coded spatial neighboring pixels in the current picture), or temporal prediction (using previously coded pictures). The encoder has mode decision logic that decides the best prediction mode (spatial prediction versus temporal prediction) for each block. Then, the prediction block is subtracted from the original input. The prediction residual is transformed and then quantized to reduce the number of bits required to represent the prediction residual. Quantization introduces loss and causes video quality degradation. The prediction mode information, motion information (if temporal prediction is used), the quantized residual transform coefficients, along with other side information, will go through entropy coding to further reduce the bandwidth. Finally, the bits are packed into the coded bitstream. In modern video coding systems, the encoder also has a built-in decoder that reconstructs the video in the same way as the “remote” decoder, in order to guarantee that the encoder and the decoder stay synchronized. The built-in decoder performs inverse quantization and inverse transform to recover the reconstructed prediction residual. Then, the reconstructed prediction residual is added to the prediction block. Afterwards, loop filters such as deblocking filters may be applied to further improve the reconstructed video quality. Such reconstructed blocks are finally stored in the decoded picture buffer, for prediction of future video blocks.

Fig. 8. Video encoder block diagram.

Although the latest HEVC standard follows the same block diagram in Fig. 8 as has been used since the earliest generation of video compression standards such as MPEG-1 and H.261, many incremental improvements in each of the functional blocks in Fig. 8 have been made. A short list of these improvements includes larger block units, quad-tree-based block partitioning, larger transforms, advanced motion vector coding, and a new loop filter called Sample Adaptive Offset. We will not go in-depth to discuss these specific changes in HEVC; interested readers are referred to [Reference Ohm, Sullivan, Schwarz, Tan and Wiegand13,Reference Sullivan, Ohm, Han and Wiegand14] for technical details.

Together, these coding tools, along with many other design improvements, contribute to HEVC's superior coding performance. Table 2 compares the coding performance of HEVC with earlier compression standards. Two types of video applications are considered in Table 2: entertainment applications such as TV broadcasting and video streaming, and interactive applications such as video conferencing and telepresence. The numbers shown are the Bjontegaard Delta rate (BD-rate) [Reference Bjontegaard15], which is a commonly used metric in video coding that measures the percentage of average rate reduction at the same video quality. In Table 2, peak signal-to-noise ratio (PSNR) is the objective quality metric used to calculate the BD-rate. Compared with H.264/AVC, HEVC achieves 35% rate reduction for entertainment applications and 40% rate reduction for interactive applications, respectively. Compared with MPEG-2, 71 and 80% of bit rate savings are achieved for entertainment and interactive applications, respectively. The numbers are less than the desired 50% rate reduction between two generations of video compression standards (i.e., HEVC versus H.264/AVC); this is because PSNR is used as the quality metric in Table 2. Typically, the rate reduction becomes higher when a subjective quality metric such as mean opinion score (MOS) measured using naïve viewers is used instead of PSNR, as we will discuss next.

Table 2. HEVC performance compared with previous generation standards. Numbers shown are % of bit rate savings at the same PSNR.

The rapid increase in computing power has allowed the spaftial resolution of mobile video to increase significantly in recent years. At the turn of the century, mobile devices had very small screens with support for merely QCIF (176×144) resolution video. Nowadays, HD (1920×1080) video is widely supported by many mobile devices; and recent trend shows that 4K/UHD (3920×2160) video support is starting to migrate from TVs in the living room to handheld devices. Figure 9 shows the percentage of bit rate savings that HEVC achieves compared to H.264/AVC, based on subjective tests using MOS [Reference Tan, Mrak, Baroncini and Ramzan16]. As shown, on average HEVC saves 64% of bit rate for 4K/UHD content, which is much higher gain than the 52% of bit rate saving achieved for VGA (720×480) content. Averaged across all resolutions, HEVC can reduce the video bit rate by 59% without compromising the video quality.

Fig. 9. Average percent of bit rate savings using HEVC compared to H.264/AVC at the same subjective quality.

Looking beyond HEVC, exploration of future video coding technologies that promise to increase video compression efficiency by at least another factor of 2× compared to HEVC is already underway [17].

IV. A STORYTELLER'S PERSPECTIVE

Let us go back to the fundamental human desire of storytelling, and discuss how these most recent wireless and video technological advances have enriched the storytelling experience and influenced the way we live as storytellers.

A) The storytelling society

In R. Jensen's book published more than a decade ago [Reference Jensen3], he predicted that the human society would be transformed into the Storytelling Society (the Dream Society); some of the predicted trends are now coming to realization. In his book he said “…any conceivable piece of information will be yours for the asking, and you will be able to get in touch with anybody, anytime, anywhere. And your computer and communications devices will be designed to be exciting electronic companions for you.” In today's society, information can indeed be accessed anytime and anywhere and our handheld devices have exactly become the “exciting electronic companions” of ours. Thanks to their convenient form factors and the ease of anywhere anytime connection, we use our mobile devices to navigate the world whether we are in town or going out on a trip, and we have become accustomed to telling our stories to our friends, family, and sometimes even the “strangers” on the web almost real-time. Further, the rich media capabilities (high-resolution digital camera, camcorder and screen, high-quality sound system, etc.) on our mobile devices enable a storytelling experience that has become more interactive than ever before. Instead of using the traditional form of one-directional activity with one person narrating and an audience listening, the modern-time storyteller narrates his/her story, oftentimes with the assistance of video to convey the visual information. Almost at the same time (or with minimal delay), the audience can start participating in the story by providing feedback to the storyteller, sometimes to simply agree or disagree with the content of the story, other times to further enhance and enrich the story with more details and narratives. These personal and emotional stories told by the general population through online platforms are sometimes referred to as User Generated Content (UGC). According to Wikipedia, UGC includes “any form of content such as blogs, wikis, discussion forums, posts, chats, tweets, podcasting, pins, digital images, video, audio files, and other forms of media that was created by users of an online system or service, often made available via social media websites” (http://en.wikipedia.org/wiki/User-generated_content).

Not long ago, most of the content on popular video sharing websites such as YouTube was produced by amateurs. Also, there existed a sharp dividing line between the amateur-produced content available on the internet, and the professionally generated premium content, as the latter was primarily distributed using conventional means such as linear TV programs, cinema, and/or subscription-only access, and was not made widely available online. However, as traditional media companies and publishers started to heed the power of the online video platforms (as well as their potential to attract advertising revenue), they have responded by customizing the content production process for online video distribution. In order to maintain and grow their share of advertising dollars, these traditional content producers became more open to sharing the premium video content through online distribution. As OTT video streaming services became popular, the front runners of the streaming service providers, such as Netflix and Amazon, also saw compelling reasons to break from the status quo of being simply the middleman, and have turned to producing and distributing their own video content. Famous examples included the “House of Cards” and “Orange is the New Black” original series on Netflix. Overall, the unmistakable popularity of web video has democratized the content creation and distribution process, transforming content creation (i.e. storytelling) into a more transparent, level, and sharable experience.

Today, vast amount of high-quality premium content can be accessed on web-based platform. As the nature of video storage and video consumption shifts, it has been found in various independent market research studies that mobile video views have been trending up. The Ooyala Global Video Index [18] shows the steep growth of video consumption on mobile devices between 2011 and 2013. This kind of explosive growth continues today. Three major factors contribute to this stunning growth rate: (1) the availability of premium online content; (2) the advance in wireless communications technology (as we discussed earlier in Section II) that enables mobile broadband connections; and (3) the more powerful tablets and phones with rich media capabilities.

B) The storytelling cost

Whereas this new era of storytelling society is empowering, enabling such storytelling experience is not without costs. The costs are generally twofolds: cost to infrastructure and service providers, and cost to the users including content providers and consumers (who are the storytellers and the audience). In this section, we discuss how the storytelling costs can be mitigated and/or compensated using the advanced wireless and video technologies.

Let us first take a look at the cost to the service providers. According to a brainstorming discussion at the 110^th MPEG meeting [19], combining advanced wireless technology (LTE) with advanced video technology (HEVC) can bring cost down for carriers and at the same time improve quality of experience (QoE) for mobile users. Table 3 shows some example calculations of how many video users per cell can be accommodated using 3G and 4G networks. As shown, 4G LTE significantly improves spectral efficiency and data rate compared to 3G: the average downlink per cell is improved from 2 Mbps @5 MHz in 3G to 38.6 Mbps @20 MHz in 4G. Assuming that the users consume 640×480 @30 fps video coded using H.264/AVC, a 4G network is able to accommodate 40 users per cell, compared to only 2 users per cell in a 3G network. Further, if HEVC is used instead of H.264/AVC to code the video, the number of users per cell is further doubled (because HEVC can compress the video twice as efficiently). As the video size becomes smaller due to more efficient video compression, it helps to conquer the fluctuation of wireless network bandwidth and reduce video stalls, which are detrimental to QoE. Another important take-away from Table 3 is that advanced wireless technology and advanced video technology give the carriers the ability to not only serve more users, but also to better serve those users: with LTE and HEVC, wireless users can now have access to larger resolution videos such as 1280×720 and 1920×1080. As LTE infrastructure is built up and the mobile devices become equipped with advanced chipsets with HEVC capabilities, it becomes cheaper for the carriers to support the storytelling society with higher quality video.

Table 3. Comparison between 3G and 4G networks carrying video coded using H.264/AVC and HEVC.

So what about the cost for the storytellers? As we discussed earlier, the storytellers include the general population (that is, the general consumers, respectively) who provide the UGC, the traditional media companies and publishers, as well as those service providers who have recently transformed into content providers. Let us roughly categorize the storytellers into amateurs (i.e. the general consumers) and professionals (the traditional and new content providers). As will become evident in our analysis, both will benefit from the advanced technologies, albeit the specific impacts for the amateurs and the professionals will be quite different.

First let us look at the storytelling cost for the amateurs. To be engaged as storytellers (e.g. sharing of moments with friends and family), consumers need to pay for the mobile devices and then pay the wireless carriers to have “anytime anywhere” network access. There is wide-spread use of data plans that require payment based on usage. For consumers of these usage-based plans, the benefit of more efficient video compression is obvious: it directly brings down the cost per video consumed. Even for those who pay a monthly fee to have mobile broadband data access (e.g. 2.5 GB high-speed data per month), the carriers usually enforce a data quota, beyond which the consumers will either have to pay more or forego access to high-speed data network. So the reduction in video bandwidth consumption due to HEVC directly benefits the consumers as a whole.

For the amateur storytellers, the latest technologies offer another less obvious, but equally important potential cost reduction benefit, which is reduced power consumption on the mobile device. The form factor of our mobile devices limits the amount of battery power it has. As we take our phones anywhere we go, it is not uncommon (and very frustrating) to run out of battery on the phones. A big source of power consumption on the device comes from uplink transmission. For the mobile device, sending data consumes significantly higher power compared to receiving data. When we upload a video coded using the latest and most efficient video compression standard, fewer bits need to be transmitted. This translates into reduced power consumption; and longer lasting batteries on the mobile leads to improved storytelling experience.

It is worth noting that the above analysis on power saving applies only when the storyteller is the amateur directly uploading a video he or she has taken on the mobile device. As wireless and video compression technologies become more sophisticated, the overall system complexity increases, which can translate to increase power consumption. To alleviate the power consumption issue, mobile devices use chipsets that include dedicated hardware implementation of wireless modems and video encoders/decoders. Advances in silicon manufacturing technologies reduce the foot print of ASIC, leading to ever faster and more power-efficient hardware implementations. There is also the balance of modem power and video power consumption to be considered. For example, although HEVC-coded content takes more power to decode than H.264-coded content, only half the bits need to be received over the wireless channel, leading to reduced modem power. That is why throughout the years, our mobile devices can support more applications, more advanced wireless, and video technologies whereas the device's overall battery life has mostly remained stable.

Next, let us look at the storytelling cost for the professional storytellers. It has recently been reported that more than half of the mobile video viewers are watching long form video content of more than 30 min [18]. In response to rapid growth in mobile video consumption, advertisers are taking notice, and are changing their resource allocation and creative implementation away from linear TV distribution model (where interruptive advertising is the norm) toward emerging models that are driven by relevancy and native impressions, which ultimately allow the advertisers to tell their brands story with authenticity and relevant narratives. For the professional storytellers, as they shift their content creation and production resource toward catering the mobile video viewers, they stand to capture the shift in advertising revenue.

However, catering to an increasingly diverse audience at the same time requires the professional storytellers to maintain a growing database of video content: some content will be made suitable for conventional distribution, others suitable for distribution to mobile. Due to the difference between the conventional linear TV model and the new mobile video distribution model, the same content needs to be prepared in different spatial resolutions, frame rates, coded bit rates, transport formats, etc. Further, different contents may need to be separately produced for mobile distribution and for conventional distribution. For example, a shorter version of the same program may be created for mobile distribution, as study shows that people's mobile video consumption habits tend to focus more on shorter programs (although there is also a recent trend that indicates this may be changing). Another example of customizing the original content for mobile distribution is video retargeting. Because the screen size of a mobile device is smaller, content retargeting can be used to focus on the most interesting part of the scene, e.g. the face of the leading actor/actress, without compromising the storyline. As a multitude of content versions get created, storage, management, and delivery cost will all be increased. In order to maintain reasonable storytelling cost, it is essential for the storytellers to take advantage of the latest and most efficient video compression standards.

C) Storytelling in the future

Up to now our discussion has been focused on the recent formation of the storytelling society in the affluent regions of the world. As people in the developed countries enjoy rapid development and deployment of advanced technologies, a large portion of the population in the developing countries is being left behind. This global disparity is also called the global digital divide. The global digital divide represents an increasing gap between people in the developed and the developing countries in terms of their access to computing and information resources. Today about three in five of the world's population still do not have an internet connection. To further advance the storytelling society, the digital divide must be bridged.

For regions that have low fixed internet connection infrastructure, it has been reported that mobile broadband is generally less costly than fixed broadband [20]; in some countries, the cost of mobile broadband can be only one-third to one-fourth of that of fixed broadband. In some African countries, proper government policy and regulations, as well as successful private–public partnership and foreign aid programs, have helped to build up widespread mobile broadband access, including connections in the rural areas. Affordability also continues to improve. As the remote and poorer regions of the world become connected (often through mobile infrastructure), basic human services such as medical care and education can be fulfilled remotely, using advanced video technologies. These are positive developments that will help to bridge the digital divide and improve quality of life for the rural residents for generations to come.

V. CONCLUDING REMARKS

In this paper, we discussed how multi-disciplinary technological advances in areas such as silicon manufacturing, wireless communications, video communications, etc., have changed our way of life. The technological advances have led us into a new storytelling era with a more democratized process of content creation and sharing. Cost reduction is achieved throughout the ecosystem for all the players, including the service providers, the amateurs, and the professionals alike. Take for example the time it takes to download a 2-h movie coded using HEVC in a 4G network versus the same movie coded using the next generation video codec (2× as efficient as HEVC) in a 5G network. As shown in Table 4, the time to download can be significantly reduced from more than 3.5 min to only about 1 s.

Table 4. Time to download a 2-h UHD movie onto the mobile device.

Finally, we predict that these advanced technologies will eventually help to bridge the digital divide, and improve quality of life for the less developed regions in the global community.

Byung K. Yi is InterDigital's Executive Vice President, InterDigital Labs, and Chief Technology Officer. As head of InterDigital Labs, Dr. Yi is responsible for directing the development of advanced wireless and network technologies, the evolution of standards-based technologies and the company's participation in wireless standards bodies. Dr. Yi joined InterDigital in April 2014 from the Federal Communications Commission (FCC), where he had served as assistant division chief of engineering since 2012. Prior to his appointment at the FCC, Dr. Yi was at LG Electronics from 2000 to 2012, where as Senior Executive Vice President he headed the company's North American R&D center. A former member of InterDigital's Technology Advisory Council, Dr. Yi contributes more than 30 years of advanced wireless development experience. Dr. Yi also contributes a strong history of industry leadership. He currently serves on the board of directors of the Telecommunications Industry Association and has served on the board of directors or steering committees of a number of professional organizations, including the Center for Wireless Communications, the 3rd Generation Partnership Project 2 Technical Specification Group, and a number of others. He was awarded the prestigious CDG (CDMA Development Group) Industry Leadership award, been recognized by the National Engineer Week (NEW) Foundation, and inducted to the Hall of Fame by the School of Engineering and Applied Science of George Washington University. Dr. Yi received his bachelor's degree in electrical engineering from Yonsei University (Korea), his master's degree in electrical engineering from the University of Michigan, and his doctoral degree in electrical engineering from George Washington University.

Yan Ye received her Ph.D. from the Electrical and Computer Engineering Department at University of California, San Diego in 2002. She received her M.S. and B.S. degrees, both in Electrical Engineering, from the University of Science and Technology of China, in 1994 and 1997, respectively. She is currently with InterDigital Communications, where she manages the video standards project. Previously she was with Image Technology Research at Dolby Laboratories Inc. and Multimedia R&D and Standards at Qualcomm Inc. She has been involved in the development of various video coding standards, including the HEVC standard, its scalable extensions and its screen content coding extensions, the Key Technology Areas of ITU-T/VCEG, and the scalable extensions of H.264/AVC. Her research interests include video coding, processing and streaming.

References

REFERENCES

[1]Robert H'obbes' Zakon, Hobbes Internet Timeline v7.0, http://www.zakon.org/robert/internet/timeline/Google Scholar

[2]Ericsson Mobility Report, http://www.ericsson.com/ericsson-mobility-report.Google Scholar

[3]Jensen, R.: The Dream Society: How the Coming Shift from Information to Imagination Will Transform Your Business, 2001.Google Scholar

[4]Telatar, E.: Capacity of Multi-antenna Gaussian Channels. Eur. Trans. Telecommun., 10 ( 1999), 585–595.Google Scholar

[5]Small Cell Forum, 2013 Annual Small Cell Forum Report, 2013.Google Scholar

[6]ITU-T, Video Codec for Audiovisual Services at p×64 Kbit/s, ITU-T Rec. H.261, version 1: 1990, version 2: 1993.Google Scholar

[7]ITU-T and ISO/IEC JTC 1, Generic Coding of Moving Pictures and Associated Audio Information – Part 2: Video, ITU-T Rec. H.262 and ISO/IEC 13818-2 (MPEG-2), version 1: 1994.Google Scholar

[8]ITU-T, Video Coding for Low Bitrate Communication, ITU-T Rec. H.263, version 1, 1995, version 2, 1998, version 3, 2000.Google Scholar

[9]ITU-T and ISO/IEC JTC 1, Advanced Video Coding for generic audio-visual services, ITU-T Rec. H.264 and ISO/IEC 14496-10 (AVC), version 1: 2003, version 2: 2004, versions 3, 4: 2005, versions 5, 6: 2006, versions 7, 8: 2007, versions 9, 10, 11: 2009, versions 12, 13: 2010, versions 14, 15: 2011, version 16: 2012.Google Scholar

[10]ITU-T and ISO/IEC JTC 1, High Efficiency Video Coding, ITU-T Rec. H.265 and ISO/IEC 23009-2 (HEVC), version 1: 2013, version 2: 2014.Google Scholar

[11]ISO/IEC JTC 1, Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Mbit/s – Part 2: Video, ISO/IEC 11172-2 (MPEG-1), 1993.Google Scholar

[12]ISO/IEC JTC 1, Coding of Audio-Visual Objects – Part 2: Visual, ISO/IEC 14496-2 (MPEG-4 Visual), version 1: 1999, version 2: 2000, version 3: 2004.Google Scholar

[13]Ohm, J.R.; Sullivan, G.J.; Schwarz, H.; Tan, T.K.; Wiegand, T.: Comparison of the coding efficiency of video coding standards – Including High Efficiency Video Coding (HEVC). IEEE Trans. Circuits Syst. Video Technol. (special issue on HEVC), 22 ( 2012), 1669–1684.Google Scholar

[14]Sullivan, G.J.; Ohm, J.R.; Han, W.-J.; Wiegand, T.: Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans. Circuits Syst. Video Technol. (special issue on HEVC), 22 ( 2012), 1649–1668.Google Scholar

[15]Bjontegaard, G.: Improvements of the BD-PSNR model. ITU-T/SG16/Q.6/VCEG document VCEG-AI11, Berlin, Germany, Jul. 2008.Google Scholar

[16]Tan, T.K.; Mrak, M.; Baroncini, V., Ramzan, N.: HEVC verification test results, JCTVC-Q0204, April 2014.Google Scholar

[17]HM KTA software version HM-14.0-KTA-1.0-rc1, https://vceg.hhi.fraunhofer.de/svn/svn_HMKTASoftware/tags/HM-14.0-KTA-1.0-rc1/Google Scholar

[18]Ooyala Global Video Index, Q4, 2013.Google Scholar

[19]Huawei Technologies, Insights into Future Video Codec, the 110th MPEG Meeting, Strasbourg, France, October 2014.Google Scholar

[20]Alliance for affordable internet, The Affordability Report, 2013.Google Scholar

Fig. 1. Connected world.

Fig. 2. Mobile subscription by cellular network generation. Source: Ericsson.

Fig. 3. Mobile data traffic by application types. Source: Ericsson.

Fig. 4. Achievable Shannon bound and interference bound.

Fig. 5. Achievable envelope of spectral efficiency by varying the degrees of spatial multiplexing and interference reduction.

Fig. 6. Evolution path of 3GPP standards peak DL (Down-Link) and UL (Up-Link) Data Rate, Source: AT&T 3GPP presentation.

Table 1. Suggested 5G network performance parameters.

Fig. 7. Generations of video compression standards.

Fig. 8. Video encoder block diagram.

Table 2. HEVC performance compared with previous generation standards. Numbers shown are % of bit rate savings at the same PSNR.

Fig. 9. Average percent of bit rate savings using HEVC compared to H.264/AVC at the same subjective quality.

Table 3. Comparison between 3G and 4G networks carrying video coded using H.264/AVC and HEVC.

Table 4. Time to download a 2-h UHD movie onto the mobile device.

Article contents

From Eros (silicon) to Gaia (storytelling business): transmitting HEVC-coded video over broadband mobile LTE

Abstract

Keywords

I. A BRIEF REVISIT OF THE HISTORY OF OUR DIGITALLY CONNECTED WORLD

II. MOBILE NETWORK EVOLUTION

A) Spectral efficiency

B) Comparison of different generations of digital mobile networks

C) The 4G LTE network

D) The 5G network

III. VIDEO COMPRESSION EVOLUTION

IV. A STORYTELLER'S PERSPECTIVE

A) The storytelling society

B) The storytelling cost

C) Storytelling in the future

V. CONCLUDING REMARKS

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests